



文字识别一般包括文字信息的采集、信息的分析与处理、信息的分类判别等几个部分。信息采集 将纸面上的文字灰度变换成电信号,输入到计算机中去。


信息分析和处理 对变换后的电信号消除各种由于印刷质量、纸质(均匀性、污点等)或书写工具等因素所造成的噪音和干扰,进行大小、偏转、浓淡、粗细等各种正规化处理。

信息的分类判别 对去掉噪声并正规化后的文字信息进行分类判别,以输出识别结果。


一般OCR套路是这样的1.先检测和提取Text region.2.接着利用radon hough变换 等方法 进行文本校正。3.通过投影直方图分割出单行的文本的图片。

最后是对单行的OCR对单行的OCR主要由两种思想第一种是需要分割字符的。分割字符的方法也比较多,用的最多的是基于投影直方图极值点作为候选分割点并使用分类器+beam search 搜索最佳分割点。


一般流程是 灰度 -> 二值化->矫正图像 -> 提取特征(方法多种多样例如pca lbp 等等) ->分类器(分类器大致有SVM ANN KNN等等 )。

现在的 CNN(卷积神经网络)可以很大程度上免去特征工程。第二种是无需分割字符的还有一点就是端到端(end to end)的识别,但前提是你需要大量的标注好的数据集。

这种方法可以不分割图像直接以连续的输出字符序列。对于短长度的可以使用mutli-label classification 。比如像车牌,验证码。 这里我试过一个车牌的多标签分类。


卷积神经网络 有哪些改进的地方








文字识别 要用神经网络。

具体参考神经网络的guide文件,关于 Character Recognition 应用:11-15Appcr1: Character RecognitionIt is often useful to have a machine perform pattern recognition. In particular, machines that can read symbols are very cost effective. A machine that reads banking checks can process many more checks than a human being in the same time. This kind of application saves time and money, and eliminates the requirement that a human perform such a repetitive task. The demonstration appcr1 shows how character recognition can be done with a backpropagation network.Problem StatementA network is to be designed and trained to recognize the 26 letters of the alphabet. An imaging system that digitizes each letter centered in the system’s field of vision is available. The result is that each letter is represented as a 5 by 7 grid of Boolean values.For example, here is the letter A.Load the alphabet letter definitions and their target representations.[alphabet,targets] = prprob;However, the imaging system is not perfect, and the letters can suffer from noise.11 Applications11-16Perfect classification of ideal input vectors is required, and reasonably accurate classification of noisy vectors.The twenty-six 35-element input vectors are defined in the function prprob as a matrix of input vectors called alphabet. The target vectors are also defined in this file with a variable called targets. Each target vector is a 26-element vector with a 1 in the position of the letter it represents, and 0’s everywhere else. For example, the letter A is to be represented by a 1 in the first element (as A is the first letter of the alphabet), and 0’s in elements two through twenty-six.Neural NetworkThe network receives the 35 Boolean values as a 35-element input vector. It is then required to identify the letter by responding with a 26-element output vector. The 26 elements of the output vector each represent a letter. To operate correctly, the network should respond with a 1 in the position of the letter being presented to the network. All other values in the output vector should be 0.In addition, the network should be able to handle noise. In practice, the network does not receive a perfect Boolean vector as input. Specifically, the network should make as few mistakes as possible when classifying vectors with noise of mean 0 and standard deviation of 0.2 or less.ArchitectureThe neural network needs 35 inputs and 26 neurons in its output layer to identify the letters. The network is a two-layer log-sigmoid/log-sigmoid Appcr1: Character Recognition11-17network. The log-sigmoid transfer function was picked because its output range (0 to 1) is perfect for learning to output Boolean values.The hidden (first) layer has 25 neurons. This number was picked by guesswork and experience. If the network has trouble learning, then neurons can be added to this layer. If the network solves the problem well, but a smaller more efficient network is desired, fewer neurons could be tried.The network is trained to output a 1 in the correct position of the output vector and to fill the rest of the output vector with 0’s. However, noisy input vectors can result in the network’s not creating perfect 1’s and 0’s. After the network is trained the output is passed through the competitive transfer function compet. This makes sure that the output corresponding to the letter most like the noisy input vector takes on a value of 1, and all others have a value of 0. The result of this postprocessing is the output that is actually used.InitializationCreate the two-layer network with newff.net = newff(alphabet,targets,25);TrainingTo create a network that can handle noisy input vectors, it is best to train the network on both ideal and noisy vectors. To do this, the network is first trained on ideal vectors until it has a low sum squared error.Then the network is trained on 10 sets of ideal and noisy vectors. The network is trained on two copies of the noise-free alphabet at the same time as it is trained on noisy vectors. The two copies of the noise-free alphabet are used to maintain the network’s ability to classify ideal input vectors.p1a111n1n235 x 110 x110 x 126 x 126 x 126 x 1Input26 x 10。


