[0083] The present invention will be described in detail below in conjunction with specific embodiments. The following examples will help those skilled in the art to further understand the present invention, but do not limit the present invention in any form. It should be pointed out that for those of ordinary skill in the art, a number of modifications and improvements can be made without departing from the concept of the present invention. These all belong to the protection scope of the present invention.
[0084] Such as figure 1 As shown, the entire printing character image recognition and verification system is composed of four parts: character generation, character printing, character recognition and character verification. The whole system can automatically realize the automatic recognition and automatic verification of the character marks on the product surface, without the need for human eyes to compare the recognized characters with the actual printed characters on the surface, which effectively reduces the labor intensity. The four parts of the whole system are described below.
[0085] 1) Character generation part
[0086] The character generation part is the beginning part of the whole system, such as figure 2 As shown, the character generation part can be divided into three parts, the character part (called the character part), the check code part and the connection part.
[0087] The character part refers to the character string that identifies the product. It is the main body of the character generation part. It is generally composed of letters or numbers or a mixture of both. When a product is at the beginning of the circulation link, the system will assign a product to the product. The identifier string that identifies the product. This part of the string is automatically generated by the system according to the production characteristics of the product (such as batch number, material, etc.), and is a code that can uniquely identify the product in the production process. Such as image 3 The character string shown in this part, L indicates that the product is a temporary batch, 0328 indicates the production date number, and 58 indicates the batch of similar products produced on the current production date. Each character of the character string can be composed of a dot matrix of M rows and N columns, where the dots in the foreground color (black) constitute the character itself, and the dots in the background color constitute the background blank part.
[0088] The check code part is located after the character part, and is based on certain check rules (parity, BCC, CRC, etc.) to check whether the character recognition is an accurate character. It is usually different from the character part. Composed of special characters (such as BCC check, XOR is performed on the ASCII code corresponding to each character recognized by the character itself, and the result is a single-byte ASCII code. If the ASCII code is represented by a hexadecimal number, Then the number is usually a string consisting of two characters, 0-9 numbers and AF letters). It can be seen that the character range of the check code is included in the character range. Therefore, if the check code part contains a character that already exists in the character base, and the character in the character base is already wrong, the check code cannot be correct, and the recognition result is naturally wrong. This does not realize the function of the check character part of the check code.
[0089] In view of this, the check code needs to have a certain degree of independence and easy identification to enable the check code character to be more accurate to check whether the character string in this part is correct or not, the check code should meet the following requirements:
[0090] 1) The check code should have low correlation with the characters in the character base, that is, the check code should be as different as possible from the characters in the character base, so as to prevent the check code from being mistakenly recognized as the relevant characters in the character base.
[0091] 2) The check code should be simple to design and implement. Compared with the character base, the characters in the check code part should be easier to recognize.
[0092] 3) The identification of the check code should have a certain degree of redundancy, that is, when the check code is damaged to a certain degree, the correctness of the recognition will not be affected.
[0093] When the check code satisfies the above conditions, the check code itself has a high recognition accuracy, so that it can be correctly recognized by the check code to determine whether the recognized character string is correct. Such as Figure 4 The check code character shown has quite good independence, easy identification and redundancy, and the purpose of recognizing the check code can be achieved by identifying the simple image of the triangle.
[0094] The connecting part is the middle part connecting the character part and the check code part. The connecting part can inform the computer that the character string of the character part is over and the character of the check code part is about to start through the characters in this part. When there is little difference between the character string of the character part and the character of the check code part, the two parts can be distinguished by designing the characters of the connecting part, so as to know which part is the character part and which part is the check code. For the present invention, since the above-mentioned check code part is quite different from the character string of the character part, it can be distinguished more clearly, so there is no need to design the connecting part of the character, so the connecting part is set to be empty.
[0095] 2) Character printing part
[0096] The character printing part is the second part of the whole system, which is the realization of the printing of the character string generated by the character generation part on the product surface. When the system assigns an identification string to the product according to the production characteristics of the product, the character generation part will automatically based on the identification string (character part), the check code part of the identification string and the connecting part of the character The string is added to the end of the character string to generate the entire string, which is called the character generated string. The character string will be transmitted from the upper computer to the printing control system of the lower computer, and the printing control system of the lower computer will convert all the character generated strings into the printing control according to the predetermined character size and printing points. The printer can recognize the dot matrix pattern of jet printing, and then the jet printing controller controls the jet printing actuator according to the dot matrix pattern to jet print characters at a certain position on the product surface to generate a character string, thereby forming a visible product surface character string.
[0097] 3) Character recognition part
[0098] After the printing is completed, the character strings with unique printing marks on the surface of the product will go through the corresponding circulation links. In these circulation links, the printed character strings on the product can be identified by image recognition. So as to record the circulation record of the product in this link. The character recognition process usually includes string segmentation, character segmentation, feature extraction and character recognition.
[0099] 1 string split
[0100] String segmentation refers to the segmentation of the string printed on the product surface from the background image, including three steps: string image reading, preprocessing, and string segmentation. String image reading means that the string image is collected and stored by the terminal image acquisition device. After the character string image is read, it is necessary to perform preprocessing such as grayscale and denoising processing on the read image (if you need to use the color characteristics of the image, you don't need to perform grayscale). After preprocessing, the overall characteristics of the string (such as color feature, grayscale feature, or grayscale gradient feature, etc.), according to the grayscale, color or grayscale gradient feature of the string part, are different from the feature corresponding to the background area To extract the character string from the background area, and to binarize part of the extracted character string. The final image is a binary image, the character string is the foreground color, and the rest is the background color.
[0101] 2 character split
[0102] After the character string is separated from the background, the separated character string needs to be split into individual characters. This process is character segmentation. The character segmentation usually includes character string binarization image tilt adjustment processing, single character segmentation processing, and single character normalization processing.
[0103] Due to the binarized image after the character string is split, the position of the character string in the image may be slanted, which will have an adverse effect on the subsequent processing and recognition process. Therefore, the inclination adjustment process is required to change the character of the character string. The position is adjusted to an approximate horizontal position. The character string is divided into the first half and the second half according to the width direction. The angle between the character string and the horizontal direction can be determined by calculating the centroid coordinates of the first half of the character string and the centroid coordinates of the latter half. After obtaining the angle, you can Rotate the entire string in the opposite direction of the included angle to adjust the string to a horizontal position.
[0104] After the inclination of the binary character string is adjusted to be horizontal, a single character can be segmented according to the vertical and horizontal projection histograms of the pixels of the foreground color of the character string. Find the trough of the vertical and horizontal projection histogram curve and set a threshold. If the statistical value of the projection pixel corresponding to the trough position is less than this threshold, then determine the character and character segmentation position. After determining the character and character segmentation position, the character string image can be segmented into several individual characters.
[0105] Several single characters after segmentation need to be normalized, and individual characters that may be of different sizes after segmentation are converted into characters of the same size to facilitate subsequent character feature extraction processing.
[0106] 3 feature extraction
[0107] The different features between characters are the key to distinguish character categories (for example, the number character 8 has two circles and the number character 1 has no circle, etc.). You can extract the same type features of different types of characters (such as extracting normalized characters) The pixel-by-pixel grayscale feature, etc.) classify characters. Commonly used character features include pixel-by-pixel grayscale feature, horizontal and vertical grayscale projection features, etc.
[0108] 4 character recognition
[0109] The characters printed on the surface of the product are divided into two parts, one part is the character part, which is composed of ordinary numeric characters and alphabetic characters; the other part is the check code part of the characters, which is composed of some special design patterns. Their feature extraction and previous processes are the same, but the recognition process is different.
[0110] 1) Character section
[0111] According to the feature extraction method, the same type of features of different types of characters can be extracted. If the types of characters are the same, then the same type of features will be very similar, so a large number of character samples can be collected and these samples can be divided into different Types of characters (such as numbers 0-9, letters AZ or az), using the same feature extraction method for all samples, the character features of the same category will be very similar, and form differences with other categories of characters, according to these The statistics of similarities and differences of known character samples can form a character classifier. When a new character is encountered, the new character is compared with the features of all the category characters in the classifier formed by known samples, and the character category with the highest similarity to the known category character features is found, and the new The character category is set to the character category with the highest similarity. Generally, there are many design methods for classifiers, such as neural network classifier design methods, vector machine-based classifier design methods, and cluster classifier design methods.
[0112] 2) Check code character
[0113] The characters in the check code part are not the same as the character recognition of the character base. It is not composed of numbers and letters, so it cannot be recognized by a set of character classifiers shared with the character base. The identification of check code characters can be based on its design principles and its own characteristics to create an original set of identification methods, according to this set of identification methods to identify the check code. Such as Figure 4 The most obvious feature of the check code recognition is that the characters are composed of simple triangles, so the check code recognition can be completed by the recognition of the triangles.
[0114] 3) Character verification
[0115] The character string printed on the surface of the product consists of two parts, one is the character string, and the other is the character string formed by the check code. After the character part is recognized, the character string corresponding to the character part can be obtained. And the character string of the check code part can also be recognized according to the recognition method of the check code character. Since the check code is the ASCII code value corresponding to each character of the character string, it is a hexadecimal string formed according to certain check rules. A character string corresponds to a unique check code string, so it can be judged whether the recognized character string is correct by checking the check code string.
[0116] When product A is at the beginning of the circulation link, assume that the host computer system assigns the production order number of product A to AS120722 according to the production characteristics of product A, and uses a printing character image recognition and checksum method to identify product A The order number, the order number identification structure and steps of product A are as follows Figure 5 Shown.
[0117] 1) Obtain the production order number AS120722 of product A from the host computer. The production order number is the character base of the character generation part.
[0118] 2) After the character body is determined, the character verification method is selected. If the system verifies whether the last recognized character is correct according to the BCC check method, then according to the BCC check rule, it is not difficult to obtain the check code of the production order number AS120722 as the hexadecimal number 16H.
[0119] 3) Select the method of character printing dot matrix pattern, and design the check code. The production order number of product A is AS120722, and each character of the string is composed of a dot matrix of M rows and N columns, such as Image 6 As shown, according to the three principles that the check code design satisfies, the check code design is as follows:
[0120] ① Each character of the character string consists of M rows and N columns (M> N) dot matrix composition, it is preliminarily determined that each character of the check code is composed of a square dot matrix with M rows and M columns.
[0121] ②According to the rules of verification, the verification code can always be represented by binary numbers 0 and 1, with the foreground color dot matrix area representing 1, and the background color pixel dot matrix area representing 0. Due to printing or printing, the dot matrix of the foreground pixels of the entire row or column of the character is often missing, and the dot matrix is less in other cases. If the coding principle similar to one-dimensional code is adopted, the lines of different thicknesses and The blank combination means different characters, so when the entire row and the entire column are missing (especially the entire column), it will be easy to indicate the correct encoding error, such as Figure 7 As shown, Figure 7 The picture on the left is correct, and the binary digit is 1010, while the picture on the right is due to the lack of printing in the entire column during the printing process, causing the thick lines to become narrow lines. The resulting character is the binary number 00010, which causes recognition errors.
[0122] ③In order to deal with the above problems, the following solutions exist.
[0123] Isosceles right triangle representation. According to 1, each character of the check code is a square dot matrix with M rows and M columns. The square dot matrix can be divided into four isosceles right-angled triangles, and the isosceles right-angled triangle foreground pixel dot matrix is used. The area represents 1, and the blank isosceles right triangle background pixel dot matrix area represents 0. Composing four isosceles right-angled triangles of the same size into a square has the following advantages: first, triangles are the simplest and most basic type of graphics, which is very easy to identify; second, the square lattice area is certain In this case, the isosceles right-angled triangle can achieve a larger number of divided regions (which can represent more binary numbers 0 and 1), and at the same time, the isosceles right-angled triangle has more vertical and horizontal traversing rays (the horizontal and vertical traversing rays are more More, it means that when the entire row and column are missing, the impact will be lower); third, the characteristics of the middle waist and right angle of the isosceles right-angled triangle can achieve multiple combinations (such as two isosceles right-angled triangles) It can form a large isosceles right-angled triangle, and four can form a square), and various combinations have their own unique characteristics. For example, a single isosceles right-angled triangle can specify binary numbers 0 and 1 according to the outer normal direction of the hypotenuse. (Assuming that the order of the characters in this part is from left to right, and the normal direction outside the hypotenuse is in counterclockwise order, then Picture 8 The check code of can be expressed as 1110), and the normal direction of the hypotenuse of a large isosceles right triangle formed by two isosceles right triangles is different from the normal direction of the hypotenuse of a single isosceles right triangle.
[0124] The identification method of the above check code is: 1) Count the number of dots in the foreground color dot matrix, determine the number of isosceles right triangles formed by the foreground color of the square dot matrix (determined by the ratio of the entire square dot matrix) and the background color formation The number of isosceles triangles; 2) According to the triangle detection method, detect the triangle area. Among all the combination types of isosceles right-angled triangles, there are only two combination types. The combination shape formed by its foreground color dot matrix is a non-triangular shape, one such as Picture 8 As shown, the foreground color dot matrix area it contains is a non-triangular shape, but the background color dot matrix forms a triangular shape. Therefore, in this case, the triangle area formed by the background pixels can be detected to determine the entire square dot matrix. It is composed of triangles; the other is square, that is, the entire dot matrix area of the square is the foreground color or the background color, then this situation is also very easy to detect, and you can know it directly through 1). 3) Determine the general direction of the normal of the hypotenuse of the triangle, determine the order of the isosceles right-angled triangles in the foreground, and form the order of binary numbers 0 or 1. If it is a single isosceles right triangle, such as Picture 9 , The check code is expressed as 1010, it is easy to determine the normal direction of the hypotenuse of the triangle. If it is a large isosceles right triangle composed of two isosceles right triangles, it is also easy to obtain the normal direction of the hypotenuse and its normal The direction is very different from the normal direction of the small isosceles right-angled triangle. The order of the check code 0 or 1 can be determined according to the normal direction and the number of triangles determined in 1).
[0125] 4) After confirming the characters in the character string and the check code part, the characters in the character string and the check code part need to be printed on the surface of the product. The character string in the character part is composed of numbers and letters, and it is still in the form of letters plus numbers after printing; while the characters in the check code part are mainly square dot matrix areas composed of isosceles right-angled triangles. The order of the character part is From left to right, the check code isosceles right triangle area will also follow the order from left to right, and the order is counterclockwise. The check code is 16H, and its binary number should be 00010110. There are 8 binary numbers, so 2 square dot matrix areas composed of 8 small isosceles right-angled triangles are required; and the position of character 1 is from left to right The 4th, 6th and 7th bits, the 4th, 6th, and 7th isosceles right-angled triangles from left to right and counterclockwise corresponding to the outer normal direction are the foreground pixel areas. If the gray level of the foreground pixels is defined Is black, the printed character image of the check code part is like Picture 8 As shown in the figure on the right, the entire string of the product is as Figure 7 As shown on the left.
[0126] 5) After the identification string of the entire product is generated, the inkjet or printing equipment will Figure 7 The character string shown is printed on the surface of the product.
[0127] 6) When the product circulates to a certain production link, image acquisition equipment such as industrial cameras (or handheld image acquisition equipment) can be used to collect the character string printed on the product surface to form a character string digital image such as Figure 7 Shown.
[0128] 7) After obtaining the identification string, first extract the string from the background image. There are many methods that can be used. Here, clustering algorithm (KMean) and adaptive threshold segmentation algorithm can be used to combine The string is extracted from the background image.
[0129] 8) The image formed by extracting the character string from the background image is a binary image containing only the character string. Then calculate the angle between the centroid of the left half of the pixel and the centroid of the right half of the binarized image and the x-axis direction of the image reference coordinate system (the width direction of the image), so that the character string can be adjusted The angle of inclination is parallel to the width of the image.
[0130] 9) After adjusting the horizontal position of the character string in the image, according to the vertical projection of the foreground pixel segment of the character string in the width direction and the height direction, the division position between characters can be determined, and the character string can be divided into several characters , The identification string of product A can be divided into 10 characters.
[0131] 10) After the character string is divided into several individual characters, since each character may have a different size, determine a character string size standard, establish a mapping relationship between all character strings to standard character strings, and convert characters with inconsistent sizes into characters of standard size ( Such as M rows and N columns).
[0132] 11) After all characters are converted into characters of the same size, the features of each character are extracted. For example, the pixel-by-pixel feature extraction method extracts features based on the gray level of the foreground pixels of the character. The feature value of the foreground pixel is 1, the background pixel The feature value of is 0. According to the position of the pixel, a feature vector composed of 0 and 1 will be formed (the image with M rows and N columns will have M x N 0 or 1).
[0133] 12) Since the entire identification string contains the character part and the check code part, each character feature vector of the character part is sent to the designed classifier for classification. For example, the neural network classifier can use the feature vector as the neural network input layer node Vector, according to the weight matrix of the input layer node to the hidden layer node and the hidden layer node to the output layer node in the nerve, the vector value of the output layer node can be calculated, and the corresponding input layer vector can be determined according to the output layer vector Character category. The identification of the check code part is based on the position and area of the triangle, such as Figure 7 On the right, in the counterclockwise order from left to right, it can be seen that the first square lattice contains an isosceles right triangle, and the position of the isosceles right triangle is the fourth position, so the binary number of this character is 0001 , And the second square lattice contains two right-angled isosceles triangles, the positions are 2 and the third, then the binary number of the second character is 0110. Combine the two identified binary numbers, then The binary number that can be obtained is 00010110, which is the hexadecimal number 16.
[0134] 13) According to step 11), the characters of the character part and the characters of the check code part can be separately recognized. If the recognition result of the character part of the product A recognized by the image acquisition and recognition system is AS120722, then the BCC check rule , Its check code is 16. At this time, if the characters in the check code part can be recognized and the hexadecimal number strings 1 and 6 can be obtained, then the character string obtained by the check code part can be matched with the theoretical check code of the character part of the character, so that you can Prove that the character string is correct when recognized. If they do not match, it is proved that the character recognition is wrong.
[0135] The specific embodiments of the present invention have been described above. It should be understood that the present invention is not limited to the above specific embodiments, and those skilled in the art can make various deformations or modifications within the scope of the claims, which does not affect the essence of the present invention.