[0046] All the features disclosed in this specification, or all disclosed methods or steps in the process, except for mutually exclusive features and/or steps, can be combined in any manner.
[0047] Any feature disclosed in this specification (including any additional claims and abstract), unless specifically stated, can be replaced by other equivalent or alternative features with similar purposes. That is, unless otherwise stated, each feature is just one example of a series of equivalent or similar features.
[0048] Such as figure 1 As shown, this embodiment discloses an automatic book classification device, which is characterized in that it includes an image collector, a character recognizer, a character matching circuit, and an information extraction circuit connected in sequence, in which:
[0049] The image collector is configured to: collect the image data of the book cover and pass it to the text recognizer;
[0050] The text recognizer is configured to: recognize the book cover text of the image data, and output the cover text information to the text matching circuit;
[0051] The text matching circuit is configured to: receive the cover text information, match the book attribute information in the book database according to the cover text information, and output it to the information extraction circuit;
[0052] The information extraction circuit is configured to extract the classification information of the book from the book attribute information.
[0053] Preferably, the classification information includes one or more of the subject classification information of the book, the price classification information of the book, the audience classification information of the book, or the evaluation grade classification information of the book.
[0054] Such as figure 2 As shown, in one embodiment, the above-mentioned character recognizer includes an image processing circuit, a feature value extraction circuit, and a feature value matching circuit that are sequentially connected, wherein:
[0055] The image processing circuit is configured to: preprocess the image data, and output a book cover image to the feature value extraction circuit;
[0056] The feature value extraction circuit is configured to: extract the feature value of the book cover image and output it to the feature value matching circuit;
[0057] The feature value matching circuit is configured to match the received feature value to the corresponding text in the text feature value library, and output cover text information.
[0058] Preferably, the preprocessing includes: cover image positioning, edge extraction and binarization processing, or also includes morphological processing. That is, preprocessing includes: cover image positioning, edge extraction and binarization processing, or cover image positioning, edge extraction, binarization processing and morphological processing.
[0059] In a specific embodiment, the image collector is a camera device or a scanning device.
[0060] Such as image 3 As shown, the characteristic value matching circuit includes a paragraph dividing module circuit, and a library matching module circuit connecting the paragraph dividing module circuit and the character characteristic value library, wherein:
[0061] The library matching module circuit is configured to: sequentially match the received feature values to corresponding characters in the character feature value library, and output the recognized characters to the paragraph dividing module circuit;
[0062] The paragraph dividing module circuit is configured to: receive the text sent by the library matching module circuit, divide the received text into several paragraphs according to the layout of the text in the book cover image, and output the cover text information divided into several paragraphs;
[0063] The text matching circuit is configured to: when the received cover text information is sequentially matched to the book attribute information in the book database according to the divided paragraphs, stop matching subsequent paragraphs, and output the book attribute information to the information extraction circuit.
[0064] In one embodiment, the above-mentioned paragraph dividing module circuit is configured to: add interval identifiers where the characters are not continuous according to the continuity of text layout in the book cover image;
[0065] The text matching circuit is configured to receive cover text information, and every time the text between two consecutive interval identifiers is matched to the book attribute information in the book database, stop matching subsequent text matching, and output the book attribute information to the information Extract the circuit.
[0066] In a specific embodiment, the above-mentioned book database includes a book name item, a publisher item, and an author item that are related to each other, and the related book name item, publisher item, and author item correspond to the same book attribute information;
[0067] The text matching circuit is configured to receive cover text information, and when the text between every two consecutive interval identifiers is matched to the corresponding item under the book name item, publisher item, or author item of the book database, extract the The book attribute information corresponding to the matched item is output to the information extraction circuit.
[0068] Such as Figure 4 As shown, in one embodiment, the feature value extraction circuit includes an image projection module circuit, an image preprocessing module circuit, and a feature value extraction module circuit that are sequentially connected, wherein:
[0069] The image projection module circuit is configured to connect to the image processing circuit, and project the book cover image in the horizontal or vertical direction, and divide it into several image blocks;
[0070] The image preprocessing module circuit is configured to: perform preprocessing on the plurality of image blocks and output a plurality of binarized image blocks;
[0071] The characteristic value extraction module circuit is configured to: sequentially extract the characteristic values of the several binarized image blocks, and sequentially output the extracted characteristic values to the characteristic value matching circuit.
[0072] Further, the feature value matching circuit sequentially recognizes the feature value corresponding text of the binarized image block output by the feature value extraction module circuit and outputs it to the text matching circuit as cover text information;
[0073] The text matching circuit is also configured to: sequentially match the text between every two interval identifiers in the cover text information sent by the feature value matching circuit to the corresponding items under the book name item, publisher item, or author item of the book database At this time, the book attribute information corresponding to the matched item is extracted and output to the information extraction circuit, and a processing stop signal is sent to the image preprocessing module circuit, so that the image preprocessing module circuit stops processing subsequent image blocks.
[0074] Preferably, the book database is an authorized book publisher database or an authorized book agent database.
[0075] The present invention is not limited to the foregoing specific embodiments. The present invention extends to any new feature or any new combination disclosed in this specification, and any new method or process step or any new combination disclosed.