Unlock instant, AI-driven research and patent intelligence for your innovation.
Chinese text recognition method and device
What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A text recognition and Chinese technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of low recognition rate of special words, achieve the effect of improving recognition efficiency and accurate recognition results
Active Publication Date: 2018-09-14
CHINA MOBILE GRP GUANGDONG CO LTD +1
View PDF4 Cites 8 Cited by
Summary
Abstract
Description
Claims
Application Information
AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology
Problems solved by technology
[0005] The present invention provides a Chinese text recognition method and device, which is used to overcome the defect that the existing new word recognition method adopts a unified way to recognize all vocabulary to be confirmed, and the recognition rate of special vocabulary is low
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more
Image
Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
Click on the blue label to locate the original text in one second.
Reading with bidirectional positioning of images and text.
Smart Image
Examples
Experimental program
Comparison scheme
Effect test
example 1
[0147] Original Search Record: Is it better to use red string or black string for obsidian pendant?
[0148] Jieba Chinese word segmentation results [8]: obsidian, for pendants, red rope, good, or, black rope
[0149] Jieba word segmentation results based on dynamic thesaurus update: obsidian, pendant, red rope, okay, or, black rope
example 2
[0151] Original search record: Liu Tao Domineering Wall Dong Yang Zi
[0152] Jieba Chinese word segmentation results: Liu Tao, Domineering, Bidong, Yang, Zi
[0153] Jieba word segmentation results based on dynamic thesaurus update: Liu Tao, Domineering, Bi Dong, Yang Zi
example 3
[0155] Original Search Record: Yuecheng Hospital, Lancheng District
[0156] Jieba Chinese word segmentation results: blue,city,month,city,hospital
[0157] Jieba word segmentation results based on dynamic thesaurus update: Lancheng District, Yuecheng, Hospital
[0158] In the second aspect, the embodiment of the present invention provides a Chinese text recognition device, such as Figure 7 shown, including:
[0159] The keyword acquisition unit 201 is used to obtain the keywords reported by each terminal application program in the application program search, and store the keywords into the search corpus of the corresponding category according to the category attribute of the keywords;
[0160] Character string segmentation unit 202, for adopting corresponding preset algorithm for each search corpus to carry out multiple segmentation to the keyword of storage until obtain the single character string that can't continue segmentation;
[0161] The preliminary recognition un...
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
PUM
Login to View More
Abstract
The invention relates to a Chinese text recognition method and device. The method includes: acquiring keywords reported by application programs of terminals in PS domain signaling; classifying the keywords according to type of the application programs; performing segmentation, preliminary recognition and probability screening on the keywords stored in different search corpuses on the basis of different preset algorithms; adding a result acquired by screening into a preset word library. Therefore, compared with existing recognition methods, the method has the advantages that different vocabularies can be specifically processed according to type difference of the application programs reporting the keywords, higher pertinence is realized, more accurate recognition results can be acquired, andrecognition efficiency is improved.
Description
technical field [0001] The embodiment of the present invention relates to the field of software technology, in particular to a Chinese text recognition method and device. Background technique [0002] With the advent of the Internet age, people rely more and more on search engines for information retrieval. However, the traditional mechanical word segmentation method is not ideal for the recognition of ever-changing network words and emerging phrases. Chinese word segmentation technology is the basis of search engines and Chinese natural languageprocessing, and it is a major bottleneck for unregistered word recognition in Chinese word segmentation. Among them, unregistered words refer to words that have not been included in the word segmentation system. [0003] For the identification of unregistered new words, the more commonly used methods are to obtain web page content, search logs or query logs, and identify new words based on rules or statistics based on the content o...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More
Application Information
Patent Timeline
Application Date:The date an application was filed.
Publication Date:The date a patent or application was officially published.
First Publication Date:The earliest publication date of a patent with the same application number.
Issue Date:Publication date of the patent grant document.
PCT Entry Date:The Entry date of PCT National Phase.
Estimated Expiry Date:The statutory expiry date of a patent right according to the Patent Law, and it is the longest term of protection that the patent right can achieve without the termination of the patent right due to other reasons(Term extension factor has been taken into account ).
Invalid Date:Actual expiry date is based on effective date or publication date of legal transaction data of invalid patent.