Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and system for identifying and verifying inflected words

A verification method and word-changing technology, which is applied in text database query, instrumentation, unstructured text data retrieval, etc., can solve the problem of poor automatic update performance of algorithms, prone to misjudgment, lack of extended word-changing thesaurus and concept library And other issues

Active Publication Date: 2020-10-30
DATAGRAND TECH INC
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The disadvantage of the existing inflected word recognition technology is that the system and method are based on a fixed inflected thesaurus and concept library, and the number and quality of inflected words and training samples have relatively large limitations, which is prone to misjudgment ; The automatic update performance of the algorithm is poor, and it does not have the ability to expand the variable word thesaurus and concept library

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for identifying and verifying inflected words
  • Method and system for identifying and verifying inflected words
  • Method and system for identifying and verifying inflected words

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0070] In order to enable those skilled in the art to better understand the technical solutions of the present invention, the present invention will be further described in detail with reference to the accompanying drawings.

[0071] Such as figure 1 , 11 As shown, a method for identifying and verifying variant words provided by an embodiment of the present invention includes the following steps:

[0072] S101. Obtain a set of sensitive words and training samples;

[0073] Sensitive words refer to the collection of words that violate laws, regulations or moral standards in the text; the collection of sensitive words is stored in the sensitive word database, and the number of sensitive words in the sensitive word database will continue to accumulate as it is updated. The training sample refers to a collection of multiple texts containing deformed words; the deformed words are all stored in the deformed character library; the deformed words in the training sample are determined, so tha...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a recognizing and verifying method and system for an anagram. The method has the following beneficial effects: an anagram library can be expanded through the pronunciation and character pattern expansion, and thus the quantity of the anagrams in the anagram library can be increased, the quality can be improved, and the misjudgement probability can be decreased; a training sample is used for training context probability, thus the misjudgement probability in semantic verification of the anagram can be further decreased, and moreover, the accuracy can be improved; the training sample is updated based on the verification result, so that the automatic updating performance of algorithm can be improved, and as a result, a concept base for the semantic verification can be expanded; as the accumulation of the verification results, the misjudgement probability continuously decreases. The recognizing and verifying system comprises an acquiring unit, an anagrammatizing training unit, a recognizing unit and a semantic verification unit. The system has the same beneficial effects as the method.

Description

Technical field [0001] The invention relates to the field of machine recognition of deformed words, in particular to a method and system for identifying and verifying deformed words. Background technique [0002] We often see deformed sensitive words in various platforms such as post bars, forums, and news media. The way of thinking of the human brain allows us to find these deformed words very naturally, because these deformed words are "abnormal" parts in the sentence. This "abnormal" feeling will focus our attention on this area, and then Gradually find complete inflections. When the machine directly faces these deformed words (including intermixed special symbols, homophone transformation, form near transformation, simplified and traditional conversion, radical splitting, etc.), it appears to be a little weak. The recognition of deformed words is an important problem in solving Chinese spam content filtering. . [0003] At present, in the Chinese patent application with appl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/284G06F40/30G06F16/33
CPCG06F16/3344G06F40/284G06F40/30
Inventor 张健江永青纪传俊陈运文高翔
Owner DATAGRAND TECH INC