Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Industry spelling mistake checking method based on user feedback

A spelling error and checking method technology, applied in the field of English spelling check, can solve problems such as inability to effectively use multiple corpora, inaccurate calculation results, and low performance, and achieve efficient spelling error checking, fast checking speed, and high use efficiency Effect

Active Publication Date: 2014-06-25
SOUTHEAST UNIV +1
View PDF3 Cites 45 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If the corpus is based on grammatical rules or only counts the frequency of words, it is easy to cause poor performance caused by rule explosion or inaccurate calculation results due to insufficient statistical data during the query process.
Aiming at the problem that multiple corpora cannot be effectively used, a method of combining user dictionaries, industry corpora and core corpora and weighted calculations is adopted

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Industry spelling mistake checking method based on user feedback
  • Industry spelling mistake checking method based on user feedback
  • Industry spelling mistake checking method based on user feedback

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The present invention will be further described in detail below in conjunction with the accompanying drawings and specific examples.

[0045] The industry spelling error checking method based on user feedback of the present invention mainly solves the problems of lack of user association and rapid search of large corpus in current spelling error checking, and involves related technologies such as natural language processing, user dictionary design, and database search. This method utilizes the user dictionary designed by classification, uses the N-gram method to check spelling errors in English text, and completes the recommendation of correct words through large corpus database search, so as to realize the spelling error check associated with users. N-gram model ( figure 1 ) As a basic method of natural language processing, errors in the text are checked through word or sentence features and statistical information in the corpus; the user dictionary designed by classif...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an industry spelling mistake checking method based on user feedback. According to the industry spelling mistake checking method based on user feedback, spelling mistake checking is carried out on English text by using an N-gram method and a user dictionary which is designed in a classified mode, recommendation of correct words is accomplished by searching for a large corpus database, and thus checking of spelling mistakes related to a user is achieved. The N-gram method serves as a basic method for natural language processing, and the mistakes in the text are checked according to the characteristics of words or statements and statistical information in a corpus; recommended words which are most related to wrong words in the text input by the user are selected through cooperation between the user dictionary designed in the classified mode and statistical data of the corpus according to historical information of the user at present; the database is searched for a word chain with the largest conditional probability product by using the Viterbi algorithm, and computational efficiency of a hidden Markov model in the large corpus and use efficiency of the statistical information in the database are improved.

Description

technical field [0001] The invention relates to an English spelling error checking method, which utilizes a corpus containing a large amount of language information, a natural language statistical model, a hidden Markov model and other related technologies, and relates to natural language processing, especially the field of English spelling checking. Background technique [0002] First the abbreviations used in the present invention are defined: [0003] NLP (Natural Language Processing): natural language processing; [0004] BNC (British National Corpus): British National Corpus; [0005] LDC (Linguistic Data Consortium): Linguistic Data Consortium; [0006] LD (Levenshtein Distance): edit distance; [0007] N-gram: N-gram. [0008] Spelling Checker (Spelling Checker) is an important branch and basic link of NLP. It processes natural language into error-free and understandable text, and has a natural support for advanced NLP technologies such as machine translation, spe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27G06F17/30
Inventor 杨明罗军舟倪俊辉马成平任新才
Owner SOUTHEAST UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products