Method and device for word form restoration

A word form and word entry technology, applied in the field of morphological restoration methods and devices, can solve problems such as over-reduction, and achieve the possible effect of reducing ambiguity or escaping

Active Publication Date: 2016-03-30
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] In order to solve the above technical problems, embodiments of the present invention provide a method and device for morphological restoration to solve the problem of over-reduction existing in existing morphological restoration algorithms

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for word form restoration
  • Method and device for word form restoration
  • Method and device for word form restoration

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053]In order to enable those skilled in the art to better understand the technical solutions in the present invention, the technical solutions in the embodiments of the present invention will be described in detail below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is a part of embodiments of the present invention, but not all embodiments. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments of the present invention shall fall within the protection scope of the present invention.

[0054] First, a kind of morphological restoration method provided by the embodiment of the present invention is described, see figure 1 As shown, the method may include the following steps:

[0055] Perform root restoration on the entry to be restored, and obtain the output results of each step of the root restoration algorithm and add them to the restoration candidate set;...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a lemmatization method and device. The lemmatization method comprises the following steps: achieving stemming of vocabulary entries to be lemmatized, acquiring the output results of all steps of the stemming algorithm and adding the output results into a lemmatization candidate set; for each lemmatization candidate, respectively calculating the lemmatization probability of the lemmatization candidate relative to the vocabulary entries to be lemmatized; according to the lemmatization probability, determining the lemmatization result of the vocabulary entries to be lemmatized. According to the scheme provided by the embodiment of the invention, in the stemming process, the output in each step serves as a candidate of the lemmatization result; as the stemming is a step-by-step process, in the lemmatization candidate set, various simplified forms of the vocabulary entries to be lemmatized appear; and then, through calculating the lemmatization probability of the original candidate relative to the vocabulary entries to be lemmatized, the final lemmatization result is determined, and accordingly, the probability of equivocation or meaning transfer after lemmatization is reduced.

Description

technical field [0001] The invention relates to the technical field of natural language processing, in particular to a method and device for restoring word form. Background technique [0002] According to the relationship between word structure and constituent morphemes, languages ​​in the world are generally divided into four types: isolating languages, agglutinative languages, inflectional languages, and complex-synthetic languages. Among them, the inflectional language is characterized by rich inflections to express the grammatical relationship between words. Common inflectional languages ​​are English, French, Russian and so on. [0003] English, as an inflectional language, has a series of complex word-form transformations, including singular and plural, tense, comparative, possessive, etc. Therefore, morphological analysis of English is often the basis for various English processing (such as recognition of common phrases, recognition of noun phrases, and analysis of n...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27
Inventor 何径舟王晓露
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products