Method and device for modeling and naming entity recognition based on maximum entropy model

A maximum entropy model, named entity technology, applied in instrumentation, computing, electrical digital data processing and other directions, can solve problems such as affecting the recognition effect and information loss

Active Publication Date: 2008-10-29
NEW FOUNDER HLDG DEV LLC +2
View PDF2 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] It can be seen from the existing technology that since this method is based on word segmentation for named entity recognition, word segmentation errors and the information loss caused by it will affect the recognition effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for modeling and naming entity recognition based on maximum entropy model
  • Method and device for modeling and naming entity recognition based on maximum entropy model
  • Method and device for modeling and naming entity recognition based on maximum entropy model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] In the technical solution of the specific embodiment of the present invention, the maximum entropy model is adopted, and a variety of linguistic information is fully utilized to directly label characters with roles to obtain a sequence of role labels with the highest probability, and through simple label name pattern matching, to Efficiently identifies named entities such as names of people, places, and organizations.

[0032] We believe that each character in a sentence implicitly carries a role information (role is an attribute of the character itself). The character role in the present invention is the role played by a single character in a named entity or sentence. Role labeling is to label the single-character roles in the sentence. These roles can be the first character of a place name (person's name), the last character of a place name (person's name) or the middle character of a place name (person's name), etc. For example, in the recognition of person names a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for modelling based on a maximum entropy model and named entity recognition. The method comprises that training texts marked by named entity are input; the role marking of the characters in the training texts is carried out so as to obtain the character role marking of the training text; according to the character role marking, the characteristic items of the characters are established; the characteristic items of the characters are input into a modelling tool of the maximum entropy to obtain the data model which is based on the character role marking. The method requires no word division, thus solving the problems that when the named entity recognition is carried out, the word division errors and information loss which is caused by the word division errors affect the recognition effect.

Description

technical field [0001] The invention belongs to the category of natural language processing, and in particular relates to a method and device for modeling and named entity recognition based on a maximum entropy model. Background technique [0002] Named entity (Named Entity, NE) refers to the uniquely determined minimum information unit with specific meaning—proper name and quantity phrase, which mainly includes 7 types of named entities: person name, organization name, place name, date, Times, currency values ​​and percentages. The task of named entity recognition is mainly to identify named entities in text and classify them. Named entity recognition was originally proposed as a subtask at MUC-6 (Message Understanding Conference Message Understanding Seminar). From the overall research results of named entity recognition, the recognition of date, time, currency value, and percentage is relatively simple. The design of its rules and statistical training of data are also r...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
Inventor 王学武彭学政杨建武肖建国
Owner NEW FOUNDER HLDG DEV LLC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products