Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and apparatus for identifying amended entity words

A technology of entity words and correction units, which is applied in special data processing applications, instruments, electrical digital data processing, etc., and can solve the problem of low accuracy in identifying entity words

Inactive Publication Date: 2015-11-18
INSPUR GROUP CO LTD
View PDF4 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] Now many enterprises use big data to obtain valuable information assets. From the perspective of the application status of big data, they mainly use basic training corpus to identify proper names such as names of people and places in the text and meaningful time, date, etc. However, in practical applications, due to different industries and different businesses, there are certain differences in proper names, resulting in low accuracy in identifying entity words

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and apparatus for identifying amended entity words
  • Method and apparatus for identifying amended entity words
  • Method and apparatus for identifying amended entity words

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach

[0050] In one embodiment of the present invention, in order to further improve the accuracy of identifying entity words and at the same time improve the accuracy of word segmentation, after step 103 and before step 104, it further includes: The wrong entity word in the word is re-segmented; the specific implementation mode of step 105: perform word segmentation according to each entity word after re-segmentation and the category of the updated entity word.

[0051] In an embodiment of the present invention, in order to make the labeling simple and obvious, and easy to handle for non-professionals, the specific implementation of step 101: configure the corresponding display colors for each category of entity words; the specific implementation of step 102 Distribute display colors for each substantive word in the text after word segmentation; specific implementation of step 103: display each substantive word according to the assigned display color; specific implementation of step...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a method and an apparatus for identifying amended entity words. The method for identifying the amended entity words comprises: configuring corresponding labels to each category of entity words in a training corpus; according to the labels corresponding to each category of entity words in the training corpus, carrying out labeling on each segmented entity word in a text; viewing each entity word with the label; and receiving a trigger, amending the labels of the entity words, updating the categories of the entity words in the training corpus according to the amended labels of the entity words, and carrying out segmentation according to the updated categories of the entity words. Accuracy of identifying the entity words can be effectively improved.

Description

technical field [0001] The invention relates to the field of computer language processing, in particular to a method and device for correcting entity word recognition. Background technique [0002] Now many enterprises use big data to obtain valuable information assets. From the perspective of the application status of big data, they mainly use basic training corpus to identify proper names such as names of people and places in the text and meaningful time and date. However, in practical applications, due to different industries and different businesses, there are certain differences in proper names, resulting in low accuracy in identifying entity words. Contents of the invention [0003] The invention provides a method and device for correcting entity word recognition, so as to improve the accuracy of identifying entity words. [0004] A method for correcting entity word recognition, configuring corresponding labels for each category of entity words in the training corpu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/28G06F17/27
Inventor 范莹于治楼
Owner INSPUR GROUP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products