Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Comment text entity recognition method and device based on character-based model

An entity recognition and model technology, applied in the information field, can solve the problems of affecting the recognition effect, inaccurate word segmentation results, and inability to handle massive volumes, so as to improve the accuracy rate, improve the training efficiency and effectiveness, and reduce the training sample size Effect

Active Publication Date: 2017-05-31
INST OF INFORMATION ENG CHINESE ACAD OF SCI
View PDF1 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In view of the above two reasons, it is difficult for rule matching to accurately identify the target entity from the comment text
[0004] In the existing technology, the manual-based method has a relatively high accuracy rate, but the cost is high and it cannot handle massive texts; the text content that can be recognized by rule-based matching is very limited, and only normatively expressed text can be recognized; word segmentation-based The method will lead to inaccurate word segmentation results due to irregularities in oral expression, which will affect the recognition effect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Comment text entity recognition method and device based on character-based model
  • Comment text entity recognition method and device based on character-based model
  • Comment text entity recognition method and device based on character-based model

Examples

Experimental program
Comparison scheme
Effect test

example

[0035] Example: A core entity recognition method and device for evaluative texts Find out the core entities in different types of evaluative texts, the overall process is as follows figure 1 As shown, each functional module is as figure 2 shown. Take the travel review as an example, "In spring, the scenery of the Summer Palace is very beautiful." The core entity is "Summer Palace".

[0036] 1) Training model, use the labeled training data to train the word-based bidirectional LSTM model. For example, in the text "Beijing is very congested.", the core entity is "Beijing".

[0037] i) For the marked training text, first segment it according to the word (continuous English and numbers as a whole as a word, punctuation marks as a word), take each word as the center to intercept the fixed-length context as the training sample, the length of the context Insufficient 0 padding processing.

[0038] Assuming that the fixed length of the context intercepted here is 2, the text with...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a comment text entity recognition method and device based on a character-based model. The method includes the steps that firstly, labeled training data is used for training a character-based bidirectional LSTM model; secondly, the trained character-based bidirectional LSTM model is used for kernel entity recognition of an input evaluative text; thirdly, non-empty results output by the character-based bidirectional LSTM model are subjected to character completion, and the completed results serve as finally recognized kernel entities and are output; fourthly, a candidate entity is generated through text segmentation, part-of-speech tagging and an entity dictionary to serve as a kernel entity by means of the evaluative text with the empty result output by the character-based bidirectional LSTM model. By means of the method and device, the entities in texts can be accurately and effectively extracted from large-scale colloquially expressed evaluative texts.

Description

technical field [0001] The invention belongs to the field of information technology, and in particular relates to a method and device for identifying a comment text entity based on a character model. Background technique [0002] Comment text refers to the comment text on consumer products or services published by users, including but not limited to commodities, stores, tourist attractions, etc., such as product reviews on shopping websites, tourist attraction reviews on travel websites, movie reviews on movie viewing websites, etc. wait. The entity recognition of comment text refers to finding out the object of user comments from the comment text. As a direct reflection of consumer user experience, review text can provide important references for product or service providers and other consumer users. The entity recognition of this type of text can quickly and conveniently locate the review information of consumer products or services, and provide a strong basis for releva...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/295
Inventor 李全刚柳厅文王玉斌李柢颖时金桥亚静郭莉
Owner INST OF INFORMATION ENG CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products