Chinese entity relation extraction method based on keyword and verb dependency

An entity relationship and keyword technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve the problems of difficulty in mining deep semantic relationships in texts, less frequent occurrence of named entities, etc., and achieve rich text semantic network. Effect

Active Publication Date: 2019-01-18
SHANGHAI DATATOM INFORMATION TECH CO LTD
View PDF5 Cites 108 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006]The study found that the current entity relationship extraction in non-specific fields is mainly aimed at the relationship extraction between people, institutions, and places, and some articles, such as an article about introducing big data In an article or a product manual, named entities such as people, institutions, and places appear less frequently. It is difficult to mine the deep semantic relationship of the text if only the semantic relationship of these named entities is extracted.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese entity relation extraction method based on keyword and verb dependency
  • Chinese entity relation extraction method based on keyword and verb dependency
  • Chinese entity relation extraction method based on keyword and verb dependency

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The present invention will be further described below in conjunction with accompanying drawing.

[0049] The Chinese entity relationship extraction method based on keyword and verb dependence of the present invention analyzes the dependency relationship of verbs, realizes the entity relationship extraction of large-scale free text, and provides data support for building a text semantic network, see figure 1 , including the following steps:

[0050] Step 1: Segment the input text, extract keywords, and generate a keyword thesaurus from the extracted keywords.

[0051] The purpose of extracting text global keywords is to expand traditional entity sets. Traditional entity sets are only for named entities such as person names, place names, and organization names, while the present invention is oriented to large-scale domain-free free text. If a text document has almost no names, Names of places and institutions, then the entity relationship cannot be extracted, so the pres...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Chinese entity relation extraction method based on keyword and verb dependency. Taking large-scale unstructured free text as target text, firstly, the text is segmented and keywords are extracted to form a text keyword thesaurus. Then the text is subjected to sentence segmentation, word segmentation, part-of-speech tagging, named entity recognition, dependency parsing, and entity corpus is constructed by combining named entity thesaurus and keyword thesaurus. According to the characteristics of Chinese sentence structure, syntactic structure and the dependency betweenwords, the entity-relation syntactic rules are constructed from verbs, and then each sentence in the text is matched with the relation syntactic rules. Finally, the relation triple is output and theset of text relation triple is obtained. The invention can make the entity relation extraction of the large-scale Chinese text more effective and more accurate.

Description

technical field [0001] The invention relates to a Chinese entity relationship extraction method, in particular to a large-scale free text extraction method based on keyword and verb dependency analysis. Background technique [0002] With the rapid development of Internet information technology, text information on the Internet has shown explosive growth. How to quickly and accurately extract the information people need from large-scale text information has become a research hotspot. Therefore, information extraction technology came into being. Entity relationship extraction is an important part of information extraction. Its purpose is to mine the semantic association between entities from natural language texts. Mining and analysis can further understand the user's search intent, so as to provide users with more accurate search services and improve user search experience. [0003] The traditional Chinese entity relationship extraction is oriented to the extraction of dom...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/279G06F40/30
Inventor 许青青谢赟韩欣卓建飞
Owner SHANGHAI DATATOM INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products