Dependency semantic-based Chinese unsupervised open entity relationship extraction method

An entity-relationship and open technology, applied in the field of information extraction research, can solve the problems of inability to transplant extraction methods, high requirements for training corpus, poor portability and scalability, etc., to meet high real-time requirements, low computational complexity, and extraction efficiency high effect

Active Publication Date: 2017-10-24
TONGJI UNIV
View PDF5 Cites 71 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The purpose of the present invention is to avoid the disadvantages of traditional extraction methods such as high requirements for training corpus, poor portability and scalability, and inability to adapt to open network texts, and the extraction method under the English corpus due to the complex and flexible characteristics of Chinese in terms of morphology and grammar. Unable to be transplanted to Chinese, this invention proposes an open unsupervised entity relationship extraction method for network text based on the characteristics of Chinese language

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dependency semantic-based Chinese unsupervised open entity relationship extraction method
  • Dependency semantic-based Chinese unsupervised open entity relationship extraction method
  • Dependency semantic-based Chinese unsupervised open entity relationship extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0043] A Chinese unsupervised open entity relationship extraction method based on dependency semantics proposed by the present invention is an entity relationship extraction method based on dependency semantic paradigm (DSNFs), which can realize automatic extraction without manual intervention, and the input is without any processing Natural language sentences, the output is entity-relationship triples. Such as figure 1 As shown, the whole process can be described as follows:

[0044] Step 1: Preprocess the input text. Each sentence will undergo a series of natural language processing operations such as word segmentation, part-of-speech tagging, and dependency syntax analysis to prepare for the next steps. The method proposed by the present invention uses the natural language processing technology provided by the "Language Technology Platform (LTP)" developed by the Social Computing and Information Retrieval Research Center of Harbin Institute of Technology to perform the ab...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a dependency semantic-based Chinese unsupervised open entity relationship extraction method. The method comprises the following steps of preprocessing an input text: performing Chinese word segmentation, part-of-speech tagging and dependency grammar analysis on the input text; performing named entity identification on the input text; arbitrarily selecting two entities from identified entities to form candidate entity pairs; searching for a dependency path between two entities in the candidate entity pairs; and analyzing whether a syntactic structure mapped by the dependency path is matched with a normal form of a dependency semantic normal form set or not, if yes, extracting words or phrases from the residual part of the input text according to the matched normal form to serve as relational words, forming a relational triple by the extracted relational words and the candidate entity pairs, and if not, performing normal form matching of a next group of the candidate entity pairs; and outputting the relational triple. Compared with the prior art, the method has the advantages that the calculation complexity is low; the extraction efficiency is high; distance position limitation is overcome; a simple sentence also can be extracted and the like.

Description

technical field [0001] The invention relates to information extraction research in the fields of artificial intelligence and natural language processing, in particular to a Chinese unsupervised open entity relationship extraction method based on dependency semantics. Background technique [0002] The wave of big data seems to be surging like the Qiantang River, and the data accumulated on the Internet is growing explosively. Faced with the massive amount of information on the web, it becomes very difficult for users to quickly find the information they care about. Traditional search engines can only return a large number of webpages related to the user's query content to the user, and the information that the user needs can only be obtained after browsing the webpage. This single search method of returning webpages can no longer meet the actual needs of users facing massive network data. The Internet provides people with an inexhaustible source of information, how to quick...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/279G06F40/30
Inventor 向阳贾圣宾鄂世嘉吕东东
Owner TONGJI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products