Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Natural language text entity relationship extraction method and device

A natural language and entity relationship technology, applied in the field of information processing, can solve problems such as large amount of calculation, achieve low calculation intensity, reduce calculation amount, and solve the effect of long-distance dependence

Pending Publication Date: 2022-07-26
江苏新质信息科技有限公司
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] To this end, the present invention provides a method and device for extracting entity relationships in natural language text, which solves the problem of long-distance dependence in entity relationship recognition, ensures recognition accuracy, and solves the problem of large amount of calculation at the same time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Natural language text entity relationship extraction method and device
  • Natural language text entity relationship extraction method and device
  • Natural language text entity relationship extraction method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0064] see figure 2 , Embodiment 1 of the present invention provides a natural language text entity relationship extraction method, including:

[0065] S1. Separate and arrange the articles in the experimental corpus to obtain sentences;

[0066] S2. Perform word segmentation and part-of-speech tagging on the obtained sentences, and convert natural language sentences into words;

[0067] S3. Mark the association between words by manual means to form a training set;

[0068] S4. Determine whether the similarity between any two words exceeds a given threshold, and decide whether to generate a link between the corresponding words to form a directed acyclic graph;

[0069] S5. Invoke the Similar-Chain CRF algorithm and the training model to perform training calculations to obtain a parameter model;

[0070] S6 , predicting and analyzing the given natural language text through the parameter model, and outputting a pair of associated nodes; extracting the entity relationship of ...

Embodiment 2

[0098] see image 3 , Embodiment 2 of the present invention provides a natural language text entity relationship extraction device, including:

[0099] Sentence processing module 1, used to separate and arrange the articles in the experimental corpus to obtain sentences;

[0100] The word generation module 2 is used to perform word segmentation and part-of-speech tagging on the obtained sentences, and convert natural language sentences into words;

[0101] The labeling module 3 is used to manually mark the relationship between words and words to form a training set;

[0102] The similarity analysis module 4 is used to judge whether the similarity between any two words exceeds a given threshold, and decide whether to generate a link between the corresponding words to form a directed acyclic graph;

[0103] The training module 5 is used to call the Similar-Chain CRF algorithm and the training model for training calculation, and obtain the parameter model;

[0104] Prediction ...

Embodiment 3

[0119] Embodiment 3 of the present invention provides a non-transitory computer-readable storage medium, where the computer-readable storage medium stores a program code for a method for extracting an entity relationship from a natural language text, the program code includes a method for executing Embodiment 1 or Instructions for a natural language text entity relation extraction method for any of its possible implementations.

[0120] A computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that includes an integration of one or more available media. The usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)), and the like.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a natural language text entity relationship extraction method and device, and the method comprises the steps: carrying out the separation and arrangement of articles in an experiment corpus set, and obtaining sentences; performing word segmentation and part-of-speech tagging on the obtained sentences, and converting the sentences in the natural language into words; the method comprises the following steps of: marking association relationships among words in a manual mode to form a training set; calculating whether the similarity between any two words exceeds a given threshold value, and determining whether to generate links between corresponding words so as to generate a directed acyclic graph; calling a Similar-Chain CRF algorithm and the training set to carry out training calculation so as to obtain a parameter model; the parameter model can be used for predicting and calculating a natural language text so as to extract a related entity relationship. According to the method, not only is the dependency relationship between the words determined by calculating the similarity between the words, but also the original chain is clipped by utilizing the mutual information of the words, so that the calculation amount is reduced; and the long-distance dependence problem is solved.

Description

technical field [0001] The present invention relates to the technical field of information processing, in particular to a method and device for extracting entity relations from natural language texts. Background technique [0002] In real life, application scenarios based on entity relationships are very common. For example, domain-based community correlation discovery, by extracting entity relationships from published news content, constructs potential community relationships in a specific field for domain research. and research to provide reliable leads. Among various schemes for analyzing entity relationships, conditional random fields are a very important model that can achieve high performance without requiring a large amount of training data. [0003] The original conditional random field is called Linear-Chain CRF, and its disadvantage is that it cannot solve the problem of long-distance dependence between entities, that is, when two related entities appear too far a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/289G06F40/211G06K9/62
CPCG06F40/289G06F40/211G06F18/22
Inventor 朱鸿宇
Owner 江苏新质信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products