Social noise text entity relationship extraction optimization method and system

An entity relationship and optimization method technology, applied in reasoning methods, text database clustering/classification, unstructured text data retrieval, etc. The effect of reducing false correlation problem and improving extraction effect

Active Publication Date: 2021-07-06
XI AN JIAOTONG UNIV
View PDF8 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If training is performed directly on such a data set without modification, for the former, a certain object and relationship may obtain a spurious correlation (SpuriousCorrelation) due to model fitting, that is, the object and relationship itself are logically inconsistent. There is no relevant significance, but because the data set often appears together from the perspective of statistical probability, the model mistakenly believes that there is a correlation between the two; for the l

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Social noise text entity relationship extraction optimization method and system
  • Social noise text entity relationship extraction optimization method and system
  • Social noise text entity relationship extraction optimization method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The present invention is further described below in conjunction with accompanying drawing:

[0037] A social noise text entity relationship extraction optimization method, comprising the following steps:

[0038] S1, construct a semantic counterfactual corpus by using the subject and object under the same relationship in the original dataset and using the same category of entities to replace the subject and object;

[0039] S2, use the grammatical structure and recognition result standards to build a counterfactual checker, filter out valuable semantic counterfactuals and incorporate them into the original data, and the semantic counterfactual data that fail the test will be deleted and the number will be filled through S1;

[0040] S3, using the relative position code to extract word position information, and completing the syntactic position counterfactual generation by exchanging the position code of subject and object;

[0041] S4, the expanded data set uses the BE...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

According to the social noise text entity relationship extraction optimization method and system, problems that in the social text field, the data labeling cost is high, the data updating speed is high, a data set has deviation, and an original model may fall into false correlation can be solved. According to the method, causal inference is introduced into the process of generating word vectors through natural language processing, advantages of intervention and anti-facts in causal inference are applied, data enhancement is achieved, and deviation caused by a data set is weakened. According to the method, the input cost of manual data labeling can be reduced, non-normalization and innovativeness of the text in an actual scene can be effectively simulated, and robustness of the model for social noise text entity relation extraction is improved; due to the fact that the method processes the word vectors, the method has good adaptability and application possibility for various existing models.

Description

technical field [0001] The invention belongs to the technical field of entity relationship extraction optimization, in particular to a method and system for entity relationship extraction optimization of social noise text. Background technique [0002] Entity relationship extraction technology has become a key part of big data analysis and knowledge graph construction. The goal of this technology is to output all (subject, relationship type, object) triples in the sentence as the target. As the field continues to develop, new methods are constantly being proposed. The earliest pipeline model divides entity extraction and relationship extraction into two successive steps, but this method is prone to cumulative errors. Subsequently, many researchers proposed a variety of entity-relationship joint extraction models to reduce the cumulative error. Existing joint entity-relationship extraction models can be broadly classified into two categories: encoder-based models and task-d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06F40/279G06F40/30G06F40/253G06F40/211G06N5/04G06F16/36
CPCG06F16/353G06F40/279G06F40/30G06F40/253G06F40/211G06N5/041G06F16/367
Inventor 刘晓明李承祖冯乙洋多小川贺靖涵
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products