Method for analyzing sparse semantic relationship by combining BiLSTM-CRF algorithm and R-BERT algorithm

A semantic relationship and algorithm technology, applied in the computer field, can solve the problems of high labeling cost of labeling data, high accuracy of semantic analysis, irrelevant information, etc.

Inactive Publication Date: 2021-02-26
江苏网谱数据科技有限公司
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] With the rapid development of the Internet, the analysis and processing of massive Internet data has become a crucial task in various industries. Massive text data in the Internet, especially unstructured text data, contains a lot of important information, but also contains a lot of noise and Irrelevant information, the distribution of effective entities is relatively sparse, how to extract data information in unstructured text with high precision has become the top priority in industry analysis
[0003] In addition, manual text data relationship extraction relies on a large number of human resources to mine the relationship between entities in massive data. Traditional machine learning methods are difficult to achieve high precision in semantic analysis, and at the same time, insufficient annotation data and high annotation Cost has also become a factor affecting the accuracy of semantic entity relationship extraction
[0004] The distribution of market data in unstructured articles on the Internet is relatively sparse, and irrelevant information brings a lot of noise to the judgment of the relationship between market data values ​​and their designated products and time

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for analyzing sparse semantic relationship by combining BiLSTM-CRF algorithm and R-BERT algorithm
  • Method for analyzing sparse semantic relationship by combining BiLSTM-CRF algorithm and R-BERT algorithm
  • Method for analyzing sparse semantic relationship by combining BiLSTM-CRF algorithm and R-BERT algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Embodiments of the present invention are described below through specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the content disclosed in this specification. The present invention can also be implemented or applied through other different specific implementation modes, and various modifications or changes can be made to the details in this specification based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that, in the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.

[0031] It should be noted that the diagrams provided in the following embodiments are only schematically illustrating the basic ideas of the present invention, and only the components related to the present invention are shown in the diagrams rather than the number, shape and shape of the compo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for analyzing a sparse semantic relationship by combining a BiLSTM-CRF algorithm and an R-BERT algorithm, which comprises the following steps: acquiring text data of emerging industries through a web crawler, and performing semi-supervised annotation on the text data; preprocessing the labeled text data, and constructing a training data set and a verification data set; training a BiLSTM-CRF algorithm model and an R-BERT algorithm model according to the training data set and the verification data set; extracting entities contained in the text data to be predictedthrough the trained BiLSTM-CRF algorithm model; predicting the relationship between the text data to be predicted and the entities through the trained RBERT algorithm model, and establishing relationship connection between the related entities; and extracting the triad pair of the semantic relationship of the text data to be tested according to the established relationship connection, and completing semantic analysis of the text data to be tested. The invention provides a high-precision semantic relation extraction method for information extraction of an unstructured text, a BiLSTM-CRF algorithm model is used for extracting required entities in the text, and the relation between the text and the extracted entities is predicted through an R-BERT model.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to a semantic relationship analysis method based on deep learning and pre-training models and an efficient semi-supervised semantic entity relationship labeling construction method, in particular to a combination of BiLSTM-CRF algorithm A method for parsing sparse semantic relations with the R-BERT algorithm. Background technique [0002] With the rapid development of the Internet, the analysis and processing of massive Internet data has become a crucial task in various industries. Massive text data in the Internet, especially unstructured text data, contains a lot of important information, but also contains a lot of noise and Irrelevant information, the distribution of effective entities is relatively sparse, how to extract data information in unstructured text with high precision has become the top priority in industry analysis. [0003] In addition, manual text data re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/30G06F40/279G06N3/04G06N3/08
CPCG06F40/30G06F40/279G06N3/049G06N3/08G06N3/045
Inventor 陆佃龙王增林
Owner 江苏网谱数据科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products