Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

NLP knowledge graph construction method combining information amount and BERT-BiLSTM-CRF

A knowledge map and construction method technology, applied in unstructured text data retrieval, semantic tool creation, machine learning, etc., can solve the problems of insufficient missing value processing, poor classification effect, and insufficient use of data information, etc., to achieve powerful Ability to extract contextual information from text, improve performance

Pending Publication Date: 2022-07-29
BEIJING UNIV OF POSTS & TELECOMM
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0025] At present, common machine learning algorithms such as logistic regression, naive Bayesian, random forest, linearSVC and other algorithms are not perfect for the processing of missing values ​​in the data, and currently only use the abstract features in the paper data for text multi-classification, and do not make full use of the acquisition The data information, the final classification effect is poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • NLP knowledge graph construction method combining information amount and BERT-BiLSTM-CRF
  • NLP knowledge graph construction method combining information amount and BERT-BiLSTM-CRF
  • NLP knowledge graph construction method combining information amount and BERT-BiLSTM-CRF

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0073] The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

[0074] The implementation process of the technical solution of the present invention includes the following five steps: defining the NLP model layer, acquiring NLP paper data, new word discovery, text multi-classification, and completing the construction of the knowledge map. The overall structure is as follows image 3 shown:

[0075] Step 1), define the NLP knowledge graph schema layer

[0076] The schema layer describes the entities, relationships and attributes in the graph and is the framework of the knowledge graph. In the domain knowledge graph, it is usually necessary to deeply understand the domain knowledge and define the schema layer in combination with the domain data schema. The present invention defines the schema layer of the NLP knowledge graph through a seven-step method.

[0077] (1) Fir...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an NLP knowledge graph construction method combining information amount and BERT-BiLSTM-CRF. The method comprises the following steps: providing a mode layer structure of a knowledge graph in the field of natural language processing by analyzing the structure of knowledge network journal paper data and combining a research task of natural language processing; a key term entity class in paper data is obtained by proposing a new word discovery algorithm, and a fine-grained NLP research task entity class of the paper is obtained by proposing a feature fusion multi-classification algorithm; and then the knowledge extraction module obtains the triple, and finally the natural language processing knowledge graph is constructed. According to the new word recognition algorithm, the defect that in the prior art, only information is used for obtaining new words is overcome, and the new word discovery effect is greatly improved. Compared with other machine learning models, classification of paper fine granularity research tasks by using the XGBoost model has higher accuracy, and in addition, through feature fusion, the accuracy of the classification model is improved by about five percent compared with the classification accuracy of a model without feature fusion.

Description

technical field [0001] The present application relates to the field of computer technology, and in particular, to a method for constructing an NLP knowledge graph combining information content and BERT-BiLSTM-CRF. Background technique [0002] In recent decades, natural language processing has been in a stage of rapid development, and the data volume of academic research papers related to natural language processing has increased sharply. Whether it is in academia or industrial production, the demand for consulting papers related to the field of natural language processing is getting stronger and stronger. , but the diverse research content and complex conceptual relationships in the field of natural language processing have brought great challenges to people reading papers. [0003] When constructing vertical domain knowledge graphs, the two most important subtasks are the construction of knowledge graph schema layers and knowledge extraction. The knowledge graph can be lo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06F16/36G06N20/00
CPCG06F16/367G06F16/3344G06F16/353G06N20/00
Inventor 范春晓吴岳辛孙娟娟蔡婷婷王艺潼
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products