A deepdive-based domain text knowledge extraction method

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A knowledge extraction and text technology, applied in text database query, unstructured text data retrieval, instruments, etc., can solve the problems of difficulty and lack of data utilization, achieve strong practicability and flexibility, and reduce costs.

Active Publication Date: 2019-09-20

ZHEJIANG UNIV

View PDF3 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

For structured and semi-structured data, there are already a lot of tools that can help us transform into knowledge in the knowledge base, but most of the data sources are currently unstructured, including data data, dialogue data, etc., for this There is a lack of automatic knowledge extraction methods for a class of Chinese data, which makes data utilization very difficult. There is an urgent need for a domain text knowledge extraction method to make up for this lack

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0033] In order to describe the present invention more specifically, the technical solutions of the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0034] This example requires the analysis of financial announcement data to extract the knowledge of equity changes in the financial sector, so as to build a corresponding company equity knowledge base. The construction method of the overall corresponding corporate equity knowledge base is as follows: figure 1 Shown:

[0035] S01, obtain the corresponding financial announcement data, convert it into txt text content through a series of tools, and use the jieba tool to segment the announcement data, and use Stanford's core NLP tool to perform part-of-speech tagging and named entity tagging on the word-segmented announcement data and syntax-dependent processing to obtain the preprocessed announcement data, figure 2 Shown is a schematic diagram of the res...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a Deepdive-based field text knowledge extracting method comprising the steps of: (1) acquiring original texts required by a knowledge base construction system and performing pretreatment on the texts; (2) performing entity connection on the pre-treated texts, finding out target entities corresponding to a preset specific relation, generating entity-relation-entity triads and forming a candidate relation-entity pair set; (3) learning and labeling a plurality of candidate relation-entity pairs by using a weak supervising method and generating training samples of a Deepdive tool; (4) inputting the training samples into the Deepdive tool to train Deepdive, and outputting candidate relation-entity pairs with probability values greater than a threshold value to form an extracted knowledge base. The method can complete the work of construction of a field knowledge base, has great expandability and is of high practical value for utilization and extraction of unstructured data.

Description

technical field [0001] The invention relates to computer natural language processing technology, and specifically designs a method for extracting domain text knowledge based on Deepdive. Background technique [0002] The construction of knowledge base has practical significance and application prospect in reality. The daily operation of Apple's Siri and Microsoft's Cortana is based on a large knowledge base, and quickly returns correct answers to users' questions. However, in some vertical fields, such as customer service, finance, chat robots, etc., there is a lack of knowledge bases for specific relationships, or lack of knowledge bases with complete information and timely content updates. If the knowledge base can be automatically constructed for a specific field and some specific relationships, and achieve high accuracy, it can effectively reduce the manpower and time costs in knowledge base construction, and provide more downstream applications. good service. [0003...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G06F16/33G06F16/36G06F17/27G06N99/00

CPCG06F16/3344G06F16/367G06F40/253G06N20/00

Inventor陈华钧陈曦张宁豫吴朝晖

OwnerZHEJIANG UNIV

A deepdive-based domain text knowledge extraction method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology