Deep learning-based structured information extraction method

A technology of structured information and deep learning, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve problems such as model complexity and learner confusion.

Active Publication Date: 2017-07-07
上海数眼科技发展有限公司
View PDF3 Cites 28 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Second, solutions based on supervised learning require manually labeled data
[0007] 2. A sentence about a subject can contain multiple objects for a predicate, which further complicates the pattern and confuses the learner
[0008] 3. The objects we want to extract may not necessarily appear in a single sentence

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep learning-based structured information extraction method
  • Deep learning-based structured information extraction method
  • Deep learning-based structured information extraction method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] The implementation of the present invention will be described in detail below in conjunction with the accompanying drawings and examples, so as to fully understand and implement the process of how to apply technical means to solve technical problems and achieve technical effects in the present invention. It should be noted that, as long as there is no conflict, each embodiment and each feature in each embodiment of the present invention can be combined with each other, and the formed technical solutions are all within the protection scope of the present invention.

[0056] In addition, the steps shown in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and, although a logical order is shown in the flow diagrams, in some cases, the sequence may be different. The steps shown or described are performed in the order herein.

[0057] Specifically, in recent years, deep learning has proven to have stron...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a deep learning-based structured information extraction method. The method comprises the following steps of: 1) constructing large-scale flag data by adoption of remote monitoring: constructing an extractor by using the existing Wikipedia to provide remote supervision, wherein a signature and a Wikipedia information box comprise entity-related structured facts, the facts are mentioned in a free text part of an entity webpage, and sentences which express the facts in the information box are used as training data; 2) integrating prior knowledge into a structured information extraction model, wherein the prior knowledge comprises type and phrase information, and labels of words in natural language sentences are related to the foregoing words and continuous words thereof; 3) utilizing the past and future input features by using a bidirectional hidden LSTM layer, and inputting each sentence into a bidirectional LSTM model in a sequence form; and 4) finally outputting a marker sequence, wherein the marker sequence comprises true or false markers and has a length same as the length of the input word sequence.

Description

technical field [0001] The invention belongs to the field of information processing, and in particular relates to a method and system for extracting structured information based on deep learning. Background technique [0002] In the extraction of structured information, many studies have done a lot of work in collecting the structured knowledge of entities from corpora, such as Kylin and DBpedia. These knowledge bases, also known as knowledge graphs, contain rich facts about entities, such as Barack Obama's birthplace being Honolulu. We usually refer to entities as subjects (s), properties or aspects as predicates (p) and values ​​as objects (o). Extracting structured facts (in the form of SPO triples) from corpora has increasingly attracted much research interest due to the wide application of knowledge graphs. In this technique, we focus on the problem of simultaneously extracting structured facts for a large number of predicates with tens of millions of entities. [00...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
CPCG06F40/279
Inventor 谢晨昊梁家卿肖仰华
Owner 上海数眼科技发展有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products