Information extraction method based on bi-directional recurrent neural network

A recurrent neural network and neuron technology, applied in the field of natural language processing, can solve problems such as poor versatility, complex prediction process, time-consuming and labor-intensive, etc.

Inactive Publication Date: 2016-09-21
成都数联铭品科技有限公司
View PDF4 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Feature templates include first-order words or multi-order phrases with a specified window size context, word prefixes, suffixes, part-of-speech tags and other state features; the construction of feature templates is very time-consuming and labor-intensive, but the recognition results are extremely dependent on feature templates; Manually set feature templates are often only based on the characteristics of some samples, which is poor in versatility; and usually only local context information can be used, and the use of each feature template is independent of each other. The prediction cannot rely on longer historical state information, nor can it Use longer future information feedback to correct possible historical errors, the forecasting process is complicated, and the forecasting results are difficult to achieve global optimality

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Information extraction method based on bi-directional recurrent neural network
  • Information extraction method based on bi-directional recurrent neural network
  • Information extraction method based on bi-directional recurrent neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0053] For example, the following news text was obtained on the Internet: "Chengdu AB Electronics Co., Ltd., a wholly-owned subsidiary of Chengdu AB Holding Group Co., Ltd., intends to jointly invest with Chengdu CDEF Technology Co., Ltd. and two natural persons to establish Chengdu ABEF Big Data Financial Services Co., Ltd. A commercial big data solution that provides financial services for financial institutions mainly based on banks." The result of segmenting this text using a tokenizer is as follows: "Chengdu / A / B / Holdings / Group / Shares / Co., Ltd. / of / Wholly-owned / Subsidiary / Chengdu / A / B / Electronics / Co., Ltd. / Proposed / United / Chengdu / C / D / E / F / Technology / Co., Ltd. / and / 2 / Name / Natural Person / Investment / Establishment / Chengdu / A / B / E / F / big data / gold / service / limited company / , / for / based on / bank / based / financial / institution / provide / financial / service / of / commercial / big data / solution / ." After word segmentation processing, a character sequence with a length of 55 is formed. After the above-m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of natural language processing, in particular to an information extraction method based on a bi-directional recurrent neural network. The information extraction method applies the technology of the bi-directional recurrent neural network, the basic elements, which include characters, words, punctuations and the like, of a text are subjected to the automatic learning of characteristics, series modeling is carried out through the RNN (Recurrent Neural Network), and the defect that the characteristics need to be manually set in a traditional way is overcome. In addition, the bi-directional communication RNN is used to overcome the problem of information asymmetry in a prediction process of a unidirectional RNN, so that the classification judgment result of a natural language series to be identified depends on both preamble information and postamble information, and therefore, information extraction and judgment accuracy is higher. The method is especially suitable for entity name extraction in big data analysis, and has an important application value in the big data analysis.

Description

technical field [0001] The invention relates to the field of natural language processing, in particular to an information extraction method based on a bidirectional recursive neural network. Background technique [0002] With the rapid development of the Internet, a large amount of public web page data has been generated, which has also spurred various emerging industries based on big data technology, such as Internet medical care, Internet education, corporate or personal credit investigation, etc. The rise and prosperity of these Internet industries is inseparable from the analysis of a large amount of data information; however, most of the data obtained directly from web pages are unstructured. In order to use these data, data cleaning has become the most time-consuming and energy-consuming task for major companies. place. In data cleaning, the extraction of specific information, especially the extraction of named entities, is a common occurrence. For example, when doing...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06N3/02
CPCG06F40/258G06F40/279G06F40/30G06N3/02
Inventor 刘世林何宏靖
Owner 成都数联铭品科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products