Event extraction method based on maximum entropy

An event extraction and maximum entropy technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as data sparseness, affecting extraction results, and incomplete event elements, so as to ensure accuracy and integrity, The effect of complete element information and accurate event category

Active Publication Date: 2015-05-06
COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
View PDF4 Cites 25 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method has the problem of data sparseness caused by the problem of the corpus itself. At the same time, the feature selection and the complexity of the Chi

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Event extraction method based on maximum entropy
  • Event extraction method based on maximum entropy
  • Event extraction method based on maximum entropy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0055] The method will be described in detail below in conjunction with the accompanying drawings.

[0056] figure 1 It is the implementation process of corpus preprocessing and building trigger word dictionary. The specific methods include:

[0057] Step 1: Corpus preprocessing. Manually collect event corpus related to food safety, mark the collected training corpus, mark the event by labeling each sentence in the corpus, and mark the trigger words, event type information, and role information of event elements in the event.

[0058] The quality and scale of the corpus greatly affect the results of machine learning. The corpus used in this method is manually collected and screened text, which highlights the representativeness of events and covers all types of events to be processed. By labeling the corpus, it is possible to identify whether the sentence contains event information, as well as the type of event and the role of each element in the event during program processi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an event extraction method based on maximum entropy. The method comprises the following steps of (1), constructing a trigger word dictionary and an event element role dictionary; (2), as for labeled training corpus, training a model by use of a machine learning method, acquiring a maximum entropy model MT which is used for judging event types and a maximum entropy model MR which is used for extracting event elements from event sentences; (3), filtering corpus needing event extraction according to trigger words, and utilizing sentences which are matched with the set trigger words as candidate events; (4), classifying candidate events by virtue of the maximum entropy model MT and acquiring the event sentences which belong to a set event type; (5), extracting each element word of events from the event sentences which are obtained in the step (4) according to the event element role dictionary and the maximum entropy model MR, thereby finishing event extraction. The event extraction method disclosed by the invention is extensive in use and high in accuracy; by virtue of the event extraction method, the event extraction effect is greatly improved.

Description

technical field [0001] The invention belongs to the field of information technology, and relates to an event extraction method, which is mainly applied in the fields of natural language processing, data mining, information retrieval, food safety and the like. Background technique [0002] With the rapid popularization and development of the Internet, a large amount of data information is generated and disseminated in the network, and the total amount of information is growing rapidly at an exponential rate. Large amount of data, non-uniform structure, and high redundancy are the characteristics of this information. Traditional information acquisition methods have been difficult to meet the requirements. How to quickly select the information you are interested in from the vast ocean of data has become an urgent problem. The research on information extraction is produced under this background. [0003] The purpose of information extraction is to identify and extract informati...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F40/247G06F40/30
Inventor 崔现鹏黎建辉杨风雷王鹏尧汪海燕周昊
Owner COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products