Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text event extraction method in combination of sparse coding and structural perceptron

A technology of sparse coding and event extraction, applied in unstructured text data retrieval, text database clustering/classification, special data processing applications, etc. Problems such as long time consumption and lack of universality achieve the effect of improving convenience and explainability

Active Publication Date: 2017-04-26
ZHEJIANG UNIV
View PDF3 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, most of these features require manual intervention, and the extraction process takes a long time and is not universal
In recent years, neural network and deep learning technology have become research hotspots, and unsupervised distributed word vector extraction methods in the NLP field are becoming increasingly common. This distributed word vector learning method is simple and general, and does not require manual labeling of data, but its drawbacks It does not have the interpretability and flexibility of common sparse expression features, so some studies have converted this distributed word vector into a sparse expression form that is easy to use in traditional NLP problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text event extraction method in combination of sparse coding and structural perceptron
  • Text event extraction method in combination of sparse coding and structural perceptron
  • Text event extraction method in combination of sparse coding and structural perceptron

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The present invention will be further elaborated below in conjunction with the accompanying drawings and specific embodiments.

[0045] The present invention utilizes the sparse coding expression of the distributed word vector features based on the neural network, strengthens the text features, uses the structure perceptron model to simultaneously learn event trigger words and the identification of event participants, thereby realizing event extraction. A text event extraction method combining sparse coding and a structure perceptron, including the following steps:

[0046] 1) Construct the text data as training samples according to the Automatic Content Extraction (ACE) or Rich EntityRelation Event (RichERE) specification;

[0047] 2) Use the extracted entities as candidate entities for event trigger words and event parameters, and extract text features (part-of-speech tagging, dependency syntax analysis, etc.);

[0048] 3) Further extract text distributed word vector...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text event extraction method in combination of sparse coding and a structural perceptron. The method comprises following steps: 1) normatively labeling and creating text data according to ACE or RichERE as training samples; 2) taking entities extracted as candidate entities for event trigger words and event parameters and extracting text features; 3) further extracting text distributive word vector features and learning sparse coding features; 4) utilizing training samples and extracted text features, training a classifier of the structural perceptron while recognizing trigger words and parameters related to events in texts; 5) inputting the classifier of the structural perceptron through the step 1 as for new text data and extracting text event information. The text event extraction method in combination of sparse coding and the structural perceptron has following beneficial effects: sparse coding expressions for distributive word vector features based on a neural network are utilized for enhancing text features; on the other hand, a model of the structural perceptron is utilized for learning event trigger words and recognizing event participants. Therefore, the better event extraction effect is obtained.

Description

technical field [0001] The invention relates to event extraction, in particular to a text event extraction method combining sparse coding and a structure perceptron. Background technique [0002] An event is a thing that occurs or appears. The event involves the entities (people, objects, etc.) participating in the event or affected by the event, and also involves all aspects of space and time in the universe. It is very important to understand events and their descriptions in text data, and event extraction is often a key component of applications such as machine reading, news summarization, information retrieval, and knowledge base construction. [0003] Usually, the goal of event extraction task is to extract event-related trigger words and participants (persons or objects) in the text. Current state-of-the-art methods for event extraction generally include three steps: first, use pre-trained named entity recognition tools to extract entities such as people, institutions...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/313G06F16/35G06F40/211G06F40/289
Inventor 汤斯亮吴飞杨启凡邵健郝雷光庄越挺
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products