Event extraction based sensitive information monitoring method

A technology for event extraction and sensitive information, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as incomplete event elements, sparse data, and affecting extraction results, so as to ensure accuracy and integrity, The effect of accurate event category and improved monitoring efficiency

Active Publication Date: 2015-04-29
COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
View PDF4 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method has the problem of data sparseness caused by the problem of the corpus itself. At the same time, the feature selection and the complexity of the Chi

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Event extraction based sensitive information monitoring method
  • Event extraction based sensitive information monitoring method
  • Event extraction based sensitive information monitoring method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0056] The method will be described in detail below with reference to the accompanying drawings.

[0057] figure 1 It is the implementation process of corpus preprocessing and construction of trigger word dictionary. The specific methods include:

[0058] Step 1: Corpus preprocessing. Manually collect food safety-related event corpus, label the collected training corpus, and label each sentence in the corpus to label the event, and label the trigger word in the event, event type information, and event element role information.

[0059] The quality and scale of the corpus greatly affect the results of machine learning. The corpus used in this method are all manually collected and screened texts, highlighting the representativeness of events and covering all event types to be processed. By tagging the corpus, in the process of program processing, it can identify whether the sentence contains event information, as well as the type of the event and the roles of each element in t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an event extraction based sensitive information monitoring method. The method comprises the steps of 1) creating a trigger word dictionary and an event element role dictionary; 2) training models marked with training corpus by the machine learning method to acquire a maximum entropy model MT for determining the type of an event and a maximum entropy model MR for extracting event elements from event sentences; 3) filtering corpuses of the event to be extracted according to the trigger words, and treating the sentences which are matched with the set trigger words as the candidate events; 4) classifying the candidate events through the maximum entropy model MT to obtain the event sentence with the set event type; 5) extracting each element term of the event from the event sentences obtained in step 4) according to the event element role dictionary and the maximum entropy model MR so as to finish the event extracting; matching the extracted event with the monitored event; if matching successfully, determining that the extracted event is sensitive information. With the adoption of the method, the monitoring efficiency of the sensitive information is greatly increased.

Description

technical field [0001] The invention belongs to the field of information technology, and relates to a sensitive information monitoring method based on event extraction, which is mainly applied to the fields of natural language processing, data mining, information retrieval, food safety and the like. Background technique [0002] With the rapid popularization and development of the Internet, a large amount of data information is generated and disseminated in the network, and the total amount of information grows rapidly at an exponential rate. The large amount of data, non-uniform structure and high redundancy are the characteristics of this information. Traditional information acquisition methods have been unable to meet the requirements. How to quickly select the information of interest from the vast ocean of data has become an urgent problem. The research on information extraction is born in this context. [0003] The purpose of information extraction is to identify and e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/353G06F16/374
Inventor 杨风雷崔现鹏黎建辉王鹏尧汪海燕周昊
Owner COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products