BERT-based multi-model fusion event subject extraction method
A multi-model and event technology, applied in the field of data processing, can solve problems such as poor recall rate, low portability, and difficulty in covering event types, so as to improve accuracy and ensure diversification.
Pending Publication Date: 2020-06-09
民生科技有限责任公司
View PDF0 Cites 7 Cited by
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
This type of method depends on the specific form of the text (language, domain, and document format, etc.), and the process of obtaining the template is time-consuming and laborious, which is highly professional. Moreover, it is difficult for the formulated model to cover all event types. When the corpus changes , the schema needs to be re-fetched
In view of the low portability and poor recall rate of the method based on pattern matching, event subject extraction based on machine learning has become the mainstream method
Method used
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View moreImage
Smart Image Click on the blue labels to locate them in the text.
Smart ImageViewing Examples
Examples
Experimental program
Comparison scheme
Effect test
Embodiment
[0079] 1. Data preprocessing.
[0080] Clean and process data, including the following processing options:
[0081] Remove special symbols that are not useful for training language models, such as ▽[[+_+]];
[0082] Replace consecutive spaces with commas;
[0083] For samples with multiple event bodies, the event bodies need to be adjusted to match the order in the original text
[0084] The data is divided into training samples and prediction samples;
[0085] 2. Embedding training samples as vectors
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more PUM
Login to view more
Abstract
The invention relates to a BERT-based multi-model fusion event subject extraction method, and belongs to the technical field of data processing. The method comprises the following steps: preprocessingcrawling data to obtain a training sample and a prediction sample; performing embedding operation on the training sample and the prediction sample to obtain a training sample input sequence and a prediction sample input sequence of the BERT pre-training network; adopting a plurality of single models with different complexity based on a BERT pre-training network, utilizing a training sample inputsequence to train the single models, and optimizing network parameters; inputting the prediction sample input sequence into a plurality of trained single models, and outputting a plurality of model results; and fusing the plurality of model results to obtain a final prediction result of the prediction sample. According to the method, models with different complexities are adopted, diversificationof the models is guaranteed, parameters are adjusted for training, detection results of the multiple models are fused, and the detection accuracy is further improved.
Description
technical field [0001] The invention relates to the technical field of data processing, in particular to a method for extracting event subjects based on BERT-based multi-model fusion. Background technique [0002] Event identification is one of the important tasks in the field of public opinion monitoring and the financial field. In the financial field, "events" are an important decision-making reference for investment analysis and asset management. The event subject extraction task for the financial field belongs to the limited domain event extraction in the event extraction task, and is one of the important links in information extraction and knowledge graph construction. The complexity of "event recognition" lies in the judgment of the event type and the event subject, and only the subject that has a specific event type is the extraction target. [0003] There are currently two main types of methods: pattern-matching-based methods and machine-learning-based methods. [...
Claims
the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more Application Information
Patent Timeline
Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/214G06F18/25
Inventor 李振刘恒赵兴莹秦培歌李勇辉
Owner 民生科技有限责任公司
Who we serve
- R&D Engineer
- R&D Manager
- IP Professional
Why Eureka
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Social media
Try Eureka
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap