Named entity identification method and system for legal instrument multi-strategy fusion

A named entity recognition, multi-strategy technology, applied in the direction of instruments, electrical digital data processing, computing, etc., can solve the problems that are difficult to meet the requirements of named entity recognition, so as to improve the accuracy rate and recall rate, reduce dependence, and reduce the burden Effect
CN110807328AActive Publication Date: 2020-02-18SOUTH CHINA NORMAL UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SOUTH CHINA NORMAL UNIVERSITY
Publication Date
2020-02-18

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a named entity recognition method and system for legal document multi-strategy fusion, and the method comprises the following steps: building a source data corpus, carrying outthe part-of-speech tagging and sequence tagging of the source data corpus, and carrying out the model pre-training; training the labeled data through a BiLSTM-Attention-CRF (Bipolar Long Short Term Memory-Attention-Content Random Field) model to obtain a trained first model; improving the trained first model; establishing a target data corpus, randomly extracting data from the target data of thelegal instrument, and generating a plurality of training sets; carrying out transfer learning on the plurality of training sets, and training the improved first model to obtain models trained by the plurality of training sets; and integrating the models trained by the plurality of training sets by adopting a voting mechanism in ensemble learning to obtain a second model, and performing named entity identification of legal documents by the second model to obtain a final named entity identification result. According to the method, the accuracy and recall rate of named entity recognition are improved under the condition of insufficient annotation corpora.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the technical field of natural language processing, in particular to a named entity recognition method and system for multi-strategy fusion of legal documents. Background technique

[0002] Named entities are people's names, organization names, place names, and all other entities identified by names. They are the basic information elements in the text, an important carrier of information expression, and the basis for correct understanding and processing of text information. Chinese named entity recognition is one of the basic tasks in the field of natural language processing. Its main task is to identify and classify name entities and meaningful phrases appearing in the text, mainly including person names, place names, organization names, and time expressions. Formulas, dates, digital expressions, etc., the accuracy and recall of named entity recognition directly determine the performance of the whole process of language unders...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More