Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Model building method and system, paragraph label obtaining method and medium

A model building and model technology, applied in the field of natural language processing transfer learning, which can solve problems such as differences in label definitions, high complexity, and limited classification capabilities.

Pending Publication Date: 2021-04-23
CHENGDU UNION BIG DATA TECH CO LTD
View PDF11 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although the semantic information of the paragraphs of civil judgment documents has been learned, part of the effect of the model comes from the correlation between the learned specific context tags of civil judgment documents, and the definition of paragraph tags between civil judgment documents and other types of judgment documents is different, resulting in the model cannot be used to directly predict the classification of other types of referee passages
[0006] Some traditional "convolutional / recurrent neural network + fully connected classification layer" models represented by TextCNN and LSTM and their derivative models have limited modeling ability for long texts and weak representation ability for complex texts, which directly Limits the ability of this type of model to classify the structure of referee documents
Although the "pre-trained language model + fully connected classification layer" represented by BERT and XLNet can make great progress in text representation, it requires a completely independent fine-tuning process for different types of classification tasks, wasting multiple types of Shared information between referee documents also brings high complexity

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model building method and system, paragraph label obtaining method and medium
  • Model building method and system, paragraph label obtaining method and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0058] figure 1 It is a schematic flow chart of a method for establishing a structured model of adjudication documents. Embodiment 1 of the present invention provides a method for establishing a structured model of adjudication documents. The method includes:

[0059] Collect all referee document data from the database to obtain pre-training data;

[0060] Define paragraph tags for different types of referee documents;

[0061] Mark the paragraphs of different types of referee documents to obtain training data;

[0062] Construct a structural model of referee documents;

[0063] Construct a pre-training task, and convert the input sequence of the model into word vector input through the pre-trained referee document structured model;

[0064] Using the training data to train the pre-trained referee document structured model to obtain the trained referee document structured model;

[0065] Debug the structured model of referee documents after training to obtain the final str...

Embodiment 2

[0081] figure 2 It is a schematic diagram of the composition of the system for establishing a structured model of referee documents. Embodiment 2 of the present invention provides a system for establishing a structured model of referee documents. The system includes:

[0082] The collection unit is used to collect all referee document data from the database to obtain pre-training data;

[0083] Definition unit, used to define the paragraph tags of different types of referee documents;

[0084] The marking unit is used to mark the paragraph labels of different types of referee documents to obtain training data;

[0085] A construction unit for constructing a structured model of referee documents;

[0086]The pre-training unit is used to construct the pre-training task, and convert the input sequence of the model into word vector input through the pre-training referee document structured model;

[0087] The training unit is used to utilize the training data to train the pre-...

Embodiment 3

[0091] Embodiment 3 of the present invention provides a method for obtaining paragraph tags of referee documents, the method comprising:

[0092] Obtain the data of pending judgment documents;

[0093] Add the task prefix to the paragraph of the referee document data to be processed to obtain the input data;

[0094] The input data is input into the judgment document structured model established by the method for establishing the judgment document structured model, and the judgment document structured model outputs the paragraph labels of the judgment documents to be processed.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a model building method and system, a paragraph label obtaining method and a medium, and relates to the field of natural language processing transfer learning, and the method comprises the steps: collecting all judgment document data from a database, and obtaining pre-training data; defining paragraph labels of different types of judgment documents; marking paragraph labels of different types of judgment documents to obtain training data; constructing a judgment document structured model; pre-training the model; training a pre-trained judgment document structured model by utilizing the training data; and debugging the trained judgment document structured model to obtain a final judgment document structured model, wherein the input of the judgment document structured model is judgment document text data, a task prefix is added to a paragraph of the input judgment document, and the output of the judgment document structured model is paragraph label text data of the judgment document. The model established by adopting the method can predict any type of judgment document paragraph label after being trained.

Description

technical field [0001] The present invention relates to the field of natural language processing transfer learning, in particular, to a method and system for establishing a structured model of adjudication documents, and a method and medium for obtaining paragraph tags of adjudication documents. Background technique [0002] As of December 2019, more than 80 million judgments have been published online, providing massive data resources for the practice and research of legal artificial intelligence. [0003] Judgment documents are judicial products that record the process of judicial trial activities and clarify the rights and obligations of the parties. They are important resources for studying legal text information and provide important data for legal artificial intelligence application research such as recommendation of similar cases based on judgment documents, prediction of judgment results, and intelligent question-and-answer. element index. However, the judgment docu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/33G06F16/35G06F40/126G06F40/151G06F40/279G06F40/30
CPCG06F16/3344G06F16/355G06F40/126G06F40/151G06F40/279G06F40/30
Inventor 翁洋李鑫王竹其他发明人请求不公开姓名
Owner CHENGDU UNION BIG DATA TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products