Check patentability & draft patents in minutes with Patsnap Eureka AI!

Legal field event extraction method based on pre-training model and convolutional neural network algorithm

A convolutional neural network and event extraction technology, applied in the field of legal intelligence, can solve the problems of relying on training data and high cost of data labeling, so as to improve the effect and reduce the time cost and labor cost

Active Publication Date: 2021-06-15
SHANGHAI UNIV
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this supervised learning-based event extraction method relies heavily on manually labeled training data, and most of the experimental results are based on the ACE2005 dataset.
However, in the specific field of law, due to the high cost of data labeling, there is no large-scale Chinese corpus data of legal events

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Legal field event extraction method based on pre-training model and convolutional neural network algorithm
  • Legal field event extraction method based on pre-training model and convolutional neural network algorithm
  • Legal field event extraction method based on pre-training model and convolutional neural network algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0070] In this example, see figure 1 , a method for extracting events in the legal field based on a pre-training model and a convolutional neural network algorithm, the method comprising the following steps:

[0071] A. Data acquisition and preprocessing:

[0072] Use web crawlers to crawl public legal text corpus, perform text preprocessing on the original legal text corpus, and sequentially perform sentence segmentation, word segmentation, and denoising to obtain usable legal text corpus data;

[0073] B. Legal event template definition:

[0074] Obtain high-frequency verbs and key nouns in the legal field, perform distance-based clustering of similar words on these words, and manually define legal event types and templates with reference to relevant legal clauses based on the clustering results;

[0075] C. Large-scale legal event data annotation based on distance supervision learning:

[0076] Use the method of rules or patterns to obtain seed legal events from semi-str...

Embodiment 2

[0080] This embodiment is basically the same as Embodiment 1, especially in that:

[0081] In this embodiment, in the step A, the specific steps for obtaining available legal text corpus data are:

[0082] A1. Use crawlers to crawl public legal document data from legal document websites;

[0083] A2. Manually classify part of the obtained legal document data according to the crimes sentenced, use the neural network model RCNN to train the crime classification model of the legal document data, classify the remaining data, and obtain the legal document data classified according to the crime;

[0084] A3. Unify the punctuation marks of legal document data into Chinese format, according to include? ! The Chinese punctuation and sentence break symbols divide the document data into sentence forms to form a sentence set;

[0085] A4. Use an open source word segmentation tool to segment each sentence in the sentence set to obtain the word segmentation result;

[0086] A5. Build a ...

Embodiment 3

[0114] This embodiment is basically the same as the above-mentioned embodiment, and the special features are:

[0115] In this embodiment, a method for extracting events in the legal field based on a pre-trained model and a convolutional neural network algorithm, the steps

[0116] A. Data acquisition and preprocessing: use web crawlers to crawl public legal text corpus, and use public information from the legal document website; perform text preprocessing on the original legal text corpus, and sequentially perform sentence segmentation, word segmentation, and denoising to obtain usable legal texts text corpus data;

[0117] A1. Use crawlers to crawl public legal document data from legal document websites;

[0118] A2. Manually classify part of the obtained legal document data according to the crimes punished. On this basis, use the neural network model RCNN to train the crime classification model of the legal document data, classify the remaining data, and obtain the legal d...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a legal field event extraction method based on a pre-training model and a convolutional neural network algorithm, and the method comprises the steps: crawling a public legal text corpus through a web crawler, carrying out the text preprocessing of the original corpus, and obtaining available legal text corpus data; obtaining high-frequency verbs and key nouns in the legal field, and clustering the words; constructing an original legal event knowledge base IE, and on this basis, realizing large-scale automatic labeling of legal event corpus data in a remote supervised learning mode; and realizing a legal event extraction system based on an NEZHA pre-training corpus model and a DMCNN convolutional neural network model by using the obtained large-scale legal event prediction data. Large-scale automatic labeling of legal event corpus data is realized based on a remote supervised learning mode, deep semantic information of legal text data is mined by utilizing the pre-training language model and the convolutional neural network algorithm, and a better effect is achieved on a legal event extraction task.

Description

technical field [0001] The invention belongs to the field of legal intelligence, in particular to a method for extracting events in the legal field based on a pre-training model and a convolutional neural network algorithm. Background technique [0002] With the application of artificial intelligence in more and more fields and scenarios, legal intelligence has also become a popular research direction. A judicial case contains many elements, such as entities, relationships, and events. Describing judicial cases through events can not only disassemble a complex case, reconstruct and express it, but also evaluate the sentencing of the case by extracting some key events. [0003] At present, the methods of event extraction can be roughly divided into two categories—pattern-matching-based methods and machine-learning-based methods. Most of the early event extraction methods were based on pattern matching, and a large number of rules or patterns were manually formulated based o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/30G06F16/906G06F16/951G06F40/211G06F40/284G06N3/04G06N3/08G06Q50/18
CPCG06F40/30G06F40/211G06F40/284G06F16/906G06F16/951G06Q50/18G06N3/08G06N3/045
Inventor 魏晓谢伟
Owner SHANGHAI UNIV
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More