Legal text case retrieval method and system based on pre-training language model

A language model and pre-training technology, which is applied in the legal text similar case retrieval method and system field based on the pre-trained language model, can solve the problems of poor retrieval performance and limited text length of the model performance, so as to achieve sufficient features and improve retrieval performance , the effect of improving accuracy and reasoning ability

Active Publication Date: 2022-02-18
CENT SOUTH UNIV
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But these two types of methods have certain problems and deficiencies. Classical algorithms based on Bm25 and Jaccard similarity have no limit on the length of text, but their retrieval performance is much inferior to that of deep neural networks.
For methods based on deep neural networks, a sufficient amount of data is often required to train the model to support subsequent retrieval effects, and the performance of the model is also very limited by the length of the text.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Legal text case retrieval method and system based on pre-training language model
  • Legal text case retrieval method and system based on pre-training language model
  • Legal text case retrieval method and system based on pre-training language model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] In order to make the technical problems, technical solutions and advantages to be solved by the present invention clearer, the following will describe in detail with reference to the drawings and specific embodiments. Apparently, the described embodiments are some, but not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0055] In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer" etc. The indicated orientation or positional relationship is based on the orientation or positional relationship shown in the drawings, and is only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying that the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a legal text type case retrieval method and system based on a pre-training language model. The method comprises the steps: arranging information of a legal text type case to be retrieved into data information comprising a main sentence and a retrieved sentence according to the text data of an original legal main sentence and the text data of a retrieval pool, and taking the data information as the input data of model training; performing word segmentation processing and invalid part-of-speech screening on a main sentence and a retrieved sentence in the input data, and obtaining final data with key information based on an artificially constructed crime name table positioning function; carrying out position vector calculation on the data with the key information, and determining the position relation between the data; and utilizing the trained pre-training language model to retrieve a legal text type case related to the query main sentence case. Effective text features are reserved to the maximum extent, the text length is reduced, meanwhile, it is guaranteed that text semantic information is not damaged, and the proportion of key features is enhanced. In terms of data, the precision and the performance of the model are essentially improved.

Description

technical field [0001] The invention relates to the technical field of similar case retrieval, in particular to a method and system for retrieving similar cases of legal texts based on a pre-trained language model. Background technique [0002] The retrieval of similar cases is an important system for implementing the requirements of the judicial accountability system, promoting judicial restriction and supervision, and promoting the uniform application of laws. Similar cases generally refer to cases with the same or similar elements and facts, or cases with similar facts, circumstances, criminal subjects, criminal methods, criminal purposes, and criminal results. The similar case search is to use the cases that have been solved, and can provide judges with some reference methods when encountering similar cases. Although several similar case retrieval platforms have been formed at present, they have certain deficiencies in the intelligent judgment of similar cases, retrieva...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/289G06F40/194G06F16/33G06K9/62
CPCG06F40/289G06F40/194G06F16/334G06F18/214
Inventor 李芳芳苏朴真邓晓衡张健
Owner CENT SOUTH UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products