A method and system for searching and sorting judicial texts

A sorting method and text technology, applied in digital data information retrieval, instruments, biological neural network models, etc., can solve problems such as poor results and mismatched lengths of Query and Doc

Active Publication Date: 2021-09-10
ENJOYOR COMPANY LIMITED
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Judicial text data is like adjudication documents, often thousands of words, up to several million words, but the query in the search is often dozens of words or even a few words, so the similarity is used when the length of the query and the doc are seriously inconsistent method to retrieve and sort the results presented to the user may not be very good

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for searching and sorting judicial texts
  • A method and system for searching and sorting judicial texts
  • A method and system for searching and sorting judicial texts

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0055] refer to figure 1 , the present embodiment provides a judicial text-oriented search and sorting method, the steps of which are as follows:

[0056] Step 1: Data Preprocessing

[0057] (1) Data acquisition

[0058] Collect judicial text data such as judgment document data, mediation case data, and legal clause data, and perform preprocessing such as deduplication.

[0059] (2) word segmentation processing

[0060] According to the collected judicial text data, construct a word segmentation dictionary in the judicial field, and use jieba word segmentation to process the judicial text data.

[0061] (3) Train word vectors with judicial text data

[0062] Existing word vectors are mostly trained with data such as encyclopedia and news, while the context in judicial text is quite different from that in news encyclopedia, etc., and a large number of unsupervised training samples in the judicial field can be easily obtained, so judicial text data is used. Retraining the w...

Embodiment 2

[0114] refer to Image 6 , in order to implement the judicial text-oriented search and sorting method described in the first embodiment, an embodiment of the present invention also provides a search and sorting system for implementing the above-mentioned judicial text-oriented search and sorting method, including:

[0115] The first acquisition module is used to acquire judicial text data Doc, and perform word segmentation processing on the judicial text data, and pre-train word vectors;

[0116] The second acquisition module is used to acquire the legal consultation question Query input by the user;

[0117] The correlation calculation module is used to calculate the matching score of the judicial text data Doc and the legal consultation question Query, construct a matching matrix of the judicial text data Doc and the legal consultation question Query, and intercept the relevant text according to the matching matrix , calculate the statistical information of words and word c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A search and sorting method and system for judicial texts, wherein the method includes (1) data preprocessing: collecting judicial text data Doc and legal consulting question Query, and performing word segmentation processing on the collected judicial text data Doc, using the word segmentation Data pre-training of judicial text word vectors; (2) building similarity matrix: using pre-trained word vectors to construct similarity matching matrix M of Query and Doc; (3) intercepting relevant text fragments: according to matching matrix M of Query and Doc Extract local correlation text fragments, splicing multiple local correlation text fragments together to obtain correlation text Ds, and splicing matching matrices of corresponding multiple local correlation text fragments together to obtain matrix Ms; (4) Construct feature vector: calculate Query and the global correlation of the relevant text Ds, construct the feature vector F; (5) Calculate the matching value and sort: input the obtained feature vector F into the neural network model, get the matching score of Query and Doc, according to the size of the matching score put in order.

Description

technical field [0001] The invention belongs to the field of natural language processing, and relates to a search and sorting method and system for judicial texts. Background technique [0002] The core of the search sorting algorithm is how to calculate the relationship between the search input (Query) and the target document (Doc) and sort the Doc accordingly. The patent CN201710263575.6 sorts the retrieved documents through preset sorting rules. The preset rules take a lot of time to analyze the feasibility of the rule template, and the user's input and intention are highly uncertain. It is difficult to exhaust all Sorting rules, sorting results have certain unpredictability. The patent CN201710348412.8 extracts keywords and makes a thesaurus dictionary to expand the search results, and then uses the preset sorting rules to sort. Patent CN201710298924.8 extracts the topics of Query and Doc, and calculates the similarity between topics as a sorting criterion. At present...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/332G06F16/338G06F40/284G06N3/04
CPCG06F16/3329G06F16/338G06F40/284G06N3/045
Inventor 王开红陈涛张云云丁锴李建元
Owner ENJOYOR COMPANY LIMITED
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products