Judicial text-oriented search sorting method and system

A sorting method and text technology, applied in the direction of digital data information retrieval, special data processing applications, instruments, etc., can solve the problems of Query and Doc length mismatch, the result is not very good, etc., to speed up the algorithm running speed, and the matching results are reliable , the effect of accurate sorting results

Active Publication Date: 2019-10-18
ENJOYOR COMPANY LIMITED
View PDF9 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Judicial text data is like adjudication documents, often thousands of words, up to several million words, but the query in the search is often dozens of words or even a few words, so the similarity is used when the length of the query and the doc are seriously inconsistent method to retrieve and sort the results presented to the user may not be very good

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Judicial text-oriented search sorting method and system
  • Judicial text-oriented search sorting method and system
  • Judicial text-oriented search sorting method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0055] refer to figure 1 , the present embodiment provides a judicial text-oriented search and sorting method, the steps are as follows:

[0056] Step 1: Data Preprocessing

[0057] (1) Data acquisition

[0058] Collect judicial text data such as judgment document data, mediation case data, and legal text data, and perform preprocessing such as deduplication.

[0059] (2) word segmentation processing

[0060] According to the collected judicial text data, construct a word segmentation dictionary in the judicial field, and use jieba word segmentation to process the word segmentation of judicial text data.

[0061] (3) Training word vectors with judicial text data

[0062] Most of the existing word vectors are trained with data such as encyclopedias and news, but the context in judicial texts is quite different from that of news encyclopedias, and it is easier to obtain a large number of unsupervised training samples in the judicial field. Therefore, using judicial text data...

Embodiment 2

[0114] refer to Figure 6 , in order to realize a judicial text-oriented search and sort method described in Embodiment 1, an embodiment of the present invention also provides a search and sort system for implementing the above-mentioned judicial text-oriented search and sort method, including:

[0115] The first obtaining module is used to obtain the judicial text data Doc, and carry out word segmentation processing to the judicial text data, and pre-train word vectors;

[0116] The second obtaining module is used to obtain the legal consultation question Query input by the user;

[0117] The correlation calculation module is used to calculate the matching score of the judicial text data Doc and the legal consulting question Query, construct the matching matrix of the judicial text data Doc and the legal consulting question Query, and intercept the relevant text according to the matching matrix , calculating the statistical information of word and word co-occurrence in the r...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a judicial text-oriented search sorting method and system. The judicial text-oriented search sorting method comprises the steps: (1) data preprocessing: collecting judicial text data Doc and a legal consultation question Query, carrying out the word segmentation of the collected judicial text data Doc, and pre-training a judicial text word vector through the word-segmenteddata; (2) constructing a similarity matrix: constructing a similarity matching matrix M of Query and Doc by adopting pre-trained word vectors; (3) intercepting correlation text fragments: extracting local correlation text fragments according to a matching matrix M of Query and Doc, splicing a plurality of local correlation text fragments together to obtain a correlation text Ds, and splicing matching matrixes of a plurality of corresponding local correlation text fragments together to obtain a matrix Ms; (4) constructing a feature vector: calculating global correlation between Query and the correlation text Ds, and constructing a feature vector F; and (5) calculating a matching value and sorting: inputting the obtained feature vector F into a neural network model to obtain a matching scoreof Query and Doc, and sorting according to the size of the matching score.

Description

technical field [0001] The invention belongs to the field of natural language processing, and relates to a search and sort method and system for judicial texts. Background technique [0002] The core of the search sorting algorithm is how to calculate the relationship between the search input (Query) and the target document (Doc) and sort the Doc accordingly. Patent CN201710263575.6 sorts the retrieved documents through preset sorting rules. The preset rules take a lot of time to analyze the feasibility of the rule template, and the user's input and intention are highly uncertain and it is difficult to exhaust all Sorting rules, sorting results are somewhat unpredictable. Patent CN201710348412.8 extracts keywords and creates a synonym dictionary to expand the search results, and then uses the preset sorting rules for sorting. Patent CN201710298924.8 extracts the topics of Query and Doc, and calculates the similarity between the topics as a sorting standard. Currently comm...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/332G06F16/338G06F17/27G06N3/04
CPCG06F16/3329G06F16/338G06F40/284G06N3/045
Inventor 王开红陈涛张云云丁锴李建元
Owner ENJOYOR COMPANY LIMITED
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products