Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Subject similarity based case retrieving method

A similarity and topic technology, applied in the field of case retrieval based on topic similarity, can solve problems such as inaccurate retrieval results, poor professionalism, and low calculation efficiency, and achieve improved efficiency and retrieval professionalism, and improved accuracy and recall rates , the effect of improving the accuracy

Inactive Publication Date: 2017-09-12
安徽富驰信息技术有限公司
View PDF5 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the following reasons: 1. For the judicial document data set, the number of terms and the number of documents are both large, and the word frequency vector model must be used to represent the document as a matrix of the number of terms and the number of documents, which has a very high feature dimension
2. The feature matrix is ​​extremely sparse and the calculation efficiency is low
3. In the process of calculating the similarity, irrelevant terms participate in the calculation of the similarity model, causing interference and poor retrieval effect
[0004] Therefore, the existing keyword-based full-text retrieval methods have the disadvantages of low retrieval efficiency, inaccurate retrieval results, and poor professionalism.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Subject similarity based case retrieving method
  • Subject similarity based case retrieving method
  • Subject similarity based case retrieving method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] In order to further illustrate the features of the present invention, please refer to the following detailed description and accompanying drawings of the present invention. The accompanying drawings are for reference and description only, and are not intended to limit the protection scope of the present invention.

[0037] Such as figure 1 As shown, this embodiment discloses a case retrieval method based on subject similarity, which includes the following steps S1 to S5:

[0038] S1. With the layout and key words of the document as constraints, the automatic extraction algorithm is used to extract three segments: the case facts, the focus of dispute and the judgment result of the document;

[0039] Among them, the layout of documents refers to: the fixed components of judicial documents when they are arranged, generally including the facts of the case, the focus of disputes and the results of the judgment, and the key words refer to the facts of the case, the focus of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a subject similarity based case retrieving method, and belongs to the technical field of data retrieving. The method is characterized in that the document layout and the keywords are treated as the constraint conditions, and the automatic extraction algorithm is utilized to extract three sections, namely, the case facts of the document, the dispute focus and the judgment result; the subject terms of each document section are correspondingly extracted through a subject mode based on the field word list, thus obtaining the subject word blocks and non-subject word blocks of each document section; a feature inverse index is created according to the property words in the subject word blocks and the non-subject word blocks of each document section; the feature inverse index is mapped into a feature vector; the similarity of searching statements and documents in a document data set is calculated through a subject similarity model; the similarity of searching statements and documents in the document data set are sequenced, and the sequencing result is output, thus finishing document retrieving. According to the method, the document is described in a two-dimension manner, namely, the judicial feature words and the judicial subject, and the similar-case retrieving efficiency and accuracy can be improved.

Description

technical field [0001] The invention relates to the technical field of data retrieval, in particular to a case retrieval method based on subject similarity. Background technique [0002] With the openness and transparency of social information, the trial results of cases have attracted more and more attention from the society. For the same case, the judgment scales of different judges are often different. If it is possible to recommend similar previous cases in a timely manner before a case is judged, it will undoubtedly serve as a good reference. [0003] The current judicial case retrieval generally adopts the vector space model similarity calculation method based on tf-idf. This method uses the frequency of keywords appearing in the text and the inverse document frequency of the word appearing in the text set to characterize the word weight. , by calculating the cosine similarity between vectors to calculate the similarity of the text, and then perform retrieval accordi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/3331G06F40/242
Inventor 耿伟周宇司华建贾真
Owner 安徽富驰信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products