Class case recommendation method based on text content

A technology for recommending methods and content, applied in text database clustering/classification, neural learning methods, text database query, etc. It can solve the problems of neglect, inability to apply, and general model effect, and achieve the effect of improving the effect.

Active Publication Date: 2019-11-12
SHANDONG UNIV +1
View PDF3 Cites 29 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, traditional artificial intelligence methods use remote labels for model training based on criminal fact descriptions, and only use the information contained in the labels, while ignoring other information other than labels, such as crime plot information.
In addition, since the information processing of long texts is inherently difficult, the long-distance dependency problem has not been addressed
Therefore, the effect of the model is mediocre, and the recommended content is uneven, which cannot be applied to actual work.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Class case recommendation method based on text content
  • Class case recommendation method based on text content
  • Class case recommendation method based on text content

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0080] A method for recommending similar cases based on content includes the following steps:

[0081] (1) Construct unstructured data into structured data:

[0082] Use the method of rule matching to extract the required information such as description of criminal facts, basic information of criminal suspects, etc., realize data structuring, and construct a structured data set; required information includes description of criminal facts and basic information data of suspects, basic information of suspects Data includes age, gender, pre-arrest occupation information;

[0083]The non-overlapping structured data set is divided into training data set and test data set. The ratio of training data set and test data set is 7:3, that is, the training data set accounts for 70% of the structured data set, and the test data set accounts for 70% of the structured data set. 30% of the data set;

[0084] (2) Model pre-training:

[0085] The model includes sequentially connected word map...

Embodiment 2

[0103] According to a method for recommending similar cases based on content described in Embodiment 1, the difference is that:

[0104] In step (2), the basic structure for the vector compression layer is a self-attention structure, as shown in formulas (I) and (II):

[0105] A=Attention(Q,K,V)=sigmoid(Q T KV T ) (I)

[0106] R=Reduce(A, aixs=-2) (II)

[0107] Formula (I) represents the attention structure, Q, K, V are the output of described two-way transformer layer, namely the input of described vector compression layer, Q, K, V three are query, key, the abbreviation of value, refer to respectively Request matrix, key value matrix, and target matrix, in the present invention, all three are the same matrix; if Q, K, and V are the same input, it is called self-attention, and A represents the self-attention structure The result is the attention matrix of each column vector (that is, word vector) for all other column vectors in the input matrix (input is a two-dimensional ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a class case recommendation method based on text content. The method is divided into a pre-training part and a fine adjustment part. The pre-training part adopts a transformerencoder as a main structure, a Chinese language model is trained, Chinese language knowledge is learned from other corpora, and a high-quality language model is obtained. A triad model is used as a framework of the fine adjustment part, a preprocessed judicial document is used as training data, more knowledge about judgment is learned from the judicial field, and a better text vector representation is obtained. Compared with a traditional keyword-based class case recommendation method and a single-task neural network-based class case recommendation method, the content-based class case recommendation method provided by the invention is better in effect, and has better robustness based on a semantic training model, which indicates that the method provided by the invention is effective and practical.

Description

technical field [0001] The invention relates to a method for recommending similar cases based on text content, and belongs to the cross technical field of justice and natural language processing technology. Background technique [0002] The combination of law and artificial intelligence saves manpower to a certain extent, and the recommendation of similar cases is an important topic in this field. Its goal is to recommend several similar documents based on a given description of criminal facts. Its purpose is to provide judicial personnel with similar previous cases, so that judicial personnel can more quickly and accurately determine the crimes committed in the case and the laws on which it is based, and also provide the results of previous case judgments for judicial personnel's reference. In recent years, there have been many achievements on the combination of artificial intelligence and the judicial field at home and abroad, which have greatly improved the efficiency of ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/335G06F16/35G06F17/27G06K9/62G06N3/04G06N3/08
CPCG06F16/335G06F16/35G06F16/3344G06N3/084G06N3/045G06F18/22
Inventor 李玉军韩均雷王泽强马宝森张文真邓媛洁
Owner SHANDONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products