News and case similarity calculation method based on asymmetric twin network

A similarity calculation, twin network technology, applied in the field of natural language processing, can solve the problems of high accuracy, difficult to learn unbalanced corpus, redundant news text, etc., to solve the effect of content redundancy

Active Publication Date: 2020-01-21
KUNMING UNIV OF SCI & TECH
View PDF3 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The invention provides a news and case similarity calculation method based on an asymmetric twin network, which is used to solve the problem of difficult learning of unbalanced corpus

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • News and case similarity calculation method based on asymmetric twin network
  • News and case similarity calculation method based on asymmetric twin network
  • News and case similarity calculation method based on asymmetric twin network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] Embodiment 1: as Figure 1-2 Shown, a kind of news and case similarity calculation method based on asymmetric twin network, the specific steps of the news and case similarity calculation method based on asymmetric twin network are as follows:

[0027] Step1. By analyzing popular news in recent years, this embodiment can select a number of popular cases such as "Kunshan anti-murder case", and crawl 4513 pieces of news related to the case. By establishing the relationship between news and cases, 4607 pairs of news-case correspondence data are obtained. Through artificial calibration, 3374 pairs of valid data were selected, including 1630 pairs of relevant case-news pairs and 1744 pairs of irrelevant data. From it, 675 pairs were separated as a verification set, 326 pairs of relevant data in the verification set, and 349 pairs of irrelevant data;

[0028] Then use the news title to compress the news document to obtain the compressed news document: first calculate the cor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a news and case similarity calculation method based on an asymmetric twin network, and belongs to the technical field of natural language processing. The method comprises thefollowing steps: firstly, selecting a sentence representation document most relevant to a news title by calculating the similarity between sentences and titles in a text so as to remove redundant sentences in the news text; describing and modeling a document and a case by using an asymmetric twin network, fusing the case element serving as supervision information into the asymmetric twin network to encode a news document and case description in consideration of key semantic information of the case contained in the case element, and finally judging the correlation between news and the case by calculating document similarity. According to the method, similarity calculation is carried out on the news text and the case description based on the asymmetric twin network, semantic coding modelingcan be carried out on the unbalanced news text and case description, and the accuracy of similarity calculation can be improved.

Description

technical field [0001] The invention relates to a method for calculating the similarity between news and cases based on an asymmetric twin network, and belongs to the technical field of natural language processing. Background technique [0002] The analysis of news public opinion in the legal field is a hot issue in the current natural language processing research. The correlation analysis between news and cases is an important part of news public opinion analysis in the legal field. It is the basis, premise and pillar of news public opinion analysis of subsequent cases. The accuracy of multiple follow-up news public opinion analysis, such as: sentiment classification, topic analysis, summary generation, etc. In order to solve the quality and performance of follow-up work, it is necessary to construct a high-accuracy correlation analysis method between news and cases. Using Siamese network to calculate text correlation is an effective way, which has a good learning ability ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/30G06K9/62G06F40/12
CPCG06F18/22
Inventor 余正涛赵承鼎郭军军线岩团黄于欣相艳
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products