Calculation method of news and case similarity based on asymmetric twin network

A similarity calculation, twin network technology, applied in the field of natural language processing, can solve the problems of high accuracy, difficult to learn unbalanced corpus, redundant news text, etc., to solve the effect of content redundancy

Active Publication Date: 2020-09-08
KUNMING UNIV OF SCI & TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The invention provides a news and case similarity calculation method based on an asymmetric twin network, which is used to solve the problem of difficult learning of unbalanced corpus by traditional text correlation analysis methods, and solves the problem of news text redundancy, and realizes news Similarity calculation between text and case description with high accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Calculation method of news and case similarity based on asymmetric twin network
  • Calculation method of news and case similarity based on asymmetric twin network
  • Calculation method of news and case similarity based on asymmetric twin network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0026] Embodiment 1: as Figure 1-2 Shown, a kind of news and case similarity calculation method based on asymmetric twin network, the specific steps of the news and case similarity calculation method based on asymmetric twin network are as follows:

[0027] Step1. By analyzing popular news in recent years, this embodiment can select a number of popular cases such as "Kunshan anti-murder case", and crawl 4513 pieces of news related to the case. By establishing the relationship between news and cases, 4607 pairs of news-case correspondence data are obtained. Through artificial calibration, 3374 pairs of valid data were selected, including 1630 pairs of relevant case-news pairs and 1744 pairs of irrelevant data. From it, 675 pairs were separated as a verification set, 326 pairs of relevant data in the verification set, and 349 pairs of irrelevant data;

[0028] Then use the news title to compress the news document to obtain the compressed news document: first calculate the cor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for calculating the similarity between news and cases based on an asymmetric twin network, and belongs to the technical field of natural language processing. The present invention first selects the sentence representation document most relevant to the news title by calculating the similarity between sentences and titles in the text, thereby removing redundant sentences in the news text, and then uses an asymmetric twin network to model documents and case descriptions, taking into account The case elements contain the key semantic information of the case, and the case elements are integrated into the asymmetric twin network as supervisory information to encode the news documents and case descriptions, and finally judge the relevance of the news and the case by calculating the similarity of the documents. The present invention calculates the similarity of news texts and case descriptions based on the asymmetric twin network, and can perform semantic encoding modeling on unbalanced news texts and case descriptions, which is beneficial to improving the accuracy of similarity calculations.

Description

technical field [0001] The invention relates to a method for calculating the similarity between news and cases based on an asymmetric twin network, and belongs to the technical field of natural language processing. Background technique [0002] The analysis of news public opinion in the legal field is a hot issue in the current natural language processing research. The correlation analysis between news and cases is an important part of news public opinion analysis in the legal field. It is the basis, premise and pillar of news public opinion analysis of subsequent cases. The accuracy of multiple follow-up news public opinion analysis, such as: sentiment classification, topic analysis, summary generation, etc. In order to solve the quality and performance of follow-up work, it is necessary to construct a high-accuracy correlation analysis method between news and cases. Using Siamese network to calculate text correlation is an effective way, which has a good learning ability ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/30G06K9/62G06F40/12
CPCG06F18/22
Inventor 余正涛赵承鼎郭军军线岩团黄于欣相艳
Owner KUNMING UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products