Unlock instant, AI-driven research and patent intelligence for your innovation.

Text similarity calculation method and device

A technology of text similarity and calculation method, applied in the field of text similarity calculation method and device, can solve the problems of limited recognition efficiency and difficult recognition, and achieve the effect of reducing subsequent calculation amount, small calculation amount, and improving processing efficiency

Active Publication Date: 2020-04-17
BEIJING QIYI CENTURY SCI & TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, the similarity of online comment texts is generally identified based on social account attribute information. Since this method not only needs to analyze and process comment text, but also needs to analyze and process user account attribute information, the identification method The identification efficiency is limited, and, for the above-mentioned trolls' remarks, it is difficult to identify because the individual accounts of trolls are different

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text similarity calculation method and device
  • Text similarity calculation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] In order to make the above objects, features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0021] Embodiments of the present invention provide a method and device for calculating text similarity, aiming to quickly calculate the similarity of comment texts in the context of the current network big data, so as to provide a basis for judging certain types of comments (such as troll comments).

[0022] see figure 1 , which is a flowchart of a method for calculating text similarity provided by an embodiment of the present invention, the method includes:

[0023] S101: Determine multiple pieces of pending text.

[0024] Wherein, the pending text may refer to a network comment text, for example, a news comment text, a social platform comment text, a film and television resource comment text, and so on. The text can be commen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a text similarity calculation method and device, wherein the method comprises the following steps of determining a plurality of texts to be determined; respectively converting each text to be determined into respectively corresponding character string lists; calculating a text signature of each text to be determined on the basis of the corresponding character string list; finding all texts to be determined with the same text signature to form a candidate text set, wherein any two texts in the candidate text set form a candidate pair; calculating the similarity degree between the two texts of the candidate pair. The text similarity calculation method and the text similarity calculation device provided by the invention can be used for processing mass network comment texts, and meanwhile, the processing efficiency is ensured.

Description

technical field [0001] The invention relates to the technical field of data analysis, in particular to a text similarity calculation method and device. Background technique [0002] With the development of the Internet, news websites and social networking sites have become the main platforms for users to obtain information, and comment texts largely affect the direction of public opinion. Therefore, how to identify text similarity will help improve the credibility of Internet information . For example, for commercial purposes, a group of "water army" has emerged. The water army publishes comments with the characteristics of the water army on the Internet based on a unified purpose. Purpose. Due to the consistency of purpose of the water army groups, there is a great similarity in their comment texts. [0003] At present, the similarity of online comment texts is generally identified based on social account attribute information. This method not only needs to analyze and p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/35
CPCG06F16/35
Inventor 唐文韬
Owner BEIJING QIYI CENTURY SCI & TECH CO LTD