Unlock instant, AI-driven research and patent intelligence for your innovation.

Similar post determination method and device, storage medium and terminal

A technology for determining methods and posts, applied in special data processing applications, instruments, unstructured text data retrieval, etc., can solve problems that affect user experience, improve user experience, increase query speed, and improve deduplication efficiency Effect

Pending Publication Date: 2019-04-23
BEIJING CHENGSHI WANGLIN INFORMATION TECH CO LTD
View PDF4 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention provides a method, device, storage medium and terminal for determining similar posts, which are used to solve the problem in the prior art that repeated or similar posts are recommended to users at the same time, thereby affecting the user experience

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Similar post determination method and device, storage medium and terminal
  • Similar post determination method and device, storage medium and terminal
  • Similar post determination method and device, storage medium and terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] In order to solve the problem that the existing technology cannot judge the similar content of new posts in real time, it can only be done offline. When the new posts are stored in the database and the repeated similar posts have not been deleted offline, there is a certain probability that the repeated or similar posts will be deleted at the same time. For the problem of recommending to users, which in turn affects user experience, the present invention provides a method, device, storage medium, and terminal for determining similar posts. The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0024] The first embodiment of the present invention provides a method for determining similar posts, the flow chart of which is as follows figure 1 As show...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a similar post determination method and device, a storage medium and a terminal. The method comprises the following steps of calculating a simhash value of a new post; and thendividing the simhash value into a plurality of same parts; obtaining the simhash values of other posts with the same part and a simhash value of a newly added post, calculating a Hamming distance, and when the Hamming distance is smaller than a preset threshold value, proving that the newly added post has similar posts, and deleting the similar posts at the moment. By means of the method, the number of times of calculating the Hamming distance between the posts is reduced, the query speed is increased, whether the posts are similar posts or not is rapidly determined when the posts are put instorage, the duplicate removal efficiency is improved, and therefore the purpose of rapidly determining and deleting the similar posts is achieved, the same or similar posts are prevented from being repeatedly recommended to a user, and the use experience of the user is improved.

Description

technical field [0001] The invention relates to the technical field of text mining, in particular to a method, device, storage medium and terminal for determining similar posts. Background technique [0002] When the application recommends relevant content to the user, it pushes the existing posts in the content pool to the user. After a new post enters the content pool, the new post may be recommended to the user during content recommendation. However, the new post may be a duplicate post with the content of the existing post in the content pool. After the new post is pushed to the user, it may It will cause repeated recommendations for users. [0003] The existing technology judges whether there are original posts and new post contents by calculating the minimum hash signature simhash value of all posts and new posts in the content pool, and then calculating the Hamming distance between the new post and each post in the content pool Repeat, however, the existing technolo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/22G06F16/31
CPCG06F40/194
Inventor 王硕硕
Owner BEIJING CHENGSHI WANGLIN INFORMATION TECH CO LTD