Unlock instant, AI-driven research and patent intelligence for your innovation.

Text data filtering method and device and medium

A technology of text data and filtering method, applied in the field of text processing

Pending Publication Date: 2019-10-18
TENCENT TECH CHENGDU
View PDF14 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

There is no method in the prior art that can filter different types of garbage data in UGC separately

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text data filtering method and device and medium
  • Text data filtering method and device and medium
  • Text data filtering method and device and medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0072] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0073] The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above drawings are used to distinguish similar objects and not necessarily Describe a specific order or sequence. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. F...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a text data filtering method, which is used for filtering user generated content, such as comment content published by a user in a post bar, a forum or an application store, andcomprises the following steps of: obtaining first text data to be filtered; filtering junk data in the first text data through a heuristic rule to obtain second text data; filtering abnormal statements in the second text data through the first language model to obtain third text data. The invention further provides a text data filtering device, computer equipment and a medium. Different types ofjunk data in the user generated content can be respectively filtered.

Description

technical field [0001] The present invention relates to the technical field of text processing, and more specifically, to a text data filtering method, device and medium. Background technique [0002] User generated content (UGC), that is, user-generated content, such as Baidu Tieba, comments posted by users in major forums or app stores, etc. Website administrators need to manage UGC to avoid flooding the comment area with a lot of junk data, which will affect the quality of comments. [0003] UGC garbage data includes many different types of garbage data. There is no method in the prior art that can filter different types of garbage data in UGC respectively. [0004] Therefore, above-mentioned problem still needs to be improved. Contents of the invention [0005] In view of this, in order to solve the above problems, the present invention provides a distributed text clustering method. The technical solution is as follows: [0006] A method for filtering text data, c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/9536G06F16/35
CPCG06F16/9536G06F16/35
Inventor 徐灿
Owner TENCENT TECH CHENGDU