Automatic discovery method of sensitive words and its device and application

A technology of automatic discovery and sensitive words, applied in the field of data analysis, can solve problems such as difficulties, uneven quality, and unstable reporting cycle, and achieve the effect of improving filtering efficiency, improving accuracy, and reducing online risks

Active Publication Date: 2019-09-10
ALIBABA GRP HLDG LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] 2) Sensitive words mutate very quickly. Even for professional information security personnel, it is very difficult and time-consuming to promptly discover the modified words of malicious accounts that deliberately evade website rules; and the existence of these harmful information for a long time will also poses a high risk to the site
This method of relying on a large number of Internet accounts to assist in the investigation can alleviate the above-mentioned problem 2) to a certain extent, but there are also some problems, such as unstable reporting cycles, uneven quality, etc.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic discovery method of sensitive words and its device and application
  • Automatic discovery method of sensitive words and its device and application
  • Automatic discovery method of sensitive words and its device and application

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In the following description, many technical details are proposed in order to enable readers to better understand the application. However, those skilled in the art can understand that without these technical details and various changes and modifications based on the following implementation modes, the technical solution claimed in each claim of the present application can be realized.

[0038] In order to make the purpose, technical solution and advantages of the present invention clearer, the following will further describe the implementation of the present invention in detail in conjunction with the accompanying drawings.

[0039] The first embodiment of the present invention relates to a computer automatic discovery method for sensitive words. figure 1 It is a flow diagram of the computer automatic discovery method for the sensitive word.

[0040] Specifically, as figure 1 As shown, the computer automatic discovery method of this sensitive word comprises the follo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of data analysis, and discloses an automatic discovery method, device and application of a sensitive word. The automatic discovery method of the sensitive word comprises the following steps: obtaining a reporting account whose the reported act is established; obtaining a search keyword adopted by the reporting account before the established reporting behaviors happen; and on the basis of the established reported information in the search result of each search keyword, judging whether the search keyword is the sensitive word or not. The sensitive word can be determined by the search behaviors of the reporting account, and a sensitive word bank is effectively expanded in real time.

Description

technical field [0001] The invention relates to the field of data analysis, in particular to an automatic discovery method of sensitive words and its device and application. Background technique [0002] Any UGC (account generated content) website will face information content security issues, including politically sensitive, pornographic, counterfeit, fraudulent, and advertising spam. Therefore, a text-based sensitive word filtering system is indispensable. This system mainly includes the following modules: [0003] 1) The establishment and update of thesaurus: this part mainly depends on the way of manual collection. [0004] 2) Preprocessing and index creation: This step is mainly to solve the quick search in the following steps, and there are very mature solutions, such as using the data structure of Trie tree (word search tree) to achieve. [0005] 3) Content acquisition: depending on the specific business model, there are mainly two implementation methods, one is th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/9035G06F16/9535G06F17/27
Inventor 薛晖
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products