A Method for Discovering Occasional Sensitive Words Based on Word Network

A discovery method and technology of sensitive words, applied in the computer field, to achieve the effect of simple and direct sensitivity, simple and fast discovery method, and convenient collection

Active Publication Date: 2022-04-08
BEIHANG UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there is currently a lack of a complete and fast discovery method for occasional sensitive words

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Method for Discovering Occasional Sensitive Words Based on Word Network
  • A Method for Discovering Occasional Sensitive Words Based on Word Network
  • A Method for Discovering Occasional Sensitive Words Based on Word Network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] In order to understand the characteristics and technical contents of the embodiments of the present invention in more detail, the implementation of the embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. The attached drawings are only for reference and description, and are not intended to limit the embodiments of the present invention.

[0029] In order to clearly illustrate the design idea of ​​the present invention, the present invention will be described below in conjunction with embodiments.

[0030] figure 1 It is a flowchart of a method for discovering occasional sensitive words based on a word network according to an embodiment of the present invention. As shown in FIG. 1 , a method for discovering occasional sensitive words based on a word network includes:

[0031] Step 1. Use Internet public text data or other social platform text information collection channels to collect Internet text data inc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

An occasional sensitive word discovery method based on the word network, with the help of Internet public text data or Internet text information obtained from other channels, as well as the corresponding text language settings and the specific time of posting; the text is extracted, and the time granularity (generally set to 1 day) ) as a unit for time division, combined with the existing dictionary of common sensitive words to screen sensitive texts in a specific language, cut out several short texts according to the position of punctuation marks in the text, and perform word segmentation processing on the short texts; build a word network based on short texts , calculate the maximum K-core value of the word network, and the K-core value and core coefficient of each word in the network; for the selected core word, extract the core word within the specified historical period (generally set to 30 days) The number of times in the core position, and the average core coefficient of the time period that is not in the core position in the specified historical period, and finally use the detection formula to find the occasional sensitive words in the word network.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a word network-based discovery method for occasional sensitive words. Background technique [0002] Internet sensitive words refer to content that involves uncivilized language, etc., that is blocked by network technology or tracked by topics in real time. In today's network environment, sensitive words that have been at the core of discussions for a long time can often be detected and blocked by network technology, and these high-frequency sensitive words form a fixed dictionary of sensitive words. However, in the management of sensitive words, there are relatively few studies on occasional sensitive words. These sporadic sensitive words have never been in the fixed sensitive word dictionary, but with the evolution of some emergencies or popular topics, they will often be closely related to high-frequency sensitive words within a certain period of time, and After a certain p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/242G06F40/284G06F40/289G06F16/951
CPCG06F16/951
Inventor 赵吉昌赵怡雯杨阳盛浩
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products