Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Accidental sensitive word discovery method based on word network

A discovery method and technology of sensitive words, applied in the computer field, to achieve the effect of convenient collection, simple and direct sensitivity, simple and fast discovery method

Active Publication Date: 2020-07-28
BEIHANG UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there is currently a lack of a complete and fast discovery method for occasional sensitive words

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Accidental sensitive word discovery method based on word network
  • Accidental sensitive word discovery method based on word network
  • Accidental sensitive word discovery method based on word network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to understand the characteristics and technical contents of the embodiments of the present invention in more detail, the implementation of the embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. The attached drawings are only for reference and description, and are not intended to limit the embodiments of the present invention.

[0031] In order to clearly illustrate the design idea of ​​the present invention, the present invention will be described below in conjunction with embodiments.

[0032] figure 1 It is a flowchart of a method for discovering occasional sensitive words based on a word network according to an embodiment of the present invention. As shown in FIG. 1 , a method for discovering occasional sensitive words based on a word network includes:

[0033] Step 1. Use Internet public text data or other social platform text information collection channels to collect Internet text data inc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an accidental sensitive word discovery method based on a word network, and the method comprises the steps: obtaining Internet text information through the public text data of the Internet or other channels, and setting a corresponding text language and the specific time of message sending; extracting texts, carrying out time division by taking time granularity (generally set to be one day) as a unit, screening sensitive texts of a specific language by combining an existing common sensitive word dictionary, cutting according to the positions of punctuation marks in the texts to obtain a plurality of short texts, and carrying out word segmentation processing on the short texts; constructing a word network on the basis of the short text, and calculating a maximum K-core value of the word network, and a K-core value and a core coefficient of each word in the network; and for the selected core word, extracting the number of times that the core word is located at thecore position in a specified historical period (generally set to be 30 days) and the average core coefficient of the time period that the core word is not located at the core position in the specifiedhistorical period, and finally discovering accidental sensitive words in the word network by utilizing a detection formula.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a word network-based discovery method for occasional sensitive words. Background technique [0002] Internet sensitive words refer to politically sensitive tendencies, violent tendencies, unhealthy color words or uncivilized terms that are blocked or tracked by Internet technology in real time. In today's network environment, sensitive words that have been at the core of discussions for a long time can often be detected and blocked by network technology, and these high-frequency sensitive words form a fixed dictionary of sensitive words. However, in the management of sensitive words, there are relatively few studies on occasional sensitive words. These sporadic sensitive words have never been in the fixed sensitive word dictionary, but with the evolution of some emergencies or popular topics, they will often be closely related to high-frequency sensitive words within a certain...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/242G06F40/284G06F40/289G06F16/951
CPCG06F16/951
Inventor 赵吉昌赵怡雯杨阳盛浩
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products