Sensitive word filtering method and device

A filtering method and technology of sensitive words, applied in the field of information processing, can solve the problem that the accuracy rate is difficult to meet expectations, and achieve the effect of improving accuracy rate, avoiding manslaughter, and reducing labor costs

Pending Publication Date: 2021-12-07
BEIJING WODONG TIANJUN INFORMATION TECH CO LTD +1
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In the process of implementing this application, the inventor found that the accuracy rate of

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sensitive word filtering method and device
  • Sensitive word filtering method and device
  • Sensitive word filtering method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0038] see figure 1 , figure 1 It is a schematic diagram of the sensitive word filtering process in Embodiment 1 of the present application. The specific steps are:

[0039] Step 101, acquire target text.

[0040] The target text here is the text that needs to be filtered for sensitive words.

[0041] Obtaining the target text may be receiving a text request sent by the client to obtain the target text;

[0042] It can also be obtained by copying, transferring, etc.

[0043] Before performing word segmentation on the target text, the target text may be preprocessed.

[0044] In this step, the preprocessing of the target text is specifically realized, including:

[0045] filter out special symbols in the target text;

[0046] The special symbols such as: "", %, #, @ and so on.

[0047] Convert traditional characters in the target text into simplified characters;

[0048] Filter out stop words in the target text.

[0049] Among them, the stop words are: modal particles...

Embodiment 2

[0062] see figure 2 , figure 2 It is a schematic diagram of the process of filtering sensitive words in Embodiment 2 of the present application. The specific steps are:

[0063] Step 201, obtain target text.

[0064] The target text here is the text that needs to be filtered for sensitive words.

[0065] Obtaining the target text may be receiving a text request sent by the client to obtain the target text;

[0066] It can also be obtained by copying, transferring, etc.

[0067] In the embodiment of the present application, before the target text is segmented, the target file may be segmented first.

[0068] Specifically, the target text is preprocessed, including:

[0069] filter out special symbols in the target text;

[0070] Convert traditional characters in the target text into simplified characters;

[0071] Filter out stop words in the target text.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a sensitive word filtering method and device. The method comprises: obtaining a target text; performing word segmentation processing on the target text; performing sensitive word matching on the target text subjected to word segmentation processing, and storing matched sensitive words into a first sensitive word set; determining whether an emotion type to which a context corresponding to the sensitive word in the target text belongs is a specified emotion type or not based on a preset Markov logic network model; deleting the sensitive words corresponding to the context belonging to the specified emotion type from the first sensitive word set to obtain a second sensitive word set; and filtering the target text by using the sensitive words in the second sensitive word set. According to the method, on the premise that the labor cost is reduced, the sensitive word filtering accuracy is improved, and the situation of mistaken killing is avoided.

Description

technical field [0001] The present invention relates to the technical field of information processing, in particular to a method and device for filtering sensitive words. Background technique [0002] At present, the sensitive word filtering is mainly carried out through the sensitive word database and some sensitive word dictionary trees. For the implementation of sensitive word dictionary trees, support vector machine classification and naive Bayesian classification are mainly used. [0003] In the process of implementing the present application, the inventors found that the accuracy rate of filtering using the above sensitive word filtering method is difficult to meet expectations. Contents of the invention [0004] In view of this, the present application provides a method and device for filtering sensitive words, which can improve the accuracy of filtering sensitive words and avoid manslaughter on the premise of reducing labor costs. [0005] In order to solve the pr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/33G06F16/335G06F40/289G06K9/62
CPCG06F16/335G06F16/3334G06F40/289G06F18/24323G06F18/295G06F18/214
Inventor 李雨航余欢
Owner BEIJING WODONG TIANJUN INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products