A noisy illegal short text recognition method based on a dual-channel text convolutional neural network

A convolutional neural network and recognition method technology, applied in the field of computer natural language processing, can solve the problems of variant feature identification of illegal users, difficulty in constructing variant features, etc., and achieve the effect of improving accuracy and robustness
CN109670041AInactive Publication Date: 2019-04-23TIANGE TECH HANGZHOU

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
TIANGE TECH HANGZHOU
Publication Date
2019-04-23
Estimated Expiration
Not applicable · inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention relates to a noisy illegal short text recognition method based on a dual-channel text convolutional neural network. The method comprises the steps of preprocessing the short texts with noise, constructing a dual-channel text convolutional neural network model, and training and real-time recognizing the model. The preprocessing of the short text with noise is used for standardizing the noise characters, eliminating the influence of noise and improving the learning ability of the convolutional neural network model. The dual-channel text convolutional neural network model is a textconvolutional neural network model capable of inputting a preprocessed character sequence and a preprocessed pinyin sequence at the same time. Due to the fact that the input capacity and the modelingcapacity of the pinyin sequence are improved, the influence of homophone character replacement on the classification performance can be eliminated through the model. According to the method, influences caused by homophone character replacement, English character replacement with similar shapes, numeric symbol replacement with the same semantics and the like can be processed, and the experimental results show that the method has higher recognition accuracy and lower false detection rate for the recognition of the illegal short texts with noise.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the field of computer natural language processing, and relates to a method for identifying illegal short texts with noise based on a double-channel text convolutional neural network. Background technique

[0002] With the rapid development of the network, the sharing and communication of information and opinions through the network has become an important way of current network applications. For example, discuss certain issues through BBS; express views, news and comments through Weibo; communicate through instant messaging tools; comment on the comment pages of news websites; communicate through live video services; Comment on the video content through the barrage when the video is playing, etc. This mode of user-generated content facilitates information sharing and communication among users. However, this method of publishing Internet content is also easy to be used by criminals to release some illegal advertising informati...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More