An unsupervised method for long-text recognition of Internet public opinion spam
A technology of network public opinion and identification method, applied in the field of information processing, can solve problems such as unusable, low accuracy, high cost of monitoring data, etc., and achieve the effect of reducing costs
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Examples
Embodiment 1
[0052] An unsupervised long-text recognition method for Internet public opinion spam, the long-text recognition method for Internet public opinion to be predicted includes the following steps:
[0053] (1) Acquisition of corpus: Obtain the data of corresponding marked public opinion spam text and normal text from the existing internal system;
[0054] (2) Model training: build two models respectively, including the language model based on the network public opinion text training and the BERT next sentence prediction model based on the network public opinion text, and input the long text of the network public opinion to be predicted into the above language model and BERT respectively In a predictive model;
[0055] The judgment process of the language model is as follows:
[0056] (X1) statistical language model;
[0057] The statistical language model is used to calculate the probability that a sentence S is a normal sentence, formalized p(S)=p(w 1 ,w 2 ,...,w n ), where ...
Embodiment 2
[0068] An unsupervised long-text recognition method for Internet public opinion spam, the long-text recognition method for Internet public opinion to be predicted includes the following steps:
[0069] (1) Acquisition of corpus: Obtain the data of corresponding marked public opinion spam text and normal text from the existing internal system;
[0070] (2) Model training: build two models respectively, including the language model based on the network public opinion text training and the BERT next sentence prediction model based on the network public opinion text, and input the long text of the network public opinion to be predicted into the above language model and BERT respectively In a predictive model;
[0071] The judgment process of the language model is as follows:
[0072] (X1) statistical language model;
[0073] The statistical language model is used to calculate the probability that a sentence S is a normal sentence, formalized p(S)=p(w 1 ,w 2 ,...,w n ), where ...
Embodiment 3
[0091] An unsupervised long-text recognition method for Internet public opinion spam, the long-text recognition method for Internet public opinion to be predicted includes the following steps:
[0092] (1) Acquisition of corpus: Obtain the data of corresponding marked public opinion spam text and normal text from the existing internal system;
[0093] (2) Model training: build two models respectively, including the language model based on the network public opinion text training and the BERT next sentence prediction model based on the network public opinion text, and input the long text of the network public opinion to be predicted into the above language model and BERT respectively In a predictive model;
[0094] The judgment process of the language model is as follows:
[0095] (X1) statistical language model;
[0096] The statistical language model is used to calculate the probability that a sentence S is a normal sentence, formalized p(S)=p(w 1 ,w 2 ,...,w n ), where ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com