Method and device for doubly judging bad short messages by pre-training model and short message address

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A pre-training and SMS technology, applied in the field of information and information security, can solve the problems of pre-training and fine-tuning effect differences, ignore dependencies, etc., and achieve good explanatory and intuitive analysis effects

Active Publication Date: 2020-08-28

BEIJING ACT TECH DEV CO LTD +1

View PDF11 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, since a part of the mask is required for input, BERT ignores the dependency between the masked positions, so there is a difference between pretraining and fine-tuning effects (pretrain-finetuned discrepancy)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0035] see figure 1 Realize the pre-training model of the present invention adds the method and device of short message address double judgment bad short message by classified short message sample set 1, pre-training model module 2, short message collector 3, text processor 4, address extractor 5, web crawler 6 Composed of bad message judger 7;

[0036] Classified SMS sample set 1 stores classified SMS samples, and the number of classified SMS samples is greater than 150 and less than 1000;

[0037] The pre-training model module 2 uses the XLNet pre-training model to classify and calculate the classified short message sample set 1, and the pre-trained model module 2 after the classified short message sample set 1 can give the input text in the case of input text. out classification label;

[0038] The method of pre-training model module 2 to give classification labels to the input text is to program load the XLNet pre-training model, first format a tf_record file, then perform...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method and device for doubly judging bad short messages through a pre-training model and a short message address and relates to the technical field of information. The deviceis composed of a classified short message sample set, a pre-training model module, a short message collector, a text processor, an address extractor, a web crawler and a bad short message judger. According to the invention, the problem that traditional machine learning depends on features in bad short message recognition is solved; compared with deep learning, a large number of training sets arenot needed, judgment can be conducted through url short links in the short messages, and the short messages with sparse semantic information can be well recognized; meanwhile, the property of the short message judged by combining the text information and the short message address has better interpretability and more intuitive analysis effect than the property of the short message judged only according to the ip of the short message address.

Description

technical field [0001] The present invention relates to the field of information technology, especially the field of information security technology. Background technique [0002] At present, the information security of mobile phone text messages has been paid attention to by the whole society. The research on the identification of bad text messages is mainly based on the analysis method based on text classification and the analysis method based on the url in the text message. [0003] In the analysis of bad information based on text, it is mainly based on traditional machine learning algorithms and methods based on deep learning. Traditional machine learning, such as the invention patent number CN110147448A, selects features through the construction of dual feature engineering. The text features of bad text messages are ever-changing, and the text feature extraction method in traditional machine learning cannot be fully applied to the actual situation of bad text message cl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): H04W12/12H04W4/14G06K9/62G06F16/955G06F16/951G06F16/35

CPCH04W12/12G06F16/35H04W4/14G06F16/951G06F16/9566G06F18/214Y02D30/70

Inventor 林飞潘练王森蒋天翔古元

Owner BEIJING ACT TECH DEV CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method and device for doubly judging bad short messages by pre-training model and short message address

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology