Method for identifying and disposing junk call content

A technology of spam calls and call content, applied in the field of telecommunications, can solve the problems of narrow information stored in the database and inability to comprehensively defend against harassing calls, and achieve the effect of high accuracy

Inactive Publication Date: 2018-06-05
ZHEJIANG PONSHINE INFORMATION TECH CO LTD
View PDF7 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This invention scheme focuses on matching the call content with the database information. If the match is successful, it will be identified as fraudulent information. The information stored in the database is relatively narrow, and cannot comprehensively defend against various harassing calls

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for identifying and disposing junk call content
  • Method for identifying and disposing junk call content
  • Method for identifying and disposing junk call content

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0033] Step A1: using the recording device to collect the call content, the speech recognition device recognizes the recorded data, and converts the recorded data into text data;

[0034] Step A2: Use a regularization method to remove non-text parts in the text data;

[0035] Step A3: divide the samples into training samples and test samples according to the ratio of 3:1;

[0036] Step A4: Text word segmentation: use the stammer word segmentation tool to perform word segmentation processing on the SMS text;

[0037] Step A5: preset invalid words, and remove words matched by the invalid words in the text;

[0038] Step A6: Use Word2Vec technology to convert the segmented words into word vectors, and perform vectorization processing on the words;

[0039] Step A7: Convert word vectors into sentence vectors using LSTM algorithm;

[0040] Step A8: using the sentence vector as the input vector of the DNN classification model;

[0041] Step A9: Select the result with the largest...

Embodiment 2

[0049] Such as figure 2 As shown, the present embodiment takes "I am from the Public Security Bureau, your account is suspected of money laundering, please cooperate with the investigation" as an example, and the design of the present invention is based on a hybrid neural network model of LSTM and DNN.

[0050] exist figure 2 Among them, the model is divided into three layers. The first layer converts the words in the text into word vectors by using Word2Vec; the second layer is the LSTM layer, which inputs the word vectors generated by the first layer to the LSTM layer, and uses the LSTM algorithm structure. Calculate the impact of the previous and subsequent words on the current word, and finally convert each individual word vector into a sentence vector; the third layer is the DNN layer, and the sentence vector generated by the second layer is used as the input layer. After passing through the hidden layer, the softmax activation function is used. The output layer output...

Embodiment 3

[0052] Such as image 3 As shown, this embodiment specifically introduces the function of the Word2Vec algorithm in the present invention.

[0053]To transform the problem of natural language understanding into a problem that can be handled by the machine, the first step must be to digitize these symbols, that is, to map the expression of the text into a k-dimensional vector space. The Word2Vec algorithm converts Chinese words in the word-segmented corpus into word vectors. The word vectors trained by Word2Vec are as follows:

[0054] v i =(a 0 ,a 1 ,L,a d ) (1)

[0055] In formula (1), d is the dimension of the word vector.

[0056] The specific implementation process of the Word2Vec algorithm is as follows:

[0057] Step A61: Perform statistics on keywords in the phone text feature library, assuming there are m keywords;

[0058] Step A62: First use one-hot-vector to convert a word into an n-dimensional vector x, taking "arrears" as an example:

[0059] "Arrears" → ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for identifying and disposing a junk call content. The method specifically comprises the following steps: S1, a conversation content is collected; S2, the conversationcontent is converted into text information; S3, word segmentation processing is performed on the text information by means of a word segmentation tool; S4, a telephone discriminating model is obtained through the words processed through word segmentation according to an LSTM algorithm and a DNN algorithm; S5, according to a softmax classifier, a phone category obtained through the telephone discriminating model is output; S6, a call, which is output to be a junk call by the softmax classifier, is interrupted. According to the method for identifying and disposing the junk call, content in thecall can be analyzed in real time and can be blocked at the same time, so that real-time interception of the junk call is realized.

Description

technical field [0001] The invention belongs to the technical field of telecommunications, and in particular relates to a method for identifying and disposing of garbage call content. Background technique [0002] With the continuous development of communication technology worldwide, people are increasingly dependent on mobile communication. While the rapid development of mobile communication brings convenience to people, it also makes some people use mobile communication to promote advertisements, sell products or telecommunication fraud for commercial purposes. Effective monitoring and control methods should be used to detect and filter in time. [0003] Therefore automatically identifying garbage calls, and interrupting the garbage calls in time to protect people's lives and property safety, is the purpose of the present invention. [0004] For example, the invention patent with the publication number CN103731832A discloses a system and method for preventing telephone an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04M3/436G10L25/51G10L25/30G10L15/26G06N3/08G06N3/04G06K9/62G06F17/27
CPCH04M3/436G06N3/08G10L25/30G10L25/51G06F40/30G10L15/26G06N3/045G06F18/241
Inventor 陈晓莉刘亭丁一帆徐菁林建洪徐佳丽
Owner ZHEJIANG PONSHINE INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products