Method for identifying and disposing junk call content
A technology of spam calls and call content, applied in the field of telecommunications, can solve the problems of narrow information stored in the database and inability to comprehensively defend against harassing calls, and achieve the effect of high accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0033] Step A1: using the recording device to collect the call content, the speech recognition device recognizes the recorded data, and converts the recorded data into text data;
[0034] Step A2: Use a regularization method to remove non-text parts in the text data;
[0035] Step A3: divide the samples into training samples and test samples according to the ratio of 3:1;
[0036] Step A4: Text word segmentation: use the stammer word segmentation tool to perform word segmentation processing on the SMS text;
[0037] Step A5: preset invalid words, and remove words matched by the invalid words in the text;
[0038] Step A6: Use Word2Vec technology to convert the segmented words into word vectors, and perform vectorization processing on the words;
[0039] Step A7: Convert word vectors into sentence vectors using LSTM algorithm;
[0040] Step A8: using the sentence vector as the input vector of the DNN classification model;
[0041] Step A9: Select the result with the largest...
Embodiment 2
[0049] Such as figure 2 As shown, the present embodiment takes "I am from the Public Security Bureau, your account is suspected of money laundering, please cooperate with the investigation" as an example, and the design of the present invention is based on a hybrid neural network model of LSTM and DNN.
[0050] exist figure 2 Among them, the model is divided into three layers. The first layer converts the words in the text into word vectors by using Word2Vec; the second layer is the LSTM layer, which inputs the word vectors generated by the first layer to the LSTM layer, and uses the LSTM algorithm structure. Calculate the impact of the previous and subsequent words on the current word, and finally convert each individual word vector into a sentence vector; the third layer is the DNN layer, and the sentence vector generated by the second layer is used as the input layer. After passing through the hidden layer, the softmax activation function is used. The output layer output...
Embodiment 3
[0052] Such as image 3 As shown, this embodiment specifically introduces the function of the Word2Vec algorithm in the present invention.
[0053]To transform the problem of natural language understanding into a problem that can be handled by the machine, the first step must be to digitize these symbols, that is, to map the expression of the text into a k-dimensional vector space. The Word2Vec algorithm converts Chinese words in the word-segmented corpus into word vectors. The word vectors trained by Word2Vec are as follows:
[0054] v i =(a 0 ,a 1 ,L,a d ) (1)
[0055] In formula (1), d is the dimension of the word vector.
[0056] The specific implementation process of the Word2Vec algorithm is as follows:
[0057] Step A61: Perform statistics on keywords in the phone text feature library, assuming there are m keywords;
[0058] Step A62: First use one-hot-vector to convert a word into an n-dimensional vector x, taking "arrears" as an example:
[0059] "Arrears" → ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com