Method for extracting text sentence features
A sentence and text technology, applied in the field of extracting text sentence features, can solve the problem of low accuracy of deduplication
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0057] Such as figure 1 As shown, a method for extracting text sentence features includes the following steps:
[0058] S110. Calculate the word vector according to the Self-Attention algorithm to obtain an attention sequence;
[0059] S120. Perform a convolution operation on the attention sequence to obtain a feature matrix;
[0060] S130. Output text sentence features using a maximum pooling algorithm on the feature matrix.
[0061] According to Embodiment 1, it can be seen that firstly, the balance operation is performed on the unbalanced samples. The balance operation technology in this method is to effectively combine undersampling and oversampling to obtain the word vector of the balanced sample, and perform Self-Attention algorithm calculation on the word vector. Perform convolution operation on the obtained attention sequence to obtain the feature matrix, and finally calculate the feature matrix according to the maximum pooling algorithm to output text sentence featu...
Embodiment 2
[0063] Such as figure 2 As shown, a method of extracting text sentence features, including:
[0064] S210. Calculate the word vector according to the Self-Attention algorithm to obtain an attention sequence;
[0065]S220. Obtain the weight of the word vector by using a dot product algorithm;
[0066] S230. Perform normalization processing on the weights;
[0067] S240. Output the attention sequence according to the weighted sum of the weight Key and the key value Value, and the calculation formula is:
[0068] where Q∈R n×dk ,K∈R m×dk , V ∈ R m×dv .
[0069] According to Example 2, the similarity between Query and each Key is calculated according to the dot product similarity function, and the weight of the word vector is obtained, and then the weight is normalized according to the Softmax function, and finally the weight and the corresponding key Value Weighted summation to get the attention sequence, the calculation formula is where Q∈R n×dk ,K∈R m×dk , V ∈ R ...
Embodiment 3
[0071] Such as image 3 As shown, a method of extracting text sentence features, including:
[0072] S310. Calculate the word vector according to the Self-Attention algorithm to obtain an attention sequence;
[0073] S320. Perform a convolution operation on the attention sequence to obtain a feature matrix;
[0074] S330. Process the attention sequence into an attention component according to the length of the convolution kernel;
[0075] S340. Perform a convolution operation on the attention component to output a feature matrix, and the calculation formula is: C=(c 1 ,c 2 ,...,c n-h+1 ), where c i is the attention component X i :i+h-1 features extracted after convolution operation.
[0076] In Embodiment 3, the attention sequence is processed as an attention component according to the length of the convolution kernel. For example, a convolution kernel with a length of h can divide the attention sequence into {X0:h-1, X1:h,..., Xi:i+h-1,...,Xn-h+1:n} style attention co...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com