Short text feature extraction method based on multi-feature factor fusion

A feature extraction and short text technology, which is applied in the field of short text feature extraction based on the fusion of multiple feature factors, can solve the problems of not considering the front and rear positions and its own part of speech features
CN109977206APending Publication Date: 2019-07-05NORTHWEST UNIV

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
NORTHWEST UNIV
Publication Date
2019-07-05

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention relates to a short text feature extraction method based on multi-feature factor fusion, which comprises the following steps of: carrying out word segmentation and stop word removal processing on short text comments through a conjunctive word segmentation tool so as to construct a preliminary text feature word vector matrix; combining a traditional TF-IDF algorithm to carry out weightcalculation on the constructed feature word vector matrix by using an IDF obtain a weight vector matrix; introducing a feature word position influence factor and a part-of-speech feature factor, carrying out part-of-speech tagging on the preliminary text feature words one by one, and calculating the sum value of each feature word; multiplying the obtained sum values with a weight value corresponding to the conventional TF-IDF algorithm, to finally obtain a weight vector matrix of the optimized TF-IDF algorithm. According to the technical scheme provided by the invention, a word weight imbalance problem of the traditional TF-IDF algorithm can be solved to a certain extent, so that the text characteristic extraction accuracy is improved, and effective help is provided for emotion classification tasks.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to the technical field of text mining, in particular to a short text feature extraction method based on the fusion of multiple feature factors. Background technique

[0002] With the advancement of the Web3.0 era, Internet information has been increasingly integrated into people's lives. A large number of users express their opinions on an event or product on the Internet, and these comment information will greatly affect people's thinking and behavior under the time effect. At the same time, these comment information includes people's various emotional attitudes and emotional information, such as happiness, anger, sadness, joy, sadness or positive, neutral, and negative. Based on these comment information, other users can learn about group users' comments and opinions on an event or product through the network platform, so this information has huge potential mining value. In addition, during the rapid development of the Interne...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More