Order-preserving submatrix (OPSM) and frequent sequence mining based emotion classification method for e-commerce comments

A frequent sequence and sentiment classification technology, applied in semantic analysis, electronic digital data processing, marketing, etc., can solve the problems of weight difference, feature vector sparseness, affecting the accuracy of sentiment analysis, etc., to reduce scale, reduce time and space complexity degree of effect

Active Publication Date: 2017-11-17
山东云从软件科技有限公司
View PDF2 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

First of all, multiple words in online comments often express similar semantics, making the number of feature words in the corpus very large, so the feature vectors calculated by TF-IDF are very sparse, which affects the accuracy of sentiment analysis; secondly, TF-IDF is calculating The feature weight will be affected by the length of the sentence, and the network comments are long or short, resulting in emotional approximation and different lengths of sentences corresponding to the weights in the feature vector have different magnitudes; finally, the idea of ​​​​the TF-IDF algorithm is similar to the bag of words model ( Bag-of-Words), does not consider the word order information in the sentence, and the word order has an important impact on the semantics and emotional expression of the review text

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Order-preserving submatrix (OPSM) and frequent sequence mining based emotion classification method for e-commerce comments
  • Order-preserving submatrix (OPSM) and frequent sequence mining based emotion classification method for e-commerce comments
  • Order-preserving submatrix (OPSM) and frequent sequence mining based emotion classification method for e-commerce comments

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] The embodiments of the present invention will be further described below in conjunction with the accompanying drawings, but the implementation of the present invention is not limited thereto.

[0061] This example performs preprocessing operations on e-commerce network comment data, including removing blank lines and duplicate lines, and dividing it into training set, verification set, and test set. Then, word segmentation is performed on the preprocessed training set, verification set, and test set to obtain comment text data composed of word sequences. Then, using the semantic similarity calculation function of the sentiment dictionary and word vectors, the vector representation method of TF-IDF for synonyms is calculated, which overcomes the sparsity problem of traditional TF-IDF, and excavates the order-preserving sub-matrix pattern in the feature vectors corresponding to different comments That is, the OPSM feature, and the corresponding 0 / 1 vector is obtained, so ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an order-preserving submatrix (OPSM) and frequent sequence mining based emotion classification method for e-commerce comments. The method comprises the following steps: (1) performing pretreatment and Chinese word segmentation on the e-commerce comments; calculating to obtain a TF-IDF weight vector of synonyms; and then mining a local mode in the weight vector through a biclustering algorithm based on OPSM; (3) mining classification frequent phrase characteristics through an improved PrefixSpan algorithm, and meanwhile, improving the capacity for distinguishing emotion tendency by the frequent phrases through limitation such as word intervals; and (4) converting the characteristics mined in steps (2) and (3) into a 0/ 1 vector to be used as an input of a classifier, and thus obtaining the emotion classification result of the e-commerce comments. With the adoption of the method, the emotion classification characteristics of the e-commerce comments can be accurately mined, so that potential customers can know the goods evaluation information before buying, and meanwhile, the businessman can fully know the suggestions of the customers and accordingly improve the service quality.

Description

technical field [0001] The invention belongs to the field of natural language processing technology and emotion computing, in particular to an e-commerce comment emotion classification method based on order-preserving sub-matrix and frequent sequence mining. Background technique [0002] With the development of e-commerce, it has become a part of daily life for users to evaluate online shopping products on e-commerce platforms. How to use machine learning and natural language processing technology to analyze comment texts and obtain opinion tendencies and emotional polarity has become an important research issue in the field of artificial intelligence. The techniques commonly used in text sentiment analysis are divided into rule-based methods and statistical-based methods. The rule-based method mainly starts from the perspective of linguistics, using artificial dictionaries and templates for sentiment analysis (Xu et al., 2008). Statistics-based methods start from the pers...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/27G06Q30/02
CPCG06F16/3344G06F40/284G06F40/30G06Q30/0201
Inventor 黄佳锋马志豪陈鑫卢昕薛云胡晓晖
Owner 山东云从软件科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products