Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for identifying substance features of customer reviews

A technology of user comments and entities, applied in the computer field, can solve problems such as lack of accuracy, low accuracy, and real-time update of manually defined entity features, achieving high accuracy and wide application range

Inactive Publication Date: 2013-03-13
XIDIAN UNIV
View PDF3 Cites 22 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The method of manually defining entity characteristics requires the participation of experts in this field for entities in each field, so it is not portable
At the same time, manual definition of entity features does not have the function of real-time update with the addition of new product features
Automatic recognition mainly uses natural language technologies such as part-of-speech tagging, syntactic analysis, and text mode to analyze the sentences in entity comments, and automatically discover entity features from them. It has good versatility and portability, but the disadvantage is that the accuracy rate is not high. Moreover, most of the current automatic extraction technologies are researched on English user comments, and there is still a lack of methods for identifying the entity characteristics of Chinese online comments.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for identifying substance features of customer reviews
  • Method for identifying substance features of customer reviews
  • Method for identifying substance features of customer reviews

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0045] Embodiment 1: Using the entity feature method for identifying user comments in the present invention, a test is carried out on the real data of a mobile phone network comment.

[0046] refer to figure 1 , the implementation steps of this example are as follows:

[0047] Step 1. Select 15,000 user comment data of a hot-selling mobile phone from Dangdang.com and Joyo.com from September 2011 to September 2012 as the training set, and use the Chinese word segmentation tool ICTCLAS for Chinese word segmentation, that is, each of the 15,000 user comments The comment data is divided into individual words, and then the second-level part-of-speech tagging is performed on the word-segmented data as the comment corpus. The words of the second-level part-of-speech tagging include adjectives, verbs, proper nouns, and emotional words.

[0048] Step 2, based on the association rule classification method CBA to extract frequent itemsets from the above review corpus.

[0049] (2a) Use...

Embodiment 2

[0086]Embodiment 2: Using the entity feature method for identifying user comments in the present invention, a test is carried out on the real web comment data of a certain tablet computer.

[0087] refer to figure 1 , the implementation steps of this example are as follows:

[0088] Step 1: Select 20,000 user comment data of a hot-selling tablet computer from Dangdang and Joyo from September 2011 to September 2012 as the training set, and use the Chinese word segmentation tool ICTCLAS for Chinese word segmentation, that is, every 20,000 user comments A piece of comment data is divided into individual words, and then the second-level part-of-speech tagging is performed on the word-segmented data as the comment corpus. The words of the second-level part-of-speech tagging include adjectives, verbs, proper nouns, and emotional words.

[0089] Step 2, based on the association rule classification method CBA, extract frequent itemsets from the above-mentioned review corpus.

[0090...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for identifying substance features of customer reviews, which is mainly used for solving the problem of poor accuracy of the conventional automatic substance feature extraction technology on Chinese customer reviews. The method comprises the following implementation steps of: sampling an appropriate quantity of reviews serving as a training set, and performing secondary part-of-speech tagging on the training set; excavating a frequent item set based on a Classification Based on Association (CBA) method; and correcting centralized substance features in the frequent item set according to a dependence relationship to form a candidate substance feature set, performing reliability judgment, removing unreliable substance features, combining synonymic substance features through lexical meaning similarity computation to form a substance feature adjacency set, and screening valuable customer reviews from the substance feature adjacency set to prevent customers from reading a large amount of invaluable information and realize specific lookup and reading of certain substance features for the customers. The method is suitable for identifying the substance features of Chinese network reviews, and has the advantages of high accuracy, portability, simple structure and convenience in implementing.

Description

technical field [0001] The invention belongs to the field of computer technology, relates to data mining and natural language processing, and can be used to identify the main entity features of user comments. Background technique [0002] With the rapid development of information technology, the Internet, as the fourth type of digital media, has become the largest communication media in the world today. People use the Internet for distance learning, use the Internet to understand current social events, and also express different views on a certain point of view in a certain topic through the Internet, such as feedback on network teaching suggestions or opinions on current events. [0003] More and more online public opinion hotspots show that the Internet has become an important channel for Chinese netizens to participate in society. According to data released by the China Internet Network Information Center a few days ago, 56% of Internet users often post comments on the I...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27
Inventor 黄健斌康剑梅慕鹏赵贝贝耿霄孙鹤立
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products