Semi-supervised automatic aspect extraction method and system based on domain information

A technology for automatic extraction of domain information, applied in special data processing applications, instruments, electrical digital data processing, etc.

Active Publication Date: 2014-07-02
SOUTH CHINA UNIV OF TECH
View PDF4 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Third, the opinion mining system can also process documents in multiple languages, which is difficult for ordinary people to meet the requirements of mastering multiple languages ​​in traditional methods
The main deficiency of the former is that it cannot classify semantically related terms describing the same aspect of the product, which makes this type of method unable to help users quickly and intuitively understand the characteristics of various aspects of the product in a structured way; for the latter, Most methods use an unsupervised learning method, which leads to the following sh

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semi-supervised automatic aspect extraction method and system based on domain information
  • Semi-supervised automatic aspect extraction method and system based on domain information
  • Semi-supervised automatic aspect extraction method and system based on domain information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0125] Such as figure 1 The general structure diagram shown and figure 2 The overall data flow diagram shown, a semi-supervised automatic aspect extraction method based on domain information, includes:

[0126] Network information crawling, crawling consumer comments on products of interest from e-commerce websites, as well as semi-structured product detail description information for products on e-commerce websites.

[0127] Information preprocessing, segmenting, part-of-speech tagging, removing stop words, and extracting feature words in the crawled comments.

[0128] Keyword extraction, which extracts the keywords of each aspect from the semi-structured product description information in the e-commerce website as the seed word set of the semi-supervised topic model, and obtains the domain experts defined by the e-commerce website and conforms to human cognition Habitual Item Aspect Classification as Prior Knowledge in a Semi-Supervised Approach.

[0129] where aspect cl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a semi-supervised automatic aspect extraction method based on domain information. The semi-supervised automatic aspect extraction method comprises the steps of network information crawling, information pre-processing, keyword extraction, comment document recombination and fine-grit mark LDA learning. The invention further discloses a semi-supervised automatic aspect extraction system based on the domain information. The semi-supervised automatic aspect extraction system based on the domain information comprises a network information crawling module, an information pre-processing module, a keyword extraction module, a comment document recombination module and a fine-grit mark LDA learning module. By the adoption of the semi-supervised automatic aspect extraction method and system based on the domain information, all extracted aspects of a commodity are more clear and more definite, and the differences between the aspects are more clear; a generated aspect structure (order and content) generated through the semi-supervised automatic aspect extraction method and system can be kept consistent with a commodity aspect structure which is predefined in a seed word set, so that the semi-supervised automatic aspect extraction method and system have the advantages that semantic clustering can be conducted on different expressions used by a consumer for description of the same commodity aspect, and human interference can be reduced in the process of opinion mining of the commodity.

Description

technical field [0001] The present invention relates to a commodity opinion mining technology, in particular to a semi-supervised aspect automatic extraction method and system based on domain information. Background technique [0002] With the increasing popularity of e-commerce, more and more consumers choose to buy goods and services online, especially in recent years, with the vigorous development of different models (B2B, B2C, C3C, etc.), various types of electronic products in different fields Business websites continue to emerge, competition continues to intensify, and user demands continue to improve. Manufacturers and sellers always try to obtain the public or consumers' evaluations of their products and services in a timely manner in order to improve product quality and sales; and potential consumers also want to know the current consumption before enjoying a service or buying a product. In order to choose the product that really suits you. Automated opinion minin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06Q30/02G06F17/30
Inventor 蔡毅王涛梁浩锋闵华清
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products