Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Emotion dictionary construction method in field of automobile product based on word2vec

A technology of emotional dictionary and construction method, applied in the direction of semantic tool creation, unstructured text data retrieval, special data processing application, etc., can solve the problems of emotional new word recognition, low efficiency of emotional dictionary, poor field applicability, etc., and achieve convenience The effect of emotional orientation

Pending Publication Date: 2019-11-12
TIANJIN UNIV
View PDF3 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0019] The present invention is oriented to Internet automobile vertical websites, combines text mining, data processing and word2vec technology, builds a kind of emotion dictionary construction method based on word2vec in the field of automobile products, solves the problem of low efficiency of manual construction of emotion dictionary and excessive dependence on WordNet and HowNet semantic knowledge library, narrow coverage, poor domain applicability, and emotional new word recognition, etc., see the description below for details:

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Emotion dictionary construction method in field of automobile product based on word2vec
  • Emotion dictionary construction method in field of automobile product based on word2vec
  • Emotion dictionary construction method in field of automobile product based on word2vec

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0051] The embodiment of the present invention provides a kind of sentiment dictionary construction method based on word2vec automobile product field, see figure 1 , the method includes the following steps:

[0052] 101: Use MRQ (Python's distributed task queue based on Redis, Mongo and gevent) distributed data collection mechanism to capture user comment modules of multiple automobile vertical websites, and store them in a tree structure of "website-model-word-of-mouth" into the postgresql database;

[0053] 102: Extract the word-of-mouth data in the database. The word-of-mouth data includes "most satisfactory point", "least satisfied point", "space", "power", "handling", "fuel consumption", "comfort", " Appearance", "Interior" and "Cost-effective" ten parts, select the "most satisfactory point" and "least satisfied point" in the word-of-mouth data, and preprocess the selected data: remove abnormal comments, and use punctuation marks as cutting points Cutting and turning lo...

Embodiment 2

[0060] The scheme in embodiment 1 is further introduced below in conjunction with specific calculation formulas and examples, see the following description for details:

[0061] 201: Obtain user word-of-mouth data of multiple automobile vertical websites through data capture and store them in the database;

[0062] Wherein, the step 201 is specifically:

[0063] 1) Through the python language based on the MRQ distributed data collection mechanism, write programs for the automobile vertical websites to be captured, capture the source code of the required information webpage, and analyze the source code of the webpage based on regular expressions to obtain the specific vehicle model information and the vehicle type Comment data below.

[0064] The data captured above includes website links, web page source code, user information, car model information, posting time, etc., among which are mainly user word-of-mouth data. User word-of-mouth data is further classified into: "Most ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an emotion dictionary construction method in the field of automobile products based on word2vec. The method comprises: using a python directional crawler technology for pickingup automobile vertical website user comment data, and obtaining structural data with a website-vehicle model-public praise as a main body through analysis and stored in a postgresql database; carrying out data cleaning on the public praise data in the database, including data missing value processing, noise data cleaning, simplified and complicated conversion and the like; selecting part of textsfor emotion labeling to serve as a training set, 1 being the positive direction, and 0 being the negative direction; carrying out model training by utilizing word2vec, and importing a large batch oftext data into a training model to carry out similarity calculation; further expanding the initial sentiment dictionary based on the existing sentiment dictionary; and outputting the text to obtain anemotion dictionary, and finally performing artificial reverse emotion word supplementation. According to the invention, the problem that the emotion dictionary constructed based on manpower and a knowledge base method is inaccurate in analysis when used for processing emotion analysis in the field of automobiles is solved.

Description

technical field [0001] The invention relates to the recognition field, in particular to a word2vec-based sentiment dictionary construction method in the field of automobile products. Background technique [0002] On February 28, 2019, China Internet Network Information Center (CNNIC) released the 43rd "Statistical Report on Internet Development in China" in Beijing [1] . As of December 2018, the number of netizens in my country reached 829 million, and 56.53 million new netizens were added throughout the year. The Internet penetration rate was 59.6%, an increase of 3.8 percentage points compared with the end of 2017. With the rapid development of the scale of Internet users, data is also produced and accumulated rapidly with a blowout. "Technology prophet" Kevin Kelly said that by 2050, the total amount of global data will reach the astronomical level of 1 million ZB. [0003] With the popularity of the Internet, users can express their emotions and attitudes through the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/36G06F16/33G06F17/27
CPCG06F16/374G06F16/3344
Inventor 汪金亮郭伟邱泽成
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products