An Implicit Feature Recognition Method Based on Word Vector Model

A feature recognition and word vector technology, applied in the field of data mining, can solve problems such as information that cannot express semantics, and achieve the effect of improving accuracy.

Active Publication Date: 2020-08-04
康旭科技有限公司
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The method proposed by Wei Wang et al cannot express semantic information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An Implicit Feature Recognition Method Based on Word Vector Model
  • An Implicit Feature Recognition Method Based on Word Vector Model
  • An Implicit Feature Recognition Method Based on Word Vector Model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0044] The implicit feature recognition method based on the word vector model of the present invention is mainly used to identify the implicit features in product review sentences. In this embodiment, the mobile phone product reviews captured on Taobao are taken as an example for illustration.

[0045] Such as figure 1 As shown, an implicit feature recognition method based on a word vector model in this embodiment includes the following steps:

[0046] (1) Grab the comment data of mobile phone products from the website (Taobao in this embodiment), form a training corpus S, and preprocess the training corpus S.

[0047] The training corpus S is preprocessed, including segmentation of comment sentences, Chinese word segmentation (including part-of-speech tagging), stop word filtering, and deletion of unpunctuated sentences. The preprocessed comment statement is as follows:

[0048] Very / d satisfied / v. / w first / c say / v once / m mobile phone / n itself / r, / w and / c describe / v ar...

Embodiment 2

[0108] In order to improve the accuracy of recognition, this embodiment proposes a method for correcting the recognition of implicit features based on specific contexts. The specific implementation process is the same as that of Embodiment 1. The difference is that in step (5-3), for any hidden feature If the previous clause of the implicit feature clause is an explicit feature clause, when calculating the mapping vector of the implicit feature clause in operation (a), the previous explicit feature clause The attribute word in the sentence is added to the implicit feature clause as one of the words.

[0109] When identifying, proceed one by one, and identify each clause in each sentence in turn, specifically as image 3 As shown, it is assumed that the comment clauses include the following clauses in turn, which are the explicit feature clause i, the implicit feature clause i+1, ..., the explicit feature clause n, where the explicit feature clause i corresponds to Based on th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses an implicit feature recognition method based on a word vector model. The method comprises the following steps: obtaining a training corpus, preprocessing the training corpus and then constructing a corresponding emotion word dictionary and an attribute word dictionary; for the preprocessed training corpus, using the word vector model to form a total dictionary, and solving the word vector of each word in the total dictionary to form a corresponding word vector matrix, wherein each row of the word vector matrix corresponds to a word vector of one word in the total dictionary; according to the word vector matrix, setting parameter matrices of the input layer to the mapping layer and the mapping layer to the output layer in the word vector model to obtain a trained word vector model; and using the trained vector model to carry out implicit feature recognition on each implicit feature clause in the to-be-analyzed corpus. According to the method disclosed by the present invention, starting from understanding the sentence semantics, the word vectors are used to represent the semantic information of the words, and the word vector model is used to recognize the attribute words of the implicit feature clauses, so that the recognition accuracy is improved.

Description

technical field [0001] The invention relates to the field of data mining, in particular to an implicit feature recognition method based on a word vector model. Background technique [0002] Most of the current research focuses on the identification of explicit evaluation features, but there is little research on implicit evaluation features, especially in the Chinese language environment. Implicit feature recognition was proposed by Hu and Liu in the paper "Mining and summarizing customer reviews". At present, there are mainly the following two methods: the first method uses word co-occurrence to calculate the weight of feature words-opinion word phrases to obtain a rule set, and then uses the rule set to identify implicit features. The second method seeks clues of implicit features, and identifies implicit features by establishing a mapping relationship between clues and features. There are mainly two types of implicit feature clues: the first is the traditional method, u...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/289G06F40/247G06F40/284G06F16/953
CPCG06F16/951G06F40/247G06F40/284G06F40/289G06F2216/03
Inventor 张宇姚奥
Owner 康旭科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products