Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for identifying data

A technology for identifying data and data, applied in the field of identifying data, it can solve the problems of low data recognition accuracy and inability to determine the degree of emotion, and achieve the effect of improving accuracy

Pending Publication Date: 2020-12-08
BEIJING JINGDONG ZHENSHI INFORMATION TECH CO LTD
View PDF5 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The words in a word pair may not belong to the same topic, or they may be words with opposite emotional polarities, and only the emotional polarity can be identified according to the word pair, and the degree of emotion cannot be determined
Therefore, the noise of the word pairs causes the prior art to have the problem that the accuracy of data recognition is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for identifying data
  • Method and device for identifying data
  • Method and device for identifying data

Examples

Experimental program
Comparison scheme
Effect test

example 2

[0152] In this step, during specific implementation, such as Figure 4 As shown, the positive word set includes positive evaluation words (Chinese), positive evaluation words (English), positive emotional words (Chinese) and positive emotional words (English); the negative word set includes negative evaluation words (Chinese), negative evaluation words ( English), negative emotional words (Chinese) and negative emotional words (English). Positive word sets and negative word sets may also include degree-level words. The set of positive words and the set of negative words can be obtained by downloading HowNet (referring to the sentiment dictionary) from the Internet and decompressing it. The following two specific examples illustrate this step: Example 1: There are 10 words in the target data to be recognized, of which, 8 words appear in the positive word set, 2 words appear in the negative word set, 8-2=6, the target to be identified The sentiment value of the recognition dat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and device for identifying data, and relates to the technical field of computers. One specific embodiment of the method comprises the steps of performing word segmentation on multiple pieces of to-be-identified data to obtain to-be-identified words to generate a word vector set and a word frequency set of the to-be-identified words; matching a word vector and a word frequency of the target to-be-identified data from the word vector set and the word frequency set of the to-be-identified words respectively; wherein the target to-be-identified data is any one of aplurality of pieces of to-be-identified data; inputting the word vector of the target to-be-recognized data into a pre-trained recognition model to obtain a theme and emotion of the target to-be-recognized data; and obtaining an emotional tendency value of the target to-be-identified data according to the word frequency of the target to-be-identified data, the positive word set and the negative word set. According to the embodiment, the accuracy of data identification is improved.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a method and device for identifying data. Background technique [0002] Existing techniques for identifying data include: unsupervised Bayesian models, short text topic models, and short text sentiment topic models. Unsupervised Bayesian models and short text topic models cannot recognize the sentiment of data, but short text sentiment topic models can recognize the sentiment of data. [0003] In the course of realizing the present invention, the inventor finds that there are at least the following problems in the prior art: [0004] The short text emotional theme model is to obtain word pairs based on training data, and then obtain a model based on word pairs, so as to use the model for recognition. The words in a word pair may not belong to the same topic, or they may be words with opposite emotional polarities. Moreover, only the emotional polarity can be identified accord...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35
CPCG06F16/3335G06F16/35Y02D10/00
Inventor 程翔
Owner BEIJING JINGDONG ZHENSHI INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products