Method and device for text classification

A text classification and text technology, applied in the network field, can solve the problems of judgment errors in expressing different emotional tendencies, lack of related concept relations of indicative words, lack of syntactic information relations, etc., and achieve the effect of improving the accuracy of judgment

Inactive Publication Date: 2010-01-27
HUAWEI TECH CO LTD
View PDF0 Cites 74 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] It does not include the related conceptual relationship of the indicator words in the specific topic area of ​​the text, and also lacks the syntactic information about the words and the relationship between words of different parts of speech, which will cause certain errors in the judgment of expressing different emotional tendencies in different contexts

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for text classification
  • Method and device for text classification
  • Method and device for text classification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The embodiment of the present invention provides a text classification method, such as figure 1 shown, including the following steps:

[0032] Step s101, obtaining emotional feature words from the input text.

[0033] Step s102, according to the pre-constructed thesaurus, acquire the emotional tendency of the emotional feature words.

[0034] Step s103, classify the text according to the emotional tendency of the emotional feature words.

[0035] Below in conjunction with specific embodiment, to above-mentioned figure 1 Each step in the text classification method described in is described in further detail.

[0036] like figure 2 As shown, it is a flow chart of obtaining the emotional characteristic words from the input text and obtaining the emotional tendency of the emotional characteristic words in the embodiment of the present invention, including:

[0037] Step s201, given an arbitrary text d, first use a Chinese tokenizer to analyze and process the document ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

An embodiment of the invention discloses a method and a device for text classification. The method comprises: acquiring an affective characteristic word from an input text; acquiring an affective aptitude degree of the affective characteristic word according to a synonym storehouse constructed in advance; and classifying the text according to the affective aptitude degree of the affective characteristic word. The embodiment of the invention is used to acquire the affective aptitude degree of the affective characteristic word in the text for text classification according to the synonym storehouse constructed in advance and improves the accurate degree of judging the affective aptitude degree of the words.

Description

technical field [0001] The invention relates to the field of network technology, in particular to a text classification method and device. Background technique [0002] With the rapid development of communication technology and the popularization of the Internet, effective processing and filtering of Internet information has become an important research topic. [0003] The study of semantic orientation came into being under this background. The so-called semantic tendency of words is to calculate a measurement value for the degree of praise and criticism of words. For the convenience of statistics and comparison, the more commonly used practice is to specify the measurement value as a real number between [-1, 1]. If the measurement value is higher than a certain threshold, it is judged as a positive tendency; otherwise, it is judged as a derogatory tendency. In addition, the semantic orientation of a text can be obtained by averaging the semantic orientation values ​​of w...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/30
Inventor 佘莉张翼
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products