System and method for analyzing tendency of short text

A oriented, short-text technology, applied in the field of information processing and analysis, can solve problems such as inaccessibility

Active Publication Date: 2012-07-04
ZHONGKE DINGFU BEIJING TECH DEV
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The more classic algorithm is the PMI-IR (Point Mutual Information-Information Retrieval) algorithm. This processing method is effective for long texts such as news. For short texts, due to the short length of the text, the number of tendentious words does not appear often. Statistical methods often fail to yield good results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for analyzing tendency of short text
  • System and method for analyzing tendency of short text
  • System and method for analyzing tendency of short text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] figure 1 A schematic diagram of the short text tendency analysis system is given.

[0045] The system includes three modules: user input module 101 , tendency recognition 102 , and tendency output 103 .

[0046] The user inputs the object of attention through the module 101, and the user input may come from an input box of a web page, or from a dialog box of the client.

[0047] The user obtains the tendency output result through module 103, and the output result can be displayed as a web page, or output to the client for display, or used as input for other analysis systems.

[0048] In the 102 module, there are two parts:

[0049] The analysis and identification part includes object retrieval 111 , orientation feature identification 112 , sentence orientation identification 113 , and text orientation identification 114 .

[0050] The knowledge base part includes a word tendency database 122 and a field tendency pattern database 123 .

[0051]

[0052] figure 2 ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

Disclosed are a system and a method for analyzing the tendency of a short text by use of Chinese information semantic processing technology. The system comprises a user input module 101, a tendency identification module 102 and a tendency output module 103. The method involves two parts including an identification part and a tendency knowledge base. The identification part involves four steps including object retrieval 111, tendency characteristic identification 112, sentence tendency identification 113, and text tendency identification 114. The tendency knowledge base includes a word and expression tendency database 122 and a field tendency pattern database 123. The field tendency pattern database 123 provides the semantic pattern of tendency expressions in the whole field, with a field as unit, and the semantic pattern expression adopts such a format that a semantic attribute plus an attribute value leads to a tendency. The sentence tendency identification module 113 performs semantic structure analysis to the input sentence to give the semantic structure of the sentence and further give the numerical value set of tendency. The text tendency identification 114 performs accumulative calculation to the result of all sentences and provides a final tendency value.

Description

technical field [0001] The present invention relates to information processing and analysis technology, and more specifically, relates to a system and method for analyzing tendencies expressed in short text content using Chinese information processing technology. Background technique [0002] With the development of the Internet, more and more user-generated content (User Generated Content) appeared on the Internet. After the emergence of BBS forums, especially Weibo, a large number of UGC content on the Internet are short texts (the number of words in Weibo is limited within 140), when users express short texts, they often express their own tendencies clearly (likes for products, attitudes towards events, etc.), which is of great significance to Internet information monitoring, information processing and analysis . [0003] The more commonly used method of text tendency analysis is to use statistical methods to conduct statistical analysis on the tendency words that appear...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/30
Inventor 不公告发明人
Owner ZHONGKE DINGFU BEIJING TECH DEV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products