Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for extracting attribution and comment word with template based on internet

An Internet and attribute technology, applied in the field of concept attribute and comment word extraction, can solve problems such as inability to equate and search engines to understand perceptually

Inactive Publication Date: 2010-05-05
SHANGHAI SECOND POLYTECHNIC UNIVERSITY
View PDF0 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When people see the word apple, they know that it refers to the round, watery, and delicious food. However, search engines can't understand it emotionally, and they can't compare apples with "round" and "with water". Water's ", "very delicious things" are equivalent
Therefore, in the face of the massive amount of information on the Internet, it is really difficult to find the answer you want directly in the search engine.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for extracting attribution and comment word with template based on internet
  • Method for extracting attribution and comment word with template based on internet
  • Method for extracting attribution and comment word with template based on internet

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Taking Visual C++6.0 as the experimental environment, the method of extracting concept attributes based on the Internet is verified through experiments to verify its feasibility and effectiveness.

[0031] First, three text files are provided by two language experts: first, several templates configured by several websites reviewed by the language experts; second, the marked candidate attributes are given according to the product entity, as the data training set train, and finally One chose 17 attribute flag affixes. Create four experimental modules:

[0032] The first is to compute the feature values ​​of the classifier. According to the artificially given template ( figure 1 ) and the manually marked attribute dictionary, calculate the PMI feature value based on the Internet, and display it in the format "f_1:0.000319". (D+I) and Hits(I) are saved so that they can be obtained from the cache next time without traversing the Internet. Obtain another characteristic va...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for extracting an attribution and a comment word with a template based on internet, which comprises the steps of: memorizing an attribute template aiming at a concept, the classification of which is manually labeled, by a machine readable dictionary; checking the evaluation index of a classification algorithm, selecting the maximum entropy, and ensuring a PMI value and an attribution word when training a classification model; extracting an original attribution set by collocating a template based on the internet, filtering the attribution based on a classification rule, expanding an attribution set with a connecting phrase template based on the assumption of Resnik, filtering again to form a process of cyclic iteration, evaluating the performance of the attribution extracting method according to the precision, the recall ratio and the comprehensive index F to obtain a relationship curve graph of the precision and the recall ratio; and extracting an individal comment word to the product attribution based on the product attribution extracted by the internet to form an effective attribution-evaluation pair, judging good evaluation and bad evaluation according to the word characteristic of an evaluation adjective in evaluation information, and generating a market feedback comprehensive value.

Description

technical field [0001] The invention relates to the field of information retrieval, in particular to a method for extracting concept attributes and comment words. Background technique [0002] In all surveys around the world, search engines are the most used services on the Internet after e-mail. Search engine services can become the most popular service because it solves the bottleneck problem of users quickly locating information on the vast Internet. However, it still takes a lot of energy and time for users to search for information in the traditional way. This traditional way is just a simple symbol processing. Computers are different from human brain thinking. Humans can directly understand the meaning of words and the ideas of articles, while machines and algorithms cannot. When people see the word apple, they know that it refers to the round, watery, and delicious food. However, search engines can't understand it emotionally, and they can't compare apples with "roun...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
Inventor 吴月萍
Owner SHANGHAI SECOND POLYTECHNIC UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products