Method and equipment for searching name related to thematic word from network

A technology of subject words and names, which is applied in the field of finding names and devices related to subject words from the Internet, which can solve problems such as difficult to read and analyze, unregistered products cannot be well supported, and time-consuming problems, etc.

Active Publication Date: 2013-06-12
RICOH KK
View PDF7 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the problem is that a web page is a kind of semi-structured data, including a lot of irrelevant information, so that it is very difficult to be read and analyzed by machines
Patent document 2 uses general keywords in specific categories, such as famous trademarks, etc., which cannot support unregistered products well. At the same time, it takes a lot of time to label different product data types

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and equipment for searching name related to thematic word from network
  • Method and equipment for searching name related to thematic word from network
  • Method and equipment for searching name related to thematic word from network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] Embodiments of the present invention are described below in conjunction with the accompanying drawings.

[0038] figure 1 An example application of implementing an embodiment of the present invention to find names related to subject headings from the web and sort the output is shown. Such as figure 1 As shown, if the subject word "digital camera" to be queried is input in the area shown in the rectangular box Q1, then by implementing the embodiment of the present invention, the relevant product names can be found, and after sorting, it is shown in the rectangular box Q2 such as area output.

[0039] figure 2 It schematically shows an application for implementing an embodiment of the present invention to find names related to a subject word from a network and sort the output. The input is a topic, that is, a category name. The embodiment of the present invention implements the process of searching for related names, such as obtaining web pages from the Internet, fin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for searching names related to thematic words from a network. The method comprises the following steps of: searching web pages related to the thematic words in the network and filtering and analyzing the web pages; according to image nodes in the DOM (Document Object Model) tree of each web page, extracting image names and matching the image names with surrounding texts to form a candidate name first set; converting the DOM trees of the web pages into code sequences, determining repeated subsequences in the code sequences, and extracting candidate names corresponding to the repeated subsequences from the candidate name first set of the web pages to form a candidate name second set; filtering names in the candidate name second set according to preset rules and preset templates to determine the candidate names of the web pages; aiming at the candidate names from multiple web pages, filtering the candidate names according to a relationship between the candidate names or between the candidate names and the thematic words to obtain the names related to the thematic words; and calculating the score of each name and ordering the names according to the scores. The invention correspondingly provides equipment for searching the names related to the thematic words from the network.

Description

technical field [0001] The invention relates to a method for finding names related to a subject word from a network and a device for finding names related to a subject word from the network. Background technique [0002] With the development of computer and network technology, the demand for finding useful information from network resources is also increasing. Product review, ranking, and description pages abound on the Internet. In many cases (product research, market analysis, and strategy formulation), it is desirable to find the exact relevant product names for a particular topic, such as automatically finding all results for a category online. Such names exist on a large scale and change dynamically on the Internet. From a human point of view, identifying and categorizing these names from web pages is not a big problem, but it is very time consuming. In addition, users searching for a name often want to know product ranking information, such as which product is the m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 谢宣松姜珊珊孙军郑继川
Owner RICOH KK
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products