Vertical search based network data excavation method

A vertical search and network data technology, applied in the direction of electronic digital data processing, special data processing applications, instruments, etc., can solve the problems of less effective information, effective organization, a large amount of repeated information and garbage information, and achieve repetitive information and garbage information less data, comprehensive and in-depth data, and more timely content

Inactive Publication Date: 2008-03-12
NANJING UNIV OF FINANCE & ECONOMICS
View PDF0 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] (1) Information disorder leads to less effective information
Since most of the network data exists in the unstructured webpage text, it is difficult to effectively organize the network data, and there are a lot of repetitive information and spam information, forming strong noise, causing users to query information like looking for a needle in a haystack
[0006] (2) Information search lacks professional division, industry boundary and user range and level distinction, which cannot satisfy user's customizable and professional query

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Vertical search based network data excavation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] The present invention will be further described below in conjunction with drawings and embodiments.

[0019] Users need to conveniently and quickly search for specific professional information, and adopt the network data mining method based on vertical search described in the present invention, including the following steps:

[0020] (1) Data sampling based on vertical search, that is, collecting data from the Internet. Vertical search is a professional search tool specially produced for querying information on a certain industry or subject, mainly for identifying the subject of web pages on the Internet and a crawling program for web spiders. The classification of vertical search engines is more detailed, the data is more comprehensive and in-depth, and the content is more timely. It integrates certain types of specialized information in the web library, extracts the required data in a targeted manner, and then returns it to the user in some form. The important thing ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a network data mining method based on a vertical search. Firstly, the method of the vertical search is adopted to search data from a network; the information gained is preprocessed, the structured data after purging through the data is saved in a data base; an analysis is performed to the data in the data base to find a rule, therefore to construct a model, and a matching is performed to the feature vector of a collection and the feature vector of a target sample, herefrom, the degree of association of the relevant collection information is gained; a prediction is performed to an unknown data, and an evaluation is delivered by being compared with a actual result, therefore to perform a revision to an original model parameter, and the authoritative information is supplied to a user. The present invention adopts a network data mining method of a vertical search to gain the relevant information, the relevant specialized information can be effectively gained, the repeated information and the spam information are little, and the enquiry of a user specialization can be met.

Description

technical field [0001] The invention relates to a method for collecting network data, in particular to a method for mining network data based on vertical search. Background technique [0002] With the rapid development of network communication technology, the World Wide Web has become a huge distributed information space containing potentially valuable knowledge. Network data contains many useful, potential, but not easy to find knowledge and patterns. People urgently need Discover and master the methods and tools that enable access to these knowledge and patterns. Search engines collect and discover information in the Internet, understand, extract, organize and process information, and provide retrieval services for users, while network data mining uses data mining technology to mine useful patterns and hidden meanings from network data. information. So search engines provide data preparation for web data mining, which is an advanced application of search engines. [000...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/06G06F17/30
Inventor 曹杰章舜仲刘军
Owner NANJING UNIV OF FINANCE & ECONOMICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products