Unlock instant, AI-driven research and patent intelligence for your innovation.

Deep layer web page data source sort management method based on query interface connection drawing

A technology for querying interface and webpage data, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve problems that do not involve new approaches, and achieve the effect of improving performance

Inactive Publication Date: 2011-06-01
束兰
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Classification and clustering are important methods for classification and management of data sources in data integration. At present, most research methods on query interfaces are limited to the mining of Web features and text information in classification and clustering methods, and no new methods are involved. The introduction of graph models Provides a new avenue for our research on the classification management of Deep Web data sources, which is an area that researchers have not set foot in

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep layer web page data source sort management method based on query interface connection drawing
  • Deep layer web page data source sort management method based on query interface connection drawing
  • Deep layer web page data source sort management method based on query interface connection drawing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0038] Embodiment one: see attached figure 1 to attach Figure 6 As shown, a deep webpage data source classification management method based on a query interface connection graph includes the following steps:

[0039] (1) Obtain a deep web page query interface form collection;

[0040] (2) automatically extract the feature value of the query interface form that step (1) obtains, and the feature value includes the name and attribute value of the form label;

[0041] (3) Construct form feature vectors, including constructing feature spaces LS and VS with the names and attribute values ​​of the extracted tags respectively, and constructing a corresponding feature vector for the feature sets formed by each form in LS and VS, thus obtaining collection of vectors;

[0042] (4) In the set of vectors obtained in step (3), for each vector, obtain the query interface connection graph about labels, attribute values, and combinations of labels and attribute values ​​through similarity ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention discloses a deep webpage data source classifying management method based on a query interface connecting picture. The method includes the following procedures: (1) acquiring the form assembly of a deep webpage query interface; (2) automatically extracting the characteristic values of the query interface form acquired in procedure (1) which comprises a form label name and attribute values; (3) constructing characteristic vectors of the form; (4) acquiring an associated adjacent matrix related with label, attribute value and the combination of label and attribute value, through the similarity comparison among vectors; (5) constructing a connecting picture of the query interface form assembly which can be expressed by the associated adjacent matrix; (6) utilizing a clustering method to cluster weighted undirected graphs; (7) acquiring the clustering result of the deep webpage data source. The present invention increases the automatic classifying management performance of the large-scale deep webpage data source through effectively constructing the query interface connecting picture of the deep webpage data source and combining with the graph mining.

Description

technical field [0001] The invention relates to an information automatic classification management method, in particular to a classification management method applied to deep webpage data sources. Background technique [0002] With the widespread application of network databases, the network is "deepening" at an accelerated rate. There are a large number of pages on the Internet that are dynamically generated by the background database. This part of the information cannot be directly obtained through static links, but can only be obtained by filling out forms and submitting queries. Since traditional web crawlers (Crawlers) do not have the ability to fill out forms, they cannot be obtained. page. Therefore, the existing search engines cannot search for this part of the page information, thus causing this part of the information to be hidden and invisible to the user. We call it Deep Web (Deep Web, also known as Invisible Web, Hidden Web). Deep Web is a concept correspondin...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 崔志明赵朋朋方巍
Owner 束兰