Distributed search method and system

A distributed and indexing technology, applied in the field of computer communication, can solve the problems of slow index file speed and limited number of saved index files, and achieve the effect of improving retrieval speed and efficiency

Active Publication Date: 2012-05-02
NO 15 INST OF CHINA ELECTRONICS TECH GRP
View PDF6 Cites 28 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the case of massive data applications, due to the limited number of stand-alone management and saved index files, if the number of saved index files

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed search method and system
  • Distributed search method and system
  • Distributed search method and system

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0042] The core of the present invention is to adopt a distributed computing framework, which can call the CPU resources of the cluster in parallel to realize the construction and query of the distributed index. Further, in the technical solution of the embodiment of the present invention, a step-by-step webpage crawling method is also adopted to improve the webpage crawling speed.

[0043] The technical solutions of the embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. figure 1 The shown distributed retrieval system includes: collection node cluster, index node cluster and retrieval node 105 .

[0044] The collection node cluster includes a plurality of collection nodes 101, and each collection node 101 has a web crawler module, which is used to perform structural processing on the webpages after grabbing the webpages, such as extracting the time, title, content, etc. of the webpages. Host and other informat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed search method and a distributed search system. The method comprises that: a search node receives search conditions input by a user through a client browser, processes the search conditions to generate query tasks, and sends the query tasks to an index control node; the index control node sends the query tasks to index nodes in an index node cluster; the index nodes query index files stored in the nodes according to the received query tasks and return the query results to the index control node; the index control node returns the received query results to the search node; and the search node merges the received query results and sends the merged query result to a client. A distributed structure is adopted in the index node cluster, and the index nodes in the index node cluster can search and query the index files thereof during searching, so parallel search and query are realized, the search speed and the search efficiency are greatly improved, and the search result is timely returned to the user.

Description

technical field [0001] The invention relates to computer communication technology, in particular to a distributed retrieval technology. Background technique [0002] Information retrieval technology is one of the key technologies in modern information society. Information retrieval refers to the process and technology of organizing and storing information in a certain way, and finding the required information according to the information needs of information users, so the full name of information retrieval is also called "information storage and retrieval". With the rapid development of the Internet all over the world, digital information has exploded. At present, the main data source of the retrieval system is the web, and the retrieval technology of network information has become a development trend. Network information retrieval can be divided into the following parts: [0003] Data preprocessing: The main data source of network information is the web, and the formats ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 吴卫荣刘玉龙仪新宇徐华王团伟陈正中李志雄耿庆斌袁平杜善姗
Owner NO 15 INST OF CHINA ELECTRONICS TECH GRP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products