Distributed real-time search engine based on p2p

A distributed real-time, search engine technology, applied in the field of distributed real-time search engines, can solve the problems that the client cannot obtain the latest data immediately, the real-time data cannot be guaranteed, and the running speed is reduced, so as to shorten the search path and speed up the search rate , The effect of improving the speed of retrieval

Inactive Publication Date: 2013-08-21
广州市一呼百应网络技术股份有限公司
View PDF4 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The existing distributed search methods are mainly centralized, with a master-slave structure. A centralized server acts as the master node to manage all child nodes, and the master node distributes requests to other child nodes to realize distributed search. The master node uniformly provides retrieval services. This centralized search engine has the following disadvantages: (1) The real-time performance of the data cannot be guaranteed. Since the data is updated to the master node first, and then the master node is updated to the sub-nodes, so the data in the slave There will be a time interval when the master node is updated to the child node. If the client queries immediately when the master node updates the data, and the child node is queried at the same time, and the child node has not synchronized to the data at this time, then the newly updated That piece of data cannot be queried, so that the client cannot obtain the latest data immediately; (2) Since the entire system passes through the master node when indexing and searching, once the master node fails, the entire cluster is in an unsearchable state , causing the entire system to collapse. At the same time, when the network is busy, a large number of clients frequently visit, resulting in excessive pressure on the master node, and it is very easy to fail due to transition saturation. Even if no failure occurs, the master node will also be due to excessive pressure , running at reduced speed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed real-time search engine based on p2p
  • Distributed real-time search engine based on p2p
  • Distributed real-time search engine based on p2p

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] Below in conjunction with accompanying drawing, the present invention will be further described:

[0022] Such as figure 1 , a p2p-based distributed real-time search engine, its cluster contains several nodes, each node contains more than one index, and each index is divided into more than one index fragments, the index fragments only contain the main shards or both a primary shard and more than one replica; as in figure 1 A cluster containing three nodes is shown, which contains node 1, node 2, and node 3, and node 1, node 2, and node 3 all contain two indexes, index 1 and index 2, where index 1 is divided into three Fragmentation: Fragment 1, Fragment 2, Fragment 3. At the same time, the number of copies of each fragment of Index 1 is 1, and Index 2 is divided into three fragments: Fragment 1', Fragment 2', Shard 3', and the number of copies of each shard of index 2 is 1; index 1 in node 1 contains shard 1 (copy), shard 3, and index 1 in node 2 contains shard 2, sha...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed real-time search engine based on p2p. A cluster of the distributed real-time search engine comprises a plurality of nodes, each node comprises more than one index, each index is divided into more than one index fragmentation, and each index fragmentation only comprises a main fragmentation or simultaneously comprises a main fragmentation and more than one duplicate; the nodes are independent from one another, are connected with one another through peer-to-peer networks, and are communicated with one another in a broadcast or multicast mode, and each independent node stores a cluster index metadata sheet reflecting index information of all the nodes in the whole cluster. When the indexes are updated and searched, the nodes read the cluster index metadata sheets and send requests to the corresponding nodes to conduct indexing or search according to attributes of the requests. By means of the method, main nodes are omitted, a searching route is shortened, and the searching speed is improved. The distributed real-time search engine has high fault tolerance, so that even if one node breaks down, other nodes still can work normally, and no burden is caused on a system.

Description

technical field [0001] The invention relates to a search engine, in particular to a distributed real-time search engine based on p2p. Background technique [0002] The information on the Internet is growing explosively, and the information on the network is also updated in real time at a high speed. Therefore, how to obtain information accurately, quickly and in a timely manner has become the primary problem that search engines need to solve. [0003] The existing distributed search methods are mainly centralized, with a master-slave structure. A centralized server acts as the master node to manage all child nodes, and the master node distributes requests to other child nodes to realize distributed search. The master node uniformly provides retrieval services. This centralized search engine has the following disadvantages: (1) The real-time performance of the data cannot be guaranteed. Since the data is updated to the master node first, and then the master node is updated t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 戴森
Owner 广州市一呼百应网络技术股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products