Index segmenting equalization based big data cloud search platform and method thereof

A technology of index sharding and cloud search, applied in the field of big data information search, which can solve the problem of difficult allocation of index shards

Active Publication Date: 2017-03-22
深圳市盛凯信息科技有限公司
View PDF5 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The present invention aims to solve the problem that in the big data cloud search platform based on the Apache Luc

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Index segmenting equalization based big data cloud search platform and method thereof
  • Index segmenting equalization based big data cloud search platform and method thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] The technical solutions of the present invention will be further specifically described below through examples.

[0052] figure 2 It is a schematic diagram of the architecture of the big data cloud search platform based on index slice balance according to the present invention. The index fragmentation balanced big data cloud search platform includes:

[0053]Apache Lucene engine unit 1, this unit is based on the search engine architecture of Apache Lucene, including analyzers, index writers and query engine modules. The Apache Lucene engine unit converts various types of data source files such as web pages, Word documents, and PDF documents into source text data and provides them to the analyzer. Analyzers convert source text data into tokens that are subsequently added to the index as "Terms" in the index. The index writer is responsible for generating and managing the index, and saves the token converted by the analyzer in the data structure of the index; the inde...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an index segmenting equalization based big data cloud search platform and a method thereof, which can solve the problem that in an Apache Lucene engine based big data cloud search platform, it is difficult to reasonably and efficiently allocate index segments among nodes of a cluster. The big data cloud search platform and method can perform index segment allocation based on the load equalization principle, can allocate index segments with high content relevancy to different nodes to be supported according to the content relevancy of the allocated index segments. The big data cloud search platform and method can achieve equalization allocation of computing loads due to the fact the index segments are queried and invoked, among all the nodes, can void the problem of delay due to overload of a part of nodes in the cluster, and can suppress the phenomenon that a part of nodes are too idle.

Description

technical field [0001] The invention relates to a big data information search technology implemented by applying a cloud computing platform, in particular to a big data cloud search platform and method based on index slice balance. Background technique [0002] In the big data era of network information explosion, it is a common demand to build efficient, easy-to-use, and accurate search functions and platforms. Not only professional search sites such as Google and Baidu need to be continuously optimized and upgraded, but even in the services of ordinary portals, forums, social networking or business websites, it is also hoped to embed powerful, resource-saving and easy-to-implement intranet and network-wide Search tools for the convenience of target customers. [0003] Apache Lucene is an open source, highly scalable search engine architecture, focusing on the indexing and searching of network information, and can build search functions for various websites and application...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/316G06F16/328G06F16/3331G06F16/84G06F16/951
Inventor 蔡叙明
Owner 深圳市盛凯信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products