A method and system for improving the efficiency of a big data comprehensive inquiry engine based on Hadoop

A query engine and data query technology, applied in the field of search engines, can solve the problems that the advantages cannot be integrated and applied, and achieve the effect of improving the efficiency of big data query, reducing business code refactoring, and reducing the impact of repulsion

Inactive Publication Date: 2019-03-08
SHANGHAI PAIBO SOFTWARE CO LTD
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, the Hadoop-based big data ecosystem is becoming more and more prosperous, especially the continuous update and iteration of the query computing engine. There are many differences in the computing engines for different scenarios and businesses, resulting in the inability of various computing advantages to be combined on one platform and multiple converged application

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for improving the efficiency of a big data comprehensive inquiry engine based on Hadoop
  • A method and system for improving the efficiency of a big data comprehensive inquiry engine based on Hadoop

Examples

Experimental program
Comparison scheme
Effect test

Embodiment example

[0042]First, prepare 14 centos 6.5 machines, configured as 8-core 32G 4T hard drives, each machine must first check the mapping files of all nodes in the Linux system, and comment out 127.0.0.1 and ::1 and add under it: 127.0.0.1localhost, HDP resource (uploaded to the internal cloud resource machine, the default is the machine where the ambari-server is located).

[0043] Because there is a single point of failure (SPOF) in the NameNode in the HDFS cluster, for a cluster with only one NameNode, if the NameNode machine has an unexpected downtime, the entire cluster will be unavailable until the NameNode is restarted. The HA function of HDFS solves the above problem by configuring Active / Standby two NameNodes to implement hot backup of NameNode in the cluster. If Active NN downtime occurs, it will switch to Standby to make NN service uninterrupted; HDFS HA ​​relies on zookeeper, so Need to edit and configure zookeeper and modify hadoop configuration.

[0044] The Hadoop core-s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and system for improving the efficiency of a big data comprehensive inquiry engine based on Hadoop, which utilizes the advantages of each calculation engine and avoidsthe disadvantages of each engine through technical means to achieve the effect of improving the efficiency of the big data inquiry. In the interface required for real-time stream processing, the invention switches the engine to Spark, and then switches the task to batch processing after the task is finished, which greatly reduces the repulsive influence of the new computing engine on the previouscomputing engine, and reduces the service code reconfiguration of the developer, especially relates to the complex service calculation on the old computing engine. The intelligent switching of the calculation engine of the invention improves the comprehensive inquiry efficiency of the big data and improves the adaptability of the business scene.

Description

technical field [0001] The invention belongs to the technical field of search engines, and in particular relates to a method and system for improving the efficiency of a Hadoop-based big data comprehensive query engine. Background technique [0002] With the rapid development of the Internet, people have become more and more dependent on the Internet to obtain information, and the emergence of search engines has built a bridge between people and massive network information; however, with the rapid increase of network users and the exponential growth of network information , the network traffic increased sharply, and the traditional centralized search engine appeared a bottleneck. Take the data generated on the Internet as an example. In Facebook, the amount of new data processed every day exceeds 20TB. With the continuous increase of Facebook users, the data to be processed will become larger. Facing such a large amount of traditional storage data, Distributed storage is to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/951
Inventor 欧阳涛
Owner SHANGHAI PAIBO SOFTWARE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products