Method for automatically creating Solr index file by Hbase data

A technology for creating indexes and index files, which is applied in the field of big data, and can solve problems such as the inability to quickly search and retrieve fields, the inability to realize paging display and page-by-page query, etc.

Inactive Publication Date: 2015-04-08
LANGCHAO ELECTRONIC INFORMATION IND CO LTD
View PDF2 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since HBase only sorts rowkeys, HBase cannot implement fast search and retrieval for fields other than rowkeys
At the same time, HBase cannot realize query-based paging display and page-by-page query

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for automatically creating Solr index file by Hbase data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The content of the present invention is described in more detail below:

[0020] Hive establishes the relationship between the appearance and HBase. Using the DIH (DataImportHandler) tool provided by Solr, through the jdbc connection of hive, the automatic indexing of Hbase data can be completed through configuration, without the need for separate coding development.

[0021] The present invention adopts the method based on Solr+HBase+Hive, and can complete the automatic index creation work for the data in the HBase through configuration. By creating a Hive table and associating it with an Hbase table, it is possible to access data in HBase through Hive. Using the DIH (DataImportHandler) component provided by Solr, through the jdbc interface provided by Hive, to access the data in the Hbase associated with the Hive appearance, and using the function of DIH to automatically create indexes, thereby realizing the function of automatically creating indexes for HBase data. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for automatically creating a Solr index file by Hbase data and belongs to the field of mega-data. The index can be automatically created by configuring data in the HBase by adopting a method based on Solr+HBase+Hive. By creating a Hive outer table and an Hbase table which are associated, data in the HBase can be accessed by virtue of Hive. Data in the Hbase associated with the Hive outer table is accessed through a jdbc interface provided by Hive by virtue of a DIH (Data Import Handler) assembly provided by Solr. The function of automatically creating the index for the HBase data by means of the function of automatically creating the index by the DIH is achieved.

Description

technical field [0001] The invention relates to the field of big data, in particular to a method for automatically creating Solr index files from Hbase data. Background technique [0002] Big data is often used to describe the large amount of unstructured and semi-structured data created by a company that would take too much time and money to download to a relational database for analysis. Big data analysis is often associated with cloud computing, because real-time analysis of large data sets requires frameworks like MapReduce and HBase to distribute work to tens, hundreds, or even thousands of computers. Compared with traditional data warehouse applications, big data analysis has the characteristics of large data volume and complex query and analysis. Big data requires special techniques to efficiently handle large volumes of data that tolerate elapsed time. Technologies applicable to big data, including massively parallel processing (MPP) databases, data mining grids, d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/9014
Inventor 金洪殿赵仁明辛国茂刘伟
Owner LANGCHAO ELECTRONIC INFORMATION IND CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products