A data import method, device and electronic equipment

A data import and data writing technology, applied in the multimedia field, can solve the problems of low data import efficiency, cumbersome operation steps, and high dependencies, and achieve the effects of avoiding the switching of active and standby nodes, high flexibility, and reducing labor costs.

Active Publication Date: 2021-11-26
BEIJING QIYI CENTURY SCI & TECH CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the inventor found in the process of realizing the present invention that at least the following problems exist in the prior art: the current operation steps are cumbersome, and the dependencies between each step are high, requiring human intervention, so the efficiency of data import is relatively low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data import method, device and electronic equipment
  • A data import method, device and electronic equipment
  • A data import method, device and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] The technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention.

[0040] In order to solve the problem of low data import efficiency, embodiments of the present invention provide a data import method, device, and electronic equipment, so as to improve data import efficiency.

[0041] Firstly, the data import method provided by the embodiment of the present invention will be introduced in detail below.

[0042] see figure 1 , figure 1 It is a flow chart of the data import method of the embodiment of the present invention, including the following steps:

[0043] S101, write the data in each data source in the Hadoop cluster into each CSV file of HDFS respectively by executing a Hive Client command.

[0044]Specifically, the Hadoop cluster may include multiple data sources, and the embodiment of the present invention may perform parallel processing on multiple data so...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the present invention provides a data import method, device and electronic equipment, which are applied in the field of multimedia technology. The method includes: writing the data in each data source in the Hadoop cluster into the Hadoop distributed Each comma-separated value in a CSV file for the file system HDFS. Synchronize HDFS from the Hadoop cluster to the Druid cluster through distcp. For each CSV file, according to the attribute information of the CSV file, generate a JSON configuration file. According to the JSON configuration file, initiate a request to build an index task through the CURL command, so that Druid The cluster imports the written data in the CSV file into the Druid cluster. The invention can improve the efficiency of data import.

Description

technical field [0001] The invention relates to the field of multimedia technology, in particular to a data import method, device and electronic equipment. Background technique [0002] In multimedia technology, time series data can be analyzed through Druid to provide support for OLAP (Online Analytical Processing, Online Analytical Processing). Druid is a distributed data storage system that supports real-time analysis. Compared with traditional OLAP systems, Druid has significantly improved performance in terms of data processing scale and real-time data processing, and embraces mainstream open source ecosystems, including Hadoop. Druid has a large data throughput and can process billions to tens of billions of events per day. It supports two ingestion methods of streaming data and batch data, and supports queries in any dimension, with fast access speed. Among them, Hadoop is a distributed system infrastructure developed by the Apache Foundation. Users can develop distr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/182
Inventor 赵艳杰康林段效晨易帆秦占明
Owner BEIJING QIYI CENTURY SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products