Spark-based sky region coverage generation method for large-scale astronomical data

A large-scale, data-based technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as poor scalability, low efficiency, and long time consumption, and achieve the effect of solving poor scalability

Active Publication Date: 2017-12-19
天科大(天津)科技园有限责任公司
View PDF2 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the above methods lack reasonable arrangements for astronomical data in the data preprocessing stage; and due to the massive nature of astronomical data, the application of traditional scientific comp

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Spark-based sky region coverage generation method for large-scale astronomical data
  • Spark-based sky region coverage generation method for large-scale astronomical data
  • Spark-based sky region coverage generation method for large-scale astronomical data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] Embodiments of the present invention will be described in further detail below in conjunction with the accompanying drawings.

[0034] A large-scale astronomical data sky coverage generation method based on Spark requires large-scale astronomical data. Combine below Figure 2a to Figure 2c Taking a set of simple data sets with a level of 2 as an example to illustrate the execution process of this method, it is especially important to note that Figure 2a to Figure 2c The numbers in the figure only show the HEALPix block number information, and the highlighted part is the data that meets the conditions and is retained until the next iteration, and the dotted line part is the data that does not meet the conditions and is output to the file system.

[0035] The large-scale astronomical data sky coverage generation method based on Spark of the present invention, such as image 3 shown, including the following steps:

[0036] Step 1: Read astronomical data from HDFS (The ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a Spark-based sky region coverage generation method for large-scale astronomical data. The method is mainly technically characterized in that the data is subjected to block indexing one by one according to right ascension and declination information in combination with an HEALPix layered spherical indexing method by using a map operator of Spark; by using the map operator of the Spark, performing father block number and sub-block number segmentation operation on an HEALPix block number of each piece of the data of a current layer by utilizing bit operation; performing clustering operation on all blocks by using a combineByKey operator of the Spark; and repeatedly iterating the operation until an iterative stop condition is met, thereby obtaining data after sky region coverage generation. The method is reasonable in design, can finish the sky region coverage generation of the large-scale astronomical data in a short time, provides support for realizing quick archiving of massive astronomical data, and improves the data access and processing efficiency; and in addition, the generated result can be used for data visualization, so that a distribution state of the astronomical data, in a star catalogue, in a sky region can be intuitively displayed for a researcher.

Description

technical field [0001] The invention belongs to the technical field of big data processing, in particular to a Spark-based large-scale astronomical data sky area coverage generation method. Background technique [0002] Sky area coverage generation is an important part of astronomical data archiving, and its results are crucial to subsequent processing processes such as astronomical data retrieval and calculation. Due to the massive amount of astronomical data, it usually takes a long time to deal with this problem with traditional scientific computing methods, the efficiency is not high, and it is limited by the storage space, so the scalability is poor. [0003] In recent years, the development of science and technology has greatly improved the ability to collect astronomical data. The amount of data in each band has grown exponentially. Astronomy is gradually moving towards the era of "big data" for full-band sky surveys. Faced with such a huge amount of data, new challe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/134G06F16/182
Inventor 熊聪聪田祖宸赵青史艳翠王丹苏静
Owner 天科大(天津)科技园有限责任公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products