Data collision flow analysis method and device, storage medium, and terminal

A technology of data collision and flow analysis, applied in the field of data circulation, can solve problems such as easy loss of data, large cost investment, poor transactional performance, etc., to achieve the effect of avoiding loss and improving accuracy

Active Publication Date: 2018-12-11
上海数据发展科技有限责任公司
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, Kafka only pursues high data throughput. When storing multiple batches of data under different upper-level business IDs, it is easy to lose data, and Kafka itself has poor transactional performance when applied in the field of data circulation; storm or Spark-Streaming is usually used to analyze unlimited data streams. Different batches of data to be analyzed will be continuously added to the analysis machine of Storm or Spark-Streaming. When the analysis machine sends out an error alarm, it is not sure which one There is an error in the batch data
If you use the distributed storage and computing platform MapR-Streaming instead of Kafka, you need to increase a lot of cost investment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data collision flow analysis method and device, storage medium, and terminal
  • Data collision flow analysis method and device, storage medium, and terminal
  • Data collision flow analysis method and device, storage medium, and terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Those skilled in the art understand that, as mentioned in the background, there are mainly two existing big data analysis methods. The first method is to save the data obtained in circulation in the batch processing platform first, and then take out the data regularly for analysis. This method cannot realize real-time stream analysis of data. The second method usually puts the data in circulation into the distributed message subscription publishing system kafka, and then cooperates with the data stream processing system Storm or Spark-Streaming to perform real-time stream analysis. However, when kafka stores multiple batches of data under multiple different upper-level business IDs, it is easy to lose data, and kafka itself is not transactional when it is applied in the field of data circulation; Storm or Spark-Streaming is usually used for unlimited Analysis of the data stream, different batches of data to be analyzed will be continuously added to the analysis machine ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a data collision flow analysis method and device, a storage medium and a terminal. The data collision flow analysis method comprises the following steps: receiving collision;carrying out query to determine idle colliding threads; performing and analyzing the collision using the idle collision thread to obtain an analysis result, wherein each idle collision thread performsand analyzes only a single collision at the same time. The technical scheme of the invention can effectively avoid the loss of the circulation data and improve the accuracy of the data analysis.

Description

technical field [0001] The invention relates to the technical field of data circulation, in particular to a data collision flow analysis method and device, a storage medium, and a terminal. Background technique [0002] Big data has been widely recognized as a strategic new resource that can define the massive data generated in today's era and the related technological development and service innovation. Big data contains huge commercial value. For users in the era of big data, the amount of data they need to store and process is large, and the data sources and data structures are various and complex, which brings many challenges to the analysis and application of big data. [0003] At present, due to various data application requirements, data exchange is often carried out between data demanders and suppliers through data circulation. In order to test the data quality in the data circulation process, it is necessary to perform real-time streaming analysis of the data duri...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/48
CPCG06F9/4843
Inventor 汤奇峰蒋宇一
Owner 上海数据发展科技有限责任公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products