Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A data collision flow analysis method and device, storage medium and terminal

A data collision and stream analysis technology, applied in the field of data circulation, can solve the problems of easy data loss, poor transactionality, large cost investment, etc., to achieve the effect of improving accuracy, avoiding loss, and ensuring transactionality

Active Publication Date: 2021-10-15
上海数据发展科技有限责任公司
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, Kafka only pursues high data throughput. When storing multiple batches of data under different upper-level business IDs, it is easy to lose data, and Kafka itself has poor transactional performance when applied in the field of data circulation; storm or Spark-Streaming is usually used to analyze unlimited data streams. Different batches of data to be analyzed will be continuously added to the analysis machine of Storm or Spark-Streaming. When the analysis machine sends out an error alarm, it is not sure which one There is an error in the batch data
If you use the distributed storage and computing platform MapR-Streaming instead of Kafka, you need to increase a lot of cost investment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data collision flow analysis method and device, storage medium and terminal
  • A data collision flow analysis method and device, storage medium and terminal
  • A data collision flow analysis method and device, storage medium and terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Those skilled in the art understand that, as mentioned in the background, there are mainly two existing big data analysis methods. The first method is to save the data obtained in circulation in the batch processing platform first, and then take out the data regularly for analysis. This method cannot realize real-time stream analysis of data. The second method usually puts the data in circulation into the distributed message subscription publishing system kafka, and then cooperates with the data stream processing system Storm or Spark-Streaming to perform real-time stream analysis. However, when kafka stores multiple batches of data under multiple different upper-level business IDs, it is easy to lose data, and kafka itself is not transactional when it is applied in the field of data circulation; Storm or Spark-Streaming is usually used for unlimited Analysis of the data stream, different batches of data to be analyzed will be continuously added to the analysis machine ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to a data collision flow analysis method and device, a storage medium, and a terminal. The data collision flow analysis method includes the following steps: receiving a collision; querying and determining an idle collision thread; using the idle collision thread to execute the collision And analysis is performed to obtain analysis results, wherein each idle collision thread only executes and analyzes a single collision at the same time. The technical scheme of the invention can effectively avoid the loss of circulation data and improve the accuracy of data analysis.

Description

technical field [0001] The invention relates to the technical field of data circulation, in particular to a data collision flow analysis method and device, a storage medium, and a terminal. Background technique [0002] Big data has been widely recognized as a strategic new resource that can define the massive data generated in today's era and the related technological development and service innovation. Big data contains huge commercial value. For users in the era of big data, the amount of data they need to store and process is large, and the data sources and data structures are various and complex, which brings many challenges to the analysis and application of big data. [0003] At present, due to various data application requirements, data exchange is often carried out between data demanders and suppliers through data circulation. In order to test the data quality in the data circulation process, it is necessary to perform real-time streaming analysis of the data duri...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/48
CPCG06F9/4843
Inventor 汤奇峰蒋宇一
Owner 上海数据发展科技有限责任公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products