A method and system for quickly retrieving changing data segments from massive data

A technology of massive data and changing data, applied in the field of data processing, can solve the problem of large global search workload and other problems

Active Publication Date: 2020-04-24
HUNAN DATANG XIANYI TECH CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a method and system for quickly retrieving changing data segments from massive data, so as to solve the technical problem that the global search of massive data requires a large amount of work to obtain the start and end times of state changes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and system for quickly retrieving changing data segments from massive data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0030] The method for quickly retrieving changing data segments from massive data of the present invention comprises the following steps:

[0031] S1: Sort massive data in order by one piece of data;

[0032] S2: Use this data as the middle data segment, the previous data as the previous data segment, and the latter data as the latter data segment, and the three combine to form a new data segment;

[0033] S3: Find all the new data: all new data in which the former data segment conforms to the first state and the middle data segment conforms to the second state, as the first position of the data, numbered in sequence (the first item is numbered 1, and the second item is numbered as 2, and so on), to get all new data sets in the first position of the data; all new data in which the middle data segment conforms to the second state and the subsequent data segment conforms to the third state, as the second position of the data, numbered in sequence (1st bar is numbered 1, bar 2 i...

Embodiment 2

[0037] In this embodiment, the method for quickly retrieving changing data segments from massive data includes the following steps:

[0038] In this embodiment, a data record (hereinafter referred to as a record) is used for illustration. A data record data includes data items such as serial number id, time time, state state, and value. In this embodiment, similar SQL statements are used to represent algorithm implementation.

[0039] 1. After sorting the records in order, for each record, combine its own data (denoted as S) with the previous 1 data (denoted as P) and the next 1 data (denoted as N) to form a new record. For The data P before the first item is empty; the data N after the last item is empty.

[0040] Combine each record with previous and subsequent records to get a new record jointData:

[0041] select id,value,state,lag(value,1)over(order by time)as pValue,lag(state,1)over(order by time)as pState,lead(value,1)over(order by time)asnValue ,lead(state,1)over(or...

Embodiment 3

[0053] A system for quickly retrieving changing data segments from massive data in this embodiment includes a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the computer program, any of the above-mentioned implementations can be realized. example steps.

[0054] In summary, the present invention transforms the comparison between different data records into a comparison between different data fields in a record by combining each piece of data with the previous and subsequent data, so that the jobs that originally required global search can be distributed Distributed execution on multiple nodes, quickly retrieve the start and end ranges of data changes and their data, and is also applicable to the retrieval of data with multiple data items and multi-state values.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and system for quickly retrieving changing data segments from massive data. The massive data is sorted sequentially in units of one piece of data; A piece of data is used as the last data segment, and the three are combined to form a new piece of data; among all new data, all new data whose previous data segment conforms to the first state and the middle data segment conforms to the second state are used as the first position of the data; the middle data segment All new data that conforms to the second state and the subsequent data segment conforms to the third state is used as the second position of the data. The invention enables the original global search jobs to be distributed to multiple nodes for distributed execution, and can quickly retrieve the start and end ranges of data changes and their data, and is also suitable for data retrieval of multiple data items and multi-state values.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a method and system for quickly retrieving changing data segments from massive data. Background technique [0002] The application of the Internet of Things and sensor technology has accumulated a large amount of data, which usually needs to be analyzed before using the data. These data are often sampled in a certain order, such as monitoring data of equipment status in industrial automation, which are generated by time-series sampling by sensors, and often need to obtain the start and end time of status changes when used. [0003] The traditional method is to search for the closest time of different states before or after a certain time point in chronological order. This method requires a global search of the data, and global search of massive data is slow, heavy workload and difficult to implement. The problem. Contents of the invention [0004] The purpose of the present inv...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/245G06F16/2455G06F16/2458
Inventor 刘文哲邹光球李号彩刘克勤向春波李志金刘有志张景王波白全生胡卫东罗文理
Owner HUNAN DATANG XIANYI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products