Data cleaning method and device based on time sequence similarity

A technology of time series and data cleaning, applied in the field of data cleaning, can solve problems such as complex construction and evaluation models, and achieve the effects of high accuracy, low computational complexity, and strong robustness

Pending Publication Date: 2020-04-17
ELECTRIC POWER SCI & RES INST OF STATE GRID TIANJIN ELECTRIC POWER CO +3
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The estimated value obtained by this method is often closer to the real value, but the process o

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data cleaning method and device based on time sequence similarity
  • Data cleaning method and device based on time sequence similarity
  • Data cleaning method and device based on time sequence similarity

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0043] In order to make the objectives, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be described in detail below. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other implementations obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.

[0044] The following is attached Figure 1-6 The embodiments of the present invention are further detailed:

[0045] A data cleaning method based on time series similarity includes the following steps:

[0046] Step 1. Read historical data based on historical sample database, and perform dimensionality reduction processing on historical data;

[0047] The high-dimensional sample data (D dimension) collected from the smart grid is actually in a low-dimens...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a data cleaning method and device based on time sequence similarity, relates to the technical field of data cleaning, and mainly aims to provide a data cleaning method which islow in calculation complexity and high in accuracy. The data cleaning method based on time sequence similarity is technically characterized by comprising the following steps: step 1, reading historical data based on a historical sample database, and performing dimension reduction processing on the historical data; 2, carrying out discretization processing on historical data by adopting a time series symbolization method; 3, performing similarity measurement and similarity calculation on the data by using a dynamic time warping algorithm; and step 4, carrying out data cleaning on the data through a set threshold, so as to obtain a cleaned result sequence. The method is suitable for a power load big data application scene, and can quickly and effectively clean the power load big data to obtain high-quality data for subsequent analysis and processing.

Description

technical field [0001] The invention relates to the technical field of data cleaning, in particular to a data cleaning method and device based on time series similarity. Background technique [0002] The smart grid collects the user's electricity consumption data through smart meters and other equipment, and then transmits the information back to the information center for analysis and processing through the network. The power user load data has important guiding significance for the study of the economic and social development of the region. With the continuous development of the smart grid, various information systems take on extremely important tasks in the operation of the power grid, and massive power user load data provide a data basis for the operation of the information system. [0003] The main characteristics of daily electric power load data are: high dimensionality and large amount of data. Due to communication reasons, equipment failure and other reasons, a lar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/215G06F16/2458G06K9/62
CPCG06F16/215G06F16/2474G06F18/213G06F18/23G06F18/10
Inventor 李野董得龙李刚卢静雅孔祥玉李予辉孙虹刘浩宇杨光顾强何泽昊季浩白涛乔亚男翟术然张兆杰吕伟嘉许迪赵紫敬
Owner ELECTRIC POWER SCI & RES INST OF STATE GRID TIANJIN ELECTRIC POWER CO
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products