Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Data cleaning method for data files and data files processing method

A technology of data files and processing methods, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve problems that no longer meet data statistical analysis, etc., and achieve the goal of reducing processing burden, ensuring integrity, and saving storage space Effect

Inactive Publication Date: 2015-02-18
BANK OF CHINA
View PDF2 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

With the rapid development of the Bank of China's business, business processing is becoming more and more complex, business processes are constantly being updated, and business systems are more and more closely related. The demand for data statistics and analysis by senior management leaders and department-level managers is increasing. All statistical institutes All the required basic data are processed in various source business systems (that is, the source systems in this article), which obviously no longer meet the requirements of Bank of China for statistical analysis of data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data cleaning method for data files and data files processing method
  • Data cleaning method for data files and data files processing method
  • Data cleaning method for data files and data files processing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The present invention will be described in detail below in conjunction with the accompanying drawings.

[0038] According to an embodiment of the present invention, a data cleaning method for data files is provided, wherein the data files come from various source systems, and can also be called source files, source file data, or data files from source systems, etc., such as figure 1 As shown, it is a flowchart of a data cleaning method for a data file according to an embodiment of the present invention, the data cleaning method includes:

[0039] Step S2: Determine the cleaning content and cleaning rules of each data file according to the data definition of the source system data table, the unified requirements of the predefined data download platform for data, and the unified requirements of the analysis system for data, and compile the cleaning process step, wherein, specifically, the data downloading platform is a downloading intermediary for downloading data files o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data cleaning method for data files and a data file processing method. The data cleaning method comprises the following steps of S2: determining a cleaning content and a cleaning rule of each data file according to uniform requirements of a data download platform, on data and the uniform requirements of a class system to the data, determined and predetermined by the data of a source system data sheet, and compiling cleaning process steps; S4: generating a cleaning configuration file corresponding to each data file according to the cleaning content and the cleaning rule of each data file; S6: according to the cleaning configuration file, cleaning the data file according to the cleaning process steps. The data cleaning method disclosed by the invention has the beneficial effects that different data files in source systems are extracted for uniformly cleansed (data cleaning), and the data files are presented and shared in an uniform manner, so that an uniform view is provided for the subsequent data process of all levels of data, and the processing burden of the source system can be reduced.

Description

technical field [0001] The present invention relates to the data file processing field of a data warehouse system or a data analysis system, in particular, to a data cleaning method and a data file processing method for a data file. Background technique [0002] Before the emergence of massive data processing methods, massive data processing was basically carried out in their respective source business processing systems. At this time, the business system not only bears the pressure of daily business processing itself, but also undertakes the processing, query, and analysis of massive data. Wait for a lot of work. With the rapid development of the Bank of China's business, business processing is becoming more and more complex, business processes are constantly being updated, and business systems are more and more closely related. The demand for data statistics and analysis by senior management leaders and department-level managers is increasing. All statistical institutes A...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/215
Inventor 王莉陈世强
Owner BANK OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products