A data processing method and device

A data processing and data technology, applied in the field of data processing, can solve problems such as unavailability, data error, and table unavailability, and achieve the effect of improving stability

Active Publication Date: 2019-08-02
ALIBABA GRP HLDG LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

During the transfer process, some tables may have data errors and become unavailable after transfer to other non-relational databases
For example, when importing multiple tables from a relational database to a table in a non-relational database such as Hadoop database (HBase), it is easy to have data errors during the transfer process, resulting in the transfer to HBase. If a table is unavailable, it may even affect the availability of other tables
It can be seen that this kind of error has a great impact on the stability of data transfer.
[0004] At present, it can only be found that the transferred table is unavailable after the data transfer is completed, and it cannot effectively solve the impact of errors during the transfer process on data transfer.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data processing method and device
  • A data processing method and device
  • A data processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0062] figure 1 A method flowchart of a data processing method provided by an embodiment of the present invention is applied to the process of data transfer from a relational database to a non-relational database, and the method includes:

[0063] S101: Acquire a to-be-transferred task from the relational database, where the to-be-transferred task includes data of the to-be-transferred table in the relational database or data of a sub-table after the to-be-transferred table is divided.

[0064] For example, the to-be-transferred task may include all data to be transferred in the to-be-transferred table, or when the to-be-transferred table has a large capacity and is divided into multiple sub-tables, the to-be-transferred task may include all data in one sub-table. All data to be transferred.

[0065] Optionally, in this embodiment of the present invention, the relational database may specifically be an Oracle database; the non-relational database may specifically be HBase.

...

Embodiment 2

[0077] The primary key described in the embodiment of the present invention identifies the uniqueness of each row of data in the table, and provides the function of quickly finding data. If there is no primary key, the data can only be scanned sequentially, and concurrent searches cannot be performed. . In the process of transferring data from a relational database to a non-relational database, it is different from a table with a primary key. For a table without a primary key, the traditional method cannot split it, and it can only be transferred through a single thread, which is very slow. , if the capacity of the non-primary key table is large, it will seriously affect the speed of data transfer. To this end, an embodiment of the present invention provides a method for table transfer without a primary key. figure 1 On the basis of the corresponding embodiment, figure 2 A flow chart of a method applied to a method for transferring a table without a primary key provided by ...

Embodiment 3

[0094] Figure 4 The device diagram of a data processing device provided by an embodiment of the present invention is applied in the process of data transfer from a relational database to a non-relational database, including:

[0095] The first obtaining unit 401 is configured to obtain a task to be transferred from the relational database, where the task to be transferred includes data of a table to be transferred in the relational database or data of a sub-table after the table to be transferred is divided.

[0096] For example, the to-be-transferred task may include all data to be transferred in the to-be-transferred table, or when the to-be-transferred table has a large capacity and is divided into multiple sub-tables, the to-be-transferred task may include all data in one sub-table. All data to be transferred.

[0097] Optionally, in this embodiment of the present invention, the relational database may specifically be an Oracle database; the non-relational database may s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention discloses a data processing method and device. The method comprises the steps of obtaining to-be-transferred tasks from a relational database; obtaining the to-be-transferred tasks from a metadatabase before the to-be-transferred tasks are imported into a non-relational database, and generating an HDFS (Hadoop Distributed File System) temporary file; checking the HDFS temporary file to judge correctness of data in the corresponding to-be-transferred tasks, and importing the corresponding to-be-transferred tasks into the non-relational database if the checking succeeds; and regenerating the corresponding to-be-transferred tasks from the relational database if the checking fails. The to-be-transferred tasks are not imported into the non-relational database when the checking fails, so the to-be-transferred tasks with errors can be effectively prevented from being imported into the transferred non-relational database in a data transfer process; the data transfer stability is improved; and the influence on data transfer resulting from data errors in the transfer process can be effectively solved.

Description

technical field [0001] The present invention relates to the field of data processing, in particular to a data processing method and device. Background technique [0002] Traditional relational databases such as Oracle are widely used and powerful, and can store large amounts of data, but relational databases have high operating costs. In order to improve storage efficiency, important or core data is generally stored in relational databases. It often happens that all or part of the data stored in a relational database is moved to other non-relational databases. [0003] Databases generally store data in the form of tables, and transferring data from relational databases to other non-relational databases can be understood as transferring tables. During the transfer process, some tables may be unavailable due to data errors during the transfer process after being transferred to other non-relational databases. For example, in the case of importing multiple tables from a relat...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/21
CPCG06F16/214
Inventor 张淼
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products