A database whole-library data access method and device, electronic equipment and medium
By configuring grouping and scheduling information, the problem of low efficiency in accessing the entire database was solved, and automated offline access to the entire database was achieved, improving efficiency and adaptability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GLODON CO LTD
- Filing Date
- 2023-03-20
- Publication Date
- 2026-06-23
AI Technical Summary
In existing technologies, the whole-database data access method is difficult to apply to offline access scenarios, resulting in low efficiency, large amount of repetitive work, and wasted manpower and time.
By acquiring the user-defined whole database data access task, grouping the tables to be accessed based on the number of tables and the scheduling time period, creating target tables and scheduling tasks, configuring scheduling information, and realizing the automatic access of whole database data.
It enables offline automatic access to the entire database, saving manual workload, improving access efficiency, and adapting to different data access modes and scenario requirements.
Smart Images

Figure CN116303507B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of database technology, and specifically to a method, apparatus, electronic device, and medium for accessing a complete database. Background Technology
[0002] As enterprise IT infrastructure matures, the amount of data accumulated increases. Based on this accumulated data, technologies such as big data and AI are used to analyze and statistically analyze it, extracting its value to support business decisions and serve operational needs. However, data value extraction cannot be based solely on raw business data; enterprises often need to build data warehouses to first integrate the necessary business data.
[0003] To integrate business data into a data warehouse, it is necessary to establish corresponding storage structures in the data warehouse (such as tables in Hive, MySQL, PostgreSQL, ElasticSearch indexes, and Mongo collections), and then create corresponding data integration programs (usually Flink or Spark type tasks). Through the data integration programs, business data is integrated into the storage structure of the data warehouse. These two steps are referred to as table creation and task creation.
[0004] Enterprises have different data access strategies. Some only access specific tables' business data when needed, while others access all tables in the database to the data warehouse for quick access later. For whole-database access scenarios, a common approach is to parse the database logs (such as MySQL or Oracle binlogs) of the business database to obtain the business data, and then store the business data in the corresponding target tables. This method is suitable for real-time, incremental data access scenarios, but not for offline scenarios requiring scheduled access (e.g., daily access) or full access. For offline whole-database access scenarios, enterprises must manually create tables in the data warehouse for each table in the business database and then create data access tasks to perform the data access. This method involves a large amount of repetitive work, is labor-intensive and time-consuming, and has low efficiency. Summary of the Invention
[0005] In view of this, embodiments of the present invention provide a method, apparatus, electronic device and medium for accessing a whole database, in order to overcome the problem that the existing methods for accessing whole databases are difficult to apply to offline access scenarios, resulting in low efficiency of accessing whole databases.
[0006] According to a first aspect, embodiments of the present invention provide a method for accessing entire database data, the method comprising:
[0007] The system retrieves the user-defined whole database data access task, which includes: the database to be accessed, the target database, table creation rules, and task scheduling rules. The table creation rules are used to characterize the table name management rules of the whole database data to be accessed in the target database, and the task scheduling rules include: the data access mode and its corresponding scheduling time period.
[0008] All tables to be accessed are grouped based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode.
[0009] Based on the table creation rules, target tables corresponding to each group of tables to be accessed are created in the target database respectively.
[0010] Create a corresponding scheduling task for each target table based on the current data access mode;
[0011] Based on the current scheduling time period and the grouping results of the tables to be accessed, the scheduling information of the scheduling tasks corresponding to each target table is configured, so as to execute each scheduling task according to the scheduling information and access the entire database data to be accessed in the database to be accessed to the target database.
[0012] Optionally, grouping all tables to be accessed based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode includes:
[0013] The current scheduling time period is divided based on a preset task start interval to determine the number of task starts;
[0014] Calculate the number of tables in a single startup based on the number of tables and the number of times the task is started;
[0015] All tables to be connected are grouped according to the number of tables in a single startup.
[0016] Optionally, the step of creating target tables corresponding to each group of tables to be accessed in the target database based on the table creation rules includes:
[0017] Obtain the hierarchy and data domain corresponding to the database to be accessed in the target database;
[0018] According to the table creation rules, a current target table corresponding to the current table to be accessed is created in the target database. The table name of the current target table includes the original table name of the current table to be accessed, the hierarchy, and the data field.
[0019] Optionally, the data access modes include: one full-volume periodic incremental, periodic incremental, periodic full, and one full. The step of creating a corresponding scheduling task for each target table based on the current data access mode includes:
[0020] Determine whether the current data access mode is a full-cycle incremental data access;
[0021] When the current data access mode is a full-cycle incremental process, two scheduling tasks are created for each target table, one of which is a full-cycle scheduling task and the other is a periodic scheduling task.
[0022] When the current data access mode is not the full-cycle incremental access mode, a scheduling task corresponding to the current data access mode is created for each target table.
[0023] Optionally, the scheduling information includes: scheduling time; the scheduling information for configuring the scheduling tasks corresponding to each target table based on the current scheduling time period and the grouping results of the tables to be accessed includes:
[0024] The scheduling time for each group is evenly allocated within the current scheduling time period according to the grouping results of the access table.
[0025] Optionally, the whole database data access task further includes: access method; before grouping all tables to be accessed based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode, the method further includes:
[0026] Determine whether the access method is offline access;
[0027] When the access method is offline access, all tables to be accessed are grouped based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode.
[0028] When the access method is not offline access, the data to be accessed in the database to be accessed is accessed into the target database according to the preset real-time access method.
[0029] Optionally, the database whole-database data access method further includes:
[0030] Monitor the creation status of the target table and / or the scheduled task;
[0031] Determine whether the creation status of the target table and / or the scheduled task is "creation failed";
[0032] When the creation status of the target table or the scheduling task is "creation failed", the reason for the creation failure is recorded so that the user can troubleshoot the problem based on the reason for the creation failure, and after troubleshooting, the step of re-executing the task of obtaining the whole database data access set by the user is re-executed.
[0033] According to a second aspect, embodiments of the present invention provide a database whole-database data access device, the device comprising:
[0034] The acquisition module is used to acquire the whole database data access task set by the user. The whole database data access task includes: database to be accessed, target database, table creation rules, and task scheduling rules. The table creation rules are used to characterize the table name management rules of the whole database data to be accessed in the target database. The task scheduling rules include: data access mode and its corresponding scheduling time period.
[0035] The first processing module is used to group all tables to be accessed based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode.
[0036] The second processing module is used to create target tables corresponding to each group of tables to be accessed in the target database based on the table creation rules.
[0037] The third processing module is used to create corresponding scheduling tasks for each target table based on the current data access mode.
[0038] The fourth processing module is used to configure the scheduling information of the scheduling tasks corresponding to each target table based on the current scheduling time period and the grouping results of the tables to be accessed, so as to execute each scheduling task according to the scheduling information and access the entire database data to be accessed in the database to be accessed to the target database.
[0039] According to a third aspect, embodiments of the present invention provide an electronic device, comprising:
[0040] A memory and a processor are communicatively connected, the memory storing computer instructions, and the processor executing the computer instructions to perform the method described in the first aspect and any of its alternative embodiments.
[0041] According to a fourth aspect, embodiments of the present invention provide a computer-readable storage medium storing computer instructions for causing a computer to perform the method described in the first aspect, or any optional embodiment of the first aspect.
[0042] The technical solution of this invention has the following advantages:
[0043] The database whole-database data access method provided in this embodiment of the invention groups all tables to be accessed according to the current scheduling time period corresponding to the current data access mode of the whole-database data access task set by the user, creates corresponding scheduling tasks for each group, and finally configures the scheduling information of each scheduling task based on the grouping results and the current scheduling time period to achieve automatic access of whole-database data. Thus, the user only needs to set the current scheduling time period corresponding to the data access mode to achieve offline automatic access of whole-database data through task scheduling, saving manual workload and time, and greatly improving the efficiency of whole-database data access. Attached Figure Description
[0044] To more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the specific embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.
[0045] Figure 1 This is a flowchart of the database whole-database data access method in an embodiment of the present invention;
[0046] Figure 2 This is a flowchart of manual configuration in an embodiment of the present invention;
[0047] Figure 3 This is a flowchart of the background program operation in an embodiment of the present invention;
[0048] Figure 4 This is a flowchart of the whole database access process in an embodiment of the present invention;
[0049] Figure 5 This is a schematic diagram of the structure of the database whole-database data access device according to an embodiment of the present invention;
[0050] Figure 6 This is a schematic diagram of the structure of an electronic device according to an embodiment of the present invention. Detailed Implementation
[0051] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0052] In the description of this invention, it should be noted that the terms "first," "second," and "third" are used for descriptive purposes only and should not be construed as indicating or implying relative importance.
[0053] The technical features involved in the different embodiments of the present invention described below can be combined with each other as long as they do not conflict with each other.
[0054] For scenarios involving full database data access, common practices include parsing database logs from the business database to obtain business data, and then storing this data in the corresponding target tables. This method is suitable for real-time, incremental data access scenarios, but not for offline scenarios requiring scheduled access (e.g., daily access) or full access. For offline full database data access scenarios, enterprises must manually create tables in the data warehouse for each table in the business database and then create data access tasks to perform the data access. This method involves a large amount of repetitive work, is labor-intensive and time-consuming, and is inefficient.
[0055] To address the aforementioned problems, embodiments of the present invention provide a method for accessing entire database data, such as... Figure 1 As shown, the method specifically includes the following steps:
[0056] Step S101: Obtain the whole database data access task set by the user.
[0057] The whole database data access task includes: the database to be accessed, the target database, table creation rules, and task scheduling rules. The table creation rules characterize the table name management rules in the target database for the data to be accessed. The task scheduling rules include: the data access mode and its corresponding scheduling time period. Furthermore, in practical applications, the above whole database data access task also includes: the access method, which includes both offline access and real-time access.
[0058] Step S102: Group all tables to be accessed based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode.
[0059] Step S103: Based on the table creation rules, create the target tables corresponding to each group of tables to be accessed in the target database.
[0060] Step S104: Create a corresponding scheduling task for each target table based on the current data access mode.
[0061] Step S105: Based on the current scheduling time period and the grouping results of the tables to be accessed, configure the scheduling information of the scheduling tasks corresponding to each target table, so as to execute each scheduling task according to the scheduling information and access the entire database data to be accessed in the database to be accessed to the target database.
[0062] The aforementioned scheduling information includes scheduling time, i.e., the execution time of the scheduled tasks. Specifically, the scheduling time for each group is evenly allocated within the current scheduling time period according to the grouping results of the access table.
[0063] By performing the above steps, the database whole-database data access method provided in this embodiment of the invention groups all tables to be accessed according to the current scheduling time period corresponding to the current data access mode of the whole-database data access task set by the user, creates corresponding scheduling tasks for each table, and finally configures the scheduling information of each scheduling task according to the grouping results and the current scheduling time period to achieve automatic access of the whole-database data. Thus, the user only needs to set the current scheduling time period corresponding to the data access mode to achieve offline automatic access of the whole-database data through task scheduling, saving manual workload and time, and greatly improving the access efficiency of the whole-database data.
[0064] Specifically, the manual configuration process for users to set up a whole database data access task is as follows: Figure 3 As shown, it specifically includes:
[0065] S11: Select the access method as offline or real-time.
[0066] S12: Select the database type, database instance, and database of the access source, and then select the tables or views to be accessed under the database in batches.
[0067] S13: Select the target data source type, database instance, and target database.
[0068] S14: Configure table creation rules. Tables in the target database need to have unified table names based on the target database's hierarchy, data domain, etc., and table name rules should be configured.
[0069] S15: Select the synchronization mode, i.e., the data access mode. Specifically, the data access modes include: one-time full-volume incremental, periodic incremental, periodic full, and one-time full. The details are as follows: One-time full-volume incremental: Two tasks are generated for each table. The full task has a separate execution time range and is executed only once; the incremental task periodically synchronizes data based on set conditions. Periodic incremental: Suitable for new business, it periodically synchronizes data to the transaction table based on time conditions, suitable for behavioral and access data. Periodic full: Suitable for business table access, it partitions and stores full slices for each time period, suitable for data that requires frequent year-on-year and month-on-month calculations. One-time full: Suitable for some long-term unchanging dimensional data, synchronizing the dimensions to the DIM layer, which does not require frequent changes.
[0070] S16: Based on the source and target information of the access, configure general access task rules, such as configuring the write method (overwrite or append), incremental conditions, target partitions, etc. Based on the general configuration, if some tables require different configuration content, personalized configurations can also be performed.
[0071] S17: Determine whether the access method is offline.
[0072] S18: If it is offline access, configure the scheduling information, which is the scheduling time period corresponding to the data access mode. For example, the scheduling time period for full access is 6 hours, and the scheduling time period for incremental access is 2 hours. The background program will automatically balance the scheduling of tasks within the time period to avoid the task execution time being too concentrated.
[0073] S19: Configuration complete.
[0074] Specifically, in one embodiment, step S102 described above includes the following steps:
[0075] Step S21: Divide the current scheduling time period based on the preset task start interval and determine the number of task starts.
[0076] Step S22: Calculate the number of tables in a single startup based on the number of tables and the number of task startups.
[0077] Step S23: Group all tables to be connected according to the number of tables started in a single operation.
[0078] For example, suppose the currently selected scheduling time period, i.e., the task start time period, is 0:00 to 2:00, and the backend defaults to a 10-minute task start interval. Assume there are 2000 tables to be accessed, corresponding to 2000 tasks, and all tasks are to be started within 2 hours. Starting once every 10 minutes, there are 12 starts within 2 hours. 2000 / 12 = 170, meaning 170 tasks are started at a time, grouping all tables to be accessed into groups of 170. Therefore, by grouping the entire database data according to the number of tables and the scheduling time period, and scheduling tasks by group, the efficiency of subsequent database data access is improved.
[0079] Specifically, in one embodiment, step S103 described above includes the following steps:
[0080] Step S31: Obtain the corresponding layer and data domain of the database to be accessed in the target database.
[0081] Step S32: Create the current target table in the target database according to the table creation rules. The table name of the current target table contains the original table name, hierarchy and data field of the current table to be accessed.
[0082] Specifically, for each group of data source tables, corresponding target tables are created in batches in the target database based on the data source tables and manually configured table creation rules. Since tables are managed through hierarchical and data domain structures in the database, users can define table naming rules, and the table management service generates table names based on these hierarchical and data domain structures. For example, for tables accessing data, the hierarchical structure is generally placed in the ODS layer; the data domain is specified according to the business scenario. Users define their own table naming rules, i.e., the aforementioned table creation rules, using the ODS layer as a prefix and the data domain as a suffix. When creating tables, the table management service generates a new table name in the data warehouse using the original table name from the data source, in the format "ods_original table name_data domain". This method of adding the hierarchical structure and data domain of the accessed data in the target database to the table name facilitates subsequent management of newly accessed tables in the target database. Retaining the original table name in the new table name facilitates table queries, improving the user experience. Furthermore, creating target tables in batches by group further improves the efficiency of accessing the entire database.
[0083] Specifically, in one embodiment, step S104 described above includes the following steps:
[0084] Step S41: Determine whether the current data access mode is a full-cycle incremental process.
[0085] Step S42: When the current data access mode is a full-cycle incremental access, create two scheduling tasks for each target table.
[0086] One of the scheduling tasks is a full scheduling task, and the other is a periodic scheduling task.
[0087] Step S43: When the current data access mode is not a full-cycle incremental access, create a scheduling task corresponding to the current data access mode for each target table.
[0088] Specifically, if the current data access mode is periodic incremental, periodic full, and single full, the corresponding scheduling tasks are periodic scheduling tasks, periodic full scheduling tasks, and single full scheduling tasks, respectively. This allows for the creation of corresponding scheduling tasks based on different user-defined data access modes, adapting to the needs of different whole-database data access scenarios and improving the user experience.
[0089] Specifically, in one embodiment, before performing step S102 above, the database whole-database data access method provided by the present invention further includes the following steps:
[0090] Step S106: Determine whether the access method is offline access.
[0091] Specifically, when the access method is offline access, step S102 is executed as described above; when the access method is not offline access, the entire database data to be accessed from the database to be accessed is accessed to the target database according to the preset real-time access method. The preset real-time access method refers to existing methods for online access to the entire database, such as parsing the database logs of the database to be accessed to obtain business data, and then storing the business data in the corresponding target table of the target database. This invention is not limited to this method. By selecting and determining the access method, the entire database data can be imported for two different access methods, enriching the forms of entire database data access and further improving the user experience.
[0092] Specifically, in one embodiment, the database whole-database data access method provided by the present invention further includes the following steps:
[0093] Step S107: Monitor the creation status of the target table and / or scheduled tasks.
[0094] Step S108: Determine whether the creation status of the target table and / or the scheduled task is "creation failed".
[0095] Step S109: When the creation status of the target table or scheduled task is "creation failed", record the reason for the creation failure so that the user can troubleshoot the problem based on the reason for the creation failure, and re-execute the above step S101 after troubleshooting.
[0096] Specifically, during the creation of the target table or the scheduling task, real-time monitoring records the results (success or failure) and failure information of the table creation / task, facilitating rapid troubleshooting and resolution. After the problem is resolved, the entire database is re-accessed following the same process.
[0097] For example, the database whole-database data access method provided in this embodiment of the invention can be run through a background program. Taking offline access to whole-database data as an example, the specific operation process is as follows: Figure 3 As shown, it includes:
[0098] S201: The access execution status is marked as running.
[0099] S202: Group the tables according to the number of tables and scheduling information (scheduling time period).
[0100] S203: For each set of data source tables, create the corresponding target tables in the target database in batches according to the data source tables and the manually configured table creation rules.
[0101] S204: Records the table creation result (success, failure) and failure information.
[0102] S205: Determine if all table creation attempts failed. If all attempts failed, no further processing is required, and the entire process ends.
[0103] S206: Determine whether the access mode is a one-time full input + periodic incremental input.
[0104] S207: If the access mode is a full access followed by periodic incremental access, create two tasks for each data source table in each group (one full access task and one periodic incremental access task).
[0105] S208: If the access mode is not a one-time full data transfer plus periodic incremental data transfer, create a task corresponding to the access mode for each data source table in each group.
[0106] S209: Record the result of task creation (success, failure) and failure information.
[0107] S210: Determine if all tasks failed to create. If all tasks fail, no further processing is needed, and the entire process ends.
[0108] S211: Submit the task online, that is, put it into use.
[0109] S212: Allocate task scheduling information in a balanced manner based on the grouping and manually configured scheduling time periods.
[0110] S213: Update access task status to complete.
[0111] S214: Background program execution completed.
[0112] For example, in practical applications, the database whole-database access process established based on the database whole-database data access method provided in the embodiments of the present invention is as follows: Figure 4 As shown, it specifically includes:
[0113] S1: Manual configuration, including selecting data sources, data targets, table creation rules, task scheduling rules, etc. The specific configuration process is as follows: Figure 2 As shown.
[0114] S2: The background program automatically creates tables and tasks in batches based on manual configuration. The specific processing steps of the background program are as follows: Figure 3 As shown.
[0115] S3: Manually check the progress and results of the connection, i.e., which tables have been successfully connected, which tables have not yet been connected, and which tables have failed to be connected.
[0116] S4: Determine if there are any tables showing failed access.
[0117] S5: For the access failure table, the program records the reasons for automatic access failure. Based on the recorded reasons, the operator resolves the problem and quickly retryes the access.
[0118] S6: Tables and tasks are created to enable automatic access to the entire database.
[0119] The database whole-database data access solution provided in this embodiment of the invention is designed for scenarios involving the whole-database data access. By requiring only a small amount of user configuration and relying on the execution of background programs, it automatically creates tables and tasks, thereby saving manual labor and improving access efficiency.
[0120] This invention, based on single-table access, adds whole-database access management to achieve whole-database data access from various source component types to various target component types. Data access is divided into two aspects: tables and tasks. Tables hold the data, and tasks access the data. Regarding tables: the table management service manages tables for various storage components, such as Hive and MySQL, including operations like adding, deleting, modifying, and querying tables. Regarding tasks: the task management service defines and manages tasks; the scheduling system handles offline task scheduling and directed acyclic graph dependency processing; the Linkis computing middleware manages tasks, supporting various task types such as Flinkx, FlinkSQL, FlinkCDC, and FlinkJar; and Yarn and Kubernetes manage task resources. After creating tables and integrating tasks for those tables, single-table data access is achieved through task execution. Building upon single-table data access, whole-database access, through unified configuration, calls the API interfaces of the table management service and task management service to create tables and tasks for the entire database, ultimately achieving whole-database data access through task execution. Furthermore, the database whole-database data access solution provided in this embodiment of the invention also has the following advantages:
[0121] 1. While a unified table creation and task creation strategy can generally be adopted for all tables in the database, it is possible that individual tables in the database may have special characteristics. In this embodiment of the invention, personalized configuration of individual tables can be supported by manually configuring the data access task for the entire database.
[0122] 2. Based on real-time requirements, data access can be divided into offline access and real-time access. This embodiment of the invention supports both of these access methods.
[0123] 3. In offline data access scenarios, user data access can be either full or periodic incremental. The solution provided in this embodiment supports four modes: one-time full and periodic incremental, one-time full, periodic incremental, and periodic full, making it more adaptable.
[0124] 4. Whether it is business data or data warehouse data, there are many types of data storage components, such as MySQL, PostgreSQL, Hive, Mongo, etc. The overall framework of the access solution provided by the embodiments of this invention can support various component types.
[0125] 5. For tables accessed offline, each table access is a separate access task. Since the database often contains many tables, it's crucial to avoid scheduling and running numerous tasks concurrently at the same time, which could strain server resources. This embodiment of the invention avoids overly concentrated task execution times by evenly distributing task scheduling time across specified time periods.
[0126] 6. Since the entire database often contains a large number of tables, the solution provided in this embodiment of the invention can provide access progress and access details functions, allowing users to view the progress of table creation and task creation.
[0127] 7. During the process of creating tables and tasks, a certain failure rate is unavoidable due to network issues, configuration problems, etc. The solution provided in this embodiment of the invention can record the reason for failure when table or task creation fails, making it convenient for users to quickly troubleshoot the problem; and based on the failure record, it can quickly retry, improving efficiency.
[0128] This invention also provides a database whole-database data access device, such as... Figure 5 As shown, the database whole-database data access device includes:
[0129] The acquisition module 101 is used to acquire the user-set whole-database data access task. The whole-database data access task includes: the database to be accessed, the target database, table creation rules, and task scheduling rules. Among them, the table creation rules are used to characterize the table name management rules of the whole-database data to be accessed in the target database, and the task scheduling rules include: the data access mode and its corresponding scheduling time period. For details, please refer to the relevant description of step S101 in the above method embodiment, which will not be repeated here.
[0130] The first processing module 102 is used to group all tables to be accessed based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode. For details, please refer to the relevant description of step S102 in the above method embodiments, which will not be repeated here.
[0131] The second processing module 103 is used to create target tables corresponding to each group of tables to be accessed in the target database based on the table creation rules. For details, please refer to the relevant description of step S103 in the above method embodiment, which will not be repeated here.
[0132] The third processing module 104 is used to create a corresponding scheduling task for each target table based on the current data access mode. For details, please refer to the relevant description of step S104 in the above method embodiment, which will not be repeated here.
[0133] The fourth processing module 105 is used to configure the scheduling information of the scheduling tasks corresponding to each target table based on the current scheduling time period and the grouping results of the tables to be accessed, so as to execute each scheduling task according to the scheduling information and access the entire database data to be accessed from the database to be accessed to the target database. For details, please refer to the relevant description of step S105 in the above method embodiment, which will not be repeated here.
[0134] The database whole-database data access device provided in this embodiment of the invention is used to execute the database whole-database data access method provided in the above embodiment. Its implementation method and principle are the same. For details, please refer to the relevant description of the above method embodiment, which will not be repeated here.
[0135] Through the collaborative operation of the aforementioned components, the database whole-database data access device provided in this embodiment of the invention groups all tables to be accessed according to the current scheduling time period corresponding to the current data access mode of the whole-database data access task set by the user, creates corresponding scheduling tasks for each table, and finally configures the scheduling information of each scheduling task based on the grouping results and the current scheduling time period to achieve automatic access of the whole-database data. Thus, the user only needs to set the current scheduling time period corresponding to the data access mode to achieve offline automatic access of the whole-database data through task scheduling, saving manual workload and time, and greatly improving the access efficiency of the whole-database data.
[0136] This invention also provides an electronic device, such as... Figure 6 As shown, the electronic device includes a processor 901 and a memory 902, wherein the processor 901 and the memory 902 can be connected via a bus or other means. Figure 6 Taking the example of a connection between China and Israel via a bus.
[0137] Processor 901 can be a Central Processing Unit (CPU). Processor 901 can also be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or combinations of the above types of chips.
[0138] The memory 902, as a non-transitory computer-readable storage medium, can be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as the program instructions / modules corresponding to the methods in the above method embodiments. The processor 901 executes various functional applications and data processing of the processor by running the non-transitory software programs, instructions, and modules stored in the memory 902, thereby implementing the methods in the above method embodiments.
[0139] The memory 902 may include a program storage area and a data storage area. The program storage area may store the operating system and applications required for at least one function; the data storage area may store data created by the processor 901, etc. Furthermore, the memory 902 may include high-speed random access memory and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, the memory 902 may optionally include memory remotely located relative to the processor 901, and these remote memories may be connected to the processor 901 via a network. Examples of such networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
[0140] One or more modules are stored in memory 902, and when executed by processor 901, they perform the methods described in the above method embodiments.
[0141] The specific details of the above-mentioned electronic device can be understood by referring to the relevant descriptions and effects in the above embodiments, and will not be repeated here.
[0142] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The implemented program can be stored in a computer-readable storage medium. When the program is executed, it can include the processes of the embodiments of the above methods. The storage medium can be a magnetic disk, optical disk, read-only memory (ROM), random access memory (RAM), flash memory, hard disk drive (HDD), or solid-state drive (SSD), etc.; the storage medium can also include combinations of the above types of memory.
[0143] Although embodiments of the invention have been described in conjunction with the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations all fall within the scope defined by the appended claims.
Claims
1. A method for accessing the entire database, characterized in that, The method includes: The system retrieves the user-defined whole database data access task, which includes: the database to be accessed, the target database, table creation rules, and task scheduling rules. The table creation rules are used to characterize the table name management rules of the whole database data to be accessed in the target database, and the task scheduling rules include: the data access mode and its corresponding scheduling time period. All tables to be accessed are grouped based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode. Based on the table creation rules, target tables corresponding to each group of tables to be accessed are created in the target database respectively. Create a corresponding scheduling task for each target table based on the current data access mode; Based on the current scheduling time period and the grouping results of the tables to be accessed, the scheduling information of the scheduling tasks corresponding to each target table is configured so as to execute each scheduling task according to the scheduling information and to access the entire database data to be accessed in the database to be accessed into the target database. The process of grouping all tables to be accessed based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode includes: The current scheduling time period is divided based on a preset task start interval to determine the number of task starts; Calculate the number of tables in a single startup based on the number of tables and the number of times the task is started; Group all tables to be connected according to the number of tables in a single startup; The step of creating target tables corresponding to each group of tables to be accessed in the target database based on the table creation rules includes: Obtain the hierarchy and data domain corresponding to the database to be accessed in the target database; According to the table creation rules, a current target table corresponding to the current table to be accessed is created in the target database. The table name of the current target table includes the original table name of the current table to be accessed, the layer, and the data field. The layer is the ODS layer, and the table name format of the target table is ods_original table name_data field. The data access modes include: one-time full data entry with periodic incremental data entry, periodic incremental data entry, periodic full data entry, and one-time full data entry. The creation of a corresponding scheduling task for each target table based on the current data access mode includes: Determine whether the current data access mode is a full-cycle incremental data access; When the current data access mode is a full-cycle incremental process, two scheduling tasks are created for each target table, one of which is a full-cycle scheduling task and the other is a periodic scheduling task. When the current data access mode is not the full-cycle incremental data access mode, create a scheduling task corresponding to the current data access mode for each target table. The whole database data access task further includes: access method; before grouping all tables to be accessed based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode, the method further includes: Determine whether the access method is offline access; When the access method is offline access, all tables to be accessed are grouped based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode. When the access method is not offline access, the data of the entire database to be accessed in the database to be accessed is accessed to the target database according to the preset real-time access method. The preset real-time access method is based on the database log of the database to be accessed, the log is parsed to obtain business data, and then the business data is stored in the target table of the corresponding target database.
2. The database whole-database data access method according to claim 1, characterized in that, The scheduling information includes: scheduling time; and the scheduling information for configuring the scheduling tasks corresponding to each target table based on the current scheduling time period and the grouping results of the tables to be accessed includes: The scheduling time for each group is evenly allocated within the current scheduling time period according to the grouping results of the access table.
3. The database whole-database data access method according to claim 1, characterized in that, Also includes: Monitor the creation status of the target table and / or the scheduled task; Determine whether the creation status of the target table and / or the scheduled task is "creation failed"; When the creation status of the target table or the scheduling task is "creation failed", the reason for the creation failure is recorded so that the user can troubleshoot the problem based on the reason for the creation failure, and after troubleshooting, the step of re-executing the task of obtaining the whole database data access set by the user is re-executed.
4. A database whole-database data access device, applied to the database whole-database data access method as described in any one of claims 1-3, characterized in that, The device includes: The acquisition module is used to acquire the whole database data access task set by the user. The whole database data access task includes: database to be accessed, target database, table creation rules, and task scheduling rules. The table creation rules are used to characterize the table name management rules of the whole database data to be accessed in the target database. The task scheduling rules include: data access mode and its corresponding scheduling time period. The first processing module is used to group all tables to be accessed based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode. Grouping all tables to be accessed based on the number of tables in the database to be accessed and the current scheduling time period corresponding to the current data access mode includes: dividing the current scheduling time period based on a preset task start interval to determine the number of task starts; calculating the number of tables started in a single start based on the number of tables and the number of task starts; and grouping all tables to be accessed according to the number of tables started in a single start. The second processing module is used to create target tables corresponding to each group of tables to be accessed in the target database based on the table creation rules. The step of creating target tables corresponding to each group of tables to be accessed in the target database based on the table creation rules includes: obtaining the hierarchy and data domain corresponding to the database to be accessed in the target database; and creating a current target table corresponding to the current table to be accessed in the target database according to the table creation rules. The table name of the current target table contains the original table name of the current table to be accessed, the hierarchy, and the data domain. The third processing module is used to create corresponding scheduling tasks for each target table based on the current data access mode. The fourth processing module is used to configure the scheduling information of the scheduling tasks corresponding to each target table based on the current scheduling time period and the grouping results of the tables to be accessed, so as to execute each scheduling task according to the scheduling information and access the entire database data to be accessed in the database to be accessed to the target database.
5. An electronic device, characterized in that, include: A memory and a processor, the memory and the processor being communicatively connected to each other, the memory storing computer instructions, the processor executing the computer instructions to perform the method according to any one of claims 1-3.
6. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions for causing the computer to perform the method as described in any one of claims 1-3.