Data synchronization method and apparatus, electronic device, system, and storage medium
By using big data synchronization tools to obtain full user information and video records from the first and second databases, the problem of excessively long data synchronization time was solved, enabling rapid updates and efficient data retrieval from the third database.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING IQIYI TECH CO LTD
- Filing Date
- 2022-11-21
- Publication Date
- 2026-06-12
Smart Images

Figure CN115934733B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data processing technology, and in particular to a data synchronization method, apparatus, electronic device, system, and storage medium. Background Technology
[0002] As the company's business segments have expanded, data access has become increasingly complex. For example, if a project needs to access data from both business segments B and C, then read interfaces for data from both business segments B and C are required. To address this issue, existing technology establishes a third database to synchronize data from all business segments. This third database then provides interfaces to simultaneously access the necessary data from all business segments.
[0003] Taking the first database as an example, in the first database, if a user uploads a video work, each user corresponds to a unique video ID. A user can upload multiple video works under this video ID. At this time, the user information is recorded in the first database, and the video work information is recorded in the second database. In the process of counting the total number of video works uploaded by each user and synchronizing it to the third database, the existing technology obtains the video work information of the second database by calling the interface of the second database. Because the interface has a limit on the query rate per second (QPS), it can only call the video work information of a single user at a time. After counting the total number of video works uploaded by that user, this statistical information is synchronized to the third database, and then the same operation is performed on the next user until the traversal is completed.
[0004] However, due to the continuous growth of users and video works, and the QPS limitation of the interface for calling the second database information, it takes too long to synchronize all users and the total number of their uploaded video works to the third database, resulting in untimely data updates and affecting data access for other services. Summary of the Invention
[0005] The purpose of this invention is to provide a data synchronization method, apparatus, electronic device, system, and storage medium to solve the problem that synchronizing all users and their uploaded video works to a third database takes too long, resulting in untimely data updates and affecting data access for other services. The specific technical solution is as follows:
[0006] In a first aspect of this invention, a data synchronization method is provided, which may include:
[0007] The big data synchronization tool is invoked to obtain full user information corresponding to the video application from the first database and full video records from the second database.
[0008] Obtain the video records corresponding to the full user information from the full video records;
[0009] The video records corresponding to the full set of user information are stored in a third database for different services to access.
[0010] Optionally, the step of calling the big data synchronization tool to obtain full user information corresponding to the video application from the first database and full video records from the second database includes:
[0011] The full video recordings from the second database are sent to the first database according to the target communication protocol;
[0012] The big data synchronization tool is invoked to obtain the full user information and full video records corresponding to the video application from the first database.
[0013] Optionally, each piece of user information in the full set of user information has a corresponding first user identifier, and each video record in the full set of video records has a corresponding second user identifier;
[0014] The step of retrieving the video records corresponding to the full user information from the full video recordings includes:
[0015] If the first user identifier is detected to be consistent with the second user identifier, the target video record corresponding to the second user identifier in the full video record is obtained;
[0016] The video records corresponding to the full set of user information are generated based on the target video records.
[0017] Optionally, generating the video record corresponding to the full user information based on the target video record includes:
[0018] Obtain all user identifiers for the target video recording;
[0019] The full set of user identifiers in the target video record is deduplicated to generate the full set of target user identifiers;
[0020] For any target user identifier among all target user identifiers, call the statistical function to obtain the total number of video records contained in the target video record for that target user identifier;
[0021] Upon detection that the total number of video records for the full target user identifier has been obtained, video records corresponding to the full user information are generated based on the full target user identifier and the total number of video records.
[0022] Optionally, each video record in the full video record obtained from the second database further includes: video status, wherein the video status includes: successful publication and failed publication;
[0023] The step of calling the statistics function to obtain the total number of video records containing the target user identifier in the target video record includes:
[0024] If the video status is detected as successfully published, the statistical function is called to increment the total number of video records counted by the target user identifier by one;
[0025] If the video status is detected as a publishing failure, the total number of video records counted by the target user identifier remains unchanged.
[0026] Optionally, storing the video records corresponding to the full set of user information in a third database for retrieval by different services includes:
[0027] Obtain the binary log of the first database;
[0028] If an update event for a video record corresponding to the full set of user information is detected in the binary log, the video record corresponding to the full set of user information is stored in a third database for different services to access.
[0029] Optionally, the step of storing the video record corresponding to the full user information in a third database when an update event for the video record corresponding to the full user information is detected in the binary log for different services to access includes:
[0030] If an update event for a video record corresponding to the full set of user information is detected in the binary log, the video record corresponding to the full set of user information is sent to the message middleware.
[0031] The video records corresponding to the full set of user information are stored in a third database through the message middleware so that different services can access them.
[0032] In a second aspect of the invention, a data synchronization device is provided, which may include:
[0033] The first module is used to call the big data synchronization tool to obtain the full user information corresponding to the video application from the first database and the full video records from the second database;
[0034] The second module is used to obtain the video records corresponding to the full user information from the full video records;
[0035] The third module is used to store the video records corresponding to the full user information into a third database so that different services can read them.
[0036] Optionally, the first module further includes:
[0037] The first sending submodule is used to send the full video recordings of the second database to the first database according to the target communication protocol;
[0038] The first acquisition submodule is used to call the big data synchronization tool to obtain the full user information and the full video records corresponding to the video application from the first database.
[0039] Optionally, each piece of user information in the full set of user information has a corresponding first user identifier, and each video record in the full set of video records has a corresponding second user identifier;
[0040] The second module also includes:
[0041] The second acquisition submodule is used to acquire the target video record corresponding to the second user identifier in the full video record when the first user identifier is detected to be consistent with the second user identifier.
[0042] The first generation submodule is used to generate video records corresponding to the full user information based on the target video records.
[0043] Optionally, the first generation submodule is configured to include:
[0044] The third acquisition submodule is used to acquire the full user identifiers of the target video recording;
[0045] The second generation submodule is used to generate a full target user identifier after deduplicating the full user identifiers of the target video record;
[0046] The fourth acquisition submodule is used to call a statistical function to obtain the total number of video records contained in the target video record for any target user identifier among the full set of target user identifiers;
[0047] The third generation submodule is used to generate video records corresponding to the full user information based on the full target user identifier and the total number of video records when the total number of video records of the full target user identifier is obtained.
[0048] Optionally, each video record in the full video record obtained from the second database further includes: video status, wherein the video status includes: successful publication and failed publication;
[0049] The fourth acquisition submodule also includes:
[0050] The first detection submodule is used to call the statistical function to add one to the total number of video records counted by the target user identifier when the video status is detected as successfully published.
[0051] The second detection submodule is used to keep the total number of video records counted by the target user identifier unchanged when the video status is detected as a publishing failure.
[0052] Optionally, the third module further includes:
[0053] The fifth acquisition submodule is used to acquire the binary logs of the first database;
[0054] The first storage submodule is used to store the video records corresponding to the full user information in a third database when an update event for the video records corresponding to the full user information is detected in the binary log, so that different services can read them.
[0055] Optionally, the first storage submodule further includes:
[0056] The first sending submodule is used to send the video record corresponding to the full user information to the message middleware when an update event for the video record corresponding to the full user information is detected in the binary log.
[0057] The second storage submodule is used to store the video records corresponding to the full user information into the third database through the message middleware, so that different services can read them.
[0058] A third aspect of the present invention also provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus.
[0059] Memory, used to store computer programs;
[0060] The processor, when executing a program stored in memory, performs any of the data synchronization methods described above.
[0061] In a fourth aspect of the invention, a computer-readable storage medium is also provided, wherein instructions are stored therein, which, when executed on a computer, cause the computer to perform any of the data synchronization methods described above.
[0062] In a fifth aspect of the invention, a computer program product containing instructions is also provided, which, when run on a computer, causes the computer to perform any of the data synchronization methods described above.
[0063] This invention provides a data synchronization method that retrieves all user information corresponding to a video application from a first database and all video records from a second database using a big data synchronization tool. This method avoids the time wasted on retrieving data from only one user at a time. Retrieving video records corresponding to all user information from the full video records allows for targeted filtering of video records relevant to the user information, reducing interference from irrelevant video records. Storing the video records corresponding to all user information in a third database achieves centralized data synchronization, facilitating access for different services and reducing reliance on the first database. In summary, this embodiment, by calling a big data synchronization tool, can retrieve all user information and video records at once. After filtering out the video records corresponding to all user information, it synchronizes them to the third database, enabling rapid data updates in the third database. This not only saves time but also facilitates access for different services. Attached Figure Description
[0064] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the accompanying drawings used in the description of the embodiments or the prior art will be briefly introduced below.
[0065] Figure 1 A flowchart illustrating the steps of a data synchronization method provided in an embodiment of the present invention;
[0066] Figure 2 A flowchart illustrating the steps of another data synchronization method provided in an embodiment of the present invention;
[0067] Figure 3 A flowchart illustrating the steps of another data synchronization method provided in an embodiment of the present invention;
[0068] Figure 4 A flowchart illustrating the steps of another data synchronization method provided in an embodiment of the present invention;
[0069] Figure 5 A flowchart illustrating the steps of another data synchronization method provided in an embodiment of the present invention;
[0070] Figure 6 A flowchart illustrating the steps of another data synchronization method provided in an embodiment of the present invention;
[0071] Figure 7 This is a structural block diagram of a data synchronization device provided in an embodiment of the present invention;
[0072] Figure 8 This is a structural block diagram of an electronic device provided in an embodiment of the present invention. Detailed Implementation
[0073] The technical solutions of the embodiments of the present invention will now be described with reference to the accompanying drawings. Although exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application can be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this application can be thoroughly understood and its scope can be fully conveyed to those skilled in the art.
[0074] The terms "first," "second," etc., used in the specification and claims of this application are used to distinguish similar objects and not to describe a specific order or sequence. It should be understood that such use of data can be interchanged where appropriate so that embodiments of this application can be implemented in orders other than those illustrated or described herein, and the objects distinguished by "first," "second," etc., are generally of the same class and the number of objects is not limited; for example, a first object can be one or more. Furthermore, in the specification and claims, "and / or" indicates at least one of the connected objects, and the character " / " generally indicates that the preceding and following objects are in an "or" relationship.
[0075] In various embodiments of the present invention, it should be understood that the sequence number of each process described below does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
[0076] The data synchronization method, apparatus, electronic device, system, and storage medium provided in this application will be described in detail below with reference to the accompanying drawings and through specific embodiments and application scenarios.
[0077] See Figure 1 This is a flowchart of a data synchronization method provided in an embodiment of the present invention. The method may include:
[0078] Step 101: Use the big data synchronization tool to obtain full user information corresponding to the video application from the first database and full video records from the second database.
[0079] In this embodiment of the invention, the first database stores all user information of the video application, i.e., full user information. The video records stored in the second database include not only all videos published by users in the video application, but also video records published in other applications. The video information stored in the second database includes the video's publishing status, the user identifier who published the video, and parameters such as the video title and video size.
[0080] It should be noted that the full user information and full video recordings are represented as a single data table. In this embodiment of the invention, the full user information corresponding to the video application stored in the first database and the full video recordings stored in the second database constitute a massive amount of data. To achieve one-time retrieval, this can be accomplished using the big data synchronization tool Hive. Specifically, Hive is a data warehouse tool based on the distributed system infrastructure Hadoop, used for data extraction, transformation, and loading. It is a mechanism that can store, query, and analyze large-scale data stored in Hadoop. The Hive data warehouse tool can map structured data files to a database table. Therefore, Hive extracts all the user information stored in the first database at once and transforms it into a full user information table, and also extracts all the video recordings sent from the second database to the first database at once and transforms them into a full video recording table. Furthermore, since Hive maps the database data to database tables, after obtaining the full user information table and the full video recording table respectively, offline data filtering, classification, and statistical operations can be performed.
[0081] In this embodiment of the invention, a big data synchronization tool is invoked to obtain all user information and all video recordings at once. Furthermore, since the full video recordings are stored in a second database, and the big data synchronization tool establishes a connection with the first database, it is necessary to send the full video recordings from the second database to the first database. This can be achieved using a Remote Procedure Call (RPC) protocol, i.e., a target communication protocol. The specific implementation steps include:
[0082] According to the target communication protocol, the full video recordings of the second database are sent to the first database.
[0083] The big data synchronization tool is invoked to retrieve the full user information and full video records corresponding to the video application from the first database.
[0084] In addition, to ensure the uniformity and integrity of the acquired data, a time period is set in advance for data synchronization. Therefore, in this embodiment, the big data synchronization tool is called during the predetermined time period to obtain the full amount of user information corresponding to the video application from the first database, and then the full amount of video recordings are sent to the first database through the target communication protocol, and then the big data synchronization tool is called to obtain the full amount of video recordings.
[0085] Step 102: Obtain the video records corresponding to all user information from the full video recordings.
[0086] In this embodiment of the invention, since the full video recording includes video records published by all users of the video application, and the video information of each video record contains the user information of the user who published the video record, the video record corresponding to the full user information can be obtained from the full video recording based on the full user information obtained from the first database.
[0087] It should be noted that because the records in the full video record are sorted according to the time when the user posted the video, the video records corresponding to each user information are not placed together. After filtering out the video records corresponding to all user information, the records of the same user information can be merged and counted to make the records clearer and more intuitive.
[0088] Step 103: Store the video records corresponding to all user information in the third database so that different services can access them.
[0089] In this embodiment of the method, in order to facilitate the simultaneous acquisition of multiple types of data by different services and improve work acquisition efficiency, and also to reduce the dependence on the first database that stores all user information of video applications and to diversify risks, a third database is set up to synchronize the data of different services together. This database can also be called a data center.
[0090] It should be noted that the video records corresponding to the full user information mentioned above are stored in the first database that is connected to the big data synchronization tool. In order to store the video records corresponding to the full user information in the third database, some communication protocols are also required. These communication protocols may include: Remote Procedure Call Protocol, File Transfer Protocol (FTP), User Datagram Protocol (UDP), etc., which are not specifically limited here.
[0091] This invention provides a data synchronization method that retrieves all user information corresponding to a video application from a first database and all video records from a second database using a big data synchronization tool. This allows for the simultaneous retrieval of all user information and video records, avoiding the time wasted on retrieving data from only one user at a time. Retrieving video records corresponding to all user information from the full video records enables targeted filtering of video records relevant to the user information, reducing interference from irrelevant video records. By storing the video records corresponding to all user information in a third database, data synchronization is achieved, facilitating access for different services and reducing reliance on the first database. In summary, this embodiment, by calling a big data synchronization tool, can retrieve all user information and video records at once. After filtering out the video records corresponding to all user information, it synchronizes them to the third database, thereby achieving rapid data updates in the third database. This not only saves time but also facilitates access for different services.
[0092] See Figure 2 This is a flowchart of another data synchronization method provided in this embodiment of the invention. The data synchronization method disclosed in this embodiment is as follows: Figure 1 One feasible implementation of step 102 in the illustrated embodiment specifically includes:
[0093] Step 201: If the first user identifier and the second user identifier are found to be the same, obtain the target video record corresponding to the second user identifier in the full video record.
[0094] In this embodiment of the invention, for ease of understanding, the corresponding user identifier for each piece of user information in the full set of user information is set as the first user identifier, and the corresponding user identifier for each video record in the full set of video records is set as the second user identifier. To filter out video records belonging to users in the first database from the full set of video records, firstly, any video record is selected from the full set of video records, and the video information included in that video record is obtained. The video information includes: the second user identifier, video title, video status, etc. This "any video record" can be randomly selected or selected according to the order in which the records are obtained; this invention does not impose a specific limitation. Then, the second user identifier in the video information of this video record is compared one by one with all the first user identifiers in the full set of user information. If a match is found during the comparison, the video record corresponding to that second user identifier is retained. If no matching user identifier is found after comparing all the user identifiers in the first database, then this video record is discarded.
[0095] For example, if the video information of any selected video record from the full video record is: User ID 31, Illustration Art, Successfully Published, 30MB; and the first user IDs in the full user information include: User ID 10, User ID 20, User ID 54, User ID 64, User ID 25, and User ID 31; then, the second user ID in the video information of this video record can be compared with all the first user IDs in the full user information. If the "User ID 31" obtained from the full user information stored in the first database matches the "User ID 31" in the selected video record from the full video record in the second database, then this video record is retained. If the video information of the video record is: User ID 30, Life Clip, Successfully Published, 20MB; then, "User ID 30" is not found in the full user information, so this video record is discarded. After traversing all video records according to the above operation, the operation ends, and the target data record is obtained.
[0096] It should be noted that, alternatively, any first user identifier from the full set of user information can be obtained. This first user identifier is then compared one by one with the second user identifiers from the full set of video records. If a match is found, the video record corresponding to the second user identifier is retained. If no matching user identifier is found after comparing all the second user identifiers in the full set of video records stored in the second database, then this first user identifier is discarded, and the next first user identifier is selected for comparison. This process continues until all first user identifiers corresponding to the full set of user information have been traversed, at which point the operation ends and the target data record is obtained.
[0097] Step 202: Generate video records corresponding to all user information based on the target video records.
[0098] In this embodiment of the invention, the target video record is N (N≥1) video records with the same identifier obtained after comparing the first user identifier and the second user identifier. However, at this time, the video records are relatively chaotic and disordered. In order to clearly obtain the video posting situation of users, the video records with the same user identifier in the target video record can be extracted, and the number of extracted video records can be counted. That is, the total number of videos posted by each user can be obtained, that is, the video record situation corresponding to the full user information can be obtained. For example, the target video records obtained after comparison include: "User ID 31, Illustration Art, Successfully Published, 30MB", "User ID 10, Food Making, Successfully Published, 20MB", "User ID 31, Military Technology, Published Failed, 33MB", "User ID 20, Celebrity Editing, Successfully Published, 26MB", and "User ID 54, Illustration Art, Successfully Published, 30MB". Among them, User ID 31 has two records, while the rest have one record each. Therefore, after sorting and statistically analyzing, the video records corresponding to all user information are User ID 31, 2, User ID 10, 1, User ID 20, 1, and User ID 54, 1.
[0099] This invention provides a data synchronization method that retrieves all user information corresponding to a video application from a first database and all video records from a second database using a big data synchronization tool. This allows for the simultaneous retrieval of all user information and video records, avoiding the time wasted on retrieving data from only one user at a time. Retrieving video records corresponding to all user information from the full video records enables targeted filtering of video records relevant to the user information, reducing interference from irrelevant video records. By storing the video records corresponding to all user information in a third database, data synchronization is achieved, facilitating access for different services and reducing reliance on the first database. In summary, this embodiment, by calling a big data synchronization tool, can retrieve all user information and video records at once. After filtering out the video records corresponding to all user information, it synchronizes them to the third database, thereby achieving rapid data updates in the third database. This not only saves time but also facilitates access for different services.
[0100] See Figure 3 This is a flowchart illustrating the steps of another data synchronization method provided in this embodiment of the invention. The data synchronization method disclosed in this embodiment is as follows: Figure 2 One feasible implementation of step 202 in the illustrated embodiment specifically includes:
[0101] Step 301: Obtain the full user identifiers of the target video recordings.
[0102] The target video records obtained in this embodiment of the invention are only video records published by all users in the video application, filtered from the total number of video records. However, in order for staff to have a more intuitive and clear understanding of the video publication activities of each user in the video application, the target video records need to be processed. Since each user has a unique user identifier, the records can be organized based on the user identifiers. Therefore, the first step is to obtain all user identifiers for the target video records.
[0103] Step 302: After deduplicating all user identifiers in the target video record, generate all target user identifiers.
[0104] In this embodiment of the invention, in order to obtain the video recording information under each user identifier, after obtaining all user identifiers of the target video record, it is first necessary to perform a deduplication operation on all user identifiers. This is because the target video record contains video records whose user identifiers are consistent with those of the user information in the full set of video records. However, these video records are quite disorganized, with multiple video records for the same user identifier, and they appear in different positions. To facilitate subsequent statistics, it is necessary to perform a deduplication operation on all user identifiers of the target video record to ensure that the obtained user identifiers are unique.
[0105] For example, the target video recording includes "User ID 31, Illustration Art, Successfully Published, 30MB", "User ID 10, Food Making, Successfully Published, 20MB", "User ID 31, Military Technology, Publish Failed, 33MB", "User ID 20, Celebrity Editing, Successfully Published, 26MB", and "User ID 54, Illustration Art, Successfully Published, 30MB". After deduplication, the user IDs are "User ID 31", "User ID 10", "User ID 20", and "User ID 54".
[0106] Step 303: For any target user identifier among all target user identifiers, call the statistical function to obtain the total number of video records contained in the target video records for that target user identifier.
[0107] In this embodiment of the invention, after deduplicating all user identifiers in the target video recordings, a unique set of target user identifiers is generated. Then, from the full set of target user identifiers, one target identifier is selected randomly or sequentially. The `Count` function is then used to count the number of video records corresponding to this target identifier. Specifically, after selecting a target user identifier, the `Count` function increments the returned result when a video record with the same user identifier as the target user identifier is found in the target video recordings, either from top to bottom or bottom to top. The search continues until all video records in the target video recordings have been traversed. During this process, if N (N≥1) video records with the same user identifier as the target user identifier are found, the `Count` function increments the returned result by N.
[0108] For example, the video recording has two data entries related to "User ID 31": "User ID 31, Illustration Art, Successfully Posted, 30MB"; "User ID 31, Military Technology, Posted Failed, 33MB". According to the return result of the Count function, the result is 2.
[0109] Step 304: After detecting that the total number of video records for all target user identifiers has been obtained, generate video records corresponding to all user information based on the total number of target user identifiers and the total number of video records.
[0110] In this embodiment of the invention, after deduplicating all user identifiers of the target video record, a unique target user identifier can be obtained. Then, a statistical function and any target user identifier are used to complete the statistics of the total number of videos published by this user. In this way, the total number of videos published by each user in the target video record is obtained by performing video record statistics on all deduplicated target user identifiers.
[0111] For example, the target video records include "User ID 31, Illustration Art, Successfully Published, 30MB", "User ID 10, Food Making, Successfully Published, 20MB", "User ID 31, Military Technology, Publish Failed, 33MB", "User ID 20, Celebrity Editing, Successfully Published, 26MB", and "User ID 54, Illustration Art, Successfully Published, 30MB". After deduplication by user ID, the target video records are found for each target user ID, and a statistical function is called to return the total number of video records. The results are: "User ID 31, 2", "User ID 10, 1", "User ID 20, 1", and "User ID 54, 1". Integrating these results gives the total number of user video records.
[0112] This invention provides a data synchronization method that retrieves all user information corresponding to a video application from a first database and all video records from a second database using a big data synchronization tool. This allows for the simultaneous retrieval of all user information and video records, avoiding the time wasted on retrieving data from only one user at a time. Retrieving video records corresponding to all user information from the full video records enables targeted filtering of video records relevant to the user information, reducing interference from irrelevant video records. By storing the video records corresponding to all user information in a third database, data synchronization is achieved, facilitating access for different services and reducing reliance on the first database. In summary, this embodiment, by calling a big data synchronization tool, can retrieve all user information and video records at once. After filtering out the video records corresponding to all user information, it synchronizes them to the third database, thereby achieving rapid data updates in the third database. This not only saves time but also facilitates access for different services.
[0113] See Figure 4 This is a flowchart illustrating the steps of another data synchronization method provided in this embodiment of the invention. The data synchronization method disclosed in this embodiment is as follows: Figure 3 One feasible implementation of step 303 in the illustrated embodiment specifically includes:
[0114] Step 401: If the video status is detected as successfully published, call the statistics function to increment the total number of video records counted by the target user identifier by one.
[0115] Step 402: If the video status is detected as a publishing failure, keep the total number of video records counted by the target user identifier unchanged.
[0116] In this embodiment of the invention, each video record obtained from the second database also includes a video status, which includes: successful publication and failed publication. The final statistics for the video records only need to count the video records with a successful publication status. Therefore, when a successful publication status is detected, the statistical function is called to increment the total number of video records for the corresponding target user identifier by one. When a failed publication status is detected, the total number of video records for the corresponding target user identifier remains unchanged.
[0117] This invention provides a data synchronization method that retrieves all user information corresponding to a video application from a first database and all video records from a second database using a big data synchronization tool. This allows for the simultaneous retrieval of all user information and video records, avoiding the time wasted on retrieving data from only one user at a time. Retrieving video records corresponding to all user information from the full video records enables targeted filtering of video records relevant to the user information, reducing interference from irrelevant video records. By storing the video records corresponding to all user information in a third database, data synchronization is achieved, facilitating access for different services and reducing reliance on the first database. In summary, this embodiment, by calling a big data synchronization tool, can retrieve all user information and video records at once. After filtering out the video records corresponding to all user information, it synchronizes them to the third database, thereby achieving rapid data updates in the third database. This not only saves time but also facilitates access for different services.
[0118] See Figure 5 This is a flowchart illustrating the steps of another data synchronization method provided in this embodiment of the invention. The data synchronization method disclosed in this embodiment is as follows: Figure 1 One feasible implementation of step 103 in the illustrated embodiment specifically includes:
[0119] Step 501: Obtain the binary log of the first database.
[0120] In this embodiment of the invention, the first database establishes a connection with the big data synchronization tool. After the big data synchronization tool obtains the video records corresponding to all user information from the full video recordings, it needs to save these records to the first database before sending them to the third database. To ensure that the video records obtained this time have been updated, it is necessary to obtain the binary log of the first database. The binary log records all database table structure changes and table data modification information, and is stored on the disk in binary form. The content stored in the binary log is called an event. Each database update operation (insert, update, delete, etc.) corresponds to an event. Therefore, in this embodiment, the events for the video records corresponding to all user information in the first database are obtained from the binary log.
[0121] Step 502: If an update event for video records corresponding to all user information is detected in the binary log, the video records corresponding to all user information are stored in the third database for different services to read.
[0122] In this embodiment of the invention, if an update event for video records corresponding to all user information is detected in the binary log, the video records corresponding to all user information are stored in the third database. Conversely, if no update event is obtained from the binary log, no synchronization operation is performed, and a signal not to update is sent to the third database, thus avoiding unnecessary data updates and reducing data synchronization time.
[0123] This invention provides a data synchronization method that retrieves all user information corresponding to a video application from a first database and all video records from a second database using a big data synchronization tool. This allows for the simultaneous retrieval of all user information and video records, avoiding the time wasted on retrieving data from only one user at a time. Retrieving video records corresponding to all user information from the full video records enables targeted filtering of video records relevant to the user information, reducing interference from irrelevant video records. By storing the video records corresponding to all user information in a third database, data synchronization is achieved, facilitating access for different services and reducing reliance on the first database. In summary, this embodiment, by calling a big data synchronization tool, can retrieve all user information and video records at once. After filtering out the video records corresponding to all user information, it synchronizes them to the third database, thereby achieving rapid data updates in the third database. This not only saves time but also facilitates access for different services.
[0124] See Figure 6 This is a flowchart illustrating the steps of another data synchronization method provided in this embodiment of the invention. The data synchronization method disclosed in this embodiment is as follows: Figure 5 One feasible implementation of step 502 in the illustrated embodiment specifically includes:
[0125] Step 601: If an update event for video records corresponding to all user information is detected in the binary log, the video records corresponding to all user information are sent to the message middleware.
[0126] In this embodiment of the invention, video recordings corresponding to all user information are sent to a message middleware. The message middleware, also known as a message queue, is the foundational software in a distributed system for sending and receiving messages. It uses an efficient and reliable message passing mechanism for platform-independent data exchange and integrates distributed systems based on data communication. By providing message passing and message queue models, it can extend process communication in a distributed environment. Common message middleware includes RabbitMQ, ActiveMQ, Kafka, and RocketMQ, but this invention does not specifically limit its use.
[0127] Step 602: Store the video records corresponding to all user information in the third database through the message middleware so that different services can read them.
[0128] In this embodiment of the invention, video records corresponding to all user information are synchronized to a third database through a message middleware. This increases the system coupling of interaction, improves the system response time, and provides architectural services for big data processing. Moreover, by synchronizing data through the message middleware, automatic retrying and resending can be performed when data synchronization fails, thus ensuring the probability of successful data transmission.
[0129] This invention provides a data synchronization method that retrieves all user information corresponding to a video application from a first database and all video records from a second database using a big data synchronization tool. This allows for the simultaneous retrieval of all user information and video records, avoiding the time wasted on retrieving data from only one user at a time. Retrieving video records corresponding to all user information from the full video records enables targeted filtering of video records relevant to the user information, reducing interference from irrelevant video records. By storing the video records corresponding to all user information in a third database, data synchronization is achieved, facilitating access for different services and reducing reliance on the first database. In summary, this embodiment, by calling a big data synchronization tool, can retrieve all user information and video records at once. After filtering out the video records corresponding to all user information, it synchronizes them to the third database, thereby achieving rapid data updates in the third database. This not only saves time but also facilitates access for different services.
[0130] See Figure 7 , Figure 7 This is a structural block diagram of a data synchronization device 700 provided in an embodiment of this application, as shown below. Figure 7 As shown, the device may include:
[0131] The first module 701 is used to call the big data synchronization tool to obtain the full user information corresponding to the video application from the first database and the full video records from the second database.
[0132] The second module 702 is used to retrieve the video records corresponding to all user information from the full video recordings.
[0133] The third module 703 is used to store the video records corresponding to all user information into the third database so that different services can read them.
[0134] Optionally, the first module also includes:
[0135] The first sending submodule is used to send the full video recordings of the second database to the first database according to the target communication protocol.
[0136] The first acquisition submodule is used to call the big data synchronization tool to obtain the full user information and full video records corresponding to the video application from the first database.
[0137] Optionally, each piece of user information in the full set of user information has a corresponding first user identifier, and each video record in the full set of video records has a corresponding second user identifier.
[0138] The second module also includes:
[0139] The second acquisition submodule is used to acquire the target video record corresponding to the second user identifier in the full video record when the first user identifier is detected to be consistent with the second user identifier.
[0140] The first generation submodule is used to generate video records corresponding to all user information based on the target video records.
[0141] Optionally, the first generation submodule is used to include:
[0142] The third acquisition submodule is used to acquire the full user identifiers of the target video recording.
[0143] The second generation submodule is used to generate full target user identifiers after deduplicating all user identifiers of the target video recording.
[0144] The fourth submodule is used to call a statistical function to obtain the total number of video records contained in the target video records for any target user identifier among all target user identifiers.
[0145] The third generation submodule is used to generate video records corresponding to all user information based on the total number of video records of all target user identifiers and the total number of video records, after detecting that the total number of video records of all target user identifiers has been obtained.
[0146] Optionally, each video record in the full video record obtained from the second database also includes: video status, where the video status includes: successful publication and failed publication.
[0147] The fourth acquisition submodule also includes:
[0148] The first detection submodule is used to call a statistical function to increment the total number of video records counted by the target user identifier when the video status is detected as successfully published.
[0149] The second detection submodule is used to keep the total number of video records counted by the target user identifier unchanged when the video status is detected as a publishing failure.
[0150] Optionally, the third module also includes:
[0151] The fifth submodule is used to obtain the binary logs of the first database.
[0152] The first storage submodule is used to store the video records corresponding to all user information in the third database when an update event for the video records corresponding to all user information is detected in the binary log, so that different services can read them.
[0153] Optionally, the first storage submodule further includes:
[0154] The first sending submodule is used to send the video records corresponding to all user information to the message middleware when an update event for the video records corresponding to all user information is detected in the binary log.
[0155] The second storage submodule is used to store the video records corresponding to all user information into the third database through a message middleware, so that different services can read them.
[0156] This invention provides a data synchronization method that retrieves all user information corresponding to a video application from a first database and all video records from a second database using a big data synchronization tool. This allows for the simultaneous retrieval of all user information and video records, avoiding the time wasted on retrieving data from only one user at a time. Retrieving video records corresponding to all user information from the full video records enables targeted filtering of video records relevant to the user information, reducing interference from irrelevant video records. By storing the video records corresponding to all user information in a third database, data synchronization is achieved, facilitating access for different services and reducing reliance on the first database. In summary, this embodiment, by calling a big data synchronization tool, can retrieve all user information and video records at once. After filtering out the video records corresponding to all user information, it synchronizes them to the third database, thereby achieving rapid data updates in the third database. This not only saves time but also facilitates access for different services.
[0157] This invention also provides an electronic device. Figure 8 This is an electrical method provided in an embodiment of the present invention.
[0158] The structural block diagram of the sub-device, such as Figure 8 As shown, it includes a processor 801, a communication interface 802, a memory 803, and a communication bus 804. The processor 801, communication interface 802, and memory 803 communicate with each other via the communication bus 804.
[0159] Memory 803 is used to store computer programs;
[0160] When processor 801 executes a program stored in memory 803, it performs the following steps:
[0161] The big data synchronization tool is invoked to obtain full user information corresponding to the video application from the first database and full video records from the second database.
[0162] Obtain the video records corresponding to the full user information from the full video records;
[0163] The video records corresponding to the full set of user information are stored in a third database for different services to access.
[0164] This invention provides a data synchronization method that retrieves all user information corresponding to a video application from a first database and all video records from a second database using a big data synchronization tool. This allows for the simultaneous retrieval of all user information and video records, avoiding the time wasted on retrieving data from only one user at a time. Retrieving video records corresponding to all user information from the full video records enables targeted filtering of video records relevant to the user information, reducing interference from irrelevant video records. By storing the video records corresponding to all user information in a third database, data synchronization is achieved, facilitating access for different services and reducing reliance on the first database. In summary, this embodiment, by calling a big data synchronization tool, can retrieve all user information and video records at once. After filtering out the video records corresponding to all user information, it synchronizes them to the third database, thereby achieving rapid data updates in the third database. This not only saves time but also facilitates access for different services.
[0165] The communication bus mentioned above can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. This communication bus can be divided into address bus, data bus, control bus, etc. For ease of illustration, only one thick line is used to represent it in the diagram, but this does not mean that there is only one bus or one type of bus.
[0166] The communication interface is used for communication between the aforementioned terminal and other devices.
[0167] The memory may include random access memory (RAM) or non-volatile memory, such as at least one disk storage device. Optionally, the memory may also be at least one storage device located remotely from the aforementioned processor.
[0168] The processors mentioned above can be general-purpose processors, including central processing units (CPUs), network processors (NPs), etc.; they can also be digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
[0169] In another embodiment of the present invention, a computer-readable storage medium is also provided, which stores instructions that, when executed on a computer, cause the computer to perform any of the data synchronization methods described in the above embodiments.
[0170] In another embodiment of the present invention, a computer program product containing instructions is also provided, which, when run on a computer, causes the computer to perform any of the data synchronization methods described in the above embodiments.
[0171] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially as a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present invention are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or third-party database to another website, computer, server, or third-party database via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or third-party database that integrates one or more available media. The available media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., DVDs), or semiconductor media (e.g., solid state disks (SSDs)).
[0172] It should be noted that, in this document, relational terms such as "first" and "first" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0173] The various embodiments in this specification are described in a related manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.
[0174] The above description is merely a preferred embodiment of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention are included within the scope of protection of the present invention.
Claims
1. A data synchronization method, characterized in that, The method includes: The system uses a big data synchronization tool to retrieve full user information corresponding to the video application from a first database and full video records from a second database. Each piece of user information in the full user information has a corresponding first user identifier, and each video record in the full video records has a corresponding second user identifier. Obtain the video records corresponding to the full user information from the full video records; The video records corresponding to the full set of user information are stored in a third database for different services to access. The step of retrieving the video records corresponding to the full user information from the full video recordings includes: Based on the first user identifier and the second user identifier, filter out the video records associated with the full user information from the full video records; The selected video records are deduplicated and statistically processed to obtain statistical results corresponding to each user identifier, which are then used as the video records corresponding to the full set of user information.
2. The method according to claim 1, characterized in that, The process of calling the big data synchronization tool to obtain full user information corresponding to the video application from the first database and full video records from the second database includes: The full video recordings from the second database are sent to the first database according to the target communication protocol; The big data synchronization tool is invoked to obtain the full user information and full video records corresponding to the video application from the first database.
3. The method according to claim 1, characterized in that, The step of filtering video records associated with the full user information from the full video records based on the first user identifier and the second user identifier includes: If the first user identifier is detected to be consistent with the second user identifier, the target video record corresponding to the second user identifier in the full video record is obtained; The video records corresponding to the full set of user information are generated based on the target video records.
4. The method according to claim 3, characterized in that, The step of generating the video record corresponding to the full user information based on the target video record includes: Obtain all user identifiers for the target video recording; The full set of user identifiers in the target video record is deduplicated to generate the full set of target user identifiers; For any target user identifier among all target user identifiers, call the statistical function to obtain the total number of video records contained in the target video record for that target user identifier; Upon detection that the total number of video records for the full target user identifier has been obtained, video records corresponding to the full user information are generated based on the full target user identifier and the total number of video records.
5. The method according to claim 4, characterized in that, Each video record in the full video record obtained from the second database also includes: video status, wherein the video status includes: successful publication and failed publication; The step of calling the statistics function to obtain the total number of video records containing the target user identifier in the target video record includes: If the video status is detected as successfully published, the statistical function is called to increment the total number of video records counted by the target user identifier by one; If the video status is detected as a publishing failure, the total number of video records counted by the target user identifier remains unchanged.
6. The method according to claim 1, characterized in that, The step of storing the video records corresponding to the full set of user information into a third database for different services to access includes: Obtain the binary log of the first database; If an update event for a video record corresponding to the full set of user information is detected in the binary log, the video record corresponding to the full set of user information is stored in a third database for different services to access.
7. The method according to claim 6, characterized in that, When an update event for a video record corresponding to the full user information is detected in the binary log, the video record corresponding to the full user information is stored in a third database for different services to access, including: If an update event for a video record corresponding to the full set of user information is detected in the binary log, the video record corresponding to the full set of user information is sent to the message middleware. The video records corresponding to the full set of user information are stored in a third database through the message middleware so that different services can access them.
8. A data synchronization device, characterized in that, The device includes: The first module is used to call a big data synchronization tool to obtain full user information corresponding to the video application from a first database and full video records from a second database; wherein, each piece of user information in the full user information has a corresponding first user identifier, and each video record in the full video records has a corresponding second user identifier. The second module is used to obtain the video records corresponding to the full user information from the full video records; The step of retrieving the video records corresponding to the full user information from the full video recordings includes: Based on the first user identifier and the second user identifier, filter out the video records associated with the full user information from the full video records; The selected video records are deduplicated and statistically processed to obtain statistical results corresponding to each user identifier, which are used as the video records corresponding to the full user information. The third module is used to store the video records corresponding to the full user information into a third database so that different services can read them.
9. An electronic device, characterized in that, It includes a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; Memory, used to store computer programs; A processor, when executing a program stored in memory, implements the steps of the method described in any one of claims 1-7.
10. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the method as described in any one of claims 1-7.