Data processing method and device, electronic equipment, medium and program product
By establishing the relationship between external and internal data in data processing, the problem of inconsistent data definition standards is solved, and efficient and accurate data classification, storage, and joint use are achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
- Filing Date
- 2022-08-22
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies suffer from inconsistent data definition standards when classifying and storing data from external data sources within a company. This leads to a need for extensive manual intervention, resulting in low efficiency and difficulty in ensuring the joint use of data.
By extracting external metadata from external data sources and establishing target data relationships based on preset data source templates, external data is associated with internal metadata and stored directly according to internally defined standards.
It reduces the manual classification process, improves the efficiency and accuracy of data processing, and ensures the combined use of external and internal data.
Smart Images

Figure CN115481305B_ABST
Abstract
Description
Technical Field
[0001] This specification relates to the field of computer technology, and in particular to a data processing method, apparatus, electronic device, medium, and program product. Background Technology
[0002] With the rapid popularization and development of the Internet, various data sources (platforms or companies) generate or use various types of data, both internal and external, in their production, operation, and decision-making processes. Due to the internal needs of the data source (platform or company), it is necessary to collect external data from other external data sources that cooperate with it, and to categorize and store the collected external data in order to ensure that the collected external data can be viewed and managed within the data source. Summary of the Invention
[0003] This specification provides a data processing method, apparatus, electronic device, medium, and program product. By establishing a correspondence (target data relationship) between internal metadata and the corresponding external metadata of external data from external data sources, the externally defined standards (external metadata) of external data from each external data source are associated with the internally defined standards (internal metadata) of the company's internal data. This achieves more efficient and accurate classification and associated storage of external data from external data sources according to internal data classification requirements. The above technical solution is as follows:
[0004] Firstly, embodiments of this specification provide a data processing method, including:
[0005] Collect a target external dataset from an external data source; the target external dataset includes at least one target external data.
[0006] Extract the target external metadata corresponding to each target external data in the above target external dataset;
[0007] Based on the target external metadata corresponding to each target external data in the aforementioned target external dataset, at least one target external data is classified and stored in the storage space of the external metadata associated with each internal metadata according to the target data relationship; the aforementioned target data relationship is used to characterize the correspondence between each internal metadata and the external metadata corresponding to each external data in the aforementioned external data source.
[0008] In one possible implementation, before classifying and storing at least one target external data into the storage space of the external metadata associated with each internal metadata based on the target external metadata corresponding to each target external data in the target external dataset according to the target data relationship, the method further includes:
[0009] Extract the internal metadata corresponding to each internal data in the internal data assets, and extract the external metadata corresponding to each external data in the external data sources;
[0010] Establish the association between the aforementioned internal metadata and the aforementioned external metadata to generate the target data relationship.
[0011] In one possible implementation, the extraction of external metadata corresponding to each external data source from the external data source includes:
[0012] Based on the preset data source template, external metadata corresponding to each external data source is extracted from the external data source; the preset data source template represents the external data standards defined by different industries corresponding to different external data sources.
[0013] In one possible implementation, the aforementioned target external dataset comes from one or more of the aforementioned external data sources.
[0014] In one possible implementation, the aforementioned target external data includes at least one of the following: text data, audio data, video data, and image data.
[0015] In one possible implementation, the aforementioned internal metadata and the corresponding associated external metadata belong to the same category.
[0016] In one possible implementation, after classifying and storing at least one target external data into the storage space of the external metadata associated with each internal metadata based on the target external metadata corresponding to each target external data in the target external dataset according to the target data relationship, the method further includes:
[0017] Receive the instruction to invoke the aforementioned target's external data;
[0018] In response to the above invocation instruction, based on the internal metadata associated with the above target external metadata, the above target external data in the storage space of the corresponding target external metadata is invoked.
[0019] Secondly, embodiments of this specification provide a data processing apparatus, the apparatus comprising:
[0020] The acquisition module is used to acquire a target external dataset from an external data source; the target external dataset includes at least one target external data.
[0021] The first extraction module is used to extract the target external metadata corresponding to each target external data in the above target external dataset;
[0022] The classification storage module is used to classify and store at least one target external data into the storage space of the external metadata associated with each internal metadata, based on the target external metadata corresponding to each target external data in the target external dataset and according to the target data relationship; the target data relationship is used to characterize the correspondence between each internal metadata and the external metadata corresponding to each external data in the external data source.
[0023] In one possible implementation, the data processing apparatus further includes:
[0024] The second extraction module is used to extract the internal metadata corresponding to each internal data in the internal data assets, and to extract the external metadata corresponding to each external data in the external data source.
[0025] The generation module is used to establish the association between the aforementioned internal metadata and the aforementioned external metadata, and to generate the target data relationship.
[0026] In one possible implementation, the second extraction module is specifically used to: extract external metadata corresponding to each external data source from the external data source based on the preset data source template; the preset data source template represents the external data standards defined by different industries corresponding to different external data sources.
[0027] In one possible implementation, the aforementioned target external dataset comes from one or more of the aforementioned external data sources.
[0028] In one possible implementation, the aforementioned target external data includes at least one of the following: text data, audio data, video data, and image data.
[0029] In one possible implementation, the aforementioned internal metadata and the corresponding associated external metadata belong to the same category.
[0030] In one possible implementation, the data processing apparatus further includes:
[0031] The receiving module is used to receive the calling instruction for invoking the aforementioned target external data;
[0032] The calling module is used to respond to the above calling instruction and, based on the internal metadata associated with the above target external metadata, call the above target external data in the storage space of the corresponding target external metadata.
[0033] Thirdly, embodiments of this specification provide an electronic device, including: a processor and a memory;
[0034] The processor is connected to the memory.
[0035] The aforementioned memory is used to store executable program code;
[0036] The processor reads the executable program code stored in the memory to run the program corresponding to the executable program code, so as to execute the method provided by the first aspect of the embodiments of this specification or any possible implementation of the first aspect.
[0037] Fourthly, embodiments of this specification provide a computer storage medium storing a plurality of instructions adapted for loading by a processor and executing the method provided by the first aspect of the embodiments of this specification or any possible implementation thereof.
[0038] Fifthly, embodiments of this specification provide a computer program product containing instructions that, when run on a computer or processor, cause the computer or processor to execute the data processing method provided by the first aspect of the embodiments of this specification or any possible implementation thereof.
[0039] This embodiment extracts target external metadata corresponding to each target external data in a target external dataset collected from an external data source. The target external dataset includes at least one target external data. Based on the target external metadata corresponding to each target external data in the target external dataset, the at least one target external data is classified and stored in the storage space of the external metadata associated with each internal metadata according to the target data relationship. The target data relationship is used to characterize the correspondence between each internal metadata and the external metadata corresponding to each external data in the external data source. Thus, through the correspondence between each internal metadata and the external metadata corresponding to each external data in the external data source (target... This data relationship mechanism links external data from various external data sources to internal data from the company's internal data, using external definition standards (external metadata). This allows the target external data to be directly categorized and stored in the storage space of the external metadata associated with each internal metadata, reducing the need for manual labeling and categorizing of external data from various data sources. This significantly saves manpower in processing external data, enabling more efficient and accurate categorization and storage of external data from external data sources according to internal data classification requirements. It also ensures that the categorized and stored external data can be used in conjunction with internal data or directly within the company. Attached Figure Description
[0040] To more clearly illustrate the technical solutions in the embodiments of this specification, the accompanying drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this specification. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0041] Figure 1 A schematic diagram of the architecture of a data processing system provided for an exemplary embodiment of this specification;
[0042] Figure 2 This is a schematic diagram illustrating the implementation process of data processing in related technologies;
[0043] Figure 3 A flowchart illustrating a data processing method provided for an exemplary embodiment of this specification;
[0044] Figure 4 This is a schematic diagram illustrating a process for establishing a target data relationship, provided as an exemplary embodiment of this specification.
[0045] Figure 5 This is a schematic diagram illustrating an implementation process for establishing target data relationships, provided as an exemplary embodiment of this specification.
[0046] Figure 6 A schematic diagram of different data source templates provided for an exemplary embodiment of this specification;
[0047] Figure 7 A schematic diagram illustrating the implementation process of a data processing method provided for an exemplary embodiment of this specification;
[0048] Figure 8 A flowchart illustrating another data processing method provided as an exemplary embodiment of this specification;
[0049] Figure 9 A schematic diagram of the structure of a data processing apparatus provided for an exemplary embodiment of this specification;
[0050] Figure 10 This is a schematic diagram of the structure of an electronic device provided as an exemplary embodiment of this specification. Detailed Implementation
[0051] The technical solutions in the embodiments of this specification will be clearly and completely described below with reference to the accompanying drawings.
[0052] The terms "first," "second," "third," etc., used in this specification, claims, and the foregoing drawings are used to distinguish different objects, not to describe a specific order. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or apparatus that includes a series of steps or units is not limited to the listed steps or units, but may optionally include steps or units not listed, or may optionally include other steps or units inherent to such processes, methods, products, or apparatus.
[0053] To more clearly describe the technical solutions of the embodiments in this specification, some concepts in this specification will be described before the description to facilitate a better understanding of the solutions.
[0054] Internal data assets: refer to data resources, recorded in physical or electronic form, that are owned or controlled by a company (enterprise or platform) and that can bring future economic benefits to it, such as but not limited to documents and electronic data.
[0055] Data resources: refers to various types of internal and external data generated or used by a company (enterprise or platform) in its production, operation and decision-making processes.
[0056] Data source: refers to the company (enterprise or platform) that generates data resources.
[0057] Please refer to Figure 1 , Figure 1 This is a schematic diagram of the architecture of a data processing system provided for an exemplary embodiment of this specification. Figure 1 As shown, the data processing system may include: an external server cluster 110, an internal server 120, and a terminal cluster 130. Wherein:
[0058] The external server cluster 110 can consist of servers (external servers) corresponding to various external data sources, specifically including one or more external servers, such as external server 110A, external server 110B, external server 110C, etc. Each external server in the external server cluster 110 stores external data corresponding to its external data source. When the external data source (external company, enterprise, or platform) corresponding to an external server in the external server cluster 110 cooperates with the company (enterprise or platform) corresponding to the internal server 120, each cooperating external server in the external server cluster 110 can provide the corresponding external data to the company (enterprise or platform) corresponding to the internal server 120 via the network to meet the company's (enterprise's or platform's) external data needs during operation. The external servers in the external server cluster 110 can be, but are not limited to, hardware servers, virtual servers, cloud servers, etc.
[0059] Understandably, each external data source may have one or more servers (external servers), and the comparison of the embodiments in this specification is not limited thereto.
[0060] Internal server 120 can be a server capable of providing various data processing functions. It can receive external data from external data sources sent by any external server in the external server cluster 110 via the network. Specifically, it collects a target external dataset from the external data source, which includes at least one target external data point. Then, it extracts the target external metadata corresponding to each target external data point in the target external dataset. Based on the target external metadata corresponding to each target external data point in the target external dataset, it categorizes and stores at least one target external data point into the storage space of the external metadata associated with each internal metadata point, according to the target data relationship. The target data relationship is used to characterize the correspondence between each internal metadata point and the external metadata corresponding to each external data point in the aforementioned external data source.
[0061] Understandably, companies (enterprises or platforms) typically require a huge amount of external data. A server is a high-performance computer with strong data processing capabilities and high stability and reliability. That is, the internal server 120 performs data processing based on the collected target external dataset, which can store a large amount of external data and ensure the efficiency and stability of data processing.
[0062] Optionally, the internal server 120 can also provide any terminal in the terminal cluster 130 corresponding to each employee within the company (enterprise or platform) with the corresponding permissions for internal data or stored external data of the company (enterprise or platform), so that the employees corresponding to the terminals in the terminal cluster 130 can view and use the internal data of the company (enterprise or platform) or the external data stored within the company (enterprise or platform) through the terminal.
[0063] Understandably, internal server 120 can be, but is not limited to, hardware server, virtual server, cloud server, etc.
[0064] Terminal cluster 130 can be the employee terminal of the company (enterprise or platform) corresponding to internal server 120, specifically including one or more employee terminals, such as employee terminal 130A, employee terminal 130B, employee terminal 130C, etc. Employee-version software can be installed on terminal cluster 130 to enable employees to view and use internal data of the company (enterprise or platform) or external data stored within the company (enterprise or platform) online. Any employee terminal in terminal cluster 130 can establish a data relationship with the network and, through this network, establish a data connection with internal server 120, such as receiving or sending internal data. Any employee terminal in terminal cluster 130 can be, but is not limited to, a mobile phone, tablet, laptop, or other device with the employee-version software installed.
[0065] Optionally, any employee terminal in the terminal cluster 130 can establish a data connection with any external server in the external server cluster 110 via the network. For example, it can receive external data sent by any external server in the external server cluster 110 that is cooperating with the company of the employee terminal. That is, it can collect target external datasets from external data sources. The target external dataset includes at least one target external data. Then, it extracts the target external metadata corresponding to each target external data in the target external dataset. Based on the target external metadata corresponding to each target external data in the target external dataset, it classifies and stores at least one target external data into the storage space of external metadata associated with each internal metadata according to the target data relationship. The target data relationship is used to characterize the correspondence between each internal metadata and the external metadata corresponding to each external data in the external data source.
[0066] Understandably, since servers are not portable, it is more convenient for employees to perform data processing on target external datasets collected from external data sources corresponding to external server cluster 110 via the network, compared to internal server 120 performing data processing based on the collected target external dataset.
[0067] The network can be a medium that provides a communication link between any external server in the external server cluster 110 and the internal server 120, between the internal server 120 and any employee terminal in the terminal cluster 130, or between any external server in the external server cluster 110 and any employee terminal in the terminal cluster 130. It can also be the Internet, which includes network devices and transmission media, and is not limited thereto. The transmission media can be a wired link (e.g., but not limited to, coaxial cable, fiber optic cable, and digital subscriber line (DSL)) or a wireless link (e.g., but not limited to, wireless fidelity (WIFI), Bluetooth, and mobile device networks).
[0068] It is understood that the data processing method provided in the embodiments of this specification can be executed by any one or more employee terminals in the terminal cluster 130, or by one or more internal servers 120, or by at least one employee terminal in the terminal cluster 130 and at least one internal server 120. The embodiments of this specification do not limit this. All the following embodiments are described using the internal server 120 as an example of data processing.
[0069] Understandably, when multiple employee terminals or multiple internal servers 120 perform data processing, or when at least one employee terminal and at least one internal server 120 jointly process data, each employee terminal and / or each internal server 120 can connect to external data from different external data sources. This not only accelerates the efficiency of external data collection when the company needs a large amount of external data from external data sources, but also improves the efficiency of data processing by having multiple employee terminals or multiple internal servers 120 process a large amount of external data in parallel.
[0070] Understandably, Figure 1 The number of external servers in the external server cluster 110, internal servers 120, and employee terminals in the terminal cluster 130 of the data processing system shown is merely an example. In a specific implementation, the data processing system can contain any number of external servers, internal servers 120, and employee terminals, and this specification does not specifically limit this. For example, but not limited to, internal server 120 can be an internal server cluster composed of multiple internal servers.
[0071] In the data processing of related technologies, such as Figure 2As shown, after collecting the target external dataset needed by the company, different external data sources may involve different industries, and their external definition standards for each industry may also differ. Furthermore, the external definition standards for each industry data source may differ from the company's internal definition standards. Therefore, manual labeling based on the external definition standards provided to each external data source is necessary to achieve accurate and internally compliant classification and storage. This demonstrates that the current data processing process consumes significant human resources and cannot guarantee that the classified and stored target external data can be used in conjunction with internal data or for internal company use.
[0072] To solve the above problems, the following will combine... Figure 1 and Figure 2 This section introduces the data processing methods provided in the embodiments of this specification. Please refer to [the relevant documentation] for details. Figure 3 This is a flowchart illustrating a data processing method provided in an exemplary embodiment of this specification. Figure 3 As shown, this data processing method may include the following steps:
[0073] S302, Collect the target external dataset from an external data source.
[0074] Specifically, when the company needs a target external dataset from an external data source for collaboration, it can send a collection command to the corresponding external server through internal server 120. After receiving the collection command, the corresponding external server can send the target external dataset to internal server 120 via the network. Internal server 120 can also receive the target external dataset sent by the external server via the network, which is equivalent to collecting the target external dataset from the external data source. The target external dataset includes at least one target external data.
[0075] For example, if Company A needs data (target external data) from Company B (external data source) and Company C (external data source) during its operation, it can directly collect at least the target external data from the data stored on the external servers corresponding to Company B (external data source) and Company C (external data source) via the network. The target external data is also the internal data corresponding to Company B (external data source) or Company C (external data source).
[0076] Understandably, there may be one or more external data sources that the company collaborates with, and there may also be one or more target external data sources that the company needs internally. Similarly, the target external datasets collected may also come from one or more external data sources, meaning that target external datasets can be collected from one or more external data sources.
[0077] Furthermore, the target external data may include, but is not limited to, at least one of the following: text data, audio data, video data, and image data.
[0078] S304, extract the target external metadata corresponding to each target external data in the target external dataset.
[0079] Specifically, after collecting the target external dataset needed by the company, since different external data sources may involve different industries, and the definition standards of data for each industry may also be different, the definition standards of the same industry or the same category of data may also be different between the company's internal and external data sources. Therefore, in order to solve the problem that it is difficult to classify and store external data according to the company's internal needs due to the inconsistency of the definition standards of the data between the company's internal and external data sources, we can first extract the target external metadata corresponding to each target external data in the target external dataset according to the definition standards of each external data source. The above target external metadata can be understood as the definition data of the target external data by the external data source. Then, the target external dataset is classified and stored based on the target external metadata corresponding to each target external data, thereby avoiding the need for a large amount of manual labeling and classification, and improving the efficiency and accuracy of data processing.
[0080] S306, based on the target external metadata corresponding to each target external data in the target external dataset, classify and store at least one target external data into the storage space of the external metadata associated with each internal metadata according to the target data relationship.
[0081] Specifically, the target data relationship is used to characterize the correspondence between each internal metadata and the corresponding external metadata in external data sources. The aforementioned internal metadata refers to the metadata corresponding to each internal data in the internal data assets, i.e., the company's internal definition data for internal data. These internal data assets can be data resources owned or controlled by the company that can bring future economic benefits, recorded in physical or electronic form, such as, but not limited to, documents and electronic data. These data resources can be various types of internal and external data generated or used by the company (enterprise or platform) in its production, operation, and decision-making processes.
[0082] Furthermore, the internal metadata and the corresponding associated external metadata are of the same category. That is, based on the target external metadata corresponding to each target external data in the target external dataset, at least one target external data can be stored in the storage space of the external data corresponding to the internal data of the same category according to the target data relationship.
[0083] Optionally, the aforementioned target data relationship can be established by the relevant person in charge within the company, based on the company's internal definition standards for internal data and the external definition standards for external data from external data sources, by pre-associating external metadata of the same industry (category) with internal metadata, thereby establishing the correspondence between each internal metadata and the corresponding external metadata in each external data source, i.e., the target data relationship.
[0084] Alternatively, to avoid consuming a large amount of human resources and improve data processing efficiency, the process of establishing target data relationships can also be automated by an internal server, specifically as follows: Figure 4 As shown, the process of establishing target data relationships includes the following steps:
[0085] S402, extract the internal metadata corresponding to each internal data in the internal data assets, and extract the external metadata corresponding to each external data in the external data sources.
[0086] Specifically, internal metadata corresponding to each internal data in the internal data assets can be extracted according to the company's internal definition standards for each internal data, i.e., the company's internal definition data for internal data. External metadata corresponding to each external data in the external data source can be extracted according to the external definition standards of the external data source for the external data (the external data source's internal data) (the external data source's definition standards for its internal data), i.e., the external data source's definition data for the external data (the external data source's internal data).
[0087] Optionally, when extracting external metadata corresponding to each external data from external data sources, external metadata corresponding to each external data can also be extracted from external data sources based on preset data source templates. The preset data source templates represent the external data standards defined by different industries corresponding to different external data sources. Thus, by using preset data source templates, the definition standards of external data sources for data are conveyed to the internal server, so that the internal server can intuitively understand the distribution of data assets of external data sources, and combine them with the internal definition standards for internal data to achieve consistency and accuracy in the classification and transmission of external data to the company.
[0088] For example, such as Figure 5 As shown, if there are three external data sources cooperating with the company, namely external data source A, external data source B, and external data source C, then the preset data source templates include the A data source template corresponding to external data source A, the B data source template corresponding to external data source B, and the C data source template corresponding to external data source C. For example... Figure 5As shown, external data source A may operate transactions related to industry 1. From the data template of data source A, it can be seen that the external data standard defined by external data source A for industry 1 is external metadata x11. That is, within external data source A, the metadata of the data corresponding to industry 1 is defined as x11. External data source B may operate transactions related to industry 1, industry 2, and industry 3. From the data template of data source B, it can be seen that the external data standard defined by external data source B for industry 1 is external metadata x12. That is, within external data source B, the metadata of the data corresponding to industry 1 is defined as x12. Simultaneously, the metadata of the data corresponding to industry 2 is defined as y11, and the metadata of the data corresponding to industry 3 is defined as y21. External data source C may operate transactions related to industry 4. From the data template of data source C, it can be seen that the external data standard defined by external data source C for industry 4 is external metadata z11. That is, within external data source C, the metadata of the data corresponding to industry 4 is defined as z11. Therefore, when establishing the target data relationship, the process of extracting the external metadata corresponding to each external data in external data source A, external data source B, and external data source C can be carried out separately according to... Figure 5 The data templates for data sources A, B, and C shown directly extract external metadata. These templates then convey the external data source's definition standards to the internal server, enabling the internal server to intuitively understand the distribution of data assets from external data sources. Combined with the internal definition standards for internal data, this ensures the consistency and accuracy of external data classification and transmission within the company.
[0089] S404 establishes the association between internal metadata and external metadata, generating target data relationships.
[0090] Specifically, after extracting the internal metadata corresponding to each internal data in the internal data assets and the external metadata corresponding to each external data in the external data source, metadata of the same category can be associated to generate target data relationships. Target data relationships are used to characterize the correspondence between each internal metadata and the external metadata corresponding to each external data in the external data source.
[0091] For example, the process of establishing target data relationships can be as follows: Figure 6 As shown, from Figure 6As can be seen, the internal data of class X may be defined as x1 (internal metadata x1) internally, as x11 (external metadata x11) in external data source A, and as x12 (external metadata x12) in external data source B. However, the data corresponding to internal metadata x1, external metadata x11, and external metadata x12 are all data of the same category. Therefore, external metadata x11 and external metadata x12 can be associated with internal metadata x1, and so on, to generate the final target data relationship.
[0092] Understandably, the external data sources involved in collecting the target external dataset in S302 should be one or more external data sources involved in establishing the target data relationship in S402.
[0093] For example, combined Figure 5 and Figure 6 It is possible to obtain, such as Figure 7 The exemplary embodiment shown in this specification provides an implementation process of a data processing method. For example... Figure 7 As shown, when the target external dataset collected from external data sources A, B, and C includes four target external data points: target external data 1, target external data 2, target external data 3, and target external data 4, these four target external data points can be classified and stored in the storage space of the external metadata associated with each internal metadata point according to S302 to S306. This makes it convenient for employees within the company to use the internal data together or to use these target external data points directly within the company.
[0094] This embodiment extracts target external metadata corresponding to each target external data in a target external dataset collected from an external data source. The target external dataset includes at least one target external data. Based on the target external metadata corresponding to each target external data in the target external dataset, the at least one target external data is classified and stored in the storage space of the external metadata associated with each internal metadata according to the target data relationship. The target data relationship is used to characterize the correspondence between each internal metadata and the external metadata corresponding to each external data in the external data source. Thus, through the correspondence between each internal metadata and the external metadata corresponding to each external data in the external data source (target... This data relationship mechanism links external data from various external data sources to internal data from the company's internal data, using external definition standards (external metadata). This allows the target external data to be directly categorized and stored in the storage space of the external metadata associated with each internal metadata, reducing the need for manual labeling and categorizing of external data from various data sources. This significantly saves manpower in processing external data, enabling more efficient and accurate categorization and storage of external data from external data sources according to internal data classification requirements. It also ensures that the categorized and stored external data can be used in conjunction with internal data or directly within the company.
[0095] Please refer to Figure 8 This is a flowchart illustrating another data processing method provided in an exemplary embodiment of this specification. Figure 8 As shown, this data processing method includes the following steps:
[0096] S802 collects the target external dataset from an external data source.
[0097] Specifically, S802 is the same as S302, and will not be repeated here.
[0098] S804 extracts the target external metadata corresponding to each target external data in the target external dataset.
[0099] Specifically, S804 is the same as S304, and will not be repeated here.
[0100] S806, based on the target external metadata corresponding to each target external data in the target external dataset, classify and store at least one target external data into the storage space of the external metadata associated with each internal metadata according to the target data relationship.
[0101] Specifically, S806 is the same as S306, which will not be repeated here.
[0102] S808 receives a call instruction to access external data from the target.
[0103] Specifically, when an employee within the company wants to access target external data, the employee's client can be triggered to send a request for accessing the target external data to the internal server 120 via the network. The internal server 120 can then receive the request sent by the employee's client via the network. The request can carry information such as the category and identifier of the target external data that the employee needs to access.
[0104] S810, in response to a call instruction, calls the target external data in the storage space of the corresponding target external metadata based on the internal metadata associated with the target external metadata.
[0105] Specifically, after receiving the call instruction, the internal server 120 can respond to the call instruction by finding the corresponding category of internal metadata based on the information of the target external data to be called carried in the call instruction, and calling the target external data in the storage space of the corresponding target external metadata from the target external metadata associated with the internal metadata.
[0106] This embodiment of the specification extracts the target external metadata corresponding to each target external data in a target external dataset collected from an external data source. The target external dataset includes at least one target external data. Based on the target external metadata corresponding to each target external data in the target external dataset, the at least one target external data is classified and stored in the storage space of the external metadata associated with each internal metadata according to the target data relationship. It can also receive a call instruction to call the target external data and, in response to the call instruction, call the target external data in the storage space of the corresponding target external metadata based on the internal metadata associated with the target external metadata. This associates the external definition standard (external metadata) corresponding to the external data of each external data source with the internal definition standard (internal metadata) corresponding to the company's internal data. This not only achieves more efficient and accurate classification and associated storage of external data from external data sources according to internal data classification requirements, but also enables employees within the company to use the collected target external data more efficiently.
[0107] Please refer to Figure 9 , Figure 9 A data processing apparatus 900 is provided as an exemplary embodiment of this specification. The data processing apparatus 900 includes:
[0108] The acquisition module 910 is used to acquire a target external dataset from an external data source; the target external dataset includes at least one target external data.
[0109] The first extraction module 920 is used to extract the target external metadata corresponding to each target external data in the above target external dataset;
[0110] The classification storage module 930 is used to classify and store at least one target external data into the storage space of the external metadata associated with each internal metadata, based on the target external metadata corresponding to each target external data in the target external dataset and according to the target data relationship; the target data relationship is used to characterize the correspondence between each internal metadata and the external metadata corresponding to each external data in the external data source.
[0111] In one possible implementation, the data processing apparatus 900 further includes:
[0112] The second extraction module is used to extract the internal metadata corresponding to each internal data in the internal data assets, and to extract the external metadata corresponding to each external data in the external data source.
[0113] The generation module is used to establish the association between the aforementioned internal metadata and the aforementioned external metadata, and to generate the target data relationship.
[0114] In one possible implementation, the second extraction module is specifically used to: extract external metadata corresponding to each external data source from the external data source based on the preset data source template; the preset data source template represents the external data standards defined by different industries corresponding to different external data sources.
[0115] In one possible implementation, the aforementioned target external dataset comes from one or more of the aforementioned external data sources.
[0116] In one possible implementation, the aforementioned target external data includes at least one of the following: text data, audio data, video data, and image data.
[0117] In one possible implementation, the aforementioned internal metadata and the corresponding associated external metadata belong to the same category.
[0118] In one possible implementation, the data processing apparatus 900 further includes:
[0119] The receiving module is used to receive the calling instruction for invoking the aforementioned target external data;
[0120] The calling module is used to respond to the above calling instruction and, based on the internal metadata associated with the above target external metadata, call the above target external data in the storage space of the corresponding target external metadata.
[0121] The division of modules in the above-described data processing device is for illustrative purposes only. In other embodiments, the data processing device can be divided into different modules as needed to complete all or part of the functions of the above-described data processing device. The implementation of each module in the data processing device provided in the embodiments of this specification can be in the form of a computer program. This computer program can run on a terminal or server. The program modules constituted by this computer program can be stored in the memory of the terminal or server. When the computer program is executed by a processor, it implements all or part of the steps of the data processing method described in the embodiments of this specification.
[0122] Please see Figure 10 , Figure 10 This is a schematic diagram of the structure of an electronic device provided as an exemplary embodiment of this specification. For example... Figure 10 As shown, the electronic device 1000 may include: at least one processor 1010, at least one communication bus 1020, a user interface 1030, at least one network interface 1040, and a memory 1050. The communication bus 1020 can be used to enable communication between the aforementioned components.
[0123] The user interface 1030 may include a display screen and a camera. Optional user interfaces may also include standard wired interfaces and wireless interfaces.
[0124] The network interface 1040 may optionally include a Bluetooth module, a Near Field Communication (NFC) module, a Wireless Fidelity (Wi-Fi) module, etc.
[0125] The processor 1010 may include one or more processing cores. The processor 1010 connects to various parts within the electronic device 1000 using various interfaces and lines. It executes various functions and processes data of the routing electronic device 1000 by running or executing instructions, programs, code sets, or instruction sets stored in the memory 1050, and by calling data stored in the memory 1050. Optionally, the processor 1010 may be implemented using at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), or Programmable Logic Array (PLA). The processor 1010 may integrate one or more of the following: a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), and a modem. The CPU primarily handles the operating system, user interface, and applications; the GPU is responsible for rendering and drawing the content required for display; and the modem handles wireless communication. It is understood that the modem may also be implemented as a separate chip without being integrated into the processor 1010.
[0126] The memory 1050 may include random access memory (RAM) or read-only memory (ROM). Optionally, the memory 1050 may include a non-transitory computer-readable medium. The memory 1050 can be used to store instructions, programs, code, code sets, or instruction sets. The memory 1050 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for at least one function (such as acquisition function, extraction function, classification storage function, etc.), instructions for implementing the above-described method embodiments, etc.; the data storage area may store data involved in the above-described method embodiments, etc. Optionally, the memory 1050 may also be at least one storage device located remotely from the aforementioned processor 1010. Figure 10 As shown, the memory 1050, which serves as a computer storage medium, may include an operating system, a network communication module, a user interface module, and program instructions.
[0127] Specifically, the processor 1010 can be used to call program instructions stored in the memory 1050 and perform the following operations:
[0128] Collect target external datasets from external data sources; the target external datasets include at least one target external data.
[0129] Extract the target external metadata corresponding to each target external data in the above target external dataset.
[0130] Based on the target external metadata corresponding to each target external data in the aforementioned target external dataset, at least one target external data is classified and stored in the storage space of the external metadata associated with each internal metadata according to the target data relationship; the aforementioned target data relationship is used to characterize the correspondence between each internal metadata and the external metadata corresponding to each external data in the aforementioned external data source.
[0131] In some possible embodiments, before the processor 1010 executes the classification and storage of at least one target external data into the storage space of the external metadata associated with each internal metadata based on the target external metadata corresponding to each target external data in the target external dataset, according to the target data relationship, it is further configured to execute:
[0132] Extract internal metadata corresponding to each internal data in the internal data assets, and extract external metadata corresponding to each external data in the external data sources.
[0133] Establish the association between the aforementioned internal metadata and the aforementioned external metadata to generate the target data relationship.
[0134] In some possible embodiments, when the processor 1010 extracts the external metadata corresponding to each external data in the external data source, it is specifically used to perform:
[0135] Based on the preset data source template, external metadata corresponding to each external data source is extracted from the external data source; the preset data source template represents the external data standards defined by different industries corresponding to different external data sources.
[0136] In some possible embodiments, the target external dataset described above comes from one or more of the aforementioned external data sources.
[0137] In some possible embodiments, the aforementioned target external data includes at least one of the following: text data, audio data, video data, and image data.
[0138] In some possible embodiments, the aforementioned internal metadata and the external metadata associated with the aforementioned internal metadata are of the same category.
[0139] In some possible embodiments, after the processor 1010 executes the classification and storage of at least one target external data into the storage space of the external metadata associated with each internal metadata based on the target external metadata corresponding to each target external data in the target external dataset, it is further configured to execute:
[0140] Receive the call instruction to invoke the external data of the aforementioned target.
[0141] In response to the above invocation instruction, based on the internal metadata associated with the above target external metadata, the above target external data in the storage space of the corresponding target external metadata is invoked.
[0142] This specification also provides a computer-readable storage medium storing instructions that, when executed on a computer or processor, cause the computer or processor to perform one or more steps in the above embodiments. If the constituent modules of the above data processing apparatus are implemented as software functional units and sold or used as independent products, they can be stored in the above-described computer-readable storage medium.
[0143] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. The computer program product includes one or more computer instructions. When these computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this specification are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in or transmitted through a computer-readable storage medium. The computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium accessible to a computer or a data storage device such as a server or data center that integrates one or more available media. The aforementioned available media can be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., Digital Versatile Discs (DVDs)), or semiconductor media (e.g., Solid State Disks (SSDs)).
[0144] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. This program can be stored in a computer-readable storage medium, and when executed, it can include the processes of the embodiments of the methods described above. The aforementioned storage medium includes various media capable of storing program code, such as ROM, RAM, magnetic disks, or optical disks. Unless otherwise specified, the technical features of this embodiment and its implementation can be combined arbitrarily.
[0145] The embodiments described above are merely preferred embodiments of this specification and are not intended to limit the scope of this specification. Any modifications and improvements made by those skilled in the art to the technical solutions of this specification without departing from the spirit of this specification should fall within the protection scope defined by the claims.
[0146] The foregoing has described specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recorded in the claims and specification may be performed in a different order than in the embodiments described in the specification and still achieve the desired results. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
Claims
1. A data processing method, the method comprising: Collect target external datasets from external data sources; The target external dataset includes at least one target external dataset; According to the definition criteria of the target external data from each of the external data sources, extract the target external metadata corresponding to each target external data in the target external dataset; Based on the target external metadata corresponding to each target external data in the target external dataset, the at least one target external data is classified and stored in the storage space of the external metadata associated with each internal metadata according to the target data relationship; The target data relationship is used to characterize the correspondence between each internal metadata and the external metadata corresponding to each external data in the external data source; After classifying and storing at least one target external data into the storage space of the external metadata associated with each internal metadata based on the target external metadata corresponding to each target external data in the target external dataset according to the target data relationship, the method further includes: Receive a call instruction to invoke the target's external data; In response to the invocation instruction, the target external data in the storage space corresponding to the target external metadata is invoked based on the internal metadata associated with the target external metadata.
2. The method as described in claim 1, wherein before classifying and storing the at least one target external data into the storage space of the external metadata associated with each internal metadata based on the target external metadata corresponding to each target external data in the target external dataset according to the target data relationship, the method further includes: Extract the internal metadata corresponding to each internal data in the internal data assets, and extract the external metadata corresponding to each external data in the external data sources; Establish the association between the internal metadata and the external metadata to generate the target data relationship.
3. The method as described in claim 2, wherein extracting the external metadata corresponding to each external data in the external data source includes: Based on the preset data source template, extract the external metadata corresponding to each external data source; The preset data source template represents the external data standards defined by different industries corresponding to different external data sources.
4. The method according to any one of claims 1-3, wherein the target external dataset comes from one or more of the external data sources.
5. The method according to any one of claims 1-3, wherein the target external data includes at least one of the following: text data, audio data, video data, and image data.
6. The method according to any one of claims 1-3, wherein the internal metadata and the external metadata associated with the internal metadata are of the same category.
7. A data processing apparatus, the apparatus comprising: The acquisition module is used to acquire a target external dataset from an external data source; the target external dataset includes at least one target external data. The first extraction module is used to extract the target external metadata corresponding to each target external data in the target external dataset according to the definition standards of each external data source for the target external data; The classification storage module is used to classify and store at least one target external data into the storage space of the external metadata associated with each internal metadata, based on the target external metadata corresponding to each target external data in the target external dataset and according to the target data relationship. The target data relationship is used to characterize the correspondence between each internal metadata and the external metadata corresponding to each external data in the external data source; The data processing device further includes: The receiving module is used to receive a call instruction that invokes the target's external data; The calling module is used to respond to the calling instruction and, based on the internal metadata associated with the target external metadata, call the target external data in the storage space corresponding to the target external metadata.
8. An electronic device comprising: Processor and memory; The processor is connected to the memory; The memory is used to store executable program code; The processor runs a program corresponding to the executable program code stored in the memory to perform the method as described in any one of claims 1-6.
9. A computer storage medium storing a plurality of instructions adapted for loading by a processor and performing the method steps of any one of claims 1-6.
10. A computer program product comprising instructions that, when run on a computer or processor, cause the computer or processor to perform the data processing method as described in any one of claims 1-6.