Data processing method and apparatus
By performing deduplication and field matching on the data dashboard's data sources, the problem of low data source replacement efficiency was solved, and an efficient and accurate data source replacement process was achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING DAJIA INTERNET INFORMATION TECH CO LTD
- Filing Date
- 2022-09-14
- Publication Date
- 2026-06-12
Smart Images

Figure CN115374163B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of data processing, and more particularly to a data processing method and apparatus. Background Technology
[0002] Data dashboards are an important form of data presentation. The content of a data dashboard consists of multiple data components, each corresponding to one or more data sources. The components visually present the source data in a style that is easy to analyze.
[0003] In the configuration and maintenance of data dashboards, there are times when it is necessary to replace the data source. Currently, the common approach is to replace the corresponding data source for each component in the data dashboard one by one. That is, for each component, it is necessary to match the relationship between the original data source and the target data source through fields. However, each component often has the same data source. Therefore, when replacing the data source for each component one by one, the processing requirements for a single component are acceptable. However, if the number of components to be replaced is large, there will be a large number of repetitive field matching operations, resulting in very low replacement efficiency. Summary of the Invention
[0004] This disclosure provides a data processing method and apparatus to at least solve the problem of low efficiency in data source replacement in related technologies.
[0005] According to a first aspect of the present disclosure, a data processing method is provided, comprising: determining at least one data source to be replaced corresponding to a data dashboard; performing deduplication processing on the at least one data source to be replaced to obtain a set of data sources to be replaced; determining a set of target data sources corresponding to the set of data sources to be replaced, wherein the set of target data sources includes a target data source to which each data source to be replaced in the set of data sources to be replaced will be replaced; determining a field correspondence between fields of data sources to be replaced in the set of data sources to be replaced and fields of target data sources corresponding to data sources to be replaced in the set of target data sources; and replacing at least one data source to be replaced with a target data source corresponding to the target data source in the set of target data sources based on the field correspondence.
[0006] Optionally, determining the field correspondence between the fields of the data source to be replaced in the set of data sources to be replaced and the fields of the target data source corresponding to the data source to be replaced in the set of target data sources includes: determining the first field in the data source to be replaced in the set of data sources to be replaced based on a predetermined matching rule, and determining the correspondence between the first field and the predetermined field as the first field correspondence, wherein the first field is the field in the target data source corresponding to the data source to be replaced in the set of target data sources where there is a matching field, and the predetermined field is the field in the target data source corresponding to the data source to be replaced in the set of target data sources that matches the first field; and determining the field correspondence based on the first field correspondence.
[0007] Optionally, based on a predetermined matching rule, determining the first field in the data source to be replaced in the data source to be replaced set includes: determining the first sub-field in the data source to be replaced ...
[0008] Optionally, based on the first field correspondence, the field correspondence is determined, including: for the second field other than the first field in the data source to be replaced in the data source to be replaced set: in response to the user instruction, the matching field of the second field is selected in the target data source corresponding to the data source to be replaced in the target data source set, and the correspondence between the second field and the matching field is determined as the second field correspondence; the first field correspondence and the second field correspondence are used as the field correspondence.
[0009] Optionally, in response to a user instruction, a matching field for the second field is selected from the target data sources corresponding to the data source to be replaced in the target data source set. This includes: displaying a matching entry for each second field, wherein the matching entry is associated with a field in the target data source corresponding to the data source to be replaced in the target data source set; receiving a field selected by the user from the field associated with the matching entry; and using the selected field as the matching field for the second field.
[0010] Optionally, based on the field correspondence, at least one data source to be replaced is replaced with the corresponding target data source in the target data source set, including: based on the first field correspondence, replacing the first field in at least one data source to be replaced with the field of the target data source corresponding to the data source to be replaced in the target data source set; based on the second field correspondence, replacing the second field in at least one data source to be replaced with the field of the target data source corresponding to the data source to be replaced in the target data source set.
[0011] Optionally, based on the field correspondence, at least one data source to be replaced is replaced with the corresponding target data source in the target data source set, including: based on the field correspondence, determining at least one unmatched field in the target data source set where the corresponding field of the data source to be replaced does not match; and discarding the relevant information of the unmatched field.
[0012] Optionally, before discarding the information related to the unmatched field, the method further includes: displaying alarm information indicating the discarding of the information related to the unmatched field, wherein the alarm information includes the unmatched field.
[0013] According to a second aspect of the present disclosure, a data processing apparatus is provided, comprising: a data source acquisition unit configured to determine at least one data source to be replaced corresponding to a data dashboard; a deduplication unit configured to perform deduplication processing on the at least one data source to be replaced to obtain a set of data sources to be replaced; a target data source determination unit configured to determine a set of target data sources corresponding to the set of data sources to be replaced, wherein the set of target data sources includes a target data source to which each data source to be replaced in the set of data sources to be replaced will be replaced; a relationship determination unit configured to determine the field correspondence between fields of data sources to be replaced in the set of data sources to be replaced and fields of target data sources corresponding to data sources to be replaced in the set of target data sources; and a replacement unit configured to replace at least one data source to be replaced with a target data source corresponding to the target data source in the set of target data sources based on the field correspondence.
[0014] Optionally, the relationship determination unit is further configured to determine, based on a predetermined matching rule, a first field in the data source to be replaced in the set of data sources to be replaced, and to determine the correspondence between the first field and the predetermined field as the first field correspondence relationship, wherein the first field is a field in the target data source corresponding to the data source to be replaced in the set of target data sources that has a matching field, and the predetermined field is a field in the target data source corresponding to the data source to be replaced in the set of target data sources that matches the first field; and to determine the field correspondence relationship based on the first field correspondence relationship.
[0015] Optionally, the relationship determination unit is further configured to: determine a first sub-field in the data source to be replaced in the set of data sources to be replaced based on a first predetermined matching rule, wherein the first sub-field is a field with the same field identifier as the target data source corresponding to the data source to be replaced in the set of target data sources; determine a second sub-field among other fields based on a second predetermined matching rule, wherein the other fields are fields in the data source to be replaced in the set of data sources to be replaced other than the first sub-field, and the second sub-field is a field that is fuzzily matched to a field in the target data source corresponding to the data source to be replaced in the set of target data sources; and use the first sub-field and the second sub-field as the first field.
[0016] Optionally, the relationship determination unit is further configured to, for the second field other than the first field in the data source to be replaced in the set of data sources to be replaced: in response to a user instruction, select the matching field of the second field in the target data source corresponding to the data source to be replaced in the set of target data sources, and determine the correspondence between the second field and the matching field as the second field correspondence; and take the first field correspondence and the second field correspondence as the field correspondence.
[0017] Optionally, the relationship determination unit is also configured to display a matching entry corresponding to each second field, wherein the matching entry is associated with a field in the target data source corresponding to the data source to be replaced in the target data source set; receive the field selected by the user from the field associated with the matching entry through the matching entry; and use the selected field as the matching field of the second field.
[0018] Optionally, the replacement unit is further configured to replace the first field in at least one data source to be replaced with the field of the target data source corresponding to the data source to be replaced in the target data source set, based on the first field correspondence; and to replace the second field in at least one data source to be replaced with the field of the target data source corresponding to the data source to be replaced in the target data source set, based on the second field correspondence.
[0019] Optionally, the replacement unit is also configured to, based on the field correspondence, identify at least one unmatched field in the target data source corresponding to the data source to be replaced in the target data source set where no matching field exists; and discard the relevant information of the unmatched field.
[0020] Optionally, the replacement unit is also configured to display alarm information indicating the discarding of information related to the unmatched field before discarding the information related to the unmatched field, wherein the alarm information includes the unmatched field.
[0021] According to a third aspect of the present disclosure, an electronic device is provided, comprising: a processor; and a memory for storing processor-executable instructions; wherein the processor is configured to execute instructions to implement a data processing method according to the present disclosure.
[0022] According to a fourth aspect of the present disclosure, a computer-readable storage medium is provided that, when instructions in the computer-readable storage medium are executed by at least one processor, causes at least one processor to perform the data processing method as described above according to the present disclosure.
[0023] According to a fifth aspect of the present disclosure, a computer program product is provided, including computer instructions that, when executed by a processor, implement the data processing method according to the present disclosure.
[0024] The technical solutions provided by the embodiments of this disclosure bring at least the following beneficial effects:
[0025] According to the data processing method and apparatus of this disclosure, instead of performing field matching of data sources for each component individually, all data sources to be replaced (at least one data source to be replaced) corresponding to all components are aggregated together for deduplication, resulting in a deduplicated set of data sources to be replaced. Field matching is then performed uniformly on this set of data sources to be replaced. Based on the matched field correspondences, the data sources to be replaced are replaced with the corresponding target data sources, avoiding repetitive matching operations and improving replacement efficiency. Therefore, this disclosure solves the problem of low data source replacement efficiency in related technologies.
[0026] It should be understood that the above general description and the following detailed description are exemplary and explanatory only, and are not intended to limit this disclosure. Attached Figure Description
[0027] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this disclosure and, together with the description, serve to explain the principles of this disclosure, and are not intended to unduly limit this disclosure.
[0028] Figure 1 This is a schematic diagram illustrating an implementation scenario of a data processing method according to an exemplary embodiment of the present disclosure;
[0029] Figure 2 This is a flowchart illustrating a data processing method according to an exemplary embodiment;
[0030] Figure 3 This is a data source replacement process illustrated according to an exemplary embodiment;
[0031] Figure 4 This is a data source replacement interface illustrated according to an exemplary embodiment;
[0032] Figure 5 This is a field matching interface shown according to an exemplary embodiment;
[0033] Figure 6 This is a field matching interface two shown according to an exemplary embodiment;
[0034] Figure 7 This is a block diagram illustrating a data processing apparatus according to an exemplary embodiment;
[0035] Figure 8 This is a block diagram of an electronic device 800 according to an embodiment of the present disclosure. Detailed Implementation
[0036] To enable those skilled in the art to better understand the technical solutions of this disclosure, the technical solutions in the embodiments of this disclosure will be clearly and completely described below with reference to the accompanying drawings.
[0037] It should be noted that the terms "first," "second," etc., used in the specification, claims, and accompanying drawings of this disclosure are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this disclosure described herein can be implemented in orders other than those illustrated or described herein. The embodiments described in the following examples do not represent all embodiments consistent with this disclosure. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this disclosure as detailed in the appended claims.
[0038] It should be noted that the phrase "at least one of several items" in this disclosure refers to three parallel cases: "any one of the several items", "a combination of any number of the several items", and "all of the several items". For example, "including at least one of A and B" includes the following three parallel cases: (1) including A; (2) including B; (3) including A and B. As another example, "performing at least one of step one and step two" indicates the following three parallel cases: (1) performing step one; (2) performing step two; (3) performing both step one and step two.
[0039] This disclosure provides a method for efficiently replacing the data source of a data dashboard. The following example illustrates the replacement scenario of the chart component of the data dashboard.
[0040] Figure 1 This is a schematic diagram illustrating an implementation scenario of a data processing method according to an exemplary embodiment of the present disclosure, such as... Figure 1 The implementation scenario includes server 100, user terminal 110, and user terminal 120. The number of user terminals is not limited to two and includes, but is not limited to, devices such as mobile phones and personal computers. The user terminals can browse data dashboards. The server can be a single server, a server cluster composed of several servers, or a cloud computing platform or virtualization center.
[0041] User terminals 110 and 120 receive user instructions and determine that the data source of the chart component in the currently displayed data dashboard needs to be replaced. They send the data source information corresponding to the chart component to server 100. Server 100 identifies the data source corresponding to the chart component as the data source to be replaced, performs deduplication on all data sources to be replaced to obtain a set of data sources to be replaced, determines the target data source to which each data source to be replaced will be replaced and merges them into a set of target data sources. It determines the field correspondence between the fields of the data sources to be replaced in the set of data sources to be replaced and the fields of the corresponding target data sources in the set of target data sources to be replaced. Based on the field correspondence, it replaces all data sources to be replaced with the corresponding target data sources in the set of target data sources.
[0042] The data processing method and apparatus according to exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings.
[0043] Figure 2 This is a flowchart illustrating a data processing method according to an exemplary embodiment, such as... Figure 2 As shown, the data processing method includes the following steps:
[0044] In step S201, at least one data source to be replaced corresponding to the data dashboard is determined. The at least one data source to be replaced can be all data sources corresponding to the data dashboard, or it can be some of the data sources corresponding to the data dashboard. However, the at least one data source to be replaced is all the data sources to be replaced corresponding to the data dashboard.
[0045] In step S202, at least one data source to be replaced is deduplicated to obtain a set of data sources to be replaced. For example, the set of data sources to be replaced can be in the form of a data source list or any other form, and this disclosure does not limit it in this way.
[0046] In step S203, a target data source set corresponding to the set of data sources to be replaced is determined. The target data source set includes the target data source to which each data source to be replaced in the set of data sources to be replaced will be replaced. For example, in this step, the user can input an instruction to specify the target data source to which each data source to be replaced will be replaced. Alternatively, other methods can be used to determine the target data source corresponding to each data source to be replaced; this disclosure does not limit this method.
[0047] In step S204, the field correspondence between the fields of the data source to be replaced in the data source to be replaced set and the fields of the target data source corresponding to the data source to be replaced in the target data source set is determined.
[0048] According to an exemplary embodiment of this disclosure, determining the field correspondence between fields in the data source to be replaced in the set of data sources to be replaced and fields in the target data source corresponding to the data source to be replaced in the set of target data sources can be achieved as follows: Based on a predetermined matching rule, a first field in the data source to be replaced in the set of data sources to be replaced is determined, and the correspondence between the first field and the predetermined field is determined as the first field correspondence. Here, the first field is a field in the target data source corresponding to the data source to be replaced in the set of target data sources that has a matching field, and the predetermined field is a field in the target data source corresponding to the data source to be replaced in the set of target data sources that matches the first field. Based on the first field correspondence, the field correspondence is determined. According to this embodiment, automatically matching a subset of fields using predetermined matching rules can improve matching efficiency.
[0049] For example, the aforementioned pre-defined matching rules can be set based on user needs. For each data source to be replaced, the fields of the data source to be replaced and the corresponding target data source can be compared according to the target data source specified by the user. The fields can be automatically matched according to the pre-defined matching rules (such as matching by field name or fuzzy matching). For fields that cannot be automatically matched, users can also manually identify and associate fields with unrelated names but the same actual meaning.
[0050] According to an exemplary embodiment of this disclosure, determining a first field in a data source to be replaced within a set of data sources to be replaced based on predetermined matching rules may include: determining a first sub-field in a data source to be replaced within a set of data sources to be replaced based on a first predetermined matching rule, wherein the first sub-field is a field with the same field identifier as the target data source corresponding to the data source to be replaced in the target data source set; determining a second sub-field among other fields based on a second predetermined matching rule, wherein the other fields are fields in the data sources to be replaced within a set of data sources to be replaced excluding the first sub-field, and the second sub-field is a field that is fuzzily matched to a field in the target data source corresponding to the data source to be replaced in the target data source set; and using the first sub-field and the second sub-field as the first field. According to this embodiment, by first matching fields with the same field identifier based on the field identifier, and then matching corresponding fields for fields with different field identifiers based on other predetermined matching rules, the matching operation can be simplified and matching efficiency improved.
[0051] For example, the field identifier mentioned above can be a field name. In this case, the first pre-defined matching rule can be to match fields based on whether their names are the same, and the second pre-defined matching rule can match fields with the same meaning based on their names. Specifically, first, identify the fields with the same names in the data source to be replaced and the target data source, and use these fields as the first field. Then, use rules to match fields in the data source to be replaced and the target data source that may have the same meaning, such as different languages with the same meaning ("city" and "city"), fuzzy matching ("city_id" and "cityid"), ("city_" and "city"), etc., and use these fields as the first field as well.
[0052] According to an exemplary embodiment of this disclosure, determining a field correspondence based on a first field correspondence includes: for a second field other than the first field in the set of data sources to be replaced: in response to a user instruction, selecting a matching field of the second field in the target data source corresponding to the data source to be replaced in the set of target data sources, and determining the correspondence between the second field and the matching field as the second field correspondence; and using the first field correspondence and the second field correspondence as the field correspondence. According to this embodiment, after automatically matching a portion of fields using predetermined matching rules, the remaining fields that cannot be automatically matched can be manually matched in conjunction with user instructions, thereby handling field matching in various situations and covering various data source replacement scenarios.
[0053] For example, you can first identify fields with the same name in both the data source to be replaced and the target data source, and use these fields as the first field. Then, use rules to match fields in the data source to be replaced and the target data source that may have the same meaning, such as different languages with the same meaning ("city" and "city"), fuzzy matching ("city_id" and "cityid"), ("city_" and "city"), etc. After the above matching is completed, there are still some fields in the data source to be replaced that are not matched. At this time, for fields that cannot be matched automatically, users can also manually identify and associate fields with unrelated names but the same actual meaning.
[0054] According to an exemplary embodiment of this disclosure, in response to a user instruction, selecting a matching field for a second field from the target data sources corresponding to the data sources to be replaced in the target data source set includes: displaying a matching entry for each second field, wherein the matching entry is associated with a field in the target data source corresponding to the data source to be replaced in the target data source set; receiving a field selected by the user from the fields associated with the matching entry; and using the selected field as the matching field for the second field. According to this embodiment, setting a matching entry facilitates the user in selecting a matching field for the second field.
[0055] For example, if there are still unmatched fields in the data source to be replaced after the predefined matching rules have been applied, a matching entry can be displayed. The matching entry is associated with the fields in the target data source corresponding to the data source to be replaced, allowing users to manually select fields for matching. For instance, users can click the relevant button of the matching entry to get the field matching interface, and select a word from the drop-down list on the field matching interface. The backend will automatically use the selected field as the matching field for the second field.
[0056] In step S205, based on the field correspondence, at least one data source to be replaced is replaced with the corresponding target data source in the target data source set.
[0057] According to an exemplary embodiment of this disclosure, replacing at least one data source to be replaced with a corresponding target data source in a set of target data sources based on field correspondence relationships includes: replacing a first field in at least one data source to be replaced with a field of the target data source corresponding to the data source to be replaced in the set of target data sources based on a first field correspondence relationship; and replacing a second field in at least one data source to be replaced with a field of the target data source corresponding to the data source to be replaced in the set of target data sources based on a second field correspondence relationship. According to this embodiment, performing field replacement based on respective field correspondence relationships can improve the accuracy of data source replacement.
[0058] For example, for fields with matching relationships, that is, fields with corresponding relationships, the fields from the target data source can be used to replace the fields from the data source to be replaced in the configuration information based on the field correspondence. Specifically, firstly, based on the first field correspondence, fields in the data source to be replaced that have the same name as fields in the target data source are replaced with fields in the target data source that have the same name. Then, based on the second field correspondence, fields in the data source to be replaced that have different names are replaced with fields in the target data source that are matched according to the second predetermined rule.
[0059] According to an exemplary embodiment of this disclosure, replacing at least one data source to be replaced with a corresponding target data source in the target data source set based on field correspondence includes: determining, based on the field correspondence, at least one unmatched field in the target data source set for which no matching field exists in the target data source corresponding to the data source to be replaced; and discarding the relevant information of the unmatched field. According to this embodiment, fields in the data to be replaced that cannot be covered by the field correspondence can be discarded. Since fields that still lack a field correspondence after automatic and manual matching are generally unnecessary, discarding such fields not only improves replacement efficiency but also reduces memory usage. For example, if a field for which no matching object can be found in the data source to be replaced after manual matching, the configuration information associated with that field can be discarded.
[0060] According to an exemplary embodiment of this disclosure, before discarding the information related to the unmatched field, an alarm message indicating the discarding of the unmatched field's information may be displayed, wherein the alarm message includes the unmatched field. According to this embodiment, the alarm message reminds the user of the fields to be discarded before discarding information, avoiding the discarding of necessary fields and preventing unnecessary trouble.
[0061] For example, before discarding the relevant information of the unmatched fields, an alarm message can be displayed on the user's visible display interface. The alarm message includes the unmatched fields to be discarded. At this time, the user can further identify whether the fields to be discarded are necessary fields. When there are necessary fields among the fields to be discarded, the user can manually retain the necessary fields. For example, after the data source to be replaced is replaced with the corresponding target data source, the necessary fields can be inserted into the corresponding target data source to avoid the necessary fields being discarded and causing unnecessary trouble.
[0062] The following system explains how to replace the data source corresponding to the data dashboard. Figure 3 This is an exemplary embodiment illustrating a data source replacement process, such as... Figure 3 As shown, the process mainly includes the following steps:
[0063] First, identify all data sources used in the data dashboard and deduplicate them to obtain a data source list. Second, the user inputs a command to specify the target data source to be replaced for each data source in the list. Third, the system automatically matches fields based on a comparison of the fields of the data source to be replaced and the target data source, according to the following rules and priorities:
[0064] 1) Match fields with the same field name;
[0065] 2) Use rules to match fields that may have the same meaning, such as different languages with the same meaning ("city" and "city"), fuzzy matching ("city_id" and "cityid"), ("city_" and "city"), etc.
[0066] Then, for fields that cannot be automatically matched, users can manually identify them and associate fields with unrelated names but the same actual meaning.
[0067] Finally, for fields in the data source to be replaced that cannot be matched by manual identification, the configuration information related to that field can be discarded; for fields that have a matching relationship, the field of the target data source can be used to replace the field of the data source to be replaced in the configuration information.
[0068] After processing the above logic, the system can seamlessly perform replacement services for all matching data sources to be replaced and the target data source specified by the user.
[0069] In practical applications, the above embodiments can include an entry point for data source replacement on the product's data dashboard editing page. Clicking this entry point will provide results such as... Figure 4 The interface shown; according to the user's instructions, the system sequentially selects the target data source to replace the deduplicated data source list. After the target data source is selected, the system automatically matches the fields. When a field is not matched, the system will prompt the user with information about the unmatched field, such as... Figure 5 As shown. At this point, a corresponding matching entry is provided for each unmatched field. This matching entry is associated with a field in the target data source corresponding to the data source to be replaced, allowing the user to manually select the field for matching. For example, the user can click the relevant button for this matching entry to access the field matching interface, and select a word from the drop-down list on this interface. The backend will automatically use the selected field as the matching field for the second field. Figure 6 As shown. The replacement operation is performed when all fields are matched. If there are unmatched fields, and the user is informed of the risks but the replacement operation is still performed, the configuration information related to the unmatched fields will be discarded.
[0070] In summary, this disclosure significantly improves efficiency by using batch operations when a large number of data sources need to be replaced in a data dashboard, thus solving efficiency problems and improving the efficiency of replacement operations. Furthermore, by using automatic intelligent matching rules combined with manual matching, it handles field matching in various situations, covering the needs of all data replacement scenarios, and maximizing the security and accuracy of replacement results. This supports more flexible data source replacement methods, lowers the threshold requirements for data sources in replacement operations, and is applicable to more situations.
[0071] Figure 7 This is a block diagram illustrating a data processing apparatus according to an exemplary embodiment. (Refer to...) Figure 7 The device includes: a data source acquisition unit 70, a deduplication unit 72, a target data source determination unit 74, a relationship determination unit 76, and a replacement unit 78.
[0072] The data source acquisition unit 70 is configured to determine at least one data source to be replaced corresponding to the data dashboard; the deduplication unit 72 is configured to perform deduplication processing on at least one data source to be replaced to obtain a set of data sources to be replaced; the target data source determination unit 74 is configured to determine the target data source set corresponding to the set of data sources to be replaced, wherein the target data source set includes the target data source to which each data source to be replaced in the set of data sources to be replaced will be replaced; the relationship determination unit 76 is configured to determine the field correspondence between the fields of the data sources to be replaced in the set of data sources to be replaced and the fields of the target data sources corresponding to the data sources to be replaced in the set of target data sources; and the replacement unit 78 is configured to replace at least one data source to be replaced with the corresponding target data source in the set of target data sources based on the field correspondence.
[0073] According to an exemplary embodiment of this disclosure, the relationship determination unit 76 is further configured to determine a first field in the data source to be replaced in the set of data sources to be replaced based on a predetermined matching rule, and to determine the correspondence between the first field and the predetermined field as a first field correspondence, wherein the first field is a field in the target data source corresponding to the data source to be replaced in the set of target data sources that has a matching field, and the predetermined field is a field in the target data source corresponding to the data source to be replaced in the set of target data sources that matches the first field; and to determine a field correspondence based on the first field correspondence.
[0074] According to an exemplary embodiment of this disclosure, the relationship determination unit 76 is further configured to: determine a first sub-field in the data source to be replaced in the set of data sources to be replaced based on a first predetermined matching rule, wherein the first sub-field is a field with the same field identifier as the target data source corresponding to the data source to be replaced in the set of target data sources; determine a second sub-field among other fields based on a second predetermined matching rule, wherein the other fields are fields in the data source to be replaced in the set of data sources to be replaced other than the first sub-field, and the second sub-field is a field that is fuzzily matched to a field in the target data source corresponding to the data source to be replaced in the set of target data sources; and use the first sub-field and the second sub-field as the first field.
[0075] According to an exemplary embodiment of this disclosure, the relationship determination unit 76 is further configured to, for a second field other than the first field in the data source to be replaced in the set of data sources to be replaced, in response to a user instruction, select a matching field of the second field in the target data source corresponding to the data source to be replaced in the set of target data sources, and determine the correspondence between the second field and the matching field as the second field correspondence; and take the first field correspondence and the second field correspondence as the field correspondence.
[0076] According to an exemplary embodiment of this disclosure, the relationship determination unit 76 is further configured to display a matching entry corresponding to each second field, wherein the matching entry is associated with a field in the target data source corresponding to the data source to be replaced in the target data source set; receive a field selected by the user from the field associated with the matching entry through the matching entry; and use the selected field as the matching field of the second field.
[0077] According to an exemplary embodiment of this disclosure, the replacement unit 78 is further configured to replace a first field in at least one data source to be replaced with a field of a target data source corresponding to the data source to be replaced in the target data source set, based on a first field correspondence; and to replace a second field in at least one data source to be replaced with a field of a target data source corresponding to the data source to be replaced in the target data source set, based on a second field correspondence.
[0078] According to an exemplary embodiment of this disclosure, the replacement unit 78 is further configured to determine, based on the field correspondence, at least one unmatched field in the target data source corresponding to the data source to be replaced in the target data source set, where no matching field is found; and discard the relevant information of the unmatched field.
[0079] According to an exemplary embodiment of this disclosure, the replacement unit 78 is further configured to display alarm information indicating the discarding of information related to the unmatched field before discarding the information related to the unmatched field, wherein the alarm information includes the unmatched field.
[0080] According to embodiments of this disclosure, an electronic device may be provided. Figure 8 This is a block diagram of an electronic device 800 according to an embodiment of the present disclosure. The electronic device includes at least one memory 801 and at least one processor 802. The at least one memory stores a set of computer-executable instructions. When the set of computer-executable instructions is executed by the at least one processor, a data processing method according to an embodiment of the present disclosure is performed.
[0081] As an example, electronic device 800 may be a PC, tablet, personal digital assistant, smartphone, or other device capable of executing the aforementioned set of instructions. Here, electronic device 1000 is not necessarily a single electronic device, but may be any collection of devices or circuits capable of executing the aforementioned instructions (or instruction sets) individually or in combination. Electronic device 800 may also be part of an integrated control system or system manager, or may be configured to interconnect with a portable electronic device locally or remotely (e.g., via wireless transmission) through an interface.
[0082] In electronic device 800, processor 802 may include a central processing unit (CPU), a graphics processing unit (GPU), a programmable logic device, a dedicated processor system, a microcontroller, or a microprocessor. By way of example and not limitation, processor 802 may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, etc.
[0083] The processor 802 can execute instructions or code stored in memory, wherein memory 801 can also store data. Instructions and data can also be sent and received over a network via a network interface device, wherein the network interface device can employ any known transmission protocol.
[0084] The memory 801 may be integrated with the processor 802, for example, by placing RAM or flash memory within an integrated circuit microprocessor. Alternatively, the memory 801 may include a separate device, such as an external disk drive, a storage array, or other storage device usable by any database system. The memory 801 and the processor 802 may be operatively coupled, or may communicate with each other, for example, via I / O ports, network connections, etc., enabling the processor 802 to read files stored in the memory 801.
[0085] In addition, the electronic device 800 may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device can be interconnected via a bus and / or network.
[0086] According to embodiments of this disclosure, a computer-readable storage medium may also be provided, wherein when instructions in the computer-readable storage medium are executed by at least one processor, the at least one processor causes the processor to perform the data processing method of the embodiments of this disclosure. Examples of computer-readable storage media include: read-only memory (ROM), random access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), flash memory, non-volatile memory, CD-ROM, CD-R, CD+R, CD-RW, CD+RW, DVD-ROM, DVD-R, DVD+R, DVD-RW, DVD+RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, Blu-ray or optical disc storage, hard disk drive (HDD), solid-state drive (SSD), card storage (such as multimedia cards, secure digital (SD) cards, or ultra-fast digital (XD) cards), magnetic tape, floppy disk, magneto-optical data storage device, optical data storage device, hard disk, solid-state drive, and any other device configured to store a computer program and any associated data, data files, and data structures in a non-transitory manner and to provide the computer program and any associated data, data files, and data structures to a processor or computer so that the processor or computer can execute the computer program. The computer program in the aforementioned computer-readable storage medium can run in an environment deployed in computer devices such as clients, hosts, agent devices, servers, etc. Furthermore, in one example, the computer program and any associated data, data files, and data structures are distributed across a networked computer system, such that the computer program and any associated data, data files, and data structures are stored, accessed, and executed in a distributed manner through one or more processors or computers.
[0087] According to an embodiment of this disclosure, a computer program product is provided, including computer instructions, which, when executed by a processor, implement the data processing method of the embodiment of this disclosure.
[0088] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are indicated by the following claims.
[0089] It should be understood that this disclosure is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this disclosure is limited only by the appended claims.
Claims
1. A data processing method, characterized in that, include: Identify at least one data source to be replaced for the data dashboard; The at least one data source to be replaced is deduplicated to obtain a set of data sources to be replaced; Determine the target data source set corresponding to the set of data sources to be replaced, wherein the target data source set includes the target data source to which each data source to be replaced in the set of data sources to be replaced will be replaced; Determine the field correspondence between the fields of the data source to be replaced in the set of data sources to be replaced and the fields of the target data source corresponding to the data source to be replaced in the set of target data sources; Based on the field correspondence, the at least one data source to be replaced is replaced with the corresponding target data source in the target data source set; The step of determining the field correspondence between the fields of the data source to be replaced in the set of data sources to be replaced and the fields of the target data sources corresponding to the data source to be replaced in the set of target data sources includes: determining a first field in the data source to be replaced in the set of data sources to be replaced based on a predetermined matching rule, and determining the correspondence between the first field and the predetermined field as the first field correspondence, wherein the first field is a field in the target data source corresponding to the data source to be replaced in the set of target data sources where there is a matching field, and the predetermined field is a field in the target data source corresponding to the data source to be replaced in the set of target data sources that matches the first field; and determining the field correspondence based on the first field correspondence. Wherein, determining the field correspondence based on the first field correspondence includes: For the second field other than the first field in the data source to be replaced in the set of data sources to be replaced: in response to the user instruction, select the matching field of the second field in the target data source corresponding to the data source to be replaced in the set of target data sources, and determine the correspondence between the second field and the matching field as the second field correspondence; The first field correspondence and the second field correspondence are used as the field correspondence.
2. The data processing method as described in claim 1, characterized in that, The step of determining the first field in the data source to be replaced in the set of data sources to be replaced based on a predetermined matching rule includes: Based on the first predetermined matching rule, a first sub-field in the data source to be replaced in the set of data sources to be replaced is determined, wherein the first sub-field is a field that has the same field identifier as the target data source corresponding to the data source to be replaced in the set of target data sources; Based on the second predetermined matching rule, a second sub-field is determined from the other fields, wherein the other fields are the fields in the data source to be replaced in the data source to be replaced set other than the first sub-field, and the second sub-field is the field in the target data source corresponding to the data source to be replaced in the target data source set that is fuzzily matched to the field; Use the first subfield and the second subfield as the first field.
3. The data processing method as described in claim 1, characterized in that, In response to a user instruction, selecting the matching field of the second field from the target data sources corresponding to the data source to be replaced in the target data source set includes: For each second field, the matching entry corresponding to the second field is displayed, wherein the matching entry is associated with a field in the target data source corresponding to the data source to be replaced in the target data source set; Receive the fields selected by the user from the fields associated with the matching entry point; Use the selected field as the matching field for the second field.
4. The data processing method as described in claim 1, characterized in that, The step of replacing the at least one data source to be replaced with the corresponding target data source in the target data source set based on the field correspondence includes: Based on the first field correspondence, the first field in the at least one data source to be replaced is replaced with the field of the target data source corresponding to the data source to be replaced in the target data source set; Based on the second field correspondence, the second field in the at least one data source to be replaced is replaced with the field of the target data source corresponding to the data source to be replaced in the target data source set.
5. The data processing method as described in claim 1, characterized in that, The step of replacing the at least one data source to be replaced with the corresponding target data source in the target data source set based on the field correspondence includes: Based on the field correspondence, it is determined that the at least one data source to be replaced has an unmatched field in the target data source corresponding to the target data source set that does not have a matching field; Discard the relevant information for the unmatched fields.
6. The data processing method as described in claim 5, characterized in that, Before discarding the relevant information for the unmatched fields, the following is also included: Display alarm information indicating the discarding of the unmatched field, wherein the alarm information includes the unmatched field.
7. A data processing apparatus, characterized in that, include: The data source acquisition unit is configured to determine at least one data source to be replaced corresponding to the data dashboard; The deduplication unit is configured to perform deduplication processing on the at least one data source to be replaced to obtain a set of data sources to be replaced. The target data source determination unit is configured to determine the target data source set corresponding to the set of data sources to be replaced, wherein the target data source set includes the target data source to which each data source to be replaced in the set of data sources to be replaced will be replaced; The relationship determination unit is configured to determine the field correspondence between the fields of the data source to be replaced in the set of data sources to be replaced and the fields of the target data source corresponding to the data source to be replaced in the set of target data sources; The replacement unit is configured to replace the at least one data source to be replaced with the corresponding target data source in the target data source set based on the field correspondence. The relationship determination unit is further configured to determine a first field in the data source to be replaced in the set of data sources to be replaced based on a predetermined matching rule, and to determine the correspondence between the first field and the predetermined field as a first field correspondence relationship. The first field is a field in the target data source corresponding to the data source to be replaced in the set of target data sources that has a matching field. The predetermined field is a field in the target data source corresponding to the data source to be replaced in the set of target data sources that matches the first field. The field correspondence relationship is determined based on the first field correspondence relationship. The relationship determination unit is further configured to, in response to a user instruction, select a matching field of the second field in the target data source corresponding to the data source to be replaced in the target data source set, and determine the correspondence between the second field and the matching field as the second field correspondence; and use the first field correspondence and the second field correspondence as the field correspondence.
8. The data processing apparatus as described in claim 7, characterized in that, The relationship determination unit is further configured to, based on a first predetermined matching rule, determine a first sub-field in the data source to be replaced in the set of data sources to be replaced, wherein the first sub-field is a field with the same field identifier as the target data source corresponding to the data source to be replaced in the set of target data sources; and, based on a second predetermined matching rule, determine a second sub-field among other fields, wherein the other fields are fields in the data source to be replaced in the set of data sources to be replaced other than the first sub-field, and the second sub-field is a field in the target data source corresponding to the data source to be replaced in the set of target data sources where the field is fuzzily matched; and use the first sub-field and the second sub-field as the first field.
9. The data processing apparatus as described in claim 7, characterized in that, The relationship determination unit is further configured to display a matching entry corresponding to each second field, wherein the matching entry is associated with a field in the target data source corresponding to the data source to be replaced in the target data source set; receive a field selected by the user from the field associated with the matching entry through the matching entry; and use the selected field as the matching field of the second field.
10. The data processing apparatus as claimed in claim 7, characterized in that, The replacement unit is further configured to replace the first field in the at least one data source to be replaced with the field of the target data source corresponding to the data source to be replaced in the target data source set, based on the first field correspondence relationship; Based on the second field correspondence, the second field in the at least one data source to be replaced is replaced with the field of the target data source corresponding to the data source to be replaced in the target data source set.
11. The data processing apparatus as claimed in claim 7, characterized in that, The replacement unit is further configured to, based on the field correspondence, determine unmatched fields in the target data source corresponding to the at least one data source to be replaced in the target data source set where no matching field exists; and discard the relevant information of the unmatched fields.
12. The data processing apparatus as claimed in claim 11, characterized in that, The replacement unit is further configured to display alarm information indicating the discarding of the relevant information of the unmatched field before discarding the relevant information of the unmatched field, wherein the alarm information includes the unmatched field.
13. An electronic device, characterized in that, include: processor; Memory used to store the processor's executable instructions; The processor is configured to execute the instructions to implement the data processing method as described in any one of claims 1 to 6.
14. A computer-readable storage medium, characterized in that, When the instructions in the computer-readable storage medium are executed by at least one processor, the at least one processor causes the at least one processor to perform the data processing method as described in any one of claims 1 to 6.