Data modeling method, device, simulator, and readable storage medium
By breaking down business requirements and analyzing metadata, logical and physical models are constructed, solving the problem of inconsistent model design in traditional experience-based modeling and achieving high-quality and controllable data modeling results.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA UNITED NETWORK COMM GRP CO LTD
- Filing Date
- 2022-12-30
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, traditional experience-based modeling methods result in a disconnect between the three stages of business definition, logical model design, and physical model production. The lack of structured metadata specifications and constraints leads to poor model quality and inconsistencies between the business definition and the final model production.
By breaking down business requirements, we obtain metadata and analytical dimensions, achieve structured metadata specifications and constraints, determine business processes based on metadata, construct a bus matrix and extract consistency dimensions, establish logical and physical models, and ensure the uniformity of model design.
It achieves high-quality and controllable standardized modeling, standardizes business definitions, clearly describes indicator criteria, ensures model quality, and solves the problem of inconsistency between business definitions and final model production.
Smart Images

Figure CN115982292B_ABST
Abstract
Description
Technical Field
[0001] This application relates to computer technology, and more particularly to a data modeling method, apparatus, simulator, and readable storage medium. Background Technology
[0002] With the development of the digital economy, data has become a new factor of production. Faced with the massive amounts of data generated by enterprise growth, better mining the value of data to drive business development has become an industry consensus. Big data warehouse construction allows enterprises to transparently transmit and accumulate the value of data, achieving high-quality data-to-information transformation while providing data assurance for rapid trial and error and refined operations. Big data modeling is a crucial foundation for data warehouse construction, and good modeling methods are key to its success.
[0003] In existing technologies, traditional experience-based modeling methods are used; through the intensive co-construction of data warehouses, a large amount of cross-domain data is generated, requiring engineers to collaborate to complete business definition, logical model design, and physical model development.
[0004] However, due to differences in human cognition and experience levels, existing technologies lack data architecture specifications and effective metadata-driven constraints, which cannot guarantee the consistency between model design and actual physical model development, resulting in poor model quality. Summary of the Invention
[0005] This application provides a data modeling method, apparatus, simulator, and readable storage medium to solve the technical problem in the prior art where the three stages of business definition, logical model design, and physical model production in experience-based modeling are separated, and the lack of structured metadata specifications and constraints leads to inconsistencies between business definition and final model production.
[0006] Firstly, this application provides a data modeling method, including:
[0007] Each metric in the business requirements is broken down to obtain multiple metadata, and each metric is then categorized.
[0008] The topic to which the indicator belongs and the business processes under the topic are determined based on the metadata of the indicator and the topic standard library;
[0009] The analysis dimensions of each indicator in the business requirement are obtained, the analysis dimensions of each indicator are added to the business process, the bus matrix of the business process is obtained, and the consistency dimension of the business process is extracted from the bus matrix.
[0010] The logical model of each indicator is constructed based on the metadata of each indicator, the topic to which the indicator belongs, the business process under the topic, and the consistency dimension of the business process.
[0011] A physical model for each of the aforementioned indicators is established based on the logical model of each indicator.
[0012] Furthermore, the types of indicators include atomic indicators, calculated indicators, and derived indicators.
[0013] Furthermore, based on the logical model of each indicator, a physical model for each indicator is established, specifically including:
[0014] The name of the physical model of the indicator is constructed based on the physical model type corresponding to the indicator, the theme to which the indicator belongs, the business of the theme, the time period of the data corresponding to each metadata of the indicator, and the data table type where the data corresponding to each metadata is located.
[0015] The physical model of the indicator is constructed based on the type of the indicator and the data corresponding to each metadata of the indicator.
[0016] Furthermore, if the type of the indicator is the atomic indicator or the calculated indicator, then the physical model type corresponding to the indicator is the basic fact detail layer model; if the type of the indicator is the derived indicator, then the physical model type corresponding to the indicator is the light summary layer model.
[0017] Furthermore, the data table types include log-type data auto-mapping incremental table types and transaction-type data auto-mapping snapshot table types.
[0018] Furthermore, a physical model of the indicator is constructed based on the type of the indicator and the data corresponding to each metadata element of the indicator, specifically including:
[0019] When the physical model of the indicator is based on the basic fact detail model, the fields of the physical model of the indicator are directly retrieved or combined through simple aggregation functions based on the metadata of the indicator and the physical fields of the snapshot table of the preparation area of the business process in which the indicator is located;
[0020] When the physical model of the indicator is a derived fact light summary model, the atomic indicator corresponding to the derived indicator is determined, and the basic fact detail model is determined based on the metadata of the atomic indicator.
[0021] The time period extracted from the derived indicators is used as the limiting metadata. Logical operations are then performed on the fields of the basic fact detail model using the limiting metadata to generate the fields of the physical model of the indicators.
[0022] Furthermore, based on the metadata of the indicator and the topic standard library, the topic to which the indicator belongs and the business processes under the topic are determined, specifically including:
[0023] Based on the metadata of the indicator, determine the alternative topics to which the indicator belongs and the alternative business processes under the alternative topics;
[0024] The candidate topics and candidate business processes under the candidate topics are matched with topics and business processes in the topic standard library to obtain the topic to which the indicator belongs and the business processes under the topic.
[0025] Secondly, this application provides a data modeling apparatus, comprising:
[0026] The processing module is used to break down each metric in the business requirements to obtain multiple metadata, and to classify each metric.
[0027] The processing module is also used to determine the topic to which the indicator belongs and the business process under the topic based on the indicator's metadata and the topic standard library;
[0028] The processing module is also used to obtain the analysis dimensions of each indicator in the business requirements, add the analysis dimensions of each indicator to the business process, obtain the bus matrix of the business process, and extract the consistency dimension of the business process from the bus matrix.
[0029] The construction module is used to construct a logical model for each indicator based on the metadata of each indicator, the topic to which the indicator belongs, the business process under the topic, and the consistency dimension of the business process; and to build a physical model for each indicator based on the logical model of each indicator.
[0030] Thirdly, this application provides a simulator, including: a memory and a processor;
[0031] The memory stores computer programs that can run on the processor;
[0032] The processor implements the method described in the first aspect when executing the computer program.
[0033] Fourthly, this application provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, are used to implement the method described in the first aspect.
[0034] Fifthly, this application provides a computer program product, including a computer program that, when executed by a processor, implements the method described in the first aspect.
[0035] This application provides a data modeling method, apparatus, simulator, and readable storage medium. In this solution, metadata and analytical dimensions are obtained by decomposing business requirements, achieving structured metadata specifications and constraints within the business requirements. Then, business processes are determined based on the metadata, and analytical dimensions are added to these processes to obtain a bus matrix of business processes, from which consistency dimensions are extracted. Logical models for each indicator are constructed based on its metadata, the theme to which it belongs, the business processes under that theme, and the consistency dimensions of the business processes. Finally, physical models for each indicator are established based on their logical models. This ensures the consistency between model design and the final physical model, standardizes business definitions, clearly describes indicator definitions, and maps them to the physical model, guaranteeing model quality and achieving high-quality and controllable standardized modeling. This solves the problem of inconsistency between business definition and final model production caused by the fragmentation of the three stages of existing experience-based modeling. Attached Figure Description
[0036] The accompanying drawings, which are incorporated in and form a part of this specification, illustrate embodiments consistent with this disclosure and, together with the description, serve to explain the principles of this disclosure.
[0037] Figure 1 A flowchart illustrating a data modeling method provided in an embodiment of this application;
[0038] Figure 2 A flowchart illustrating a method for establishing a physical model of each indicator based on a logical model of each indicator, provided in an embodiment of this application.
[0039] Figure 3 A flowchart illustrating another method for establishing a physical model of each indicator based on a logical model of each indicator, provided in an embodiment of this application.
[0040] Figure 4 A flowchart illustrating another data modeling method provided in an embodiment of this application;
[0041] Figure 5 This is a schematic diagram of the structure of a data modeling device provided in an embodiment of this application;
[0042] Figure 6 This is a schematic diagram of the structure of a simulator provided in an embodiment of this application.
[0043] The accompanying drawings have illustrated specific embodiments of this disclosure, which will be described in more detail below. These drawings and descriptions are not intended to limit the scope of the concept in any way, but rather to illustrate the concepts of this disclosure to those skilled in the art through reference to particular embodiments. Detailed Implementation
[0044] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numerals in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this disclosure. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.
[0045] With the development of the digital economy, data has become a new factor of production. Faced with the massive amounts of data generated by enterprise growth, better mining the value of data to drive business development has become an industry consensus. Big data warehouse construction allows enterprises to transparently transmit and accumulate the value of data, achieving high-quality data-to-information transformation while providing data assurance for rapid trial and error and refined operations. Big data modeling is a crucial foundation for data warehouse construction, and good modeling methods are key to its success.
[0046] In existing technologies, traditional experience-based modeling methods are used; through the intensive co-construction of data warehouses, a large amount of cross-domain data is generated, requiring engineers to collaborate to complete business definition, logical model design, and physical model development.
[0047] However, due to differences in human cognition and experience levels, existing technologies lack data architecture specifications and effective metadata-driven constraints, which cannot guarantee the consistency between model design and actual physical model development, resulting in poor model quality.
[0048] To address the aforementioned problems, this application provides a data modeling method, apparatus, simulator, and readable storage medium. It aims to solve the technical problem in existing experience-based modeling where the three stages of business definition, logical model design, and physical model production are fragmented, lacking structured metadata specifications and constraints, leading to inconsistencies between the business definition and the final model production. The technical concept of this application is as follows: Metadata and analytical dimensions are obtained by decomposing business requirements, achieving structured metadata specifications and constraints within the business requirements; then, business processes are determined based on the metadata, and analytical dimensions are added to the business processes to obtain a bus matrix of business processes, from which consistency dimensions of the business processes are extracted; logical models of each indicator are constructed based on the metadata of each indicator, the theme to which the indicator belongs, the business processes under the theme, and the consistency dimensions of the business processes; then, physical models of each indicator are established based on the logical models of each indicator; ensuring the consistency between model design and the final physical model, standardizing business definitions, clearly characterizing indicator definitions, and mapping them to the physical model to guarantee model quality, achieving high-quality and controllable standardized modeling. This solves the problem of inconsistencies between the business definition and the final model production caused by the fragmentation of the three stages of existing experience-based modeling.
[0049] The technical solution of this application and how the technical solution of this application solves the above-mentioned technical problems are described in detail below with specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments. The embodiments of this application will be described below with reference to the accompanying drawings.
[0050] Figure 1 This is a flowchart illustrating a data modeling method provided in an embodiment of this application, as shown below. Figure 1 As shown, the method includes:
[0051] 101. Break down each metric in the business requirements to obtain multiple metadata, and classify each metric.
[0052] For example, the execution subject of this embodiment can be a simulator, a terminal device, a data modeling apparatus or device, or other apparatus or device capable of executing this embodiment, and there are no limitations on this. In this embodiment, a simulator is used as the execution subject for description.
[0053] First, business requirements need to be obtained. These can be retrieved from storage or from the business platform. Business requirements include: the Chinese name of the metric, its dimension, and its business definition. The simulator then uses this business definition to structurally break down the metrics according to a structured metric scheme, generating metric decomposition metadata, and completing the technical definition of the metric based on this. Metric types include atomic metrics, calculated metrics, and derived metrics.
[0054] Atomic metrics: These are metrics that cannot be further divided within a specific business process; they are terms that contain specific business meanings. Physically, they are composed of aggregate operators added to a specific entity field within the business process.
[0055] Calculated metrics: Calculated metrics are obtained by performing addition, subtraction, multiplication, and division operations on multiple atomic metrics. Physically, calculated metrics can be automatically generated based on entity fields using defined calculation formulas.
[0056] Derived metrics: Derived metrics consist of "time period + multiple qualifiers + atomic / calculated metrics". Physically, derived metrics are generated based on basic fact fields, through dimensional roll-up according to defined aggregation functions and filtering conditions. Derived metrics mainly exist in the light aggregation layer of data warehouse layering, and are a technical approach to automatically generating metrics based on metadata from the basic fact detail layer through aggregation functions.
[0057] For example, the Chinese name is the daily total traffic statistics flowing from Beijing to IDC (Internet Data Center) server rooms; the dimensions are source IP (Internet Protocol Address) and destination IP; the indicator business scope is the inflow traffic from the source IP corresponding to the Beijing region to the destination IP corresponding to the Beijing IDC server room. According to the structured decomposition, to find the inflow traffic from region to region IDC server room, the specific region to region is a limiting condition, and the IDC server room is also a limiting condition, with the atomic indicator being the inflow traffic; by limiting it to IDC server rooms, it can be seen that the data collected by the Netflow core router currently includes IDC server room traffic, and Netflow belongs to the data network collection subject domain, and Netflow belongs to the traffic collection corresponding business.
[0058] 102. Determine the theme to which the indicator belongs and the business processes under that theme based on the indicator's metadata and the theme standard library.
[0059] For example, the simulator can extract topic, entity, and business process attribute information based on the metric's metadata. By matching it with a topic business process standard library, it can accurately determine the topic and business process to which the metric belongs, and thus determine the topic and business process metadata. For instance, by matching with the topic business standard library, it can determine that the topic domain is data network (IPnet) and the business process is network data traffic (flow).
[0060] 103. Obtain the analysis dimensions of each indicator in the business requirements, add the analysis dimensions of each indicator to the business process, obtain the bus matrix of the business process, and extract the consistency dimension of the business process from the bus matrix.
[0061] For example, the simulator can obtain the analysis dimensions of various indicators in the business requirements, such as source IP and destination IP; add the analysis dimensions to the matching topic business process to form a business bus matrix under the topic, and determine the consistency dimension metadata of the business.
[0062] 104. Construct a logical model for each indicator based on its metadata, the theme to which it belongs, the business process under that theme, and the consistency dimension of the business process.
[0063] For example, the simulator constructs a logical model for each indicator based on its metadata, the topic to which the indicator belongs, the business process under the topic, and the consistency dimension of the business process. It completes the design of business requirements and logical models, and through structured decomposition of business requirements, confirms the metadata of indicators, topics, business processes, and dimensions. This prepares the model for generating an unalterable physical model based on metadata-driven constraints. Simultaneously, the model design undergoes structured decomposition and specification constraints, eliminating model design duplication caused by human experience-based modeling and addressing deviations between business definitions and model design.
[0064] 105. Establish the physical model of each indicator based on the logical model of each indicator.
[0065] For example, the simulator can determine the physical model name based on the logical model theme and business process metadata, combined with the current business process data production flow. Then, based on the logical model indicator metadata, it completes the mapping of physical table fields to generate the basic facts of the physical model. Finally, based on the logical model business bus matrix, it obtains the consistency dimensions and performs attribute degradation of related dimensions to generate the complete dimensions of the physical model.
[0066] For example, the physical model name is confirmed, and the logical model metrics are derived metrics. First, a basic fact detail layer model (DWD layer) is constructed using atomic metrics. In the actual data production process, the preparation area data, specifically the netflow data collection, is log-type, i.e., incremental data (inc), and it's desired that daily data be placed in one table partition, i.e., daily refresh (d). A custom table name, idc, is used to identify the network traffic data content it collects; therefore, the physical model name is generated as: dwd_ipnet_flow_idc_d_inc. The basic fact field "data flow bytes" in the basic fact detail table is directly pulled and mapped from the physical field in the preparation area snapshot table where the business process atomic metrics reside. The upstream preparation area data fields such as byte and packet are directly pulled and mapped, completing the binding of physical fields to basic facts in the physical model's basic fact detail table. Based on consistency dimensions, foreign keys are used to join common dimension tables, and dimension attributes are degenerated to the fact table as much as possible, completing the assembly of the physical model's basic fact detail table facts and dimension physical fields.
[0067] For example, the business definition "inflow traffic from the source IP corresponding to the region of Beijing to the destination IP corresponding to the IDC data center in Beijing" will be rolled up by the limiting conditions of the city of the source IP and the IDC and city of the destination IP to aggregate the traffic every day and automatically complete the derived fact field, that is, the daily inflow traffic from region to region IDC. By filtering the source and destination regions as Beijing, the total inflow traffic index value from Beijing to Beijing IDC data center can be obtained every day.
[0068] In this embodiment, metadata and analytical dimensions are obtained by decomposing business requirements, thus achieving structured metadata specifications and constraints within the business requirements. Then, business processes are determined based on the metadata, and analytical dimensions are added to these processes to obtain a bus matrix of business processes, from which consistency dimensions are extracted. Logical models for each indicator are constructed based on its metadata, the theme to which it belongs, the business processes under that theme, and the consistency dimensions of the business processes. Finally, physical models for each indicator are established based on their logical models. This ensures the consistency between model design and the final physical model, standardizes business definitions, clearly describes indicator definitions, and maps them to the physical model, guaranteeing model quality and achieving high-quality and controllable standardized modeling. This addresses the problem of inconsistencies between business definitions and the final model production caused by the fragmentation of business definition, logical model design, and physical model production in existing experience-based modeling.
[0069] Figure 2 This application provides a flowchart illustrating a method for establishing a physical model of each indicator based on its logical model, as shown in the embodiments below. Figure 2 As shown, the method includes:
[0070] 201. Construct the name of the physical model of the indicator based on the physical model type corresponding to the indicator, the theme to which the indicator belongs, the business of the theme, the time period of the data corresponding to each metadata of the indicator, and the data table type where the data corresponding to each metadata is located.
[0071] For example, the simulator can obtain the accumulated model themes and business processes, and, combined with the actual data production process, confirm the model name: Table Level + Theme Domain + Business Process + Custom Name + Time Period + Table Type. The table level is determined based on the accumulated indicator type, specifying whether it's a basic fact detail layer model or a lightly summarized layer model, thus determining the table level name. Atomic / calculated indicators are assigned to DWD (Basic Fact Detail Layer Model), and derived indicators are assigned to DWS (Lightly Summarized Layer Model). The time period and table type are determined based on the data type in the actual production process preparation area and the table data refresh response requirements. Log-type data is automatically mapped to incremental table types, and transactional data is automatically mapped to snapshot table types. This metadata-driven standardization of physical model names allows for a comprehensive understanding of the business data carried by the model, completing the automatic mounting of models and assets.
[0072] 202. Construct a physical model of the indicator based on the type of indicator and the data corresponding to each metadata of the indicator.
[0073] For example, the simulator can map physical table fields based on logical model metric metadata to generate basic facts for the physical model. Based on the logical model business bus matrix, it obtains consistent dimensions and performs attribute degradation on related dimensions to generate complete dimensions for the physical model.
[0074] In this embodiment, based on metadata constraints, the inconsistency between model design and physical model development caused by differences in human thinking is avoided. The binding of physical model name and dimension fields and fact measurement fields is completed in accordance with specifications and standards, and physical model DDL (Database Schema Definition Language) is generated to guide engineers in completing physical model DML development.
[0075] Figure 3 A flowchart illustrating another method for establishing a physical model of each indicator based on a logical model of each indicator, as provided in this application embodiment, is shown below. Figure 3 As shown, the method includes:
[0076] 301. Construct the name of the physical model of the indicator based on the physical model type corresponding to the indicator, the theme to which the indicator belongs, the business of the theme, the time period of the data corresponding to each metadata of the indicator, and the data table type where the data corresponding to each metadata is located.
[0077] This step can be found in [reference]. Figure 2 Step 201 in the text will not be repeated here.
[0078] 302. When the physical model of an indicator is based on the basic fact detail model, the fields of the physical model of the indicator are directly retrieved or combined using simple aggregation functions based on the indicator's metadata and the physical fields of the snapshot table of the preparation area of the business process in which the indicator is located.
[0079] For example, electronic devices can constrain the facts that the model should produce based on the model name. The basic fact fields of the basic fact details table are populated with atomic / computational indicator metadata based on the logical model. This metadata is determined by the preparation area snapshot table of the business process where the atomic indicators or computational indicators are decomposed. It can be directly retrieved from the physical fields of the preparation area snapshot table or generated by combining them through simple aggregation functions.
[0080] 303. When the physical model of the indicator is a derived fact light summary model, determine the atomic indicators corresponding to the derived indicators, and determine the basic fact detail model based on the metadata of the atomic indicators.
[0081] For example, the derived fact light summary model is a pre-calculation process of the basic fact detail model. The basic fact detail model can be found by using the metadata of the atomic indicators decomposed from the derived indicators.
[0082] 304. Use the time period decomposed from the derived indicators as the limiting metadata, and perform logical operations on the fields of the basic fact detail model through the limiting metadata to generate the fields of the physical model of the indicators.
[0083] For example, the simulator then obtains the basic fact field, decomposes the time period of the derived indicator, applies the limited keyword metadata to the basic fact field, and completes the calculation logic (aggregate function operator) binding to generate the derived fact field.
[0084] In this embodiment, consistent dimensions generated by the logical model are obtained. Based on the foreign key association of these consistent dimensions with the common dimension table, dimension attributes are degenerated to the fact table as much as possible to complete the confirmation of complete consistent dimensions. Dimension addition is not based on requirements, but on business process consistency dimensions. This way, frequent changes in requirements later do not require adjustments to the basic model, improving model stability and reusability. Based on the above physical model design, the model name is determined, facts are completed, and dimension physical fields are mounted and assembled, automatically generating the physical model DDL. For lightly summarized physical models, this is a process of pre-calculating the detailed model. Through dimension roll-up and aggregation function binding, based on the DDL, the physical model DML is automatically generated.
[0085] Figure 4 A flowchart illustrating another data modeling method provided in this application embodiment is shown below. Figure 4 As shown, the method includes:
[0086] 401. Break down each metric in the business requirements to obtain multiple metadata, and classify each metric.
[0087] This step can be found in [reference]. Figure 1 Step 101 in the text will not be repeated here.
[0088] 402. Determine the alternative themes to which the indicators belong and the alternative business processes under the alternative themes based on the metadata of the indicators.
[0089] For example, the simulator can obtain business-meaning terms from the atomic indicators obtained by decomposing the indicator structure scheme, and extract themes, business processes and analysis entities from the business-meaning terms.
[0090] 403. Match the candidate topics and candidate business processes under the candidate topics with the topics and business processes in the topic standard library to obtain the topic to which the indicator belongs and the business processes under the topic.
[0091] For example, the obtained calculated metrics are derived from multiple atomic metrics through arithmetic operations. These metrics are then broken down into operation items and operators, resulting in atomic metrics. These atomic metrics can be assigned to specific business processes in subsequent matching. Since business processes have a sequential order, the calculated metrics are assigned to business processes that occur later in the sequence. The obtained derived metrics consist of structured metadata. Through the structured decomposition of atomic or calculated metric metadata, business-meaning terms can be obtained.
[0092] The theme and business process will not change with the change of demand. The demand indicators are decomposed into metadata and bound to the theme and business process. While ensuring the stability of the model division, it also ensures the uniqueness of the technical definition of the indicators. This prevents the problems of inconsistent indicator names or different names due to human experience definition or the inability to control the data globally.
[0093] 404. Obtain the analysis dimensions of each indicator in the business requirements, add the analysis dimensions of each indicator to the business process, obtain the bus matrix of the business process, and extract the consistency dimension of the business process from the bus matrix.
[0094] This step can be found in [reference]. Figure 1 Step 103 in the text will not be repeated here.
[0095] 405. Construct a logical model for each indicator based on its metadata, the theme to which it belongs, the business process under that theme, and the consistency dimension of the business process.
[0096] This step can be found in [reference]. Figure 1 Step 104 in the text will not be repeated here.
[0097] 406. Establish the physical model of each indicator based on the logical model of each indicator.
[0098] This step can be found in [reference]. Figure 1 Step 105 in the text will not be repeated here.
[0099] In this embodiment, the logical model design is completed, the business definition is decomposed, and metadata such as indicators, themes, business processes, and consistency dimensions are extracted. This prepares for generating an unalterable physical model based on metadata-driven constraints. At the same time, the model design is decomposed in a structured manner and the constraints are standardized, eliminating problems such as model design duplication, chaotic indicator binding, inconsistent indicator definitions, and deviation between business requirements and model design caused by human experience modeling. This achieves the unification of business definition and logical model design.
[0100] Figure 5 This is a schematic diagram of the structure of a data modeling device provided in an embodiment of this application, such as... Figure 5 As shown, the device 500 includes:
[0101] The processing module 502 is used to break down each indicator in the business requirements to obtain multiple metadata, and to classify each indicator.
[0102] The processing module 502 is also used to determine the topic to which the indicator belongs and the business process under the topic based on the indicator's metadata and the topic standard library;
[0103] The acquisition module 501 is used to acquire the analysis dimensions of each indicator in the business requirements, add the analysis dimensions of each indicator to the business process, obtain the bus matrix of the business process, and extract the consistency dimension of the business process from the bus matrix.
[0104] The processing module 502 is also used to construct a logical model for each indicator based on the metadata of each indicator, the topic to which the indicator belongs, the business process under the topic, and the consistency dimension of the business process.
[0105] The processing module 502 is also used to establish the physical model of each indicator based on the logical model of each indicator.
[0106] In one embodiment, the processing module 502 is further specifically used for:
[0107] The name of the physical model of the indicator is constructed based on the physical model type corresponding to the indicator, the theme to which the indicator belongs, the business of the theme, the time period of the data corresponding to each metadata of the indicator, and the data table type where the data corresponding to each metadata is located.
[0108] A physical model of the indicator is constructed based on the type of indicator and the data corresponding to each metadata of the indicator.
[0109] In one embodiment, the processing module 502 is further specifically used for:
[0110] When the physical model of an indicator is based on the basic fact detail model, the fields of the physical model of the indicator are directly retrieved or combined using simple aggregation functions based on the indicator's metadata and the physical fields of the snapshot table of the preparation area of the business process in which the indicator is located.
[0111] When the physical model of the indicator is a derived fact light aggregation model, determine the atomic indicators corresponding to the derived indicators, and determine the basic fact detail model based on the metadata of the atomic indicators.
[0112] The time period extracted from the derived indicators is used as the limiting metadata. Logical operations are then performed on the fields of the basic fact detail model using the limiting metadata to generate the fields of the physical model of the indicators.
[0113] In one embodiment, the processing module 502 is further specifically used for:
[0114] Based on the metadata of the indicator, determine the alternative topic to which the indicator belongs and the alternative business process under the alternative topic;
[0115] Match the candidate topics and candidate business processes under the candidate topics with the topics and business processes in the topic standard library to obtain the topic to which the indicator belongs and the business processes under the topic.
[0116] The apparatus in this embodiment can execute the technical solutions in the above method. Its specific implementation process and technical principles are the same, and will not be repeated here.
[0117] Figure 6 This is a schematic diagram of the structure of a simulator provided in an embodiment of this application, such as... Figure 6 As shown, the electronic device 600 includes: a memory 601 and a processor 602;
[0118] Among them, memory 601 is used to store computer instructions that can be executed by the processor;
[0119] The processor 602 implements the various steps of the method in the above embodiments when executing computer instructions. For details, please refer to the relevant descriptions in the foregoing method embodiments.
[0120] Optionally, the memory 601 can be either independent or integrated with the processor 602. When the memory 601 is configured independently, the detection device also includes a bus for connecting the memory 601 and the processor 602.
[0121] This application also provides a non-transitory computer-readable storage medium, which, when the instructions in the storage medium are executed by the processor of an electronic device, enables the electronic device to perform the methods provided in the above embodiments.
[0122] This application also provides a computer program product, which includes: a computer program stored in a readable storage medium, at least one processor of an electronic device can read the computer program from the readable storage medium, and the at least one processor executes the computer program to cause the electronic device to perform the solution provided in any of the above embodiments.
[0123] Other embodiments of this disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this disclosure are indicated by the following claims.
[0124] It should be understood that this disclosure is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this disclosure is limited only by the appended claims.
Claims
1. A data modeling method, characterized by, include: Each metric in the business requirements is broken down to obtain multiple metadata, and each metric is then categorized. Based on the metadata of the indicator, determine the candidate topic to which the indicator belongs and the candidate business process under the candidate topic; match the candidate topic and the candidate business process under the candidate topic with the topics and business processes in the topic standard library to obtain the topic to which the indicator belongs and the business process under the topic. The analysis dimensions of each indicator in the business requirement are obtained, the analysis dimensions of each indicator are added to the business process, the bus matrix of the business process is obtained, and the consistency dimension of the business process is extracted from the bus matrix. The logical model of each indicator is constructed based on the metadata of each indicator, the topic to which the indicator belongs, the business process under the topic, and the consistency dimension of the business process. The name of the physical model of the indicator is constructed based on the physical model type corresponding to the indicator, the theme to which the indicator belongs, the business of the theme, the time period of the data corresponding to each metadata of the indicator, and the data table type where the data corresponding to each metadata is located. When the physical model of the indicator is based on the basic fact detail model, the fields of the physical model of the indicator are directly retrieved or combined through simple aggregation functions based on the metadata of the indicator and the physical fields of the snapshot table of the preparation area of the business process in which the indicator is located; the types of the indicator include atomic indicators, calculated indicators and derived indicators; When the physical model of the indicator is a derived fact light summary model, the atomic indicator corresponding to the derived indicator is determined, and the basic fact detail model is determined based on the metadata of the atomic indicator. The time period decomposed from the derived indicators is used as the limiting word metadata. Logical operations are performed on the fields of the basic fact detail model using the limiting word metadata to generate the fields of the physical model of the indicators. The data table types include log-type data auto-mapping incremental table type and transaction-type data auto-mapping snapshot table type.
2. The data modeling method according to claim 1, characterized in that, If the type of the indicator is the atomic indicator or the calculated indicator, then the physical model type corresponding to the indicator is the basic fact detail layer model; if the type of the indicator is the derived indicator, then the physical model type corresponding to the indicator is the light summary layer model.
3. A data modeling apparatus, comprising: include: The processing module is used to break down each metric in the business requirements to obtain multiple metadata, and to classify each metric. The processing module is further configured to determine the candidate topic to which the indicator belongs and the candidate business process under the candidate topic based on the metadata of the indicator; match the candidate topic and the candidate business process under the candidate topic with the topics and business processes in the topic standard library to obtain the topic to which the indicator belongs and the business process under the topic; The processing module is also used to obtain the analysis dimensions of each indicator in the business requirements, add the analysis dimensions of each indicator to the business process, obtain the bus matrix of the business process, and extract the consistency dimension of the business process from the bus matrix. The module is used to construct logical models for each indicator based on its metadata, the theme to which the indicator belongs, the business process under the theme, and the consistency dimension of the business process; and to construct the name of the physical model of the indicator based on the physical model type corresponding to the indicator, the theme to which the indicator belongs, the business of the theme, the time period of the data corresponding to each metadata of the indicator, and the data table type where the data corresponding to each metadata is located; when the physical model of the indicator is a basic fact detail model, the modules directly pull or combine the physical fields of the snapshot table of the preparation area of the business process to which the indicator belongs, to fill the fields of the physical model of the indicator; the types of indicators include atomic indicators, calculated indicators, and derived indicators; when the physical model of the indicator is a derived fact light summary model, the modules determine the atomic indicators corresponding to the derived indicators, and determine the basic fact detail model based on the metadata of the atomic indicators; the time period decomposed from the derived indicators is used as qualifier metadata, and logical operations are performed on the fields of the basic fact detail model through the qualifier metadata to generate the fields of the physical model of the indicator; wherein, the data table types include log data automatically mapped to incremental table type and transaction data automatically mapped to snapshot table type.
4. The data modeling apparatus of claim 3, wherein, When the type of the indicator is the atomic indicator or the calculated indicator, the physical model type corresponding to the indicator is the basic fact detail layer model; when the type of the indicator is the derived indicator, the physical model type corresponding to the indicator is the light summary layer model.
5. A simulator characterized by, include: Memory and processor; The memory is used to store computer programs; The processor is used to execute the computer program stored in the memory to implement the data modeling method as described in claim 1 or 2.
6. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, is used to implement the data modeling method as described in claim 1 or 2.