Mes-based production data management system and method

By calculating the outlier characteristics and value of production data in the MES system, data with different storage requirements are filtered out, and balanced storage is performed based on a consistent hash ring. This solves the high cost and scalability problems under the multi-model database architecture and achieves low-cost, highly scalable data storage management.

CN121303569BActive Publication Date: 2026-06-26HAIMINGDE (YUEYANG) TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HAIMINGDE (YUEYANG) TECH CO LTD
Filing Date
2025-10-15
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

The existing MES production and manufacturing data storage adopts a multi-model database architecture, which results in high storage device procurement costs, high data center energy consumption, poor data storage management scalability, and high maintenance difficulty.

Method used

By acquiring data from each production stage in the manufacturing execution system, calculating data value based on outlier characteristics, filtering out data that needs to be fully or partially stored, and determining the correspondence between data and storage servers based on data value and storage server cost, a consistent hash ring is used for balanced storage.

Benefits of technology

It optimizes production data storage, reduces post-maintenance and management costs, improves the scalability of the storage system and the cost of expanding storage capacity, and achieves balanced management of data storage.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121303569B_ABST
    Figure CN121303569B_ABST
Patent Text Reader

Abstract

The application discloses an MES-based production data management system and method, relates to the technical field of data calculation, and comprises the following steps: obtaining production data of each production manufacturing stage in a manufacturing execution system; calculating data values of the production data based on the outlying characteristics of each production data; when the data values meet preset conditions, screening first storage data which needs full storage and second storage data which needs partial storage from the production data; matching each first storage data and second storage data with a storage server based on the data values and the storage cost of the storage server, determining the corresponding relationship between the production data and the storage server, and storing the production data in each storage server based on the corresponding relationship. The application achieves the technical effects of reducing the difficulty and cost of local maintenance of a super large data table in the later stage.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of data computing technology, and specifically to a production data management system and method based on MES. Background Technology

[0002] Manufacturing Execution System (MES), as a bridge connecting enterprise resource planning and industrial automation equipment, has become the core hub of smart factories. This system integration platform is tailored for the manufacturing industry, and its core mission is to comprehensively improve the execution efficiency of the production process, accurately coordinate the connection and cooperation of each production stage, realize real-time dynamic monitoring of production activities, and provide management with refined production data support throughout the entire process.

[0003] The current mainstream MES production and manufacturing data storage adopts a multi-model database architecture. This architecture requires the deployment of multiple heterogeneous database instances, each of which requires independent hardware resources and an operation and maintenance team. As the scale of production expands, the procurement cost of storage devices, the energy consumption of the data center, and the consumption of computing resources brought about by cross-database synchronization increase exponentially. This makes it extremely difficult and costly to maintain ultra-large data tables locally in the later stages, and the scalability of storage capacity in data storage management is poor. Summary of the Invention

[0004] To address the technical problem in related technologies where managing production data through a multi-model database architecture results in extremely high difficulty and cost of maintaining ultra-large data tables locally in the later stages, this invention provides a production data management system and method based on MES.

[0005] The specific technical solution adopted is as follows:

[0006] Acquire production data from each stage of the manufacturing execution system;

[0007] Calculate the data value of production data based on the outlier characteristics of each production data point.

[0008] When the value of the data meets the preset conditions, the first storage data that needs to be fully stored and the second storage data that needs to be partially stored are selected from the production data.

[0009] Based on the data value and the storage cost of the storage server, the first and second storage data are matched with the storage server to determine the correspondence between production data and storage server.

[0010] Based on the correspondence, production data is stored evenly across various storage servers.

[0011] In one possible implementation of this application, calculating the data value of production data based on the outlier characteristics of each production data point includes:

[0012] The production data is transformed into multiple feature data points, and the outlier degree of each feature data point is calculated based on the Euclidean distance between them.

[0013] Based on the degree of outlier and the preset clustering algorithm, each feature data point is clustered to obtain multiple clusters;

[0014] Determine the first difference between the maximum data volume in each cluster and the corresponding data volume of each cluster. Normalize the first difference to obtain the data value of each cluster.

[0015] In one possible implementation of this application, based on the outlier degree and a preset clustering algorithm, each feature data point is clustered to obtain multiple clusters, including:

[0016] Based on the degree of outlier and the amount of data for each feature data point, an outlier curve is obtained by fitting each feature data point.

[0017] The neighborhood radius is determined based on the feature data point corresponding to the maximum rate of change on the outlier curve.

[0018] Based on the neighborhood radius and a preset clustering algorithm, each feature data point is clustered to obtain multiple clusters.

[0019] In one possible implementation of this application, when the data value meets preset conditions, a first set of storage data that needs to be fully stored and a second set of storage data that needs to be partially stored are selected from the production data, including:

[0020] When the data value is greater than a preset data value threshold, the production data corresponding to the data value is determined as the first storage data that needs to be stored in full; otherwise, it is the second storage data that needs to be partially stored.

[0021] In one possible implementation of this application, based on data value and storage costs of storage servers, each first and second stored data is matched with a storage server to determine the correspondence between production data and storage servers, including:

[0022] Obtain the amount of stored data, additional power consumption, and average access latency of each storage server in the previous storage stage;

[0023] The continuous storage cost of the storage server is calculated based on the ratio between the additional electricity cost and the amount of data stored.

[0024] The access cost of the storage server is calculated based on the normalized value of the average access latency.

[0025] Based on the data value and the distance between each feature data point and the center of the largest cluster, the storage value of each feature data point is calculated.

[0026] The average value of the stored value is used as the first weight. Based on the first weight, the continuous storage cost and access cost are weighted and summed to obtain the comprehensive storage cost of the storage server.

[0027] Based on the overall storage cost and storage value, each type of first and second storage data is matched with a storage server to determine the correspondence between production data and storage servers.

[0028] In one possible implementation of this application, the storage value of each feature data point is calculated based on its data value and the distance between each feature data point and the center of the largest cluster, including:

[0029] Determine the first distance between each feature data point and the maximum cluster center, and the second distance between the cluster center corresponding to each feature data point and the maximum cluster center;

[0030] Calculate the second difference between the first distance and the second distance;

[0031] The storage value of each feature data point is calculated based on the product of the data value and the second difference.

[0032] In one possible implementation of this application, based on comprehensive storage costs and storage value, each first and second storage data is matched with a storage server to determine the correspondence between production data and storage servers, including:

[0033] The overall storage costs are arranged from lowest to highest, and the storage servers are divided into ranges.

[0034] Based on the partitioned storage ranges and storage servers, a consistent hash ring is constructed.

[0035] The first and second stored data are sorted in descending order of their storage value so that each first and second stored data corresponds to a storage interval, thus obtaining the correspondence between production data and storage servers.

[0036] In one possible implementation of this application, a consistent hash ring is constructed based on the partitioned storage regions and storage servers, including:

[0037] Obtain the remaining storage space and available read / write speed of the storage servers in each storage region;

[0038] Calculate the percentage of remaining storage space in each storage server, and the first product between the percentage of remaining storage space and the available read / write speed;

[0039] The number of hash ring virtual nodes required for each storage server is calculated based on the product between the first product and the preset minimum number of virtual nodes.

[0040] By evenly distributing the virtual nodes corresponding to the number of virtual nodes in the hash ring and connecting them end to end, a consistent hash ring is constructed.

[0041] In one possible implementation of this application, production data is stored evenly across various storage servers based on a correspondence relationship, including:

[0042] The production data is numbered to obtain multiple production data numbers;

[0043] Based on the correspondence and production data number, each production data is compared and mapped with multiple virtual nodes corresponding to the storage server in order to store each production data.

[0044] To achieve the above objectives, this application also provides a production data management system based on MES. The system includes a memory, a processor, and a computer program stored in the memory and running on the processor. When the processor executes the computer program, it implements the steps of the production data management method based on MES as described above.

[0045] The present invention has, but is not limited to, the following technical effects:

[0046] By acquiring production data from each stage of the manufacturing execution system, and calculating the data value based on the outlier characteristics of each production data point, when the data value meets preset conditions, the first and second storage data that need to be fully stored are selected from the production data. Based on the data value and the storage cost of the storage server, the first and second storage data are matched with storage servers to determine the correspondence between production data and storage servers. Then, based on this correspondence, various types of production data are evenly stored across different storage servers, thereby reducing the total data storage volume. Furthermore, data with different values ​​is evenly stored, enabling the storage management of data generated by different production processes to balance the later storage costs based on their value scale. This optimizes the data storage of production processes, reduces later maintenance and management costs, has high scalability, and the cost of expanding storage capacity is low. Attached Figure Description

[0047] Figure 1This is a flowchart illustrating the first embodiment of the MES-based production data management method of this application;

[0048] Figure 2 This is a schematic diagram of the system architecture involved in the MES-based production data management method of this application;

[0049] Figure 3 This is a schematic diagram of the overall implementation process of the MES-based production data management method in this application;

[0050] Figure 4 This is a schematic diagram of the outlier curve involved in the MES-based production data management method of this application;

[0051] Figure 5 This is a schematic diagram illustrating the storage cost range loop involved in the MES-based production data management method of this application;

[0052] Figure 6 This is a schematic diagram of the data storage process involved in the MES-based production data management method of this application;

[0053] Figure 7 This is a schematic diagram of the device structure of the hardware operating environment involved in the embodiments of this application. Detailed Implementation

[0054] It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to limit this application.

[0055] This application provides a production data management method based on MES. In the first embodiment of the production data management method based on MES in this application, refer to... Figure 1 The methods include:

[0056] Step S10: Obtain production data for each production stage in the manufacturing execution system;

[0057] Step S20: Calculate the data value of the production data based on the outlier characteristics of each production data point;

[0058] Step S30: When the data value meets the preset conditions, select the first storage data that needs to be fully stored and the second storage data that needs to be partially stored from the production data.

[0059] Step S40: Based on the data value and the storage cost of the storage server, match each first storage data and second storage data with the storage server to determine the correspondence between production data and storage server;

[0060] Step S50: Based on the correspondence, the production data is stored evenly on each storage server.

[0061] This embodiment aims to: match the first and second stored data with storage servers, determine the correspondence between production data and storage servers, and then, based on the correspondence, store various types of production data evenly in various storage servers, thereby reducing the total data storage volume and balancing the subsequent storage costs according to its value scale.

[0062] Step S10: Obtain production data for each production stage in the manufacturing execution system.

[0063] As an example, the MES-based production data management method can be applied to the MES-based production data management device, which belongs to the category of MES-based production data management equipment.

[0064] As an example, the MES-based production data management method can also be applied to an MES-based production data management system, which can be a distributed storage architecture system, such as... Figure 2 As shown, its main architecture consists of a central node (the central routing module in the diagram) and multiple edge storage service nodes (edge ​​storage modules in the diagram). The central node stores the routing information of the data. All storage and access processes need to be routed from the central node. For each production data link in the manufacturing process, the generated production data is collected and temporary servers are set up nearby (servers 1 / 2 / 3, etc. in the diagram).

[0065] The temporary server has short-term data processing and storage capabilities, periodically transmitting data that needs to be uploaded to the central node of the distributed architecture. The central node obtains various parameter data from the edge servers, maps and routes the data that needs to be persistently stored to the corresponding edge servers, and stores the data routing information. The overall implementation process involved in this embodiment is illustrated in the diagram below. Figure 3 As shown, the detailed process is explained below.

[0066] As an example, the scenario addressed by this application could be: selectively distributing the storage of data generated at different stages of the manufacturing process, thereby reducing the data storage cost of the manufacturing process.

[0067] As an example, the product manufacturing process mainly includes: procurement, warehousing management, production, and finished product quality inspection. Each manufacturing stage contains some production data. For example, in the procurement stage, the production data includes supplier information, quotations, purchase orders, goods purchase orders, payment applications, and material details. In this stage, a temporary server is set up to temporarily store the full amount of data for a certain period of time, and it can be queried. Production data can be obtained from these temporary servers.

[0068] Step S20: Calculate the data value of each production data based on its outlier characteristics.

[0069] As an example, a large amount of recorded data is generated at each stage of the manufacturing process, and the amount of data generated at different stages varies significantly. For example, the stages of material warehousing, management and production mainly consist of production instruction parameters, status logs and result records, while the stages of material inspection and finished product testing capture a large amount of data information, such as the detection data generated by machine vision solutions, which has high resolution and large memory usage.

[0070] However, existing traditional production data management systems typically allocate data from the same production stage to the same storage server based on geographical proximity. This can cause excessive storage pressure on edge storage servers that generate high-density data, while other storage servers store relatively less data and have a higher idle rate, resulting in an unbalanced data load in the distributed storage system.

[0071] Because a large amount of low-value data is accessed infrequently and contributes little to the analysis results during subsequent data use and analysis, its full storage value is generally small. Therefore, high-value data generated during the manufacturing process is usually stored in full, while only representative low-value data is stored in full. In each stage of manufacturing, the manufacturing process and methods are similar. Generally, a small amount of data with suspected anomalies is more valuable for subsequent production process analysis. Therefore, clustering can be used to classify the full data of each production stage. The smaller the amount of data in a category, the higher its value can be considered. Generally, high-value data is data with anomalies and obvious outlier characteristics. These data account for a small proportion of the total batch data and are key data for revealing problems, optimizing production processes, and avoiding losses. Conversely, data with outliers is of lower value. Therefore, the storage value of production data is calculated based on the outlier characteristics of various production data.

[0072] As an example, data value is used to characterize the value of production data for storage. Data with high value will have higher access volume and play a key role in optimizing production processes.

[0073] Specifically, step S20 of the MES-based production data management also includes steps S21 to S24, including:

[0074] Step S21: The production data is transformed into multiple feature data points, and the outlier degree of each feature data point is calculated based on the Euclidean distance between them.

[0075] As an example, when acquiring production data, the full amount of production data is retrieved from each nearby server. The time interval for data storage, uploading, and analysis is set. When the storage capacity threshold for the time interval is reached, the outlier degree of each piece of production data is calculated. The time interval can generally be set to 10 minutes or 1 hour, etc.

[0076] As an example, since the data types of different production and manufacturing processes may be inconsistent, such as status reflection and detection monitoring, and the data formats are also inconsistent, including sound vibration and image video, it is not possible to directly cluster the raw data in a unified way. Feature extraction and normalization are required to obtain multiple feature data points. Feature extraction is an existing technology and will not be elaborated here. The normalized features of each piece of full production data obtained are the same, and the feature dimensions of the data within the same production stage are the same.

[0077] As an example, this embodiment mainly aims to find a small number of feature data points with outlier characteristics. These outlier data points have significantly different distances from other normal data points. For example, point A is an outlier point, while B, C, and D are normal feature data points. The distance between B and C, D, and A is much smaller than the distance between A and B, C, and D. Based on this, the outlier degree of each feature data point is calculated according to the Euclidean distance between them.

[0078] Specifically, taking the i-th feature data point as an example, the outlier degree The calculation method can be:

[0079]

[0080] In the formula, This indicates the degree of outlier status of the i-th feature data point in the dataset during the corresponding production stage. This represents the sum of distances between the i-th feature data point and all other feature data points. This represents the average of the sum of distances between all feature data points and other feature data points, where the distance is Euclidean distance.

[0081] Step S22: Based on the outlier degree and the preset clustering algorithm, cluster the feature data points to obtain multiple clusters.

[0082] As an example, since relatively isolated data with abnormal features are generally considered to be of higher value, a pre-defined clustering algorithm is used to cluster the data by extracting features. The pre-defined clustering algorithm is the DBSCAN algorithm, which is used to classify various feature data points to obtain multiple clusters.

[0083] Step S22 includes:

[0084] Based on the degree of outlier and the amount of data for each feature data point, an outlier curve is obtained by fitting each feature data point.

[0085] The neighborhood radius is determined based on the feature data point corresponding to the maximum rate of change on the outlier curve.

[0086] As an example, when clustering each feature data point using a pre-defined clustering algorithm, it is also necessary to determine the neighborhood radius and the minimum number of samples within the neighborhood radius. The minimum number of samples within the neighborhood radius in the algorithm is mainly determined by taking a data point as the center and grouping all data points within that radius into one class. Therefore, it can be estimated based on the value of the point with the largest rate of change of the outlier curve.

[0087] As an example, by accumulating data points with the same outlier level, the number of outliers is obtained. Then, a curve is fitted with the x-axis representing the outlier level and the y-axis representing the number of feature data points, resulting in an outlier curve. This outlier curve reflects the change in the number of data points with different outlier levels. The curve is shown below. Figure 4 As shown.

[0088] As an example, the point with the largest rate of change in the outlier curve is calculated. This point represents the location where the data point's distance from the feature changes significantly. The average distance between the data point represented by this point and other data points is taken as the neighborhood radius; the number of data points represented by this point is taken as the minimum sample size.

[0089] Based on the neighborhood radius and a preset clustering algorithm, each feature data point is clustered to obtain multiple clusters.

[0090] As an example, after determining the neighborhood radius and the minimum number of samples, a pre-defined clustering algorithm is used to cluster each feature data point to obtain multiple clusters. The clustering method of the clustering algorithm is an existing technology and will not be elaborated here.

[0091] Step S23: Determine the first difference between the maximum data volume in each cluster and the corresponding data volume of each cluster, and normalize the first difference to obtain the data value of each cluster.

[0092] As an example, taking the cluster of class a as an example, the data value The calculation method can be:

[0093]

[0094] In the formula, This indicates the storage value of data in cluster a within the current production stage. This represents the maximum amount of data in each cluster; This indicates the number of data points in cluster a-th class in the clustering results. This indicates the first difference, and norm() indicates normalization.

[0095] Step S30: When the data value meets the preset conditions, select the first storage data that needs to be fully stored and the second storage data that needs to be partially stored from the production data.

[0096] Step S30 includes:

[0097] When the data value is greater than a preset data value threshold, the production data corresponding to the data value is determined as the first storage data that needs to be stored in full; otherwise, it is the second storage data that needs to be partially stored.

[0098] As an example, the preset data value threshold can be 0.8, 0.9, etc., and there is no specific limitation.

[0099] As an example, taking a preset data value threshold of 0.9 as an example, when the data value is greater than 0.9, it is considered that the data of this type is high-value primary data and needs to be stored in its entirety; while for other types of data, multiple data are selected and stored in the center of the class, and these data are secondary data.

[0100] Specifically, when the data value of a certain type of data is lower than a preset data value threshold, only a small portion of the total data at the center of the category is uploaded as a representative, while the rest is only used as the data for uploading indicator results. Conversely, when the data belongs to a high-value category with a value greater than the threshold, all data in that category is uploaded. For categories that do not reach the threshold, at least 1% of the total data volume is selected and uploaded as the data closest to the center of the category's Euclidean distance.

[0101] As an example, by filtering out high-value data from different production stages, and then storing and managing this production data for easy access later, distributed storage can be used to persistently store data from different production stages, since low-cost storage servers can be dynamically added and the maintenance of a single server is relatively simple.

[0102] Step S40: Based on the data value and the storage cost of the storage server, match each first storage data and second storage data with the storage server to determine the correspondence between production data and storage server.

[0103] As an example, during storage, a consistent hashing ring can be used to map each server to a virtual node, thus avoiding storage imbalance. However, because high-value data is more likely to be accessed later, and because the continuous storage cost and access efficiency cost of each edge server are inconsistent, the uniform routing distribution of the hashing ring algorithm may cause high-value data to be allocated to unsuitable server nodes, resulting in a higher overall storage access cost. For example, high-value data is stored on edge server nodes with high access efficiency costs, and high-value data is usually accessed frequently later, leading to a high final data management cost. Therefore, it is also necessary to match various types of primary and secondary storage data with storage servers based on the data value and storage cost of various data types, thereby avoiding situations where access costs or storage costs are high.

[0104] Step S40 includes:

[0105] Obtain the amount of stored data, additional power consumption, and average access latency of each storage server in the previous storage stage;

[0106] The continuous storage cost of the storage server is calculated based on the ratio between the additional electricity cost and the amount of data stored.

[0107] As an example, data exists in multiple storage stages during the production process. The amount of data stored on the storage server in the previous stage, the additional power consumption, and the average access latency can be obtained from the database or system records.

[0108] As an example, continuous storage costs The calculation method can be:

[0109]

[0110] In the formula, This indicates the additional power consumption relative to the previous storage stage of the storage server, that is, the additional electricity cost. This indicates the amount of data stored in the previous storage stage, and norm() represents the normalization calculation.

[0111] The access cost of the storage server is calculated based on the normalized value of the average access latency.

[0112] As an example, access cost The calculation method can be:

[0113]

[0114] In the formula, S represents the average access latency when reading from the server in the previous stage. The faster the data is obtained during the access process, the lower the cost consumed during the access.

[0115] Based on the data value and the distance between each feature data point and the center of the largest cluster, the storage value of each feature data point is calculated.

[0116] As an example, the value of each piece of data that needs to be uploaded is calculated in detail. When the distance between a data point and the largest class center (the center of the largest cluster) is greater than the distance between its own class center and the largest class center, it indicates that the storage value of the current data point is higher. Based on this, the storage value of each feature data point is calculated.

[0117] The step of calculating the storage value of each feature data point based on its data value and the distance between each feature data point and the center of the largest cluster includes:

[0118] Determine the first distance between each feature data point and the maximum cluster center, and the second distance between the cluster center corresponding to each feature data point and the maximum cluster center;

[0119] Calculate the second difference between the first distance and the second distance;

[0120] The storage value of each feature data point is calculated based on the product of the data value and the second difference.

[0121] As an example, taking the i-th data point in the a-th cluster as an example, the storage value degree The calculation method can be:

[0122]

[0123] In the formula, This indicates the storage value of the i-th data point in the a-th class that needs to be uploaded and stored. This represents the data value of cluster a, which represents the class a. This represents the first distance between the i-th feature data point in the a-th class and the center of the largest cluster. This represents the second distance between the center of the a-th cluster and the center of the largest cluster. Indicates the second difference. It is a natural constant.

[0124] The average value of the stored value is used as the first weight. Based on the first weight, the continuous storage cost and access cost are weighted and summed to obtain the comprehensive storage cost of the storage server.

[0125] As an example, the higher the value of the data, the more suitable it is to store it on a server with lower access costs for easier subsequent access. When the value of the data is moderate (e.g., 0.5), it means that the importance of the server's continuous storage cost and access cost is equal. Therefore, the value of server cost ranges can be divided by the average of continuous storage cost and access cost.

[0126] As an example, when calculating the overall storage cost of high-value data, the product of access cost and data value can be used as the overall storage cost of high-value data. When the storage value of the data is moderate, the storage cost of each storage server... The calculation method can be:

[0127]

[0128] In the formula, FZ represents the overall server storage cost of the data that needs to be stored at a medium value level; This represents the average storage value of the data that needs to be stored, also known as the first weight.

[0129] Based on the overall storage cost and storage value, each type of first and second storage data is matched with a storage server to determine the correspondence between production data and storage servers.

[0130] As an example, after calculating the overall storage cost of the storage server and the storage value of each piece of production data, each piece of production data is matched with a storage server, and the resulting matching relationship is used as the correspondence between production data and storage servers.

[0131] The step of matching each piece of first and second stored data with storage servers based on comprehensive storage costs and storage value, and determining the correspondence between production data and storage servers, includes:

[0132] The overall storage costs are arranged from lowest to highest, and the storage servers are divided into ranges.

[0133] As an example, the overall storage costs of the storage servers needed to store the current data are ranked from smallest to largest, and then divided into intervals to obtain multiple storage spaces. There are at least eight storage intervals, and these intervals are connected end-to-end from smallest to largest. For example... Figure 5 As shown.

[0134] Based on the partitioned storage ranges and storage servers, a consistent hash ring is constructed.

[0135] As an example, the remaining storage space and available read / write speed of the edge servers within each storage zone are obtained from multiple storage servers in the zone to construct a locally consistent hash ring.

[0136] The steps for constructing a consistent hash ring based on the partitioned storage regions and storage servers include:

[0137] Obtain the remaining storage space and available read / write speed of the storage servers in each storage region.

[0138] Calculate the percentage of remaining storage space in each storage server, and the first product between the percentage of remaining storage space and the available read / write speed.

[0139] The number of hash ring virtual nodes required for each storage server is calculated based on the product between the first product and the preset minimum number of virtual nodes.

[0140] As an example, when constructing a consistent hash ring, it is necessary to determine the number of virtual nodes in the hash ring. Taking the k-th distributed storage server in the i-th cost interval as an example, the number of virtual nodes in the hash ring is... The calculation method can be:

[0141]

[0142] In the formula, This indicates the minimum preset number of virtual nodes, which should be set to no less than 20 based on experience. This represents the proportion of remaining storage space on the k-th edge storage server compared to the total storage space. This represents the available read / write speed of the k-th server. This represents the first product.

[0143] By evenly distributing the virtual nodes corresponding to the number of virtual nodes in the hash ring and connecting them end to end, a consistent hash ring is constructed.

[0144] As an example, the local hash virtual nodes within each storage interval are evenly distributed and connected end-to-end to form a ring structure. Let the number of hash virtual nodes in the i-th storage interval be denoted as . .

[0145] The first and second stored data are sorted in descending order of their storage value so that each first and second stored data corresponds to a storage interval, thus obtaining the correspondence between production data and storage servers.

[0146] As an example, the data to be stored is sorted by size according to its storage value, and the data is evenly distributed across various storage intervals. In this way, production data with high storage value can be stored in servers with low overall storage costs, thereby reducing the total cost. After matching, the correspondence between the first and second stored data and each storage interval is determined. According to the CAP theorem, the number of data replicas stored for each data is no less than 3. For example, if the total amount of data to be stored is 1, the first 1 / 8, sorted from largest to smallest, is stored in the first storage interval.

[0147] Step S50: Based on the correspondence, the production data is stored evenly on each storage server.

[0148] As an example, after determining the correspondence between production data and various storage servers, each production data is stored in its corresponding storage server according to that correspondence.

[0149] Step S50 includes:

[0150] The production data is numbered to obtain multiple production data numbers;

[0151] Based on the correspondence and production data number, each production data is compared and mapped with multiple virtual nodes corresponding to the storage server in order to store each production data.

[0152] As an example, each storage area obtains the data to be stored and its copies, assigns an ID to all the data, and obtains multiple production data IDs.

[0153] Secondly, the virtual nodes of the interval local hash ring structure are assigned ascending ID numbers.

[0154] The mapping is performed by comparing the modulo value of the data's ID with the corresponding hash virtual nodes of the interval:

[0155]

[0156] In the formula, This represents the virtual node number after mapping the m-th data item; Indicates the number before data mapping; This represents the number of virtual hash nodes in the i-th storage space. The virtual node number representing the data storage is used to obtain the actual edge storage server route corresponding to the virtual node, and the routing information is stored in the central server.

[0157] As an example, in subsequent storage processes, high-value data from each production stage is acquired and transmitted to the central node of a low-cost distributed storage network. Based on the real-time load of the edge nodes, the central node calculates the data distribution route and stores the routing results at the distributed central node. The data transmitted to the distributed storage network, via the storage route obtained from the central node, is encrypted and distributed to each edge server node for storage. The encryption algorithm can employ symmetric or asymmetric encryption. A schematic diagram of the specific storage process is shown below. Figure 6 As shown.

[0158] Specifically, when persistently storing data during the finished product quality inspection stage in the production process, such as the PCBA testing stage, including ICT testing, FCT testing, fatigue testing, aging testing, and extreme condition testing, each testing stage generates a large amount of data. For example, ICT testing includes continuous data such as circuit continuity, voltage, current values, curve fluctuations, amplitude, and noise. Then, a temporary storage server is used to perform calculations to obtain high-value, persistent, full-scale test data, and this high-value full-scale data is stored in a distributed persistent storage architecture. Other low-value data only stores status parameters, including: product number name, line stop status, UPH, H / C, and number of on-duty personnel.

[0159] Secondly, subsequent accesses utilize the central node for forwarding to retrieve data. Upon receiving a client's access request, the backend logic server processes the target data, obtains the route to the target edge storage server where the target data is stored via the distributed central node, and then forwards the data read request through the distributed central node to obtain the data required by the client.

[0160] For example, when a client needs to access warehouse work order data generated during the warehousing phase, the request parameters include data date, work order number, receiving slip number, etc. The backend logic server calculates the target data logic parameters of the request, obtains the server routing information where the target data is stored through the distributed central node, and obtains the corresponding storage status parameters, including: box number, quantity, inflow type, material number, PID, etc. Finally, the obtained data is transmitted to the client for display. Access to other data is handled in the same way, realizing the integration of data storage access management and improving the overall cost of data storage management.

[0161] In this embodiment, production data from each stage of the manufacturing process in the manufacturing execution system is acquired. Based on the outlier characteristics of each production data, the data value of the production data is calculated. When the data value meets a preset condition, the first and second storage data that need to be fully stored are selected from the production data. The first and second storage data are matched with storage servers according to the data value and the storage cost of the storage server, and the correspondence between the production data and the storage server is determined. Then, according to the correspondence, various types of production data are stored evenly in various storage servers, thereby reducing the total data storage volume. Furthermore, data with different data values ​​are stored evenly, so that the storage management of data generated by different production processes can balance the later storage costs according to their value scale. This optimizes the data storage of the production process, reduces the later maintenance and management costs, has high scalability, and the cost of expanding storage capacity is low.

[0162] Reference Figure 7 , Figure 7 This is a schematic diagram of the device structure of the hardware operating environment involved in the embodiments of this application.

[0163] like Figure 7 As shown, the MES-based production data management device may include: a processor 1001, a memory 1003, and a communication bus 1002. The communication bus 1002 is used to realize the connection and communication between the processor 1001 and the memory 1003.

[0164] Optionally, the MES-based production data management device may also include a user interface, a network interface, a camera, RF (Radio Frequency) circuitry, sensors, a WiFi module, etc. The user interface may include a display screen and an input submodule such as a keyboard; optional user interfaces may also include standard wired or wireless interfaces. The network interface may include standard wired or wireless interfaces (such as a Wi-Fi interface).

[0165] Those skilled in the art will understand that Figure 7 The structure of the MES-based production data management device shown in the figure does not constitute a limitation on the MES-based production data management device. It may include more or fewer components than shown, or combine certain components, or have different component arrangements.

[0166] like Figure 7As shown, the memory 1003, serving as a storage medium, may include an operating system, a network communication module, and a MES-based production data management program. The operating system is a program that manages and controls the hardware and software resources of the MES-based production data management device, supporting the operation of the MES-based production data management program and other software and / or programs. The network communication module is used to enable communication between the various components within the memory 1003, as well as communication with other hardware and software in the MES-based production data management system.

[0167] exist Figure 7 In the MES-based production data management device shown, the processor 1001 is used to execute the MES-based production data management program stored in the memory 1003 to implement the steps of the MES-based production data management method described above.

[0168] The specific implementation of the production data management device based on MES in this application is basically the same as the embodiments of the production data management method based on MES described above, and will not be repeated here.

[0169] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or system. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or system that includes that element.

[0170] The sequence numbers of the embodiments in this application are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0171] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) as described above, and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of this application.

[0172] The above are merely preferred embodiments of this application and do not limit the scope of this application. Any equivalent structural or procedural transformations made based on the description and drawings of this application, or direct or indirect applications in other related technical fields, are similarly included within the scope of protection of this application.

[0173] It should be noted that the order of the above embodiments of the present invention is merely for descriptive purposes and does not represent the superiority or inferiority of the embodiments. The processes depicted in the accompanying drawings do not necessarily require a specific or sequential order to achieve the desired result. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

[0174] The various embodiments in this specification are described in a progressive manner. The same or similar parts between the various embodiments can be referred to each other. Each embodiment focuses on describing the differences from other embodiments.

Claims

1. A production data management method based on MES, characterized in that, The method includes: Acquire production data from each stage of the manufacturing process in the manufacturing execution system, and transform the production data into multiple feature data points; Calculate the data value of the production data based on the outlier characteristics of each of the production data; When the value of the data meets the preset conditions, the first storage data that needs to be fully stored and the second storage data that needs to be partially stored are selected from the production data. Based on the data value and the storage cost of the storage server, each of the first and second stored data is matched with a storage server to determine the correspondence between the production data and the storage server, specifically including: Obtain the amount of stored data, additional power consumption, and average access latency of each storage server in the previous storage stage; The continuous storage cost of the storage server is calculated based on the ratio between the additional electricity cost and the amount of data stored. Based on the normalized value of the average access latency, the access cost of the storage server is calculated. Based on the data value and the distance between each of the feature data points and the center of the largest cluster, the storage value of each feature data point is calculated. The average value of the storage is used as the first weight. Based on the first weight, the continuous storage cost and the access cost are weighted and summed to obtain the comprehensive storage cost of the storage server. Based on the overall storage cost and the degree of storage value, each of the first and second stored data is matched with a storage server to determine the correspondence between the production data and the storage server, specifically including: The overall storage costs are arranged from smallest to largest, and the storage servers are divided into ranges. Based on the partitioned storage regions and the storage servers, a consistent hash ring is constructed. The first and second stored data are sorted in descending order of their storage value so that each of the first and second stored data corresponds to each of the storage intervals corresponding to the consistent hash ring, thereby obtaining the correspondence between the production data and the storage server. Based on the aforementioned correspondence, the production data is stored evenly across various storage servers.

2. The MES-based production data management method as described in claim 1, characterized in that, The calculation of the data value of the production data based on the outlier characteristics of each of the production data includes: Based on the Euclidean distance between each of the aforementioned feature data points, the outlier degree of each feature data point is calculated. Based on the outlier degree and the preset clustering algorithm, the feature data points are clustered to obtain multiple clusters; A first difference is determined between the maximum data volume in each cluster and the corresponding data volume of each cluster. The first difference is then normalized to obtain the data value of each cluster.

3. The MES-based production data management method as described in claim 2, characterized in that, Based on the outlier degree and a preset clustering algorithm, the feature data points are clustered to obtain multiple clusters, including: Based on the outlier degree and the amount of data of the feature data points, each feature data point is fitted to obtain an outlier degree curve; The neighborhood radius is determined based on the feature data point corresponding to the maximum rate of change on the outlier curve. Based on the neighborhood radius and the preset clustering algorithm, the feature data points are clustered to obtain multiple clusters.

4. The MES-based production data management method as described in claim 1, characterized in that, When the data value meets preset conditions, the process of filtering out first storage data that needs to be fully stored and second storage data that needs to be partially stored from the production data includes: When the data value is greater than a preset data value threshold, the production data corresponding to the data value is determined to be the first storage data that needs to be fully stored; otherwise, it is the second storage data that needs to be partially stored.

5. The MES-based production data management method as described in claim 1, characterized in that, The calculation of the storage value of each feature data point based on the data value and the distance between each feature data point and the center of the largest cluster includes: Determine a first distance between each of the aforementioned feature data points and the center point of the largest cluster, and a second distance between the center point of the cluster corresponding to each of the aforementioned feature data points and the center point of the largest cluster; Calculate the second difference between the first distance and the second distance; The storage value of each feature data point is calculated based on the product of the data value and the second difference.

6. The MES-based production data management method as described in claim 1, characterized in that, The construction of a consistent hash ring based on the partitioned storage regions and the storage servers includes: Obtain the remaining storage space and available read / write speed of the storage servers in each storage region; Calculate the percentage of the remaining storage space in each storage server, and the first product between the percentage of the remaining storage space and the available read / write rate; Based on the product between the first product and the preset minimum number of virtual nodes, the number of hash ring virtual nodes required for each of the storage servers is calculated. The virtual nodes corresponding to the number of virtual nodes in the hash ring are evenly distributed and connected end to end to construct a consistent hash ring.

7. The MES-based production data management method as described in claim 6, characterized in that, Based on the aforementioned correspondence, the production data is evenly stored across various storage servers, including: The production data is numbered to obtain multiple production data numbers; Based on the correspondence and the production data number, each production data is compared and mapped with multiple virtual nodes corresponding to the storage server to store each production data.

8. A production data management system based on MES, characterized in that, The system includes a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that the processor executes the computer program to implement the steps of the method as described in any one of claims 1 to 7.