Distributed data-driven process modeling optimization method and system
A technology of distributed data and optimization methods, applied in data processing applications, neural learning methods, computing models, etc., to achieve the effect of saving manpower
Pending Publication Date: 2022-06-24
EAST CHINA UNIV OF SCI & TECH
0 Cites 0 Cited by
AI-Extracted Technical Summary
Problems solved by technology
[0006] The purpose of the present invention is to provide a distributed data-driven process modeling optimization method and system to solve the pro...
Method used
[0083] Federated machine learning is also known as federated learning, joint learning, and alliance learning. Federated Machine Learning is a machine learning framework that can effectively help multiple agencies conduct data usage and machine learning modeling while meeting the requirements of user privacy protection, data security, and government regulations. As a priva...
Abstract
The invention relates to the field of industrial process optimization, in particular to a distributed data-driven process modeling optimization method and system. The method comprises the following steps: step S1, determining a used machine learning model according to task requirements and data set size, and issuing a model configuration file to each corresponding node; s2, performing data-driven modeling by the node to obtain a node model, and uploading the node model to a parameter server; s3, aggregating the node models by the parameter server to obtain a global model, and issuing the global model to each corresponding node; s4, selecting a carrier for decision optimization, and determining an optimization target number and an optimization strategy; s5, using the global model and the node model on a carrier, combining an optimization strategy to carry out evolutionary optimization, and searching to obtain a feasible solution; and step S6, the node evaluates the feasible solution until a termination condition is satisfied. According to the method, privacy protection modeling and decision optimization during data distributed storage are realized based on federated learning modeling, and the adaptability is relatively wide.
Application Domain
ForecastingCharacter and pattern recognition +5
Technology Topic
Federated learningPrivacy protection +9
Image
Examples
- Experimental program(1)
Example Embodiment
[0082] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the invention, but not to limit the invention.
[0083] Federated machine learning is also known as federated learning, federated learning, and federated learning. Federated Machine Learning is a machine learning framework that can effectively help multiple agencies conduct data usage and machine learning modeling while meeting user privacy protection, data security, and government regulations. Federated learning, as a privacy-preserving distributed machine learning framework, protects user privacy by sharing update directions instead of private data, and has broad application prospects.
[0084] The invention proposes a distributed data-driven process modeling and optimization method. Based on the distributed data-driven modeling and optimization framework of federated learning, global knowledge can be extracted for distributed modeling without collecting data, which ensures that data is stored in a decentralized manner. Modeling performance on multiple devices without uploading private data, and on this basis, a federated optimization strategy is further proposed, and the built model is used for distributed decision optimization.
[0085] figure 2 A schematic flowchart of a distributed data-driven process modeling optimization method according to an embodiment of the present invention is disclosed, image 3 A schematic diagram of a distributed data-driven process modeling optimization flow according to an embodiment of the present invention is disclosed, such as figure 2 and image 3 As shown, a distributed data-driven process modeling optimization method proposed by the present invention is characterized in that it includes the following steps:
[0086] Step S1, according to the task requirements and the size of the data set, determine the machine learning model used, and deliver the model configuration file to the corresponding nodes, where the nodes are distributed full-process equipment or cross-regional sub-factories;
[0087] Step S2, the node preprocesses the local offline data set and the online data set, uses the local offline data and the online data as the training set, performs data-driven modeling to obtain the node model, and uploads the node model to the parameter server;
[0088] Step S3, the parameter server aggregates the node models to obtain a global model, and delivers the global model to corresponding nodes;
[0089] Step S4, selecting a parameter server or node as a carrier for decision-making optimization, and determining the number of optimization targets and an optimization strategy;
[0090] Step S5, using the global model and the node model on the carrier, performing evolutionary optimization in combination with the optimization strategy, and searching to obtain a feasible solution;
[0091] In step S6, the node evaluates the feasible solution, updates the online data set, and performs incremental update based on the received global model, and repeats steps S2-S5 until the termination condition is satisfied.
[0092] In a distributed data-driven process modeling optimization method proposed by the present invention, first, each distributed node uses local offline data and newly evaluated online data to establish a data-driven initial local model and upload it to a parameter server;
[0093] After the parameter server receives the node model that meets the requirements, it aggregates the model through the direct average or sorted average algorithm to obtain the global model and sends it back to each node;
[0094] Finally, according to the task configuration, the server or node uses the global model and the local model to assist the single-objective or multi-objective evolutionary optimization algorithm, finds (searches) feasible solutions in the decision space, and optimizes the decision variables, such as optimizing the raw material formula and operation of each sub-factory. temperature, etc., to provide guidance for the task;
[0095] After the search is completed, the node evaluates the feasible solutions and stores the results in the online dataset, and performs the next round of incremental updates on the basis of the global model.
[0096] The present invention provides a distributed data-driven process modeling optimization method, which is based on federated learning, aims at distributed whole-process modeling of process manufacturing industry, and is used for process decision optimization in combination with an intelligent evolutionary optimization algorithm. The raw materials, operating temperature and process operating conditions are optimized, and the optimal operating point is selected to guide the actual industrial operation, which can realize data privacy protection, suitable for IoT edge devices, and can be deployed on distributed full-process platforms or across regions Branch factory.
[0097] These steps are described in detail below. It should be understood that, within the scope of the present invention, the above-mentioned technical features of the present invention and the technical features specifically described in the following (eg, embodiments) can be combined with each other and related to each other, thereby constituting a preferred technical solution.
[0098] Step S1: Determine the machine learning model to be used according to the task requirements and the size of the data set, and deliver the model configuration file to corresponding nodes, where the nodes are distributed full-process equipment or cross-regional sub-factories.
[0099] For simplicity, the following distributed full-process equipment or cross-regional sub-factories are called nodes.
[0100] The machine learning model can be a shallow machine learning network or a deep neural network.
[0101] Different industrial lines and production tasks have different time scales, and the number and models of edge devices at nodes are different, resulting in a great difference in the number of samples collected by each node in the same time. Choosing the right machine learning model is critical for data-driven modeling tasks.
[0102] For fine-grained, big data tasks, choose deep neural networks;
[0103] For coarse-grained, common tasks, choose a shallow machine learning network.
[0104] The shallow machine learning network includes a variety of traditional machine learning algorithms, including BP neural network, radial basis neural network, etc., and is suitable for tasks with a small amount of data.
[0105] Deep neural networks include convolutional neural networks and recurrent neural networks. The model capacity is large and suitable for big data tasks.
[0106] Therefore, the method can be adapted to edge devices such as IoTs (Internet of Things) or high-performance computing platforms.
[0107] In step S2, the node preprocesses the local offline data set and the online data set, uses the local offline data and the online data as the training set, performs data-driven modeling to obtain the node model, and uploads the node model to the parameter server.
[0108] Using two types of data as the training set, sampling gradient descent and other algorithms for model training, that is, data-driven modeling.
[0109] Two types of data include offline data and online data.
[0110] In this embodiment, the federated learning theory is adopted, considering that there is no information and data interaction between nodes in the modeling process, and model training is performed by gradient backpropagation.
[0111] The main goals of node modeling are empirical risk minimization and structural risk minimization, and the local modeling objective function F of node k k as follows:
[0112]
[0113] where n k is the total number of node samples, L is the node loss function, x i and y i is the sample pair, γ is the structural risk loss function, w k is the local model of node k.
[0114] The node loss function L includes, but is not limited to, squared error, etc.
[0115] After modeling is complete, the node uploads the local model to the parameter server.
[0116] For ease of presentation, the local model of a node is also referred to as a node model, and no distinction is made.
[0117] It can be seen that local modeling does not need to communicate with other nodes or upload local data, there is no risk of data leakage, and node privacy is protected.
[0118] Step S3, the parameter server aggregates the node models to obtain a global model, and delivers the global model to corresponding nodes.
[0119] The parameter server uses the direct average or sorted average algorithm to perform model aggregation on the node models, and then sends the global model to each node.
[0120] Model aggregation is a very important step in the method of the present invention, which is intended to fuse node models to extract global knowledge, and its objective function is to minimize the global objective function F(w), and the corresponding expression is:
[0121]
[0122] where K is the total number of nodes, p k is the aggregated weight of node k.
[0123] Aggregate weight p of node k k It is generally determined by the number of samples and meets the following conditions:
[0124]
[0125] In order to minimize the global objective function, this method adopts two strategies for model aggregation. The model aggregation method can be either direct aggregation or sorted aggregation.
[0126] The direct aggregation method is to use the direct average algorithm to aggregate the node model, and the corresponding expression is:
[0127]
[0128] where p k is the aggregation weight of node k, configured by the server, w is the global model obtained in step S3, w k is the local model of node k.
[0129] Correspondingly, the sorting aggregation method is to use the sorting average algorithm to aggregate the node models, and the corresponding expression is:
[0130]
[0131] where p k is the aggregate weight of node k, which is configured by the server, is the sorted model of node k, K is the total number of nodes, and w is the global model obtained in step S3.
[0132] The sorted model for node k The corresponding calculation method is as follows:
[0133] First, by setting the local model w k The core structure of c k Calculate the index index of the model sorting, that is
[0134]
[0135] Among them, d is the dimension of the core structure;
[0136] Then, the sorted model can be obtained through the index index, and the corresponding expression is:
[0137]
[0138] where w k is the local model of node k.
[0139] Step S4, selecting a parameter server or node as a carrier for decision-making optimization, and determining the number of optimization targets and an optimization strategy.
[0140] In this embodiment, the decision optimization carrier includes, but is not limited to, a parameter server or a node.
[0141] After the process modeling is completed in step S3, in this embodiment, the model is used for decision optimization.
[0142] Different from the centralized optimization algorithm, in this embodiment, the privacy issue is considered during optimization, and the optimization carrier is determined according to the task and privacy requirements:
[0143] If the decision is a public decision, for example, each node handles the same task, the optimization carrier is the parameter server at this time, and the global model and the local models of all uploaded nodes are used for fitness evaluation;
[0144] If the decision is a node private task, the optimization carrier is the node, and the node uses the downloaded global model and the node local model for fitness evaluation.
[0145] In this embodiment, the number of optimization objectives includes but is not limited to single objectives and multiple objectives;
[0146] After the optimization carrier is determined, the optimization target and the number of targets are determined according to the task. Since the present invention adopts a machine learning model, the output number of the model can be single input or multiple input, so. It can be used to deal with both single-objective problems and multi-objective problems.
[0147] In this embodiment, the optimization strategy is an intelligent algorithm, including but not limited to a difference algorithm and a particle swarm algorithm.
[0148] The present invention has strong adaptability to the optimization strategy, because the data-driven global model and local model can be well embedded in various optimization algorithms, therefore, the appropriate optimization strategy can be selected according to the task, such as difference algorithm or particle group algorithm, etc.
[0149] Step S5 , using the global model and the node model on the carrier, performing evolutionary optimization in combination with the optimization strategy, and searching to obtain a feasible solution.
[0150] After completing the selection of the optimization vector, optimization objective and optimization strategy, use the local model and the global model for optimization.
[0151] The evolutionary optimization is performed on the carrier using the global model and the node model-assisted optimization strategy to search for feasible solutions.
[0152] The key of step S5 is to determine where to sample in the decision space, therefore, a suitable sampling function is required.
[0153] To solve this problem, the present invention proposes a Federated Lower Confidence Bound (FLCB) sampling strategy.
[0154] for any feasible solution x p , first use the global model to perform a global fitness evaluation, the global fitness evaluation value The expression is:
[0155]
[0156] Among them, w is the global model, and the subscript fed represents the prediction of the global model.
[0157] Then, if the parameter server is used as the optimization carrier, all the collected node models (local models) will be used for local fitness evaluation, and the expression of the local fitness evaluation value is:
[0158]
[0159] where w k is the local model of node k, p k is the aggregate weight of node k.
[0160] If the node is used as the optimization carrier, only the local model of the node is used for local fitness evaluation, and the expression of the local fitness evaluation value is:
[0161]
[0162] where w k is the local model of node k.
[0163] Finally, take the average of the global fitness evaluation value and the local fitness evaluation value as the estimated fitness value of the feasible solution The corresponding expression is:
[0164]
[0165] In addition, in this embodiment, the feasible solution x needs to be calculated p uncertainty to enhance the algorithm exploration performance in unknown space.
[0166] feasible solution x p The uncertainty of , is obtained by:
[0167] If the parameter server is the optimization carrier, the corresponding expression of the uncertainty of the feasible solution is:
[0168]
[0169] If the node is the optimization carrier, the corresponding expression of the uncertainty of the feasible solution is:
[0170]
[0171] where w k is the local model of node k, is a feasible solution x p The estimated fitness value of , K is the total number of nodes, Evaluate value for global fitness.
[0172] Then the feasible solution x p The federated lower confidence boundary expression is:
[0173]
[0174] in, is a feasible solution x p The estimated fitness value of ;
[0175] is a feasible solution x p uncertainty;
[0176] μ is a hyperparameter, which is generally set to 2.
[0177] In this embodiment, a solution with better properties can be sampled in the decision space by minimizing the lower confidence boundary. The lower confidence boundary is a sampling strategy that balances the predicted target value and uncertainty of feasible solutions.
[0178] In step S6, the node evaluates the feasible solution, updates the online data set, and performs incremental update based on the received global model, and repeats steps S2-S5 until the termination condition is satisfied.
[0179] After obtaining the global model and feasible solution, the node evaluates the feasible solution, that is, the decision. After the evaluation is completed, the sample pair composed of the decision and the target value is stored in the online data set, and the updated global model is used to incrementally update the local Model.
[0180] Although the above-described methods are illustrated and described as a series of acts for simplicity of explanation, it should be understood and appreciated that these methods are not limited by the order of the acts, as some acts may occur in a different order in accordance with one or more embodiments and/or occur concurrently with other actions from or not shown and described herein but understood by those skilled in the art.
[0181] Figure 4 A schematic diagram of a simple example according to an embodiment of the present invention is disclosed, such as Figure 4 As shown, the distributed data-driven process modeling optimization method proposed by the present invention is a simple example of distributed process modeling decision optimization, and its calculation steps are as follows:
[0182] For the computational requirements of this example, the system requirements are process modeling and decision optimization.
[0183] According to the foregoing embodiment, in Figure 4 The detailed calculation steps of the present invention under the situation shown are:
[0184] Depending on the specific task, the parameter server determines the model configuration and participating nodes, such as Figure 4 In the situation shown, the participating nodes are node 401-node 404, and the selected model is an artificial neural network. After the configuration is completed, each participating node initializes the local model w. 1 -w 4;
[0185] The node uses local data for model training to update the local model, and then communicates with the parameter server to upload the local model;
[0186] The parameter server aggregates the received local model to obtain the updated global model by using direct aggregation or sorting aggregation method, and distributes the global model to each node;
[0187]Select the optimized carrier according to the task, and the carrier can be a parameter server or each node, such as node 404 in this example;
[0188] Select an optimization strategy, in this case Particle Swarm optimization (PSO), using the local model w of node 404 4 and the global model w to assist the optimization strategy for decision-making optimization, and the found optimal solution is evaluated at the node and added to the online data set as online data, and the global model is used as the starting point to further optimize the local model.
[0189] The model and optimization method established based on the above steps can be applied to the distributed process modeling of the process industry and the optimization of operating conditions to improve the operation level.
[0190] A distributed data-driven process modeling optimization system proposed by the present invention includes:
[0191] at least one node and at least one parameter server;
[0192] The node includes at least a first memory and a first processor, the first memory is used to store instructions executable by the first processor, the first processor is used to execute the instructions to implement any of the above method;
[0193] The parameter server includes at least a second memory and a second processor, wherein the first memory is used to store instructions executable by the second processor, and the second processor is used to execute the instructions to implement any of the above Methods.
[0194] Figure 5 The principle block diagram of the node/parameter server of the distributed data-driven process modeling and optimization system according to an embodiment of the present invention is disclosed, taking the parameter server as an example, such as Figure 5 The illustrated parameter server may include an internal communication bus 501 , a processor 502 , a read only memory (ROM) 503 , a random access memory (RAM) 504 , a communication port 505 , and a hard disk 507 . The internal communication bus 501 can realize the data communication between the distributed data-driven process modeling and optimization system parameter server components. The processor 502 can make a judgment and issue a prompt. In some embodiments, processor 502 may consist of one or more processors.
[0195] The communication port 505 can realize data transmission and communication between the distributed data-driven process modeling and optimization system parameter server and external input/output devices. In some embodiments, the distributed data-driven process modeling optimization system parameter server can send and receive information and data from the network through communication port 505 . In some embodiments, the distributed data-driven process modeling and optimization system parameter server can perform data transmission and communication with external input/output devices in a wired form through the input/output terminal 406 .
[0196] The distributed data-driven process modeling optimization system parameter server may also include different forms of program storage units and data storage units, such as hard disk 507, read only memory (ROM) 503 and random access memory (RAM) 504, capable of storing computer processing and/or various data files used for communication, and possibly program instructions executed by processor 502. The processor 502 executes these instructions to implement the main parts of the method. The result processed by the processor 502 is transmitted to the external output device through the communication port 505, and displayed on the user interface of the output device.
[0197] For example, the implementation process file of the above-mentioned distributed data-driven process modeling and optimization system parameter server may be a computer program, stored in the hard disk 507, and may be recorded in the processor 502 for execution to implement the method of the present application.
[0198] When the implementation process file of the distributed data-driven process modeling optimization method is a computer program, it can also be stored in a computer-readable storage medium as an article of manufacture. For example, computer-readable storage media may include, but are not limited to, magnetic storage devices (eg, hard disks, floppy disks, magnetic stripes), optical disks (eg, compact disks (CDs), digital versatile disks (DVDs)), smart cards, and flash memory devices ( For example, Electrically Erasable Programmable Read Only Memory (EPROM), card, stick, key drive). Additionally, various storage media described herein can represent one or more devices and/or other machine-readable media for storing information. The term "machine-readable medium" may include, but is not limited to, wireless channels and various other media (and/or storage media) capable of storing, containing, and/or carrying code and/or instructions and/or data.
[0199] A distributed data-driven process modeling optimization method and system provided by the present invention have the following beneficial effects:
[0200] 1) Clear structure: modeling and optimization complement each other, extract knowledge through federated learning and build a robust global model, providing a good model foundation for further optimization;
[0201] 2) Privacy security: Distributed modeling does not need to upload node data, which can ensure node data security and protect user privacy. At the same time, the model has a large selection space, and a shallow model or a deep model can be selected according to the size of the data set;
[0202] 3) Complete theory: The model aggregation method has a complete theoretical system, which can explain the aggregation operation from the modeling mechanism and prevent the model from collapsing. In the decision-making stage, the global model containing the overall information is used as the carrier, combined with integrated learning, to achieve decision optimization;
[0203] 4) High flexibility: optimization can use any evolutionary optimization algorithm and operator, which can be replaced according to task requirements. At the same time, the decision-making stage can be deployed on the server or on the node.
[0204] As shown in this application and in the claims, unless the context clearly dictates otherwise, the words "a", "an", "an" and/or "the" are not intended to be specific in the singular and may include the plural. Generally speaking, the terms "comprising" and "comprising" only imply that the clearly identified steps and elements are included, and these steps and elements do not constitute an exclusive list, and the method or apparatus may also include other steps or elements.
[0205] Those of skill in the art would understand that information, signals and data may be represented using any of a variety of different technologies and techniques. For example, the data, instructions, commands, information, signals, bits, symbols, and chips recited throughout the above description may be composed of voltages, currents, electromagnetic waves, magnetic fields or magnetic particles, light fields or optical particles, or any combination to represent.
[0206] Those skilled in the art will further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends on the specific application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
[0207] The various illustrative logic modules, and circuits described in connection with the embodiments disclosed herein may be implemented using general purpose processors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), or other programmable Logic devices, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein are implemented or performed. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, such as a combination of a DSP and a microprocessor, multiple microprocessors, one or more microprocessors cooperating with a DSP core, or any other such configuration.
[0208] The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integrated into the processor. The processor and storage medium may reside in the ASIC. The ASIC may reside in the user terminal. In the alternative, the processor and storage medium may reside in the user terminal as discrete components.
[0209] In one or more exemplary embodiments, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software as a computer program product, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage medium can be any available medium that can be accessed by a computer. By way of example and not limitation, such computer-readable media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage, or other magnetic storage devices, or can be used to carry or store instructions or data structures in the form of Any other medium that conforms to program code and that can be accessed by a computer. Any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a web site, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave , then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc as used herein includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc, where disks are often reproduced magnetically data, and discs reproduce the data optically with a laser. Combinations of the above should also be included within the scope of computer-readable media.
[0210] The above-mentioned embodiments are provided for those skilled in the art to realize or use the present invention, and those skilled in the art can make various modifications or changes to the above-mentioned embodiments without departing from the inventive concept of the present invention. The protection scope of the present invention is not limited by the above-mentioned embodiments, but should be the maximum scope conforming to the innovative features mentioned in the claims.
PUM


Description & Claims & Application Information
We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
Similar technology patents
Aluminum pipe bagging device
Owner:昆山市工业技术研究院有限责任公司
Method and device for testing webpage downloading speed of mobile terminal device
Owner:TENCENT TECH (SHENZHEN) CO LTD
Automatic labeling system and method for hard disk
Owner:YITUO SCI & TECH CO LTD SHENZHEN
Material pulverizing and sieving device
Owner:CHENZHOU JINTONG INFORMATION TECH CO LTD
Carrying trolley for large-size oil cylinder
Owner:ANQING TIANRUN ENG MACHINERY
Classification and recommendation of technical efficacy words
- save human effort
A method and device for automatic image acquisition
Owner:VIMICRO CORP
Self-supporting car washing method and car washer
Owner:盐城东方兴达农业发展有限公司
Program recognition method and device based on machine learning
Owner:三六零数字安全科技集团有限公司
Drilled hole effective extraction radius measuring method based on gas content method
Owner:CHINA COAL TECH ENG GRP CHONGQING RES INST
Vehicle-mounted visual sensing system with matched multi-cameras
Owner:CHONGQING UNIV +1