An intelligent optimization-based adhesive tape production line production scheduling system
The intelligent optimized tape production line scheduling system utilizes deep reinforcement learning and multi-objective task scheduling to adjust production plans in real time, solving the problem of low efficiency of traditional scheduling systems in dynamic environments. It optimizes equipment specification matching and capacity utilization, thereby improving production stability and efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHANGHAI YONGGUAN ADHESIVE PROD CORP LTD
- Filing Date
- 2025-11-12
- Publication Date
- 2026-06-26
AI Technical Summary
Existing production scheduling systems for tape production lines struggle to achieve efficient and flexible production scheduling in the face of dynamic and ever-changing environments. Traditional scheduling theories differ significantly from actual needs, and reliance on manual adjustments leads to low efficiency and unreliable stability.
The production scheduling system for the tape production line adopts an intelligent optimization system, which includes a perception layer, a cloud-edge collaboration layer, a scheduling layer, an execution layer, and an interaction layer. By monitoring data through sensing devices, a constraint library is built. Using deep reinforcement learning and a multi-objective task scheduling system, the production plan is adjusted in real time to dynamically respond to changes in equipment and materials.
This improved the equipment specification mismatch rate and rework rate, reduced schedule delays caused by misjudgment of production capacity, achieved improved production stability and efficiency, and ensured on-time order delivery.
Smart Images

Figure CN121094497B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of production scheduling system technology, specifically a production scheduling system for tape production lines based on intelligent optimization. Background Technology
[0002] The tape production line scheduling system is a comprehensive system integrating automated control, data acquisition, intelligent algorithms, and production management logic. Its core objective is to achieve optimal allocation of production resources, improve production line efficiency, reduce costs, and ensure delivery while meeting constraints such as order demand, equipment capacity, and material supply. Manufacturing systems are inherently dynamic, and tasks and resources can change due to various factors such as equipment failure, maintenance, demand fluctuations, and constantly changing market conditions. Furthermore, the rapid development of technological innovation and the increasing integration of equipment in the manufacturing process have led to the complexity and unpredictability of the production environment.
[0003] If customer demands are diverse and personalized, then the production line must cope with multi-variety, small-batch production. In this mode, the amount of information data in the production process is enormous, the data structure is complex, and it has a high degree of randomness, making control difficult. Moreover, these factors lead to the production system needing high flexibility and real-time performance, widening the gap between traditional scheduling theory and reality, making it difficult to adapt to the scheduling needs of actual production lines. Furthermore, currently widely used advanced production planning and scheduling systems only deal with static scheduling problems. Faced with complex and ever-changing dynamic environments, the usual approach is to break it down into multiple static problems for rescheduling, resulting in slow response speeds and low efficiency in practical applications. Often, human experience is needed for adaptive adjustments, but manual adjustments rely on the experience and ability of the schedulers, and as the complexity of the scheduling problem increases, its quality and stability are difficult to guarantee. In addition, manual adjustments often consume a lot of time and manpower, have low levels of intelligence, and low production efficiency. Summary of the Invention
[0004] To overcome the shortcomings of existing technologies and solve the aforementioned technical problems, this invention proposes a production scheduling system for tape production lines based on intelligent optimization.
[0005] The technical solution adopted by this invention to solve its technical problem is as follows: This invention proposes a production scheduling system for tape production lines based on intelligent optimization, the scheduling system comprising:
[0006] The sensing layer monitors the operating data of each device through sensing devices, organizes the device data into device data packages, and uploads them to the control center. The device data packages include production status, operating status, process parameters, and material inventory.
[0007] Cloud-edge collaboration layer: A constraint library is built based on device data packets. The constraint library includes a device constraint sub-library, a material constraint sub-library, and an order constraint sub-library.
[0008] Scheduling Layer: Construct a workshop scheduling system based on deep reinforcement learning, including equipment units, resource units, scheduling units, and execution units. Equipment units include detection devices for production equipment, resource units include past order scheduling records and scheduling logic, scheduling units include the initial scheduling scheme transmitted by the scheduling system, and execution units include transmission and display devices to notify and control the coordination between production equipment.
[0009] Execution layer: A multi-objective task scheduling system is embedded to evenly distribute order tasks into the workshop's scheduling system;
[0010] Interaction Layer: A monitoring and early warning module is constructed. The detection data from the detection device in the scheduling layer is used as the feature extraction value, and the tape product parameters are used as the preset threshold. When the deviation value exceeds the preset threshold, a dynamic adjustment mechanism is triggered to update the scheduling scheme.
[0011] Preferably, the constraint library includes: an equipment constraint sub-library containing equipment specification adaptation matrices, capacity decay function curves, and downtime maintenance time window cycles, constructed based on historical equipment operation data; a material constraint sub-library associated with data from material storage disks in the equipment, updating material inventory balances, storage locations, and expiration dates in real time, and setting multi-level inventory warning thresholds; and an order constraint sub-library using the analytic hierarchy process to prioritize orders, with priority factors including delivery urgency, order amount, and customer level.
[0012] Preferably, the workshop scheduling system includes a workshop scheduling environment, an offline training module, and an online application module:
[0013] Shop floor scheduling environment: The shop floor scheduling problem is modeled as a Markov decision process. Then, the scheduling task is broken down into multiple nodes using a disjunctive graph model, and a Gantt graph model is used to store the processing information matrix.
[0014] Offline training module: By enabling the agent to continuously interact with the environment, the generated reinforcement learning quadruple data is stored in the storage component. For the production line scenario, a deep reinforcement learning algorithm is selected from the algorithm pool to train the agent. The loss function is continuously calculated using the data of the agent's interaction, and the network weights of the agent are updated using the gradient descent algorithm until the network converges. Finally, the trained network model and weights are saved.
[0015] Online application module: The trained network model is loaded into the agent, and then the actual state of the workshop environment is input into the agent. The agent outputs the corresponding scheduling action through network decision, and then the workshop executes the scheduling action and updates to the next state. This cycle continues until all scheduling tasks are completed.
[0016] Preferably, the disjunctive graph uses Gantt charts to reflect the processing time of each process and the specific time scale of the entire scheduling scheme. With the cooperation of disjunctive graphs and Gantt charts, the scheduling problem is modeled as a discrete sequential decision-making process, and state transition functions, action spaces, reward functions and agent network structures are designed for the sequential decision-making process.
[0017] Preferred,
[0018] The state transition function contains three types of state information: processing time, processing flag, and cumulative processing time;
[0019] The motion space consists of two parts: process sequence and machine selection, and is used to encode the motion space;
[0020] The reward function is the short-term reward obtained by performing an action in the current state;
[0021] The intelligent agent network structure includes a feature extraction network and a decision network.
[0022] Preferably, the offline training module uses a near-end policy optimization algorithm to train the agent. The algorithm architecture uses two identical agent networks, one of which samples and the other updates repeatedly, so that the policy iteration gradually converges to the optimal policy.
[0023] Preferably, the multi-objective task scheduling system includes a data collection and preprocessing module, a multi-objective optimization module, a cloud-edge resource allocation module, and a task scheduling algorithm.
[0024] A production scheduling method for a tape production line based on intelligent optimization, the specific steps of which are as follows:
[0025] Step 1: Collect data from the entire tape production process using industrial IoT devices to complete multi-dimensional data collection and preprocessing, including equipment data, process data, material data, and order data;
[0026] Step 2: Build a dynamic constraint library based on the preprocessed data, including equipment constraint sub-libraries, material constraint sub-libraries, and order constraint sub-libraries;
[0027] Step 3: Define the core optimization objective of scheduling, construct a mathematical model with the core objectives of minimizing total completion time, reducing energy consumption, and reducing production costs. Determine the weight of each objective using the entropy weight method, and integrate the equipment, material, and order constraints from Step 2 into mathematical expressions. This transforms the scheduling problem into a quantifiable mathematical optimization problem, balancing the conflict between efficiency, cost, and service quality.
[0028] Step 4: Generate an initial scheduling scheme through the algorithm, select the comprehensive optimal scheme from the optimized solutions, and generate information including equipment, processes, and time;
[0029] Step 5: Implement the scheduling plan through the industrial control system and track the execution status in real time;
[0030] Step 6: Extract features such as equipment load, material inventory, and order progress through heterogeneous graph neural networks, calculate the deviation between actual and planned values, provide real-time feedback data to identify deviations, and trigger a dynamic adjustment mechanism.
[0031] The beneficial effects of this invention are as follows:
[0032] 1. The intelligent optimization-based production scheduling system for tape production lines described in this invention transforms the static specifications, dynamic performance, and planning constraints of equipment into mathematical constraints that can be recognized by the scheduling algorithm. By clarifying the processing boundaries of equipment through an adaptation matrix, it avoids allocating large equipment to small-specification orders or low-precision equipment to high-precision requirements, thereby reducing the equipment specification mismatch rate and the rework rate caused by insufficient equipment capacity. Furthermore, by dynamically correcting capacity expectations through a decay curve, it avoids schedule delays caused by scheduling based on theoretical values. Capacity decay correction reduces scheduling deviations and improves the on-time delivery rate of orders.
[0033] 2. The intelligent optimization-based tape production line scheduling system described in this invention addresses the balance between efficiency and cost through a multi-objective task allocation system at the execution layer, avoiding resource waste caused by prioritizing the fastest completion of a single objective. The interaction layer, through feature extraction (such as comparing the deviation of the actual coating thickness from the standard value with a threshold), triggers a recalculation by the scheduling unit when the deviation exceeds the limit. If the deviation is due to material shortage, the material constraint sub-library is linked to update inventory warnings, and the task order at the execution layer is adjusted. If the deviation is due to equipment aging, the capacity decay curve from the equipment constraint sub-library is called to correct subsequent scheduling. The monitoring and early warning module at the interaction layer, through real-time deviation analysis (such as when process parameters exceed tolerances), triggers dynamic adjustments to compensate for the rigidity of traditional one-time scheduling, ensuring production stability. Attached Figure Description
[0034] The invention will now be further described with reference to the accompanying drawings.
[0035] Figure 1 This is a system flowchart of the present invention. Detailed Implementation
[0036] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0037] Example 1:
[0038] To effectively solve the above problems, see the attached diagram in the instruction manual. Figure 1 As shown, a production scheduling system for a tape production line based on intelligent optimization is disclosed. The scheduling system includes:
[0039] The sensing layer monitors the operating data of each device through sensing devices, organizes the device data into device data packages, and uploads them to the control center. The device data packages include production status, operating status, process parameters, and material inventory.
[0040] Cloud-edge collaboration layer: A constraint library is built based on device data packets. The constraint library includes a device constraint sub-library, a material constraint sub-library, and an order constraint sub-library.
[0041] Scheduling Layer: Construct a workshop scheduling system based on deep reinforcement learning, including equipment units, resource units, scheduling units, and execution units. Equipment units include detection devices for production equipment, resource units include past order scheduling records and scheduling logic, scheduling units include the initial scheduling scheme transmitted by the scheduling system, and execution units include transmission and display devices to notify and control the coordination between production equipment.
[0042] Execution layer: A multi-objective task scheduling system is embedded to evenly distribute order tasks into the workshop's scheduling system;
[0043] Interaction layer: Build a monitoring and early warning module. The detection data of the detection side device in the scheduling layer is used as the feature extraction value, and the tape product parameters are used as the preset threshold. When the deviation value exceeds the preset threshold, the dynamic adjustment mechanism is triggered to update the scheduling plan.
[0044] The constraint library includes: an equipment constraint sub-library containing equipment specification adaptation matrices, capacity decay function curves, and downtime maintenance time windows, built based on historical equipment operation data; a material constraint sub-library associated with data from material storage disks in the equipment, updating material inventory levels, storage locations, and expiration dates in real time, and setting multi-level inventory warning thresholds; and an order constraint sub-library using the analytic hierarchy process to prioritize orders, with priority factors including delivery urgency, order amount, and customer level.
[0045] Specifically: The equipment, material, and order data collected by the perception layer are processed by the cloud-edge collaboration layer and transformed into dynamic parameters of the constraint library, such as real-time equipment capacity and material inventory thresholds; the deep reinforcement learning model of the scheduling layer generates optimized solutions iteratively based on these constraints through state perception, action decision-making, and reward feedback; the execution layer breaks down the solutions into equipment instructions and monitors deviations in real time through the interaction layer, forming a closed loop of collection, modeling, decision-making, execution, and feedback; the perception layer solves the problem of data silos in traditional scheduling by collecting multi-dimensional data such as production status, operating parameters, and material inventory; because tape production is highly continuous, such as coating, slitting, and rewinding, which require continuous operations, and process parameters such as glue viscosity and coating thickness have a great impact on quality, high-frequency collection is required to ensure data timeliness. The data is integrated into standardized data packages to provide unified input for subsequent constraint modeling and intelligent decision-making, avoiding scheduling deviations caused by chaotic data formats;
[0046] Furthermore, the multi-objective task allocation system at the execution layer addresses the balance between efficiency and cost, avoiding resource waste caused by prioritizing the fastest completion of a single objective. The interaction layer, through feature extraction (such as comparing the deviation of the actual coating thickness from the standard value with a threshold), triggers a recalculation by the scheduling unit when the deviation exceeds the limit. If the deviation is due to material shortage, the material constraint sub-library updates inventory warnings and adjusts the task order at the execution layer. If the deviation is due to equipment aging, the capacity decay curve from the equipment constraint sub-library is used to correct subsequent scheduling. The monitoring and early warning module at the interaction layer, through real-time deviation analysis (such as when process parameters exceed tolerances), triggers dynamic adjustments to compensate for the rigidity of traditional one-time scheduling, ensuring production stability.
[0047] Through the combined action of the perception layer, constraint library, and interaction layer, the capacity decay status data curve is updated in real time, making equipment constraints change from static assumptions to dynamic adaptations. For example, after the coating machine has been running for 1000 hours, the system automatically corrects its production speed by extracting the operating features through the interaction layer, avoiding scheduling deviations caused by misjudgment of capacity.
[0048] By setting up a constraint library, static specification data of the equipment is collected through sensing devices in the perception layer, such as controllers and tension sensors. Dynamic operating data of the equipment is collected through detection devices in the scheduling layer, such as tape coating detectors. Furthermore, historical performance data or production capacity change curves are obtained based on historical equipment data packages. The methods include, but are not limited to:
[0049] Equipment constraint sub-library:
[0050] Data acquisition: The Python Pandas library was used to clean the data, remove sensor outliers, normalize the data, and fill in the missing maintenance data through time series analysis.
[0051] Constraint library construction: Using process, equipment, and product specifications as three dimensions, a Boolean matching matrix is constructed, and updates are made based on historical equipment data packages through association rule mining;
[0052] Using linear regression or gradient boosting tree models, with runtime as the independent variable and the ratio of actual capacity to theoretical capacity as the dependent variable, a decay curve is fitted. , The system is recalibrated quarterly based on new data to determine the number of operating hours.
[0053] Integrate preventive maintenance plans with real-time fault early warning data to construct time-axis constraints. Mark the unavailable periods of equipment in the equipment constraint sub-library. When the equipment vibration value exceeds the threshold, the maintenance window is automatically triggered in advance.
[0054] Cloud-edge collaborative updates: Edge computing nodes synchronize real-time device data every preset period. If the capacity decay rate exceeds the limit or the maintenance plan changes, the constraint library is updated immediately.
[0055] Traditional scheduling, which relies on experience to allocate equipment, can lead to specification mismatches, capacity misjudgments, and plan conflicts. Therefore, by combining a constraint library with the scheduling layer, and using a data-driven approach to equipment capabilities, static specifications, dynamic performance, and plan constraints are transformed into mathematical constraints recognizable by the scheduling algorithm. An adaptation matrix clarifies equipment processing boundaries, preventing the allocation of large equipment to small-specification orders or low-precision equipment to high-precision requirements. This reduces equipment specification mismatch rates and rework rates due to insufficient equipment capacity. Furthermore, dynamic capacity forecasting via decay curves avoids schedule delays caused by scheduling based on theoretical values. Capacity decay correction reduces scheduling deviations and improves on-time order delivery rates. Moreover, by pre-locking unavailable periods through maintenance windows, time conflicts between production and maintenance are avoided. Pre-integration of maintenance windows reduces unplanned downtime, improves equipment utilization, and ultimately enhances the scheduling efficiency of the conveyor belt production line's scheduling system.
[0056] Material constraint sub-library:
[0057] Data Acquisition: RFID readers and writers for smart material racks and weight sensors for material conveying lines are used to collect data from the entire chain of core materials in tape production; RFID data is pushed to the cloud-edge collaboration layer every 10 seconds using the MQTT protocol, triggering real-time inventory updates when materials are issued or put into storage;
[0058] Constraint library construction: A relational table is built based on material attributes and product specifications, and the adaptation relationship is updated based on quality inspection data; multi-level inventory thresholds (safety, early warning, and emergency) are set based on production consumption rate / procurement cycle / safety factor; the time and loss of switching between different materials are quantified to provide a basis for order merging and scheduling; when the inventory is lower than the corresponding threshold, the system automatically pushes a message; if the inventory of a certain material is lower than the emergency threshold, the scheduling system automatically queries the adaptation table, prioritizes production using other orders, and extends the start time of related orders for that material until the material is replenished.
[0059] When material information lags or supply and demand are mismatched in traditional scheduling, the real-time perception, early warning linkage, and demand matching functions transform the inventory status, adaptability, and supply cycle of materials into scheduling constraints. This avoids the situation where material shortages lead to downtime due to discrepancies between the books and the actual situation. Furthermore, it realizes the transformation from passive response to proactive prevention, resolves supply risks in advance, and ensures that the quality of materials meets product requirements, avoiding quality defects caused by material mismatch.
[0060] Order constraint sub-library:
[0061] Data collection: Import core order information from the ERP system, extract three priority factors, including delivery urgency, order amount, and customer level, and normalize the factor values using the range method;
[0062] Order priority weight allocation: The importance of factors is compared and ranked according to actual production needs, a judgment matrix is constructed, and the factor weights are calculated by the eigenvalue method. For example: delivery urgency score 40%, order amount score 30%, customer level score 30%, and the order sequence is formed by sorting according to the scores.
[0063] Orders are linked to the scheduling layer: the scheduling system prioritizes allocating resources to high-scoring orders. When a new order is inserted, the priority is automatically recalculated. If it is an urgent order, a partial reordering of the existing schedule is triggered.
[0064] Traditional scheduling suffers from disordered order insertion and unbalanced resource allocation. By using multi-criteria quantification, scientific sorting, and dynamic adaptation, subjective order priorities are transformed into objective numerical constraints. This achieves optimal resource allocation while adapting to the diverse production needs of multiple varieties and small batches, and responding quickly to order changes.
[0065] Furthermore, the equipment, material, and order sub-databases do not operate in isolation, but form a closed loop through data exchange, constraint linkage, and goal collaboration. This improves the effectiveness of the cloud-edge collaboration layer in assisting the scheduling layer to optimize scheduling schemes, and can also cope with temporary and ever-changing production scheduling requirements, thereby improving the practicality of the scheduling system.
[0066] The inspection device includes a reciprocating sliding mechanism, a support platform, a laser thickness gauge, and a camera assembly. The support platform is mounted on the reciprocating sliding mechanism. The laser thickness gauge and the camera assembly are mounted on the bottom of the support platform and face the tape. An air jet pipe is mounted on the support platform, with the nozzle of the air jet pipe located on one side of the camera assembly and facing the other side and the laser thickness gauge. The opening of the air jet pipe is flat. A mounting frame is slidably connected to the support platform, and a sealing cover is slidably connected to the mounting frame via a spring. A mounting cover is slidably connected to the mounting frame via a spring, and a protective film is provided inside the mounting cover. An arc-shaped extrusion rubber is provided at the bottom of the mounting cover. An adhesive roller is rotatably connected to one end of the mounting frame, and the adhesive roller contacts the lens at the bottom of the camera assembly. A coating shaft is provided at the top of the mounting frame, and an applicator is injected into the coating shaft.
[0067] The thickness of the wet coating of the tape is directly related to the thickness of the dry coating after drying; dry thickness = wet thickness. The solid content of the adhesive is a known process parameter. Detecting the wet coating before drying can identify coating amount problems earlier than after drying, reducing losses at the source. Detecting the wet coating thickness before drying and correlating it with the dry coating thickness, through solid content conversion, can provide core data support for the tape production line scheduling system for front-end early warning, dynamic adjustment, and resource optimization. This shifts the system from passively responding to quality problems to proactively controlling the production rhythm, ultimately achieving multiple improvements in scheduling efficiency, capacity utilization, and order delivery quality.
[0068] Wet coating detection can identify thickness deviations immediately after coating. If the deviation is too large, the scheduling system will trigger continuous actions of real-time adjustment and local rejection by linking with the control system, such as fine-tuning the blade gap and cutting off abnormal wet coating sections, reducing the time for abnormal handling, not affecting the production schedule of subsequent orders, and facilitating the scheduling system to schedule production quality.
[0069] If the lens protective film is aged or has residual old coating, it can cause the machine vision system to misjudge and trigger false defect alarms, leading to unplanned downtime. If a misjudgment results in the dispatch system urgently arranging equipment repair, or if a real defect is missed, defective products may flow into subsequent processes, requiring rework and disrupting the original order schedule. Therefore, when replacing the protective film on the camera module lens, personnel push the mounting bracket close to the camera module until the adhesive roller contacts the protective film on the lens. The sealing cover stops moving due to the obstruction of the camera module, and the mounting bracket then drives the adhesive roller to continue moving, removing the old protective film as the roller moves. As the protective film is gradually peeled off the lens, the exposed part of the lens immediately comes into contact with the sponge-like coating shaft. The coating shaft stores regular sizing agent to promote the coating effect. As the old protective film is peeled off, the coating shaft also gradually wipes across the lens surface. The side of the coating shaft away from the mounting cover wipes away the old sizing agent remaining on the old film, improving the cleanliness of the lens. The side of the coating shaft closer to the mounting cover wipes the new sizing agent onto the clean lens, so that the lens sizing agent is replaced at the same time as the protective film is replaced, improving the coating effect, thereby improving the detection effect and thus improving the scheduling accuracy.
[0070] Furthermore, when the mounting bracket moves the mounting cover closer to the lens, the sealing cover presses against the mounting cover to prevent it from rising. When the sealing cover moves to the film application position below the lens, the sealing cover moves away from the mounting cover, and the mounting cover rises under the influence of the spring, applying the new protective film to the lens, completing the replacement of the old and new films. This facilitates personnel operation, simplifies the workflow, and improves replacement efficiency, thereby improving inspection efficiency. In addition, the short interval between tearing off the old film, applying the new film, and applying the new film improves film application efficiency while shortening the time the lens is exposed, preventing dust and other impurities from contaminating the lens, eliminating the need for unnecessary lens cleaning steps, and avoiding scratches caused by repeated cleaning of the lens.
[0071] After the mounting cap brings the new film into contact with the lens, the spring continues to lift the mounting cap, which in turn causes the extrusion rubber to press the new film onto the lens. The center of the extrusion rubber first contacts the center of the lens. As the extrusion rubber is flattened, it gradually applies the new film to the lens, starting from the center and working outwards. This allows air bubbles generated during the application process to be expelled from the center outwards, preventing their formation. Furthermore, the expulsion of the film from the center outwards also pushes out excess coating material, preventing excessive coating material from affecting shooting and inspection. In addition, the expulsion of coating material also helps to expel air bubbles, improving air bubble flow and accelerating air bubble removal efficiency, thereby improving film application efficiency and reducing inspection and maintenance time.
[0072] Furthermore, film replacement data can be integrated into the equipment data of the scheduling system to establish a correlation function curve between film replacement cycle and detection accuracy. The scheduling system can record the change in detection accuracy after each film replacement, automatically generate the optimal film replacement cycle, and push film replacement reminders to the scheduler 24 hours in advance, incorporating film replacement arrangements into planned maintenance. Moreover, it can avoid passively responding to detection inaccuracies caused by film aging, and instead proactively manage planned film replacements, making the scheduling plan more aligned with the equipment maintenance rhythm. In addition, the scheduling system can analyze the correlation between lens detection accuracy and effective production line capacity. For example, when the detection accuracy is ≥99.5%, the effective capacity reaches 30,000㎡ / day; when the accuracy drops to 98%, the effective capacity drops to 28,000㎡ / day. When the detection accuracy shows a downward trend, the scheduling system can predict capacity fluctuations in advance and proactively adjust subsequent order schedules, upgrading from post-event correction to pre-event prediction, improving the flexibility and risk resistance of the scheduling plan.
[0073] Example 2:
[0074] Based on Example 1, the workshop scheduling system includes a workshop scheduling environment, an offline training module, and an online application module:
[0075] Shop floor scheduling environment: The shop floor scheduling problem is modeled as a Markov decision process. Then, the scheduling task is broken down into multiple nodes using a disjunctive graph model, and a Gantt graph model is used to store the processing information matrix.
[0076] Offline training module: By enabling the agent to continuously interact with the environment, the generated reinforcement learning quadruple data is stored in the storage component. For the production line scenario, a deep reinforcement learning algorithm is selected from the algorithm pool to train the agent. The loss function is continuously calculated using the data of the agent's interaction, and the network weights of the agent are updated using the gradient descent algorithm until the network converges. Finally, the trained network model and weights are saved.
[0077] Online application module: The trained network model is loaded into the agent, and then the actual state of the workshop environment is input into the agent. The agent outputs the corresponding scheduling action through network decision, and then the workshop executes the scheduling action and updates to the next state. This cycle continues until all scheduling tasks are completed.
[0078] Disjunctive graphs utilize Gantt charts to reflect the processing time of each process and the specific time scale of the entire scheduling scheme. By combining disjunctive graphs and Gantt charts, the scheduling problem is modeled as a discrete sequential decision-making process. State transition functions, action spaces, reward functions, and agent network structures are designed for the sequential decision-making process.
[0079] The state transition function contains three types of state information: processing time, processing flag, and cumulative processing time;
[0080] The motion space consists of two parts: process sequence and machine selection, and is used to encode the motion space;
[0081] The reward function is the short-term reward obtained by performing an action in the current state. When the agent performs an action, the environment undergoes a state transition and gives the agent an immediate reward. Since the maximum completion time cannot be known at the current moment, the quality of the current action can be approximated by subtracting the end time of the current state from the end time of the previous state.
[0082] The agent network structure includes a feature extraction network and a decision network. The feature extraction network consists of a convolutional neural network and a graph neural network. The convolutional neural network extracts features from the basic state of the workshop and embeds the state into the disjunctive graph. Then, the graph neural network is used to obtain the association information of the nodes from the disjunctive graph. Finally, the decision network inputs the features into a fully connected network and outputs the probability distribution of the agent's actions and the action value estimate.
[0083] Specifically, when using deep reinforcement learning algorithms to solve the tape production line scheduling problem, the workshop scheduling problem first needs to be modeled as a Markov decision process; then, the scheduling task is broken down into multiple nodes using a disjunctive graph model, and a Gantt graph model is used to store the processing information matrix; since there are various dynamic events in actual workshop scheduling, it is necessary to add perturbation events such as human intervention and weather to the environment to form a dynamic production environment, which includes state transition functions, reward functions, and agent action decoders, etc.
[0084] Then, by enabling the agent to continuously interact with the environment, the generated reinforcement learning quadruple data is stored in the storage component; that is, in the production line scenario, a deep reinforcement learning algorithm is selected from the algorithm pool to train the agent, the loss function is continuously calculated using the data of the agent's interaction, and the network weights of the agent are updated using the gradient descent algorithm until the network converges; finally, the trained network model and weights are saved.
[0085] The trained network model is loaded into the agent. Then, the actual state of the workshop environment, such as the workshop scheduling environment, is input into the agent. The agent outputs the corresponding scheduling action through network decision-making. Then, the workshop executes this scheduling action and updates to the next state. This process is repeated until all scheduling tasks are completed.
[0086] By employing deep reinforcement learning technology, a three-layer architecture of environment modeling, offline training, and online application is established. The workshop scheduling environment defines the problem boundaries and interaction rules, providing a virtual training ground for offline training and a state translator for online applications. Through interactive learning with the virtual environment, optimization strategies adapted to real-world scenarios are generated. The decision-making capabilities of the offline model are applied to actual production, while real data is fed back to offline training, forming a closed loop of virtual learning, real-world application, and data feedback. The constraint library also shares equipment, material, and order feature information with the online application module. Before execution, after making a scheduling decision, the accuracy of the production scheduling decision is verified again through real-time information from the constraint library, avoiding deviations caused by external factors during the process.
[0087] By using Markov decision processes, the scheduling problem is transformed into a sequential decision problem that can be iteratively optimized by an agent. Complex production constraints are transformed into data states that the agent can understand. The analytical graph clearly expresses the logical constraints between processes, reducing the difficulty for the agent to understand the constraints, reducing the modeling time of the scheduling problem, and improving the fast response effect. Furthermore, the Gantt chart intuitively stores the processing status, enabling high-dimensional information to be retrieved and updated efficiently, shortening the agent's query time, and meeting the needs of real-time decision-making.
[0088] Dynamic modeling of state transitions abstracts the production process into a quantifiable state flow by combining three types of time and label information, satisfying the assumption of no aftereffect of Markov decision process, enabling the agent to make independent decisions based on the current state without tracing historical information, reducing decision complexity, avoiding complex decoding process, and improving scheduling decision generation efficiency.
[0089] The structured action space addresses the decision-making logic of workshop scheduling, which prioritizes selecting processes and then allocating equipment. It uses two-dimensional coding to accurately map the actual scheduling process, avoiding redundancy in the action space and improving the decision-making efficiency of the intelligent agent.
[0090] The reward function addresses the pain point of sparse reward signals in the scheduling problem of reinforcement learning. Traditional methods require waiting for the entire process to be completed before the quality of actions can be evaluated. Real-time feedback accelerates the learning process of the agent and reduces the cost of offline training with the same amount of training data.
[0091] The feature fusion of convolutional neural networks and graph neural networks enables the comprehensive extraction of numerical and relational features. The combination of the two allows the agent to simultaneously grasp the state of production resources and logical constraints, making decisions more realistic, reducing the deviation between the agent's output of the optimal solution and the theoretical optimal solution, and improving the accuracy of the scheduling system.
[0092] Example 3:
[0093] Based on Example 2, the offline training module uses the near-end policy optimization algorithm to train the agent. In the algorithm architecture, two identical agent networks are used. One network is used for sampling, and the other network is used for repeated updates, so that the policy iteration gradually converges to the optimal policy.
[0094] The basic steps of the near-end strategy optimization algorithm to solve the shop floor scheduling problem are as follows:
[0095] Step 1: Initialize the policy network parameters, value network parameters, and hyperparameters in the algorithm;
[0096] Step 2: Start the iteration and record the number of iterations;
[0097] Step 3: The agent outputs the probability distribution of actions through the policy function and puts the data generated by interacting with the shop floor scheduling environment into the playback memory.
[0098] Step 4: Determine if the current state is the end state of a round. If not, return to step 3. If so, calculate the estimate of the backtracking reward function.
[0099] Step 5: Record the number of rounds of updating the agent and update the agent using an offline strategy. Determine the number of updates. If the number of updates is less than the threshold, jump back to step 5; otherwise, proceed to the next step.
[0100] Step 6: Update the neural network weights to the old policy network, and compare the current scheduling scheme with the previous optimal scheduling scheme. If the current scheme is better than the previous scheme, save the current scheduling scheme and set it as the optimal scheduling scheme; otherwise, discard it.
[0101] Step 7: Determine the number of algorithm iterations. If the number is less than the threshold, return to step 2 to continue iterating; otherwise, the algorithm ends and outputs the optimal scheduling result and scheduling scheme.
[0102] Specifically: The proximal policy optimization algorithm achieves stable iteration of the agent's policy through a dual-network alternating update mechanism. The core process is as follows:
[0103] Initialization configuration: Set the initial parameters for the policy network and value network, and configure hyperparameters;
[0104] Interactive sampling: The policy network outputs the probability distribution of actions based on the current workshop state. The agent interacts with the scheduling environment accordingly to generate four-tuple data: state, action, reward, and next state, and stores them in the playback memory.
[0105] Round termination judgment: If the current state is the termination state, such as all orders are completed, the total round profit is calculated by the backtracking reward function; otherwise, sampling continues.
[0106] Offline policy update: Batch data is extracted from the memory and the network is updated using an offline policy; the value network evaluates the state value and calculates the advantage function; the policy network limits the deviation between the new policy and the old policy through proximal constraints and iteratively optimizes until the preset number of updates is reached;
[0107] Network synchronization and scheme preservation: Synchronize the updated network weights to the old policy network, compare the current scheduling scheme with the historical best scheme, and retain the better scheme;
[0108] Iteration termination judgment: If the total number of iterations has not reached the threshold, return to step 2 to continue training; otherwise, output the optimal scheduling scheme.
[0109] Traditional policy gradient algorithms suffer from instability during training due to the strong correlation between policy updates and sampled data. By separating the sampling network and the update network, the sampling network remains relatively fixed, ensuring stable data distribution. The update network is optimized based on offline data, avoiding drastic policy fluctuations and performance oscillations common in traditional algorithms. This ensures that the agent converges stably to a better policy and improves the accuracy of policy generation in the scheduling system.
[0110] Furthermore, the offline update mechanism allows a single piece of interactive data to be reused multiple times, reducing the number of interactions with the environment while achieving the same training effect, shortening the training cycle, and improving the response speed of the scheduling system.
[0111] Furthermore, batch updates using historical data from the replay memory reduce sample correlation; the advantage function quantifies the relative value of actions, making policy optimization more focused on improving marginal returns, shortening the maximum completion time of near-end policy optimization algorithms, and further improving the accuracy of policy generation in the scheduling system.
[0112] Example 4:
[0113] Based on Embodiment 3, the multi-objective task scheduling system includes a data collection and preprocessing module, a multi-objective optimization module, a cloud-edge resource allocation module, and a task scheduling algorithm.
[0114] Specifically, the core modules of the multi-objective task scheduling system based on cloud-edge collaboration and intelligent optimization technology include:
[0115] Multi-objective optimization module:
[0116] The core objectives are to minimize total completion time, energy consumption, and production cost, and these are incorporated into the constraint library.
[0117] The problem of tape production line scheduling is abstracted into a multi-objective optimization model. For example, for the coating process, the correlation equations of equipment capacity, energy consumption and processing efficiency are constructed to quantify the multi-objective benefits under different processing parameters.
[0118] Cloud-edge collaboration layer allocation module:
[0119] Edge nodes are responsible for tasks with high real-time requirements and undertake computationally intensive tasks.
[0120] Task scheduling algorithm module:
[0121] A non-dominated sorting genetic algorithm is used to solve multi-objective optimization problems. Through encoding, steps such as equipment allocation and process sequence transformation, population initialization, crossover mutation, and non-dominated sorting are performed to generate a Pareto optimal solution group, covering scheduling schemes with different objective weights.
[0122] Based on the priority requirements of the tape production line, the optimal solution is selected from the Pareto front, and a scheduling Gantt chart including equipment, process, and time is generated.
[0123] To address the conflicting relationships between efficiency, cost, and energy consumption in tape production lines, such as increasing coating speed to shorten the production period but increasing energy consumption, a non-dominated sorting genetic algorithm is used to generate multiple optimal solutions under the premise of satisfying constraints. This avoids the deterioration of other indicators caused by single-objective optimization and provides decision-makers with flexible choices, such as prioritizing shortening the production period when rushing to meet deadlines and prioritizing reducing energy consumption when saving energy.
[0124] Furthermore, it can improve the overall optimization rate of tape production lines in terms of completion time, energy consumption, and cost. The Pareto optimal solution group provides adaptive solutions for different scenarios. Short-term solutions are given priority in urgent order scenarios, and low-energy solutions are given priority in environmental control scenarios, thereby significantly improving the flexibility of the scheduling system.
[0125] Example 5:
[0126] A production scheduling method for a tape production line based on intelligent optimization, the specific steps of which are as follows:
[0127] Step 1: Collect data from the entire tape production process using industrial IoT devices to complete multi-dimensional data collection and preprocessing, including equipment data, process data, material data, and order data;
[0128] Step 2: Build a dynamic constraint library based on the preprocessed data, including equipment constraint sub-libraries, material constraint sub-libraries, and order constraint sub-libraries;
[0129] Step 3: Define the core optimization objective of scheduling, construct a mathematical model with the core objectives of minimizing total completion time, reducing energy consumption, and reducing production costs. Determine the weight of each objective using the entropy weight method, and integrate the equipment, material, and order constraints from Step 2 into mathematical expressions. This transforms the scheduling problem into a quantifiable mathematical optimization problem, balancing the conflict between efficiency, cost, and service quality.
[0130] Step 4: Generate an initial scheduling scheme through the algorithm, select the comprehensive optimal scheme from the optimized solutions, and generate information including equipment, processes, and time;
[0131] Step 5: Implement the scheduling plan through the industrial control system and track the execution status in real time;
[0132] Step 6: Extract features such as equipment load, material inventory, and order progress through heterogeneous graph neural networks, calculate the deviation between actual and planned values, provide real-time feedback data to identify deviations, and trigger a dynamic adjustment mechanism.
[0133] The foregoing has shown and described the basic principles, main features, and advantages of the present invention. Those skilled in the art should understand that the present invention is not limited to the above embodiments. The embodiments and descriptions in the specification are merely illustrative of the principles of the invention. Various changes and modifications can be made to the invention without departing from its spirit and scope, and all such changes and modifications fall within the scope of the present invention as claimed. The scope of protection of the present invention is defined by the appended claims and their equivalents.
Claims
1. A production scheduling system for a tape production line based on intelligent optimization, characterized in that, The scheduling system includes: The sensing layer monitors the operating data of each device through sensing devices, organizes the device data into device data packages, and uploads them to the control center. The device data packages include production status, operating status, process parameters, and material inventory. Cloud-edge collaboration layer: A constraint library is built based on device data packets. The constraint library includes a device constraint sub-library, a material constraint sub-library, and an order constraint sub-library. Scheduling Layer: Construct a workshop scheduling system based on deep reinforcement learning, including equipment units, resource units, scheduling units, and execution units. Equipment units include detection devices for production equipment, resource units include past order scheduling records and scheduling logic, scheduling units include the initial scheduling scheme transmitted by the scheduling system, and execution units include transmission and display devices to notify and control the coordination between production equipment. Execution layer: A multi-objective task scheduling system is embedded to evenly distribute order tasks into the workshop's scheduling system; Interaction layer: Build a monitoring and early warning module. The detection data of the detection side device in the scheduling layer is used as the feature extraction value, and the tape product parameters are used as the preset threshold. When the deviation value exceeds the preset threshold, the dynamic adjustment mechanism is triggered to update the scheduling plan. The detection device includes a reciprocating sliding mechanism, a support platform, a laser thickness gauge, and a camera assembly. The support platform is mounted on the reciprocating sliding mechanism, and the laser thickness gauge and camera assembly are mounted on the bottom of the support platform and face the tape. An air jet pipe is mounted on the support platform, with the nozzle of the air jet pipe located on one side of the camera assembly and facing the other side and the laser thickness gauge. The opening of the air jet pipe is flat. A mounting frame is slidably connected to the support platform, and a sealing cover is slidably connected to the mounting frame via a spring. A mounting cover is slidably connected to the mounting frame via a spring. The mounting cover contains a protective film, and the bottom of the mounting cover has an arc-shaped extrusion rubber. One end of the mounting bracket is rotatably connected to an adhesive roller, which contacts the lens at the bottom of the camera assembly; the top of the mounting bracket is equipped with a coating shaft, and the coating shaft is filled with an applicator. The thickness of the wet coating of the tape is directly related to the thickness of the dry coating after drying. Dry thickness = wet thickness * adhesive solid content. The wet coating inspection immediately identifies thickness deviations after coating. If the deviation is too large, the scheduling system, in conjunction with the control system, triggers continuous actions of immediate adjustment and local rejection, including fine-tuning the scraper gap and removing abnormal wet coating segments to reduce the time for abnormal processing. When replacing the protective film on the camera module lens, the operator pushes the mounting bracket close to the camera module until the adhesive roller contacts the protective film on the lens. The sealing cap stops moving due to the obstruction of the camera module, and the mounting bracket then drives the adhesive roller to continue moving. As the adhesive roller moves, it removes the old protective film. As the protective film is gradually peeled off the lens, the exposed part of the lens immediately comes into contact with the sponge-material coating roller. The coating roller stores regular coating agent to promote the application of the film. As the old protective film is peeled off, the coating roller gradually wipes across the lens surface. The side of the coating roller away from the mounting cap wipes away the old coating agent residue on the old film, improving the cleanliness of the lens. The side of the coating roller closer to the mounting cap wipes the new coating agent onto the clean lens, so that the coating agent is replaced at the same time as the protective film is replaced, improving the application effect. When the mounting bracket moves the mounting cover closer to the lens, the sealing cover presses against the mounting cover to prevent it from rising. When the sealing cover moves to the film application position below the lens, the sealing cover moves away from the mounting cover, and the mounting cover rises due to the influence of the spring, applying the new protective film to the lens, thus completing the replacement of the old and new films. After the mounting cap brings the new film into contact with the lens, the spring continues to drive the mounting cap to rise. The mounting cap then drives the extrusion rubber to press the new film onto the lens. The center of the extrusion rubber first contacts the center of the lens. As the extrusion rubber is flattened, it gradually applies the new film to the lens, using the center as a reference, so that air bubbles generated during the film application process can be expelled from the center to the edges. In addition, the expulsion of the coating agent can also expel air bubbles along with it, improving the fluidity of the air bubbles. The membrane replacement data is integrated into the equipment data of the scheduling system to establish a correlation function curve between the membrane replacement cycle and the detection accuracy. The scheduling system records the change in detection accuracy after each membrane replacement, automatically generates the optimal membrane replacement cycle, and pushes a membrane replacement reminder to the scheduler 24 hours in advance, thus incorporating the membrane replacement schedule into the planned maintenance.
2. The intelligent optimization-based tape production line scheduling system according to claim 1, characterized in that: The constraint library includes: an equipment constraint sub-library containing equipment specification adaptation matrices, capacity decay function curves, and downtime maintenance time windows, built based on historical equipment operation data; a material constraint sub-library associated with material storage disks in the equipment, updating material inventory balances, storage locations, and expiration dates in real time, and setting multi-level inventory warning thresholds; and an order constraint sub-library using the analytic hierarchy process to prioritize orders, with priority factors including delivery urgency, order amount, and customer level.
3. The intelligent optimization-based tape production line scheduling system according to claim 1, characterized in that: The workshop scheduling system includes a workshop scheduling environment, an offline training module, and an online application module. Shop floor scheduling environment: The shop floor scheduling problem is modeled as a Markov decision process. Then, the scheduling task is broken down into multiple nodes using a disjunctive graph model, and a Gantt graph model is used to store the processing information matrix. Offline training module: By enabling the agent to continuously interact with the environment, the generated reinforcement learning quadruple data is stored in the storage component. For the production line scenario, a deep reinforcement learning algorithm is selected from the algorithm pool to train the agent. The loss function is continuously calculated using data from agent interactions, and the network weights of the agents are updated using the gradient descent algorithm until the network converges; finally, the trained network model and weights are saved. Online application module: Loads the trained network model into the agent, and then inputs the actual state of the workshop environment. Once the data is entered into the intelligent agent, the agent outputs the corresponding scheduling action through network decision-making. The workshop then executes this scheduling action and updates the state to the next state. This process is repeated until all scheduling tasks are completed.
4. The intelligent optimization-based tape production line scheduling system according to claim 3, characterized in that: Disjunctive graphs utilize Gantt charts to reflect the processing time of each process and the specific time scale of the entire scheduling scheme. By combining disjunctive graphs and Gantt charts, the scheduling problem is modeled as a discrete sequential decision-making process. State transition functions, action spaces, reward functions, and agent network structures are designed for the sequential decision-making process.
5. The intelligent optimization-based tape production line scheduling system according to claim 4, characterized in that: The state transition function contains three types of state information: processing time, processing flag, and cumulative processing time; The motion space consists of two parts: process sequence and machine selection, and is used to encode the motion space; The reward function is the short-term reward obtained by performing an action in the current state; The intelligent agent network structure includes a feature extraction network and a decision network.
6. The intelligent optimization-based tape production line scheduling system according to claim 5, characterized in that: The offline training module uses a near-end policy optimization algorithm to train the agent. The algorithm architecture uses two identical agent networks, one of which samples and the other updates repeatedly, so that the policy iteration gradually converges to the optimal policy.
7. The intelligent optimization-based tape production line scheduling system according to claim 1, characterized in that: The multi-objective task scheduling system includes a data collection and preprocessing module, a multi-objective optimization module, a cloud-edge resource allocation module, and a task scheduling algorithm.
8. A production scheduling method for a tape production line based on intelligent optimization, wherein the scheduling method uses the scheduling system described in any one of claims 1-7 above, characterized in that: The specific steps are as follows: Step 1: Collect data from the entire tape production process using industrial IoT devices to complete multi-dimensional data collection and preprocessing, including equipment data, process data, material data, and order data; Step 2: Build a dynamic constraint library based on the preprocessed data, including equipment constraint sub-libraries, material constraint sub-libraries, and order constraint sub-libraries; Step 3: Define the core optimization objective of scheduling, construct a mathematical model with the core objectives of minimizing total completion time, reducing energy consumption, and reducing production costs. Determine the weight of each objective using the entropy weight method, and integrate the equipment, material, and order constraints from Step 2 into mathematical expressions. This transforms the scheduling problem into a quantifiable mathematical optimization problem, balancing the conflict between efficiency, cost, and service quality. Step 4: Generate an initial scheduling scheme through the algorithm, select the comprehensive optimal scheme from the optimized solutions, and generate information including equipment, processes, and time; Step 5: Implement the scheduling plan through the industrial control system and track the execution status in real time: Step 6: Extract features such as equipment load, material inventory, and order progress through heterogeneous graph neural networks, calculate the deviation between actual and planned values, provide real-time feedback data to identify deviations, and trigger a dynamic adjustment mechanism.