Intelligent warehousing management method and device
By constructing a virtual space data foundation and a digital twin model, combined with a multi-agent collaborative framework of federated learning, the problems of information silos and equipment compatibility in warehouse management systems have been solved, enabling accurate perception and prediction of warehouse status and improving management efficiency and operation and maintenance support.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- GKHT MEDICAL TECH CO LTD
- Filing Date
- 2026-05-26
- Publication Date
- 2026-06-23
AI Technical Summary
Existing warehouse management systems suffer from information silos, static and rigid planning strategies, poor compatibility due to equipment heterogeneity, and outdated equipment maintenance methods, making it difficult to meet the needs of efficient, flexible, and intelligent warehouse management in complex scenarios.
By collecting multi-source heterogeneous data through the industrial IoT two-way communication protocol stack, semantic, structural, spatiotemporal, relational and state behavior mapping is performed to build a virtual space data foundation, establish a digital twin model and embed physical mechanism and data-driven model, and build a federated learning multi-agent collaborative framework to achieve accurate perception and prediction of warehouse status and collaborative action planning.
It improves warehouse management efficiency and operation and maintenance support, enables accurate perception and prediction of warehouse status, and enhances the system's adaptability and the real-time monitoring and predictive maintenance capabilities of equipment.
Smart Images

Figure CN122264705A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of data processing, specifically to an intelligent warehouse management method and apparatus. Background Technology
[0002] Currently, warehouse management systems are widely used in industries such as manufacturing, e-commerce, and cold chain logistics. However, existing technologies still have many shortcomings and are difficult to meet the needs of efficient, flexible, and intelligent warehouse management in complex scenarios.
[0003] First, the problem of information silos is prominent. Traditional warehouse management systems (WMS), equipment control systems (WCS), and transportation management systems (TMS) suffer from data fragmentation and a lack of unified collaboration mechanisms. Second, planning strategies are static and rigid. Existing systems heavily rely on manual experience or fixed rules in areas such as location allocation, picking route planning, and inventory scheduling, lacking the ability to adapt to dynamic order fluctuations. Third, equipment heterogeneity and incompatibility are poor. Warehouse environments often involve automated equipment such as AGVs, robots, shuttles, and conveyors from various brands. Incompatibility between different devices' protocols makes system integration difficult and costly. Furthermore, equipment maintenance methods are outdated. Most current warehouse systems rely on manual inspections and experience-based judgment to detect equipment failures, lacking real-time monitoring and predictive maintenance capabilities.
[0004] In conclusion, there is an urgent need for an intelligent warehouse management method that can improve warehouse management efficiency and operational support. Summary of the Invention
[0005] To address the problems in the existing technology, this application provides an intelligent warehouse management method and device, which can improve warehouse management efficiency and operation and maintenance support.
[0006] To solve at least one of the above problems, this application provides the following technical solution: Firstly, this application provides an intelligent warehouse management method, including: Multi-source heterogeneous data in the warehouse environment is collected through the industrial Internet of Things bidirectional communication protocol stack. Semantic mapping, structural mapping, spatiotemporal mapping, relational mapping and state behavior mapping are performed on the multi-source heterogeneous data to determine the corresponding virtual space data base. A corresponding digital twin model is constructed based on the virtual space data base, and a preset physical mechanism model and a set data-driven model are embedded in the digital twin model. The real-time operating condition data stream of the warehousing system is received based on the digital twin model, and the real-time operating condition data stream is mapped into a global state tensor. The global state tensor is used to characterize the current warehousing status and warehousing prediction information. A federated learning framework is constructed based on multiple agents deployed on various device clusters. Each agent obtains local state information related to its global state tensor and inputs the local state information into a preset local policy network for action prediction to determine the corresponding candidate actions. According to the federated learning framework, each candidate action is uploaded to the central server. The preset global policy network resolves conflicts among the candidate actions to determine the corresponding set of cooperative actions. The set of cooperative actions is input into the task planner for task decomposition, and the executable atomic task sequence obtained after decomposition is sent to each device cluster to execute the warehouse management task.
[0007] Furthermore, the semantic mapping and the structural mapping include: The semantic mapping includes establishing a unified data dictionary, mapping data entities with the same meaning but different names or definitions from different source systems to the same standard identifier, and associating isolated data with physical entities, time tags, and spatial locations in context. The structure mapping includes mapping relational tables, time-series data, JSON documents, and unstructured data to the unified data structure of the digital twin model through predefined ETL or ELT transformation rules, and associating the storage path of unstructured data with the attribute fields of the corresponding physical entities.
[0008] Furthermore, the spatiotemporal mapping, the relational mapping, and the state-behavior mapping include: The spatiotemporal mapping includes establishing a precise geometric coordinate mapping for each physical entity in virtual space through GPS coordinates, indoor positioning beacons, or 3D model registration, and establishing a unified time axis for all time series data, synchronously processing timestamp differences from different sources. The relationship mapping includes constructing a knowledge graph between entities and defining the logical relationships and connection methods between physical entities; The state behavior mapping includes defining a state machine for each physical entity and setting real-time data conditions to trigger state transitions for the state machine.
[0009] Furthermore, the step of constructing a corresponding digital twin model based on the virtual space data base, and embedding a preset physical mechanism model and a set data-driven model into the digital twin model, includes: Based on the virtual space data base, a virtual model of the physical storage space, equipment entities, and work processes is constructed through a 3D modeling engine, and a mapping channel is constructed to synchronously map the real-time data of the physical entities to the entity attribute fields corresponding to the virtual model, thereby determining the corresponding digital twin model. A preset physical mechanism model is obtained and embedded into the digital twin model. The physical mechanism model includes Newtonian mechanics, thermodynamics, and fluid dynamics equations to describe the physical behavior of equipment movement, energy consumption, and cargo stress. A data-driven model is trained based on historical fusion data and then embedded into the digital twin model. The data-driven model includes an anomaly detection model trained based on unsupervised learning, a predictive maintenance model trained based on supervised learning, and an optimization control model trained based on reinforcement learning.
[0010] Further, the step of receiving real-time operational data streams of the warehousing system according to the digital twin model and mapping the real-time operational data streams to a global state tensor includes: The real-time operating condition data stream of the warehousing system is received according to the digital twin model. The real-time operating condition data stream is standardized based on the construction rules of the virtual space data base to determine the corresponding standardized data. The standardized data is associated and fused according to its corresponding physical entity spatial location and timestamp to determine the corresponding high-dimensional state tensor. The high-dimensional state tensor represents the current global status of the warehousing system. The physical mechanism model is invoked to perform deterministic evolution calculations on the standardized data, and the data-driven model is invoked to perform probabilistic trend predictions on the standardized data. The results of the evolution calculations and trend predictions are then added as extended dimensions to the high-dimensional state tensor to determine the corresponding global state tensor.
[0011] Furthermore, each of the intelligent agents obtains relevant local state information from the global state tensor, and inputs the local state information into a preset local policy network for action prediction to determine the corresponding candidate action, including: Construct and pre-train a local policy network, which is a local reinforcement learning model, including a state space, an action space, and a reward function. The state space includes local state data, the action space defines the set of atomic actions that each agent can execute, and the reward function is defined as a multi-objective weighted sum. Each of the aforementioned intelligent agents subscribes to local state information related to itself from the global state tensor according to its function type, and performs dimensional indexing and normalization on the local state information according to a preset slicing rule to determine the corresponding local state feature vector. The local state feature vector is input into the local policy network for action prediction to determine the corresponding candidate actions.
[0012] Further, the step of resolving conflicts among the candidate actions according to a preset global policy network to determine the corresponding set of cooperative actions includes: Construct and pre-train a global policy network, which is a global reinforcement learning model. The state space of the global policy network is global state data, and the action space is the joint action of all agents. Spatiotemporal trajectory analysis is performed on each candidate action to construct a conflict graph. The conflict graph and the global state tensor are input into the global policy network. The conflict resolution value score of each candidate action combination is calculated according to a preset scoring function. The corresponding set of cooperative actions is determined based on maximizing the conflict resolution value score.
[0013] Secondly, this application provides an intelligent warehouse management device, comprising: The virtual data base determination module is used to collect multi-source heterogeneous data in the warehouse environment through the industrial Internet of Things bidirectional communication protocol stack, and perform semantic mapping, structural mapping, spatiotemporal mapping, relational mapping and state behavior mapping on the multi-source heterogeneous data to determine the corresponding virtual space data base. The global state determination module of the warehousing system is used to construct a corresponding digital twin model based on the virtual space data base, and embed a preset physical mechanism model and a set data-driven model into the digital twin model. It receives the real-time operating condition data stream of the warehousing system according to the digital twin model, and maps the real-time operating condition data stream into a global state tensor. The global state tensor is used to represent the current warehousing state and warehousing prediction information. The warehouse management action determination module is used to construct a federated learning framework based on multiple agents deployed on various device clusters. Each agent obtains local state information related to itself from the global state tensor and inputs the local state information into a preset local policy network for action prediction to determine the corresponding candidate actions. According to the federated learning framework, each candidate action is uploaded to the central server. The preset global policy network is used to resolve conflicts among the candidate actions to determine the corresponding set of cooperative actions. The set of cooperative actions is input into the task planner for task decomposition, and the executable atomic task sequence obtained after decomposition is sent to each device cluster to execute the warehouse management task.
[0014] Thirdly, this application provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of the intelligent warehouse management method.
[0015] Fourthly, this application provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the steps of the intelligent warehouse management method described above.
[0016] Fifthly, this application provides a computer program product, including a computer program / instructions, which, when executed by a processor, implement the steps of the intelligent warehouse management method described above.
[0017] As can be seen from the above technical solution, this application provides an intelligent warehouse management method and device. It collects multi-source heterogeneous data through an industrial IoT protocol stack and constructs a unified virtual space data foundation using five types of mapping rules: semantic, structural, spatiotemporal, relational, and state-behavioral. Based on this data foundation, a digital twin model is constructed, and a physical mechanism model and a data-driven model are embedded into the digital twin model. This allows the model to map real-time operating condition data streams into a global state tensor, enabling accurate perception and prediction of warehouse status. A multi-agent collaborative framework based on federated learning is constructed. Agents in each device cluster obtain their respective local state information based on the global state tensor to generate candidate actions. After conflict resolution by the central server's global policy network, the task planner decomposes and distributes executable atomic task sequences, thereby improving warehouse management efficiency and operational support. Attached Figure Description
[0018] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0019] Figure 1 This is a flowchart illustrating the intelligent warehouse management method in the embodiments of this application; Figure 2 This is a structural diagram of the intelligent warehouse management device in the embodiments of this application. Detailed Implementation
[0020] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0021] The acquisition, storage, use, and processing of data in this application all comply with relevant laws and regulations.
[0022] Considering the lack of coordination mechanisms between intelligent warehouse management and control and transportation systems, resulting in data fragmentation and weak adaptive capabilities, this application provides an intelligent warehouse management method and device. It collects multi-source heterogeneous data through an industrial IoT protocol stack and constructs a unified virtual space data foundation using five types of mapping rules: semantic, structural, spatiotemporal, relational, and state-behavioral. Based on this data foundation, a digital twin model is built, embedding a physical mechanism model and a data-driven model into it. This allows the model to map real-time operating condition data streams into a global state tensor, enabling accurate perception and prediction of warehouse status. A multi-agent collaborative framework based on federated learning is constructed. Agents in each device cluster obtain their local state information based on the global state tensor to generate candidate actions. After conflict resolution via a central server's global policy network, the task planner decomposes and distributes executable atomic task sequences, thereby improving warehouse management efficiency and operational support.
[0023] To improve warehouse management efficiency and operational support, this application provides an embodiment of an intelligent warehouse management method, see [link to embodiment]. Figure 1 The intelligent warehouse management method specifically includes the following: Step S101: Collect multi-source heterogeneous data in the warehouse environment through the industrial Internet of Things bidirectional communication protocol stack, and perform semantic mapping, structural mapping, spatiotemporal mapping, relational mapping and state behavior mapping on the multi-source heterogeneous data to determine the corresponding virtual space data base. Optionally, in order to solve the problem of data silos, the unified mapping of multi-source data in this embodiment uses the following four core mechanisms to address this issue.
[0024] 1. A unified virtual space and a "single source of fact" Implementation: Digital twins create a high-fidelity virtual model that perfectly corresponds to a physical entity (such as a factory, city, or product). This model then becomes a unified platform for integrating and displaying all relevant data.
[0025] Breaking down data silos: Regardless of the department (production, operations, supply chain) or system (MES, ERP, SCADA, IoT platform) from which the data originates, it ultimately converges and is mapped into this unified virtual model. All stakeholders view and operate on the same model and data source, avoiding disagreements arising from different data sources.
[0026] 2. Data Fusion and Contextual Relationships Implementation: Digital twins are not simply a collection of data. They use modeling and rules to link data from different isolated sources across time and space.
[0027] Spatial association: Associating equipment vibration data (from sensors) with the location of the equipment's 3D model.
[0028] Spatiotemporal correlation: Correlate production order data (from ERP) with energy consumption data (from IoT) at a specific time on a specific production line (from MES).
[0029] Breaking down data silos: This connection gives data new contextual meaning. For example, a single piece of equipment temperature data is just a numerical value, but when it is correlated with ambient temperature and humidity, equipment load, and maintenance records, it can be used for predictive maintenance.
[0030] 3. Standardized and mediated data interfaces Implementation: Digital twin platforms typically act as data intermediaries and translators. They provide various adapters, APIs, and connectors to interface with different data sources, regardless of whether the protocol is OPC UA, MQTT, Modbus, or HTTP REST.
[0031] Breaking down silos: It doesn't require overturning existing IT / OT systems, but rather "overlaying" them, converting heterogeneous data into a unified format or data model that the platform can understand, thereby achieving interoperability.
[0032] 4. Provide business-oriented value services Implementation: Based on the fused data, digital twins can be used to develop various advanced applications, such as simulation prediction, optimization control, and visual monitoring.
[0033] Breaking down data silos: The value generated by these applications, in turn, incentivizes departments to proactively share data. When production departments see that the predictive maintenance models from operations departments reduce downtime, they are more willing to provide production data to help optimize the models, thus creating a virtuous cycle of data sharing.
[0034] Optionally, in this embodiment, the core mapping rules for semantic mapping, structural mapping, spatiotemporal mapping, relational mapping, and state-behavior mapping are as follows: 1. Semantic mapping rules address the problem of inconsistent data meanings, resolving inconsistencies in naming, defining, and using units for the same thing across different systems.
[0035] Rule definition: Establishing a unified meaning and context for data from different sources.
[0036] Unified Data Dictionary: Create a common, machine-readable vocabulary. For example, "Equipment ID" from the ERP system and "Asset Code" from the MES system may point to the same machine. Semantic mapping rules would define Equipment ID == Asset Code and uniformly name them UniqueEquipmentIdentifier.
[0037] Contextual association: Giving data a spatiotemporal context. For example, a temperature reading (90°C) is meaningless in itself. Semantic rules will map it to: "At time T, the outlet temperature of boiler A at location X is 90°C." This associates an isolated data point with a physical entity, time, and space.
[0038] 2. Structure mapping rules solve the problem of "inconsistent data format and model" due to different database table structures and data storage formats.
[0039] Rule definition: Transform and map data with different structures (such as relational tables, time series data, documents, and images) into a unified data model of digital twins.
[0040] Data schema transformation: Define ETL or ELT rules to map rows, columns, JSON fields, etc., of the source data to specific attributes of the digital twin model. For example, map a row record in a database table to an attribute of a "device" object in the digital twin.
[0041] Multimodal data fusion: Defines how to associate unstructured data (such as CAD drawings and inspection videos) with structured data (such as sensor readings). For example, a rule could be to map the storage path of a device photo to the "appearance model" attribute of that device in a digital twin.
[0042] 3. Spatiotemporal mapping rules: The unique rules of digital twins that distinguish them from traditional data platforms solve the problem of "data being disconnected from the physical world". Data does not know "where" or "when", and cannot reproduce the real scene of the physical world.
[0043] Rule definition: Bind all data to its physical location and the time it occurs.
[0044] Spatial registration: Creating a precise geometric coordinate mapping for each physical entity in virtual space. For example, using GPS coordinates, indoor positioning beacons, or correspondences with 3D models, sensor data is precisely "attached" to the corresponding locations in a virtual factory.
[0045] Time synchronization: Establish a unified timeline for all time series data and handle differences in time zones and timestamps to ensure the correct causal relationships of events.
[0046] 4. Relationship and topology mapping rules address the problem of "missing data correlation," as isolated data cannot reflect system-level interactions and impacts. When data from one device is abnormal, other affected devices can be quickly located through relationship mapping.
[0047] Rule definition: Define and establish the logical relationships and connection methods between the physical entities represented by the data.
[0048] Define relationship types such as "belongs to", "connected to", "control", "upstream / downstream", etc.
[0049] Building a relationship graph: Constructing a knowledge graph in a digital twin to clearly show the network between entities.
[0050] 5. State and behavior mapping rules solve the problem that "data is a static snapshot and cannot reflect dynamic processes." Data is "what," but digital twins can express "why" and "how" through this rule.
[0051] Rule definition: Mapping real-time data streams to the dynamic states and behaviors of entity objects in a digital twin.
[0052] State machine definition: Define possible states for an entity (such as "running", "stopping", "maintenance", "fault") and specify the data conditions that trigger state transitions. For example, a rule could be: "When vibration value > threshold and current == 0, the device state maps from 'running' to 'fault'."
[0053] After the above five types of mapping processing, the originally scattered and unrelated raw data can be integrated into a virtual space data base with spatiotemporal semantics and correlations, serving as the core data source for the digital twin model and laying the data foundation for the subsequent construction and training of the digital twin model.
[0054] Step S102: Construct a corresponding digital twin model based on the virtual space data base, and embed the preset physical mechanism model and the set data-driven model into the digital twin model. Receive real-time operating condition data stream of the warehousing system based on the digital twin model, and map the real-time operating condition data stream into a global state tensor. The global state tensor is used to characterize the current warehousing status and warehousing prediction information. Optionally, in this embodiment, the digital twin model can be decomposed into the following four key training and construction stages: Phase 1: Model Building and Data Access Training. The goal of this phase is to create the "skeleton" and "neural network" of the digital twin. This process involves abstracting the model ontology based on a virtual space data foundation.
[0055] 1. Ontology model training: Based on the data foundation, define the "vocabulary" and "grammar" of physical entities in this domain. That is, clarify what components an entity has (e.g., motor, valve, bearing), their attributes (e.g., speed, temperature, lifespan), and the relationships between them (e.g., motor A drives conveyor belt B).
[0056] Process: This is not an AI training process, but a knowledge engineering process. Building upon a virtual data foundation, it requires collaboration between domain experts (engineers, operations personnel) and data scientists to construct a unified data schema using ontology tools or knowledge graph technology. This schema serves as the "constitution" for all subsequent data integration.
[0057] Output: A standardized, machine-readable domain knowledge model (ontology / knowledge graph).
[0058] The process of ontology abstraction builds a standardized, machine-readable knowledge skeleton (ontology / knowledge graph) for the data foundation. It tells the system "what's in the repository and what their relationships are," serving as the logical basis for all subsequent mappings and simulations.
[0059] 2. Data connector training teaches the system how to automatically and securely extract data from various isolated systems (WMS, WCS, IoT).
[0060] Process: Protocol Adaptation: Develop or configure universal connectors for different types of databases (SQL, NoSQL), industrial protocols (OPC UA, Modbus), APIs, etc.
[0061] Data cleaning and formatting rule training: Using a rule engine or a simple machine learning model, identify and correct outliers and missing values in the data, and convert all data into a format defined by the ontology model.
[0062] Output: A "pipeline" that can automatically extract and initially clean data from data silos. It is the pipeline connecting the physical world (data silos) and the digital world (data foundation).
[0063] Understandably, Phase One not only builds the ontology model on the foundation of historical data, but also trains the model through data connectors to teach it how to extract and clean data from real-time updated data, laying the foundation for a complete closed loop of self-updating during the model training phase.
[0064] Phase Two: Data Fusion and Mapping Training – Breaking Down Data Silos and Linking Heterogeneous Data into a Unified Virtual Model. This phase builds upon Phase One by having the model receive data and update it in real-time based on rules. Data from different sources and formats is accurately "attached" to the correct positions and attributes of the virtual model according to the ontology model, resulting in a spatiotemporally synchronized digital twin.
[0065] 1. Entity alignment and association learning automatically identify data from different systems that refer to the same physical entity. For example, "Production Line A Station 3" in the MES system and "Temperature Sensor T-103" in the SCADA system are actually different attributes of the same device.
[0066] Process: This is a semi-supervised or unsupervised machine learning process.
[0067] The system utilizes the ontology model constructed in Phase 1 as prior knowledge.
[0068] By analyzing the metadata (such as device ID, location information, timestamp) and context of the data, potential relationships between data entities can be automatically discovered using graph algorithms or clustering algorithms.
[0069] Experts validated some of the correlation results, and the model was continuously optimized based on feedback.
[0070] Output: A cross-system, unified entity relationship network (the knowledge graph becomes richer).
[0071] 2. Spatiotemporal data synchronization and mapping ensure that the spatiotemporal state of physical entities can be accurately mapped to the virtual model.
[0072] Process: For data streams containing time and location information (such as GPS coordinates and vibration sensor time-series data), the system needs to be trained to perform time series alignment and data fusion. For example, binding rotational speed, temperature, and pressure data at the same moment to the same component in the model.
[0073] Output: A time- and space-synchronized, dynamic digital twin.
[0074] Phase Three: Behavioral Model and Simulation Training. The digital twin evolves from a "static model" to a "dynamic model," gaining predictive and analytical capabilities. By introducing AI models in this phase, the digital twin can answer questions such as "What if...?" (simulation), "What will happen next?" (prediction), and "What's the best approach?" (optimization).
[0075] 1. Embedding physical mechanism models: Known physical laws (such as Newtonian mechanics, thermodynamics, and fluid mechanics) are embedded into digital twins to describe the intrinsic behavior of entities.
[0076] Process: This is traditional modeling and simulation, not machine learning training, but it is the foundation for high-fidelity digital twins.
[0077] 2. Data-driven model training: For complex behaviors that cannot be described by precise physical formulas (such as equipment degradation, nonlinear relationship between product quality and parameters), AI models are trained using fused data.
[0078] Anomaly detection model: Trained on normal historical data using unsupervised learning (such as isolated forest, autoencoder) to detect anomalous behavior in real time.
[0079] Predictive maintenance model: Using supervised learning (such as LSTM, XGBoost), a model is trained to predict the remaining useful life (RUL) of equipment, using historical operating data as features and failure occurrence as labels.
[0080] Optimize the control model: Use reinforcement learning to simulate various control strategies in a digital twin, find the optimal strategy with the highest energy efficiency and the highest output, and then apply it to the real entity.
[0081] Output: An "intelligent" digital twin with self-learning and self-predictive capabilities.
[0082] Phase Four: Closed-loop optimization and autonomous decision-making training to achieve a closed loop from "perception-analysis" to "decision-execution".
[0083] 1. Decision-making logic: Train the system to automatically make the optimal decision in specific scenarios.
[0084] Based on the prediction model trained in Phase 3, and combined with business rules (such as cost and security constraints), a decision tree or policy network is constructed.
[0085] For example, when a predictive maintenance model warns that a bearing will fail in 72 hours, the decision logic will automatically generate a work order, reserve spare parts, and schedule a maintenance window, while adjusting the production plan to minimize the impact.
[0086] Output: Automated decision-making and execution processes.
[0087] 2. Simulation verification and reinforcement learning: Before applying decisions to physical entities, countless "sandbox simulations" are conducted in the digital twin to ensure its safety and effectiveness.
[0088] Process: By utilizing reinforcement learning, the AI agent continuously tries and fails in a virtual environment to learn the optimal strategy. This process takes place entirely in the digital world, with zero risk and low cost.
[0089] Output: A highly reliable optimization strategy validated through extensive simulations.
[0090] Understandably, the above process describes in detail the construction and training of the digital twin model. The digital twin model obtained through the above steps is connected in real time to various types of working condition data streams in the warehouse through the Industrial Internet of Things interface, including equipment status, order pressure, inventory location, environmental temperature and humidity, etc. After receiving this real-time data, the digital twin model abstracts and maps the real-time data according to its own trained ontology model, and calls the onboard physical mechanism model to perform real-time simulation and verification of the current behavior of the equipment, and calls the data-driven model to predict and calculate the short-term state in the future, thereby obtaining a quantitative output representing the current state and the predicted future state of the warehouse system, namely the global state tensor.
[0091] In the global state tensor, the current warehouse state represents fine-grained information such as the location, load, and health of each piece of equipment, the occupancy rate of each storage location, and the congestion level of each path; the warehouse future prediction information includes forward-looking indicators such as order demand trends, potential equipment failure probabilities, and picking task congestion risks in the future.
[0092] Step S103: Construct a federated learning framework based on multiple agents deployed on each device cluster. Each agent obtains local state information related to it from the global state tensor and inputs the local state information into a preset local policy network for action prediction to determine the corresponding candidate actions. According to the federated learning framework, each candidate action is uploaded to the central server. According to the preset global policy network, conflict resolution is performed on each candidate action to determine the corresponding set of cooperative actions. The set of cooperative actions is input into the task planner for task decomposition, and the executable atomic task sequence obtained after decomposition is sent to each device cluster to execute the warehouse management task.
[0093] Optionally, in this embodiment, a federated learning framework and a reinforcement learning model are combined to predict warehouse management actions. The combination of the federated learning framework and the reinforcement learning model includes: Training phase: Each agent independently trains a local model on its own local data. The central server collects the model parameter updates (such as gradients or weights) uploaded by each agent, aggregates these parameters (federated averaging), and then distributes the aggregated parameters to each agent to form iterative training.
[0094] Execution phase: Each agent uses its locally trained model to make independent inferences.
[0095] Specifically, step S103 is the complete data processing flow of the execution phase.
[0096] First, each agent on each device cluster has a pre-trained local policy network. The local policy network is based on a reinforcement learning model. The state space of the local policy network defines the range of environmental information that the agent can perceive when making decisions. The action space defines the set of all atomic actions that the agent can execute. The reward function is the core signal that guides the agent to learn and optimize.
[0097] Understandably, each agent's local policy network is different, customized based on its perceived environmental range, set of atomic actions, and optimization signals. The local policy network pre-training process is completed in a digital twin environment, using historical operating data or a simulation environment, allowing the agent to continuously adjust its neural network parameters through trial and error until its decision-making behavior can stably obtain high cumulative rewards and output high-quality candidate actions in real time.
[0098] During actual operation, each agent extracts information relevant to itself from the global state tensor. For example, an AGV responsible for material handling will subscribe to information such as "occupancy status of each aisle", "idle status of each charging pile", and "queue of tasks to be handled at each location"; while a sorting robot will subscribe to information such as "arrival time of goods on the conveyor belt", "barcode recognition result of goods", and "backlog level at each sorting exit".
[0099] After extracting the aforementioned local state information, each agent inputs it into its pre-trained local policy network, which quickly predicts one or more "candidate actions" based on the current local state. The candidate actions represent the optimal choice for each agent from the current perspective.
[0100] Next, all agents upload their generated candidate actions to the central server. The central server receives the action-level information and performs a global evaluation and conflict resolution on all received candidate actions.
[0101] Corresponding to the local policy network, the central server constructs and pre-trains the global policy network. The global policy network also employs a reinforcement learning model, but its state space is complete global state data—a digital description of the entire warehousing system—while its action space represents the joint actions of all agents. During training, the system teaches the global policy network how to evaluate the quality of a joint action, or directly learns the mapping from the global state to the optimal joint action. The training objective is to maximize the overall efficiency of the entire warehousing system while satisfying safety constraints.
[0102] Before obtaining the optimal set of coordinated actions through global evaluation, conflict resolution is necessary. Conflict resolution is crucial in this stage: for example, two AGV candidate actions may simultaneously require occupying the same narrow aisle; or the candidate action path of one AGV may overlap with the working area of another robotic arm in time. The global policy network comprehensively considers multiple objectives such as overall efficiency, safety constraints, and energy consumption optimization, coordinating, prioritizing, or modifying these candidate actions to ultimately generate a conflict-free, globally optimal set of coordinated actions. This set of coordinated actions is used for warehouse management.
[0103] After obtaining the set of collaborative actions, this set is input into the task planner, which further decomposes high-level, complex collaborative actions into "atomic task sequences" that can be directly executed by the devices. Finally, the system sends the decomposed executable atomic task sequences to each device cluster via the Industrial Internet of Things (IIoT) protocol. Each device executes the instructions in the sequence sequentially, completing a series of warehouse management tasks from goods handling, sorting, shelving to outbound delivery.
[0104] Preferably, this embodiment designs a cross-brand device abstraction layer (DAL) to convert heterogeneous device instructions into a unified control protocol. The generated execution instructions are sent to the device abstraction layer, which stores the brand, model, communication protocol, and capability model of each device. Based on the target device's metadata, the device abstraction layer converts the unified format control instructions into proprietary instructions that can be recognized by the device's private protocol or SDK, and sends them to the physical device for execution through the corresponding protocol adapter.
[0105] Specifically, in warehousing and logistics, AGVs (Automated Guided Vehicles), robotic arms, and sorting machines from different manufacturers need to work together. After the scheduling system sends the task instruction "transport goods to Picking Station A" to the DAL, the DAL is responsible for converting this task into private SDKs or API calls for different AGV brands (such as Geek+, Quicktron).
[0106] After the physical device executes the command, its state changes are reverse-normalized into unified state data through the device abstraction layer and sent back to the digital twin model. The digital twin model updates the virtual state based on the feedback data and triggers the AI optimization model (prediction, allocation, energy consumption, etc.) to perform real-time or near-real-time re-optimization.
[0107] This example demonstrates how this embodiment achieves a closed loop of "perception-decision-execution" in warehousing through digital twins, breaking down information silos and solving the problem of heterogeneous equipment based on multi-agent collaboration using federated learning. This enables the warehousing system to respond to order changes in real time, dynamically plan equipment paths and operation sequences, improve throughput while reducing collision risks and energy consumption, and achieve autonomous collaborative operation of multiple devices and multiple tasks.
[0108] As described above, the intelligent warehouse management method provided in this application can collect multi-source heterogeneous data through the industrial Internet of Things protocol stack, and construct a unified virtual space data foundation using five types of mapping rules: semantic, structural, spatiotemporal, relational, and state behavior. Based on this data foundation, a digital twin model is constructed, and the physical mechanism model and data-driven model are embedded into the digital twin model so that the model can map real-time operating condition data streams into a global state tensor, thereby achieving accurate perception and prediction of warehouse status. A multi-agent collaborative framework based on federated learning is constructed, where agents in each device cluster obtain their respective local state information based on the global state tensor to generate candidate actions. After conflict resolution by the central server's global policy network, the task planner decomposes and distributes executable atomic task sequences, thereby improving warehouse management efficiency and operational support.
[0109] In one embodiment of the intelligent warehouse management method of this application, it may further include the following: Step S201: The semantic mapping includes establishing a unified data dictionary, mapping data entities with the same meaning but different names or definitions from different source systems to the same standard identifier, and associating isolated data with physical entities, time tags, and spatial locations in context. Step S202: The structure mapping includes mapping relational tables, time-series data, JSON documents and unstructured data to the unified data structure of the digital twin model through predefined ETL or ELT transformation rules, and associating the storage path of unstructured data with the attribute fields of the corresponding physical entities.
[0110] Optionally, in this embodiment, the core task of semantic mapping is to resolve the problem of "inconsistent data meaning" between different systems. In a warehousing environment, an ERP system might record the same device as "Asset Number A001," while a WCS system might call it "Device ID: conveyor_01." The names differ, but they point to the same physical entity. This step first establishes a unified data dictionary to define standardized identifiers and naming conventions for all key entities in the warehousing environment (such as equipment, storage locations, materials, orders, etc.). The system automatically scans the metadata of each source system, identifies fields with the same meaning but different expressions, and maps them uniformly to the same standard identifier, thereby eliminating semantic ambiguity.
[0111] Building upon this, semantic mapping further endows each isolated data point with complete contextual information. Raw sensor data is typically just a numerical value, such as "25.6". After semantic mapping, this value is associated with a specific physical entity (e.g., "temperature and humidity sensor in lane three"), given a precise time stamp (e.g., "2025-03-20 14:32:05.123"), and bound to a spatial location (e.g., "shelf area A12 with coordinates (125.3, 48.7, 2.0)"). Through this association, the originally isolated numbers become information units with complete spatiotemporal semantics.
[0112] Optionally, in this embodiment, structure mapping aims to address the problem of "inconsistent data formats and models" between different systems. Data sources in warehousing systems are diverse and vary in format: relational databases store order and inventory records in two-dimensional tables, time-series databases save equipment sensor data in a time-series format, equipment logs are often output in JSON document format, while inspection images, CAD drawings, etc., are unstructured data. These data with different structures cannot be directly integrated into a unified digital twin model.
[0113] The system performs unified format conversions on various types of data using predefined ETL or ELT transformation rules. For data in relational tables, the system maps each row of records to an attribute field of an object in the digital twin model. For time-series data, the system aligns it along a timeline and binds it to the dynamic attributes of the corresponding device. For JSON documents, the system parses their key-value pair structure, extracts key information, and populates it into the specified fields of the model. Specifically, for unstructured data such as images, videos, or PDF drawings, the system does not attempt to directly convert their content; instead, it uses their storage path as an attribute value, associating it with the attribute fields of the corresponding physical entity. For example, if a maintenance photo of an AGV is stored at a specified path on a file server, that path will be written into the "Maintenance Record Attachment" field of that AGV device in the digital twin model.
[0114] Through step S202, this embodiment successfully uses semantic mapping rules and structural mapping rules to enable the processed data to have a traceable source and the same structure, thus forming data assets.
[0115] In one embodiment of the intelligent warehouse management method of this application, it may further include the following: Step S301: The spatiotemporal mapping includes establishing a precise geometric coordinate mapping for each physical entity in the virtual space through GPS coordinates, indoor positioning beacons or 3D model registration, and establishing a unified time axis for all time series data, and synchronously processing timestamp differences from different sources. Step S302: The relationship mapping includes constructing a knowledge graph between entities and defining the logical relationships and connection methods between physical entities; Step S303: The state behavior mapping includes defining a state machine for each physical entity and setting real-time data conditions to trigger state transitions for the state machine.
[0116] Optionally, in this embodiment, the core task of spatiotemporal mapping is to accurately copy the spatial location and temporal attributes of each entity in the physical world into the virtual space, so that the digital twin model has the ability to recreate the real spatiotemporal scene.
[0117] In the spatial dimension, the system establishes geometric coordinate mappings for each physical entity through various positioning technologies. For outdoor or large-scale scenarios, GPS coordinates are used to pinpoint the location of mobile devices such as forklifts and transport vehicles on a virtual map. For indoor warehousing environments, indoor positioning beacons such as UWB, Bluetooth, or Wi-Fi are used to determine the real-time location of entities such as AGVs, bins, and personnel with sub-meter accuracy. For fixed equipment such as shelves, conveyor lines, and robotic arms, 3D model registration is used to precisely align their CAD models to the coordinate system of the virtual warehouse, ensuring that each entity has unique spatial coordinates.
[0118] In terms of time, the system establishes a unified timeline for all time-series data. Due to differences in data acquisition frequencies and clock sources among different devices—for example, AGV positioning data is reported every 50 milliseconds, while temperature and humidity sensors collect data every 5 seconds—and the potential millisecond-level time discrepancies between different device systems, the system aligns the timestamps of all data to a unified benchmark using the NTP time synchronization protocol or a custom timestamp calibration mechanism. This ensures the correct reconstruction of the chronological order and causal relationships of events.
[0119] Through spatiotemporal mapping, previously isolated data points have acquired clear spatiotemporal labels, providing a foundation for subsequent applications such as trajectory playback, event tracing, and spatial analysis.
[0120] Optionally, in this embodiment, the task of relation mapping is to construct a logical relation network between physical entities in virtual space, so that data is transformed from discrete points into an interconnected knowledge system.
[0121] By constructing a knowledge graph, various logical relationships between entities are defined and stored. Relationship types include, but are not limited to: spatial affiliation relationships, such as "the pallet is located in the 3rd row of the aisle" and "the AGV belongs to the charging area's No. 1 pile"; connection relationships, such as "conveyor belt A connects to sorting port B" and "the downstream of the inlet is the elevator"; control relationships, such as "the PLC controller manages conveyor segments 1 to 5" and "the scheduling system issues instructions to the AGV queue"; and upstream and downstream process relationships, such as "the outbound temporary storage area flows to the packaging station".
[0122] Relationships are stored and managed using graph databases or semantic networks, forming a relational topology graph covering the entire warehousing system. When a device malfunctions, the system can quickly locate the affected upstream and downstream devices through relational mapping. For example, when a sorting machine malfunction is triggered, the system can immediately identify which conveyor belts will become blocked and which AGVs need to be rerouted, thus providing structured information for subsequent collaborative decision-making.
[0123] Optionally, in this embodiment, the core of state-behavior mapping is to dynamically transform the real-time data stream of an entity into its behavioral state in virtual space, so that the digital twin can reflect the operating conditions of the physical entity in real time.
[0124] Define a finite state machine for each physical entity. Taking an AGV as an example, its state machine may include states such as "idle," "driving," "charging," "loading," "faulty," and "offline." Each state has a clear semantic meaning and corresponding behavioral characteristics.
[0125] Subsequently, the system sets up mapping logic from real-time data to state transitions. This logic typically exists in the form of threshold conditions or rule expressions. For example, the following mapping rules are defined for AGVs: when the positioning data is continuously moving and the speed is greater than zero, the state changes to "driving"; when the current sensor reading is zero and the charging interface voltage is positive, the state changes to "charging"; when the vibration sensor amplitude exceeds a preset threshold and the motor speed is zero, the state automatically changes from "driving" to "fault"; when no heartbeat signal is received for 30 consecutive seconds, the state changes to "offline".
[0126] The mapping logic runs continuously on the data processing pipeline. Each time a new batch of real-time data is received, the system reassesses the current state of each entity and triggers the corresponding state transition. State change events are recorded in the time-series database and simultaneously pushed to the upper-level monitoring and decision-making modules via a message mechanism.
[0127] Through state-behavior mapping, the digital twin model is no longer a static three-dimensional geometric model, but a dynamic mirror that can reflect the health status, working stage, and abnormal conditions of the equipment in real time.
[0128] Through step S303, this embodiment successfully uses spatiotemporal mapping, relational mapping, and state-behavior mapping to enable the processed virtual space data base to serve as the core data source for the digital twin model.
[0129] In one embodiment of the intelligent warehouse management method of this application, it may further include the following: Step S401: Based on the virtual space data base, construct a virtual model of the physical storage space, equipment entities, and work processes through a 3D modeling engine, and construct a mapping channel to synchronously map the real-time data of the physical entities to the entity attribute fields corresponding to the virtual model, thereby determining the corresponding digital twin model. Step S402: Obtain a preset physical mechanism model and embed the physical mechanism model into the digital twin model, wherein the physical mechanism model includes Newtonian mechanics, thermodynamics and fluid dynamics equations, used to describe the physical behavior of equipment movement, energy consumption generation and cargo stress. Step S403: Train a data-driven model based on historical fusion data, and embed the data-driven model into the digital twin model. The data-driven model includes an anomaly detection model trained based on unsupervised learning, a predictive maintenance model trained based on supervised learning, and an optimization control model trained based on reinforcement learning.
[0130] Optionally, in this embodiment, a 3D modeling engine is first used to build a virtual model that is highly consistent with the physical warehousing environment based on the geometric dimensions, spatial layout, equipment parameters, and other information stored in the virtual space data base. After modeling is completed, the system constructs a bidirectional mapping channel to synchronously write real-time data collected by sensors, encoders, RFID, and other devices deployed in the physical world into the corresponding entity attribute fields in the virtual model via an industrial IoT protocol. For example, the real-time coordinates of the AGV are synchronized to the position attribute of the virtual AGV, the real-time current of the motor is synchronized to the attribute field, and the occupancy status of the storage location sensor is synchronized to the occupancy indicator of the shelf model.
[0131] Optionally, in this embodiment, in order to enable the static 3D model to reflect dynamic information in real time, an operable virtual carrier is embedded in the virtual model.
[0132] The pre-defined physical mechanism model is embedded into the constructed digital twin model. The physical mechanism model is built based on classical physics equations, including: Newton's equations of motion, used to describe the acceleration changes, force balance, and collision response during equipment start-up and shutdown; thermodynamic equations, used to calculate the temperature rise effect caused by mechanical friction and motor operation and its impact on efficiency; and fluid dynamics equations, used to simulate the impact of airflow organization on the temperature distribution of cold chain warehouses or the pressure transmission of hydraulic equipment.
[0133] Once embedded, the digital twin model gains the ability to extrapolate equipment behavior based on physical laws. For example, when the system plans for an AGV to turn at a certain speed, the physical mechanism model can calculate in real time the energy consumption required for the action, whether the lateral acceleration exceeds the limit, and the instantaneous wear increment of the tires.
[0134] Three types of data-driven models were trained using historical fusion data and embedded into the digital twin model. The first type is an unsupervised learning-based anomaly detection model, trained on normal historical data, used to identify abnormal signals deviating from normal patterns in real-time operating conditions, detecting early signs of equipment failure without requiring labeling. The second type is a supervised learning-based predictive maintenance model, trained using historical operating data as features and fault occurrence records as labels, capable of predicting the remaining service life of the equipment and the probability of failure. The third type is a reinforcement learning-based optimization control model, which autonomously explores the most energy-efficient and effective control strategies through trial and error learning in the digital twin environment.
[0135] With these three types of models embedded, the digital twin model gains intelligent capabilities of self-sensing, self-prediction, and self-optimization. The anomaly detection model monitors the health status of equipment in real time, the predictive maintenance model provides early warnings of potential faults, and the optimization control model dynamically adjusts operating parameters to balance efficiency and energy consumption.
[0136] Through step S403, this embodiment successfully upgrades the digital twin model into an intelligent decision-making carrier through behavioral model simulation training. Combined with physical constraints, the data-driven model provides predictive capabilities based on the statistical laws of real data, enabling the digital twin model to make a trade-off between physical feasibility and statistical optimality.
[0137] In one embodiment of the intelligent warehouse management method of this application, it may further include the following: Step S501: Receive the real-time operating condition data stream of the warehousing system according to the digital twin model, standardize the real-time operating condition data stream based on the construction rules of the virtual space data base, determine the corresponding standardized data, associate and fuse the standardized data according to its corresponding physical entity spatial location and timestamp, and determine the corresponding high-dimensional state tensor. The high-dimensional state tensor represents the current global status of the warehousing system. Step S502: Call the physical mechanism model to perform deterministic evolution calculation on the standardized data, and call the data-driven model to perform probabilistic trend prediction on the standardized data. The results of evolution calculation and trend prediction are added to the high-dimensional state tensor as an extended dimension to determine the corresponding global state tensor.
[0138] Optionally, in this embodiment, this step is a data processing flow that transforms the original real-time data into a global state tensor after the digital twin model is constructed.
[0139] First, the digital twin model continuously receives various types of real-time operational data streams from the warehouse site, including equipment operating parameters, order task status, and sensor readings. The system then standardizes this raw data according to the construction rules defined by the previously established virtual space data foundation.
[0140] Preferably, once the ontology model is successfully constructed, standardized mapping can also be performed based on the ontology model. Standardization includes operations such as unifying data formats, converting units of measurement, and standardizing naming identifiers, transforming heterogeneous data from different vendors, protocols, and structures into a consistent standard expression form within the system, and determining the corresponding standardized data.
[0141] Secondly, the system associates each piece of standardized data with the corresponding physical entity's spatial location and timestamp, aggregating scattered perceptual information into a unified overall expression, and ultimately determining a high-dimensional state tensor. This tensor is a quantitative representation of the current global status of the warehousing system, encompassing fine-grained information such as the location, status, and load of each piece of equipment, the accessibility of each path, and the occupancy status of each storage location.
[0142] After obtaining the high-dimensional state tensor representing the current situation, the process is as follows: First, a pre-defined physical mechanism model is invoked to perform deterministic evolutionary calculations on the standardized data based on the laws of physics. For example, based on parameters such as the AGV's current speed, acceleration, and route gradient, its precise position sequence within the next few seconds is calculated. Second, a pre-trained data-driven model is invoked to perform probabilistic trend predictions on the standardized data based on patterns learned from historical data. For example, the probability of a device malfunctioning within the next hour or the likelihood of congestion on a picking path within the next five minutes is predicted. The results of the above evolutionary calculations and trend predictions are then added as new extended dimensions to the original high-dimensional state tensor, ultimately determining the corresponding global state tensor.
[0143] Through step S502, this embodiment successfully determined the global state tensor, providing an input basis for subsequent multi-agent collaborative decision-making that combines "current cognition" and "future prediction" capabilities.
[0144] In one embodiment of the intelligent warehouse management method of this application, it may further include the following: Step S601: Construct and pre-train a local policy network, which is a local reinforcement learning model, including a state space, an action space, and a reward function. The state space includes local state data, the action space defines the set of atomic actions that each agent can execute, and the reward function is defined as a multi-objective weighted sum. Step S602: Each agent subscribes to local state information related to itself from the global state tensor according to its function type, and performs dimensional indexing and normalization on the local state information according to the preset slicing rules to determine the corresponding local state feature vector. Step S603: Input the local state feature vector into the local policy network to predict actions and determine the corresponding candidate actions.
[0145] Optionally, in this embodiment, the local policy network is a local reinforcement learning model.
[0146] The model comprises three core elements: state space, action space, and reward function.
[0147] The state space defines the types of local states that an agent focuses on when making decisions. For example, an AGV agent may focus on its own position, battery level, and the degree of congestion on the path ahead. The action space defines all the smallest operable units that the agent can execute, i.e., the set of atomic actions, such as "move forward one meter", "turn left 90 degrees", "stop" or "grab goods". The reward function serves as the training guide signal and is defined as a multi-objective weighted sum. This means it simultaneously considers multiple optimization objectives—such as picking efficiency, energy consumption, and equipment lifespan degradation—and assigns a weight to each objective, ultimately summing them to obtain a comprehensive reward value. By maximizing this cumulative reward, the agent learns during the pre-training phase how to make decisions that balance multiple objectives in a local environment.
[0148] Optionally, in this embodiment, each agent actively subscribes to relevant local state information from the global state tensor based on its own function type. The global state tensor contains a complete state description of the entire warehousing system, but a single agent only cares about a small part of it. For example, a robotic arm agent only needs to know the location and weight of goods within its working area, and does not need to know the path planning of the remote AGV. After subscription, the agent processes the extracted data according to preset slicing rules: first, it performs dimensional indexing, that is, accurately extracts the required data fields from the multidimensional tensor; then, it performs normalization, uniformly scaling all data to a similar numerical range, such as mapping distance to 0 to 1 and speed to -1 to 1. After processing, a structured local state feature vector is output.
[0149] Optionally, in this embodiment, the local state feature vector is input into a pre-trained local policy network. After forward computation, a candidate action is selected from the action space, and the decision result is output. The candidate action represents the next action that the agent considers optimal in the current local state.
[0150] For example, an AGV's local policy network might output "drive to the nearest charging station" as a candidate action based on the characteristics of low battery and congestion on the path ahead. The entire process is end-to-end: directly mapping from the original state features to specific actions, without the need for manually written rules, achieving adaptive and real-time local decision-making.
[0151] Through step S603, this embodiment uses a local reinforcement learning model to allow each local agent to try and fail in a virtual space and output the optimal candidate action for warehouse management, with zero risk and low cost.
[0152] In one embodiment of the intelligent warehouse management method of this application, it may further include the following: Step S701: Construct and pre-train a global policy network, wherein the global policy network is a global reinforcement learning model, the state space of the global policy network is global state data, and the action space is the joint action of all agents; Step S702: Perform spatiotemporal trajectory analysis on each candidate action, construct a conflict graph, input the conflict graph and the global state tensor into the global policy network, score the conflict resolution value of each candidate action combination according to a preset scoring function, and determine the corresponding set of cooperative actions based on maximizing the conflict resolution value score.
[0153] Optionally, in this embodiment, a global policy network is constructed and pre-trained. The global policy network is a global reinforcement learning model, whose state space is defined as global state data, i.e., a complete digital description of all equipment, storage locations, orders, routes, and other information in the entire warehousing system—that is, all the content covered by the global state tensor generated in the previous steps. The action space is defined as the joint actions of all agents. This means that the output of the global policy network is not a single action of a particular device, but rather a combination of actions that should be performed by every agent in the entire device cluster. During the pre-training phase, the system uses historical operating data or a simulation environment to train the reinforcement learning model, enabling it to learn to evaluate the long-term cumulative rewards of different combinations of joint actions given a global state, thereby acquiring the ability to make globally optimal decisions.
[0154] Optionally, in this embodiment, specific conflict resolution is performed on the candidate actions uploaded by each intelligent agent.
[0155] The system performs spatiotemporal trajectory analysis on all candidate actions. It analyzes the start and end times of each candidate action on the time axis and its spatial movement path or work area. For example, a candidate action for an AGV involves occupying a certain lane within a specific time period, while a candidate action for a robotic arm involves covering a workbench area within the same time period. Based on the analysis results, the system constructs a conflict graph. A conflict graph is a graph structure where each node represents a candidate action. If two candidate actions overlap in time and interfere spatially (e.g., path intersection, area overlap, resource contention), an edge is added between the corresponding nodes to indicate a conflict.
[0156] Subsequently, the system inputs the constructed conflict graph and the current global state tensor into the pre-trained global policy network. A scoring function based on the network head is used to score the conflict resolution value of different candidate action combinations. The scoring function comprehensively evaluates the global benefit of each candidate action combination, including multiple dimensions such as overall task completion efficiency, energy consumption cost, equipment lifespan loss, and collision risk, while also considering the conflict relationships reflected in the conflict graph.
[0157] The system iterates through or samples multiple possible combinations of candidate actions, calculating a conflict resolution value score for each combination. A higher score indicates that the combination can achieve a higher global cumulative reward while resolving conflict. Finally, the system selects the set of candidate action combinations that maximizes the score and defines it as the final set of cooperative actions.
[0158] Through step S702, this embodiment successfully realizes the transformation from locally optimal candidate actions to globally conflict-free optimal joint actions, ensuring the security and efficiency of multi-device collaboration.
[0159] To improve warehouse management efficiency and operational support, this application provides an embodiment of an intelligent warehouse management device for implementing all or part of the aforementioned intelligent warehouse management method. See [link to embodiment]. Figure 2 The intelligent warehouse management device specifically includes the following components: The virtual data base determination module 10 is used to collect multi-source heterogeneous data in the warehouse environment through the industrial Internet of Things bidirectional communication protocol stack, and perform semantic mapping, structural mapping, spatiotemporal mapping, relational mapping and state behavior mapping on the multi-source heterogeneous data to determine the corresponding virtual space data base. The global state determination module 20 of the warehousing system is used to construct a corresponding digital twin model based on the virtual space data base, embed a preset physical mechanism model and a set data-driven model into the digital twin model, receive the real-time operating condition data stream of the warehousing system according to the digital twin model, and map the real-time operating condition data stream into a global state tensor. The global state tensor is used to characterize the current warehousing state and warehousing prediction information. The warehouse management action determination module 30 is used to construct a federated learning framework based on multiple agents deployed on various device clusters. Each agent obtains local state information related to itself from the global state tensor, inputs the local state information into a preset local policy network for action prediction, determines the corresponding candidate actions, uploads each candidate action to the central server according to the federated learning framework, resolves conflicts of each candidate action according to the preset global policy network, determines the corresponding set of cooperative actions, inputs the set of cooperative actions into a task planner for task decomposition, and sends the executable atomic task sequence obtained after decomposition to each device cluster to execute the warehouse management task.
[0160] As described above, the intelligent warehouse management device provided in this application can collect multi-source heterogeneous data through the industrial Internet of Things protocol stack, and construct a unified virtual space data foundation using five types of mapping rules: semantic, structural, spatiotemporal, relational, and state behavior. Based on this data foundation, a digital twin model is constructed, and the physical mechanism model and data-driven model are embedded into the digital twin model so that the model can map real-time operating condition data streams into a global state tensor, thereby achieving accurate perception and prediction of warehouse status. A multi-agent collaborative framework based on federated learning is constructed, where agents in each device cluster obtain their respective local state information based on the global state tensor to generate candidate actions. After conflict resolution by the central server's global policy network, the task planner decomposes and distributes executable atomic task sequences, thereby improving warehouse management efficiency and operational support.
[0161] This invention also provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor performs the above-described intelligent warehouse management method.
[0162] This invention also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described intelligent warehouse management method.
[0163] This invention also provides a computer program product, which includes a computer program that, when executed by a processor, implements the above-described intelligent warehouse management method.
[0164] In this embodiment of the invention, multi-source heterogeneous data can be collected through an industrial IoT protocol stack, and a unified virtual space data foundation can be constructed using five types of mapping rules: semantic, structural, spatiotemporal, relational, and state behavior. A digital twin model is built based on this data foundation, and a physical mechanism model and a data-driven model are embedded into the digital twin model. This allows the model to map real-time operating condition data streams into a global state tensor, enabling accurate perception and prediction of warehouse status. A multi-agent collaborative framework based on federated learning is constructed. Agents in each device cluster obtain their respective local state information based on the global state tensor to generate candidate actions. After conflict resolution by the central server's global policy network, the task planner decomposes and distributes executable atomic task sequences, thereby improving warehouse management efficiency and operational support.
[0165] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0166] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0167] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0168] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0169] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above descriptions are merely specific embodiments of the present invention and are not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. An intelligent warehouse management method, characterized in that, The method includes: Multi-source heterogeneous data in the warehouse environment is collected through the industrial Internet of Things bidirectional communication protocol stack. Semantic mapping, structural mapping, spatiotemporal mapping, relational mapping and state behavior mapping are performed on the multi-source heterogeneous data to determine the corresponding virtual space data base. A corresponding digital twin model is constructed based on the virtual space data base, and a preset physical mechanism model and a set data-driven model are embedded in the digital twin model. The real-time operating condition data stream of the warehousing system is received based on the digital twin model, and the real-time operating condition data stream is mapped into a global state tensor. The global state tensor is used to characterize the current warehousing status and warehousing prediction information. A federated learning framework is constructed based on multiple agents deployed on various device clusters. Each agent obtains local state information related to its global state tensor and inputs the local state information into a preset local policy network for action prediction to determine the corresponding candidate actions. The candidate actions are then uploaded to the central server according to the federated learning framework. The candidate actions are then conflict-resolved according to the preset global policy network to determine the corresponding set of cooperative actions. The set of cooperative actions is then input into a task planner for task decomposition, and the resulting executable atomic task sequence is sent to each device cluster to execute the warehouse management task.
2. The intelligent warehouse management method according to claim 1, characterized in that, The semantic mapping and the structural mapping include: The semantic mapping includes establishing a unified data dictionary, mapping data entities with the same meaning but different names or definitions from different source systems to the same standard identifier, and associating isolated data with physical entities, time tags, and spatial locations in context. The structure mapping includes mapping relational tables, time-series data, JSON documents, and unstructured data to the unified data structure of the digital twin model through predefined ETL or ELT transformation rules, and associating the storage path of unstructured data with the attribute fields of the corresponding physical entities.
3. The intelligent warehouse management method according to claim 1, characterized in that, The spatiotemporal mapping, the relational mapping, and the state-behavior mapping include: The spatiotemporal mapping includes establishing a precise geometric coordinate mapping for each physical entity in virtual space through GPS coordinates, indoor positioning beacons, or 3D model registration, and establishing a unified time axis for all time series data, synchronously processing timestamp differences from different sources. The relationship mapping includes constructing a knowledge graph between entities and defining the logical relationships and connection methods between physical entities; The state behavior mapping includes defining a state machine for each physical entity and setting real-time data conditions to trigger state transitions for the state machine.
4. The intelligent warehouse management method according to claim 1, characterized in that, The step of constructing a corresponding digital twin model based on the virtual space data base, and embedding a preset physical mechanism model and a set data-driven model into the digital twin model, includes: Based on the virtual space data base, a virtual model of the physical storage space, equipment entities, and work processes is constructed through a 3D modeling engine, and a mapping channel is constructed to synchronously map the real-time data of the physical entities to the entity attribute fields corresponding to the virtual model, thereby determining the corresponding digital twin model. A preset physical mechanism model is obtained and embedded into the digital twin model. The physical mechanism model includes Newtonian mechanics, thermodynamics, and fluid dynamics equations to describe the physical behavior of equipment movement, energy consumption, and cargo stress. A data-driven model is trained based on historical fusion data and then embedded into the digital twin model. The data-driven model includes an anomaly detection model trained based on unsupervised learning, a predictive maintenance model trained based on supervised learning, and an optimization control model trained based on reinforcement learning.
5. The intelligent warehouse management method according to claim 1, characterized in that, The step of receiving real-time operational data streams from the warehousing system based on the digital twin model and mapping the real-time operational data streams to a global state tensor includes: The real-time operating condition data stream of the warehousing system is received according to the digital twin model. The real-time operating condition data stream is standardized based on the construction rules of the virtual space data base to determine the corresponding standardized data. The standardized data is associated and fused according to its corresponding physical entity spatial location and timestamp to determine the corresponding high-dimensional state tensor. The high-dimensional state tensor represents the current global status of the warehousing system. The physical mechanism model is invoked to perform deterministic evolution calculations on the standardized data, and the data-driven model is invoked to perform probabilistic trend predictions on the standardized data. The results of the evolution calculations and trend predictions are then added as extended dimensions to the high-dimensional state tensor to determine the corresponding global state tensor.
6. The intelligent warehouse management method according to claim 1, characterized in that, Each of the aforementioned agents obtains relevant local state information from the global state tensor, and inputs the local state information into a preset local policy network for action prediction to determine corresponding candidate actions, including: Construct and pre-train a local policy network, which is a local reinforcement learning model, including a state space, an action space, and a reward function. The state space includes local state data, the action space defines the set of atomic actions that each agent can execute, and the reward function is defined as a multi-objective weighted sum. Each of the aforementioned intelligent agents subscribes to local state information related to itself from the global state tensor according to its function type, and performs dimensional indexing and normalization on the local state information according to a preset slicing rule to determine the corresponding local state feature vector. The local state feature vector is input into the local policy network for action prediction to determine the corresponding candidate actions.
7. The intelligent warehouse management method according to claim 1, characterized in that, The step of resolving conflicts among the candidate actions according to a preset global policy network to determine the corresponding set of cooperative actions includes: Construct and pre-train a global policy network, which is a global reinforcement learning model. The state space of the global policy network is global state data, and the action space is the joint action of all agents. Spatiotemporal trajectory analysis is performed on each candidate action to construct a conflict graph. The conflict graph and the global state tensor are input into the global policy network. The conflict resolution value score of each candidate action combination is calculated according to a preset scoring function. The corresponding set of cooperative actions is determined based on maximizing the conflict resolution value score.
8. An intelligent warehouse management device, characterized in that, The device includes: The virtual data base determination module is used to collect multi-source heterogeneous data in the warehouse environment through the industrial Internet of Things bidirectional communication protocol stack, and perform semantic mapping, structural mapping, spatiotemporal mapping, relational mapping and state behavior mapping on the multi-source heterogeneous data to determine the corresponding virtual space data base. The global state determination module of the warehousing system is used to construct a corresponding digital twin model based on the virtual space data base, and embed a preset physical mechanism model and a set data-driven model into the digital twin model. It receives the real-time operating condition data stream of the warehousing system according to the digital twin model, and maps the real-time operating condition data stream into a global state tensor. The global state tensor is used to represent the current warehousing state and warehousing prediction information. The warehouse management action determination module is used to construct a federated learning framework based on multiple agents deployed on various device clusters. Each agent obtains local state information related to itself from the global state tensor and inputs the local state information into a preset local policy network for action prediction to determine the corresponding candidate actions. According to the federated learning framework, each candidate action is uploaded to the central server. The preset global policy network is used to resolve conflicts among the candidate actions to determine the corresponding set of cooperative actions. The set of cooperative actions is input into the task planner for task decomposition, and the executable atomic task sequence obtained after decomposition is sent to each device cluster to execute the warehouse management task.
9. An electronic device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the steps of the intelligent warehouse management method according to any one of claims 1 to 7.
10. A computer-readable storage medium having a computer program stored thereon, characterized in that, When executed by a processor, the computer program implements the steps of the intelligent warehouse management method according to any one of claims 1 to 7.