Learning systems, trained models, information processing devices, warehouse systems, and methods

JP2026100242APending Publication Date: 2026-06-19TOYOTA INDUSTRIES CORP +1

View PDF 1 Cites 0 Cited by

Patent Information

Authority / Receiving Office: JP · JP
Patent Type: Applications
Current Assignee / Owner: TOYOTA INDUSTRIES CORP
Filing Date: 2024-12-09
Publication Date: 2026-06-19

Application Information

Patent Timeline

09 Dec 2024

Application

19 Jun 2026

Publication

JP2026100242A

IPC: B65G1/137; G06N3/04

AI Tagging

Application Domain

Neural architectures Storage devices

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure 2026100242000001_ABST

Patent Text Reader

Abstract

This system quickly predicts the optimal processing order for multiple tasks in an automated warehouse, taking into account predetermined constraints. [Solution] The learning system comprises a simulation unit 500, a determination unit 510, and a learning unit 515. The simulation unit 500 performs a simulation of the operation of multiple transport devices under predetermined constraints for each combination of multiple picking locations specified by multiple tasks and multiple processing sequences specified for each task queue. The determination unit 510 determines for each of the multiple tasks whether or not the task was processed in the simulation. The learning unit 515 generates a trained model 520 in which the relationship between the above combinations and the determination results of the determination unit 510 has been learned.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The present disclosure relates to a learning system, a learned model, an information processing apparatus, a warehouse system, and a method.

Background Art

[0002] International Publication No. 2021 / 019702 (Patent Document 1) discloses an optimization system for an automated warehouse. This system generates a learned model based on teacher data including various records obtained in the past logistics processing scenarios of the automated warehouse (past inbound and outbound operation logs, operation logs of the conveying device, attribute data of commodity items, etc.). This learned model is used to optimize the logistics of the automated warehouse. In Patent Document 1, a stacker crane type automated warehouse is shown as the automated warehouse for which logistics optimization is targeted.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] Shuttle-type automated warehouses are widely used. Unlike stacker crane type ones, shuttle-type automated warehouses may have, in addition to a plurality of first passages connecting a plurality of storage locations and a plurality of loading / unloading ports, a second passage that intersects these first passages and connects the plurality of loading / unloading ports to each other. Such an automated warehouse is subject to certain constraints (in one example, a constraint that a plurality of conveying devices (shuttle trolleys) travel in these passages without colliding with each other). For example, in this automated warehouse, only one conveying device may travel in each first passage, while on the second passage, two or more conveying devices may travel as long as they do not collide with each other. In Patent Document 1, such constraints are not considered.

[0005] Under the constraints described above, predicting the optimal processing order for multiple tasks in an automated warehouse and operating multiple conveying devices according to that order is important from the standpoint of processing efficiency for these tasks. Therefore, it is conceivable to predict the optimal processing order by simulating the operation of multiple conveying devices under the above constraints. However, in this case, the computational cost may increase and it may take a long time.

[0006] This disclosure has been made to solve the above-mentioned problems, and its purpose is to provide a learning system, a trained model, an information processing device, a warehouse system, and a method that enable the rapid prediction of a preferred processing order for multiple tasks in an automated warehouse, taking into account predetermined constraints. [Means for solving the problem]

[0007] The learning system disclosed herein is used for learning to process multiple tasks, which involve picking multiple items stored in multiple storage locations in an automated warehouse using multiple conveying devices and transporting them to multiple exits. The multiple tasks are defined by multiple task queues, each corresponding to one of the multiple exits and arranged in parallel with the others. The learning system comprises a simulation unit, a determination unit, and a learning unit. The simulation unit performs a simulation of the operation of multiple conveying devices in the automated warehouse under predetermined constraints for each combination of multiple picking locations specified for each of the multiple tasks and multiple processing sequences specified for each task queue in the multiple tasks. The determination unit determines, for each of the multiple tasks, whether or not the task was processed in the simulation. The learning unit generates a trained model in which the relationship between the combination and the determination result of the determination unit is learned.

[0008] The combination of multiple picking locations and processing sequences (which storage location's goods to pick for each task, and in what order those goods are transported to the corresponding exit) is related to the processing status of multiple tasks (whether each task has been processed or not). The processing status of multiple tasks when a transport device operates in an automated warehouse under the above combinations can be predicted in advance through simulation. With the above configuration, the trained model can be used as a surrogate model to replace the simulator. Specifically, using the trained model, it is possible to infer the simulator's prediction results for the processing status of multiple tasks (for example, the predicted number of processed tasks) for various combinations of multiple picking locations and multiple processing sequences related to multiple tasks to be processed. As a result, multiple processing sequences that optimize the processing status of multiple tasks can be predicted as the optimal processing sequence. Furthermore, a surrogate model can generally perform optimization in a shorter time than the simulator from which the model is generated. Therefore, with the above configuration, the optimal processing sequence for multiple tasks in an automated warehouse, taking into account predetermined constraints, can be predicted in a short time.

[0009] Preferably, the learning unit includes an input vector group generation unit, a learnable vector group generation unit, an output vector group generation unit, and an adjustment unit. The input vector group generation unit generates a plurality of input vectors, each representing a combination feature, according to a plurality of picking locations and a plurality of processing orders. The learnable vector group generation unit generates a plurality of learnable vectors, each representing the degree of correlation between the plurality of input vectors, according to the plurality of input vectors using an attention mechanism. The output vector group generation unit generates a plurality of output vectors that represent the features of the plurality of input vectors by transforming the plurality of input vectors according to the plurality of learnable vectors. The adjustment unit adjusts each component of the plurality of input vectors and each component of the plurality of learnable vectors so that the error function based on the vector generated by fully connecting each component of the plurality of output vectors and the vector representing the determination result of the determination unit is optimized.

[0010] With the above configuration, in addition to the characteristics of each combination, the characteristics of the correlation between the multiple input vectors (the correlation between picking locations and processing order among multiple tasks) are reflected in the multiple output vectors. Then, each component of the multiple input vectors and each component of the multiple learnable vectors are adjusted so that the error function is optimized (e.g., minimized). As a result, a trained model is generated that takes into account not only the characteristics of the multiple picking locations themselves and the characteristics of the multiple processing orders themselves, but also the characteristics of the correlation between picking locations and processing order among multiple tasks. Consequently, inference can be appropriately performed using the trained model. Therefore, based on this inference result, it is possible to predict more appropriate multiple processing orders.

[0011] Preferably, the input vector group generator includes a first generator, a second generator, a third generator, and a fourth generator. The first generator generates a plurality of first vectors, each representing a characteristic of a plurality of picking locations, according to a plurality of picking locations. The second generator generates a plurality of second vectors, each representing a characteristic of a plurality of processing sequences, according to a plurality of processing sequences. The third generator generates a plurality of third vectors by concatenating the plurality of first vectors and the plurality of second vectors. The fourth generator generates a plurality of fourth vectors, each having a dimension lower than the dimension of the plurality of third vectors, as a plurality of input vectors by fully connecting the components of each of the plurality of third vectors.

[0012] With the above configuration, the characteristics of multiple picking locations are reflected in multiple first vectors, and the characteristics of multiple processing orders are reflected in multiple second vectors. The characteristics of the multiple first vectors themselves and the characteristics of the multiple second vectors themselves are collectively reflected in multiple third vectors. For each task, the characteristics of the correlation between the first and second vectors are reflected in the fourth vector. Therefore, for each task, a trained model is generated that takes into account the correlation between picking locations and processing orders. As a result, inference can be appropriately performed using the trained model. Therefore, based on these inference results, it is possible to predict more appropriate multiple processing orders.

[0013] Preferably, the first generation unit generates a plurality of first vectors by performing a convolution operation on each of a plurality of first tensors that each indicate a plurality of picking locations.

[0014] Preferably, the first generation unit generates a plurality of first vectors by embedding information of a plurality of picking locations into a vector space of a predetermined dimension.

[0015] Preferably, the second generation unit generates multiple second vectors by performing a convolution operation on each of the multiple second tensors, each representing a multiple processing order.

[0016] Preferably, the second generation unit generates a plurality of second vectors by embedding information of a plurality of processing sequences into a vector space of a predetermined dimension.

[0017] Preferably, the predetermined constraints include the constraint that the conveying devices included in the plurality of conveying devices do not collide with each other in each of the plurality of first passages connecting the plurality of storage locations and the plurality of discharge points, and the second passages connecting the plurality of discharge points.

[0018] By adopting the above configuration, it is possible to predict the optimal processing order for multiple tasks in a short time, while taking into account constraints to avoid collisions between conveying devices.

[0019] The trained model in this disclosure is generated by the training system described above. The information processing device disclosed herein comprises a model storage unit, a receiving unit, an inference unit, and a search unit. The model storage unit stores a trained model generated by the learning system described above. The receiving unit receives processing commands for multiple tasks that transport multiple packages stored in an automated warehouse to multiple exits using multiple transport devices. The inference unit performs an inference process to predict the number of tasks to be processed among the multiple tasks, using the trained model in the model storage unit, according to multiple picking locations specified in the multiple tasks and the processing order of the multiple tasks which has been provisionally set for inferring the predicted value. The search unit searches for the processing order of the multiple tasks in which the predicted value is optimized.

[0020] The warehouse system of this disclosure comprises an automated warehouse and the information processing device described above. The method disclosed herein is a method used for learning to process multiple tasks of picking multiple goods stored in multiple storage locations in an automated warehouse using multiple conveying devices and transporting them to multiple exits. The multiple tasks are defined by multiple task queues, each corresponding to one of the multiple exits and arranged in parallel with the others. The method includes the step of performing a simulation of the operation of multiple conveying devices in the automated warehouse under predetermined constraints for each combination of multiple picking locations specified in each of the multiple tasks and multiple processing sequences specified for each task queue in each of the multiple tasks. The method further includes the step of determining, for each of the multiple tasks, whether or not the task was processed in the simulation, and the step of generating a trained model in which the relationship between the combination and the determination result in the determination step has been learned.

[0021] Preferably, the step of generating the learned model includes: generating a plurality of input vectors each indicating a combined feature according to a plurality of picking locations and a plurality of tasks; generating, by an attention mechanism according to the plurality of input vectors, a plurality of learnable vectors each representing the correlation degree between the plurality of input vectors; generating a plurality of output vectors indicating the features of the plurality of input vectors by converting the plurality of input vectors according to the plurality of learnable vectors; and adjusting each component of the plurality of input vectors and each component of the plurality of learnable vectors so that an error function based on a vector generated by fully connecting each component of the plurality of output vectors and a vector indicating the determination result in the determination step is optimized.

[0022] Preferably, the step of generating the plurality of input vectors includes: generating a plurality of first vectors each indicating the feature of a plurality of picking locations according to the plurality of picking locations; generating a plurality of second vectors each indicating the feature of a plurality of processing orders according to the plurality of processing orders; generating a plurality of third vectors by concatenating the plurality of first vectors and the plurality of second vectors; and generating, as the plurality of input vectors, a plurality of fourth vectors having a dimension lower than the dimension of the plurality of third vectors by fully connecting each component of the plurality of third vectors.

Advantages of the Invention

[0023] According to the present disclosure, it is possible to predict, in a short time, a suitable processing order for a plurality of tasks in an automated warehouse considering predetermined constraints.

Brief Description of the Drawings

[0024] [Figure 1] It is an overall configuration diagram of a warehouse system according to an embodiment. [Figure 2] It is a schematic diagram for explaining the configuration of an automated warehouse. [Figure 3] It is a schematic diagram for explaining the configuration of an automated warehouse. [Figure 4] This is a schematic diagram illustrating the configuration of an automated warehouse. [Figure 5] This diagram illustrates multiple tasks in an automated warehouse, managed using task management information. [Figure 6] This is a diagram to explain task management information. [Figure 7] This is a functional block diagram of a terminal device according to an embodiment. [Figure 8] This is a conceptual diagram illustrating the method for generating a trained model (described later) in the embodiment. [Figure 9] This diagram illustrates the detailed functional configuration of the learning unit. [Figure 10] This diagram illustrates the detailed functional configuration of the input vector group generator. [Figure 11] This figure illustrates an example of a method used by the feature vector group generator to generate a set of feature vectors. [Figure 12] This figure illustrates another example of a method by which a feature vector group generator generates a set of feature vectors. [Figure 13] This figure illustrates an example of a method used by the feature vector group generator to generate a set of feature vectors. [Figure 14] This figure illustrates another example of a method by which a feature vector group generator generates a set of feature vectors. [Figure 15] This diagram illustrates the detailed methods of inference using a pre-trained model. [Figure 16] This figure shows experimental results demonstrating that a trained model, acting as a surrogate model, can approximate the simulator with high accuracy. [Figure 17] This figure shows the values of various hyperparameters for the trained model that were set in the above experiment. [Figure 18] This flowchart shows an example of a process performed by the terminal device in the embodiment. [Figure 19] This flowchart shows the detailed steps for generating the trained model in S115. [Figure 20] This flowchart shows the detailed procedure for generating the input vector set in S205. [Figure 21] This flowchart illustrates an example of a process performed by a task management system. [Modes for carrying out the invention]

[0025] Embodiments of this disclosure will be described in detail below with reference to the drawings. The same or corresponding parts in the drawings will be denoted by the same reference numerals and their descriptions will not be repeated. Each of the embodiments and its modifications may be combined with one another as appropriate. [Embodiment] Figure 1 is an overall configuration diagram of a warehouse system according to an embodiment. The warehouse system 100 includes an automated warehouse 1, a task management device 2, an order management device 3, and a terminal device 4.

[0026] Automated warehouse 1 stores multiple types of goods and is configured to automatically receive and ship these goods. Automated warehouse 1 is a shuttle-type automated warehouse. The configuration of automated warehouse 1 is explained in detail in Figure 2.

[0027] Task management device 2 is an information processing device that manages multiple tasks in automated warehouse 1. Task management device 2 is typically a server or workstation, but may also be a general-purpose computer such as a PC (Personal Computer) or a smart device. Task management device 2 includes a processor 21, memory 22, storage 23, input device 24, network controller 26, and bus 27.

[0028] The processor 21 includes processing circuitry such as a CPU (Central Processing Unit), MPU (Micro Processing Unit), or GPU (Graphics Processing Unit). The memory 22 includes volatile storage devices such as DRAM (Dynamic Random Access Memory) or SRAM (Static Random Access Memory). The storage 23 includes non-volatile storage devices such as ROM (Read Only Memory), HDD (Hard Disk Drive), SSD (Solid State Drive), or flash memory. The storage 23 stores a system program (not shown) including an OS (Operating System), a program 231 for task management of the automated warehouse 1, and task management information 232 used by program 231. The processor 21 generates or updates the task management information 232 by reading the system program and program 231, expanding them into memory 22, and executing them. The processor 21 is capable of performing various parallel processes.

[0029] Although only one processor is shown in Figure 1, the task management device 2 may include multiple processors. That is, the task management device 2 includes one or more processors. The same applies to the memory 22 and storage 23. In this specification, "processor" is not limited to a processor in the narrow sense that executes processing in a stored-program manner, but may include hardwired circuits such as ASICs (Application Specific Integrated Circuits) and FPGAs (Field-Programmable Gate Arrays). Therefore, the term "processor" can also be interpreted as a processing circuit (circuitry or processing circuitry) whose processing is predefined by computer-readable code and / or hardwired circuits. The task management device 2 may be divided into multiple units according to function.

[0030] The input device 24 is a keyboard or touchscreen, etc., that accepts user operation. The network controller 26 is configured to communicate with the outside of the task management device 2 (order management device 3 and terminal device 4) according to the control of the processor 21. The bus 27 connects the components of the task management device 2 in a communicative manner.

[0031] The order management device 3 outputs processing commands (task processing commands INS) for multiple tasks to be shipped from the automated warehouse 1 to the task management device 2. The task processing commands INS define multiple tasks to transport each of the multiple packages. When the task management device 2 receives the task processing commands INS from the order management device 3, it determines the multiple tasks defined in the task processing commands INS and generates task management information 232 to process these tasks.

[0032] Terminal device 4 is an information processing device such as a PC, and includes a communication device 40, a storage device 41, an input device 42, and a processing device 44. The communication device 40 can communicate with the task management device 2. The storage device 41 stores a simulator 45. The simulator 45 is computer simulation software for simulating the operation of multiple shuttles (described later) in the automated warehouse 1. The simulator 45 will be described in detail later. The input device 42 receives input for various user operations. The processing device 44 includes a processor and memory (neither shown). The processor executes various processes, including parallel processing, according to the program stored in the memory. Terminal device 4 corresponds to an example of the "learning system" of this disclosure.

[0033] In addition, the task management device 2 can determine multiple tasks based on user operations on the input device 24, instead of determining multiple tasks according to the task processing command INS from the order management device 3.

[0034] The task management device 2 outputs transport commands to multiple shuttles (described later) of the automated warehouse 1 based on the task management information 232, so as to transport multiple packages specified in each of the multiple tasks defined by the task processing command INS or user operation.

[0035] Figures 2 to 4 are schematic diagrams illustrating the configuration of automated warehouse 1. Figure 2 shows a plan view of automated warehouse 1 from above. Referring to Figure 2, automated warehouse 1 is provided with four vertical aisles 411-414, one horizontal aisle 43, and four shipping ports 440 (441-444). Automated warehouse 1 includes eight storage units 421-428 and four shuttles 51-54. Note that the number of components other than the horizontal aisle 43 is not limited to these numbers and can be any number of two or more.

[0036] The vertical passages 411 to 414 are transport passages that extend in the vertical direction (direction V in the figure). Each of the vertical passages 411 to 414 is restricted to carrying only one shuttle at a time. The vertical passages 411 to 414 correspond to the "multiple first passages" in this disclosure.

[0037] Storage units 421 to 428 are located on either side of the vertical passages 411 to 414, respectively. Each of the storage units 421 to 428 has multiple storage spaces 46. In this example, each storage space 46 can either store only one item L or store no items L at all.

[0038] Referring to Figure 3, the number of storage locations 46 in the V direction is also represented as "n". The number of storage units in the H direction is also represented as "m". In the figure, the total number of storage locations 46 is determined by n × m. In this example, m = 8 and n = 8. Of the n × m storage locations 46, multiple packages L are stored in multiple storage locations 46. In the figure, the coordinates of the storage location 46 located at the i-th position from the top and the j-th position from the left are also represented as x(i,j) (1 ≤ i ≤ n, 1 ≤ j ≤ m).

[0039] Referring again to Figure 2, the lateral passage 43 is a passage that extends in the lateral direction (direction H in the figure). The lateral passage 43 connects two adjacent shipping ports 440 among the shipping ports 441 to 444 on the upper side of the figure, and connects two adjacent vertical passages among the vertical passages 411 to 414 on the lower side. The lateral passage 43 is subject to the constraint that two or more shuttles may travel in the lateral passage 43 as long as they do not collide with each other. Note that the lateral passage 43 corresponds to the "second passage" in this disclosure. Each of the shipping ports 441 to 444 corresponds to an exit for shipping cargo L.

[0040] Shuttles 51-54 are transport devices (shuttle carts) for transporting cargo L. The shuttles may also be called agents. Shuttles 51-54 operate according to transport commands from the task management device 2 and are used to process multiple tasks that each transport multiple cargo L. These tasks involve picking multiple cargo L stored in multiple storage locations 46 using shuttles 51-54 and transporting them to shipping ports 441-444. Each of the shuttles 51-54 picks cargo L at the picking location specified in the transport command from the task management device 2 and transports the picked cargo L to shipping port 440, which is one of the shipping ports 441-444 specified in the transport command.

[0041] In the situation shown in Figure 2, shuttle 51 is transporting cargo L in the horizontal passage 43. Shuttle 52 is transporting cargo L in the vertical passage 411. Shuttles 53 and 54 are traveling along the vertical passages 412 and 414, respectively, in order to pick cargo L.

[0042] Referring to Figure 4, if we divide each of the vertical aisles 411 to 414 into the same number of cells 415 as the number of storage locations 46 along that aisle (8 in this example), then there are a total of 32 (=8 × 4) cells 415 in vertical aisles 411 to 414. The location of the cell 415 that is the p-th cell from the top and the q-th cell from the left is also represented as x(p,q). In this example, 1 ≤ p ≤ 8 and 1 ≤ q ≤ 4. x(p,q) corresponds to the storage locations 46 located on either side of that cell 415 (Figure 2 or Figure 3).

[0043] Figure 5 is a diagram illustrating multiple tasks in the automated warehouse 1, which are managed using task management information 232. Referring to Figure 5, multiple tasks T are shown, and in this example, there are 32 tasks T. These tasks T pick up multiple packages L stored in multiple storage locations 46 using shuttles 51-54 and transport them to shipping docks 441-444. In detail, each task T picks the packages L that are the target of transport for that task from among the multiple packages L at the picking location designated for that task and transports them to the shipping dock 440 corresponding to that task from among the shipping docks 441-444.

[0044] The 32 tasks T are determined by one of the task queues TQ1 to TQ4. Task queues TQ1 to TQ4 are arranged in parallel and correspond to shipping bays 441 to 444, respectively. When the task queues are not distinguished, they are also referred to as "task queue TQ". The tasks T determined by task queues TQ1 to TQ4 are also referred to as tasks T1-1 to T1-8, T2-1 to T2-8, T3-1 to T3-8, and T4-1 to T4-8, respectively. The numbers 1 to 8 at the end indicate the processing order within each task queue.

[0045] For example, each of tasks T1-1 to T1-8 in task queue TQ1 corresponds to a task of transporting package L from the designated picking location to the shipping dock 441. Similarly, each of tasks T2-1 to T2-8 in task queue TQ2 corresponds to a task of transporting package L from its designated picking location to the shipping dock 442. Each of tasks T3-1 to T3-8 in task queue TQ3 corresponds to a task of transporting package L from its designated picking location to the shipping dock 443. Each of tasks T4-1 to T4-8 in task queue TQ4 corresponds to a task of transporting package L from its designated picking location to the shipping dock 444.

[0046] For each task queue TQ, tasks T are processed in the order shown from top to bottom in the diagram. For example, in task queue TQ1, tasks T1-1, ... T1-8 are processed in that order; in task queue TQ2, tasks T2-1, ... T2-8 are processed in that order; in task queue TQ3, tasks T3-1, ... T3-8 are processed in that order; and in task queue TQ4, tasks T4-1, ... T4-8 are processed in that order.

[0047] As described above, each task T picks up a package L at the designated picking location for that task and transports the package L to the shipping port 440 corresponding to the task queue TQ that defines the task. The coordinates of this picking location are determined by x(i,j) (Figure 3) or x(p,q) (Figure 4).

[0048] Figure 6 is a diagram illustrating task management information 232. Referring to Figure 6, task management information 232 includes picking location information 233 and processing order information 234.

[0049] The picking location information 233 indicates multiple picking locations specified for each of the multiple tasks T. These picking locations are also referred to as a "picking location set". For example, if there are 32 tasks T, the picking location set will be represented by 32 picking locations.

[0050] The processing order information 234 indicates multiple processing orders specified for each task queue TQ in the above-mentioned multiple tasks T. Specifically, the processing order information 234 indicates which task queue TQ each of these tasks T is processed in and at what position (which of tasks T1-1, ..., T4-8 each task T corresponds to). The above-mentioned multiple processing orders are also referred to as a "processing order set".

[0051] As described above, the automated warehouse 1 is subject to a constraint to avoid collisions between shuttles (corresponding to an example of the "predetermined constraints" in this disclosure), where only one shuttle from among 51 to 54 can travel in each of the vertical aisles 411 to 414, while two or more shuttles from among 51 to 54 can travel in the horizontal aisle 43. Under such constraints, predicting the preferred processing order of multiple tasks T in the automated warehouse 1 (which task T should be processed in what order within which task queue TQ) is important from the viewpoint of the processing efficiency of these tasks T. This is because the processing order of each task T within each task queue TQ can affect the performance of the automated warehouse 1.

[0052] Furthermore, automated warehouse 1 is subject to the so-called MAPD (multi-agent pickup and delivery) problem, which concerns how to transport multiple packages L to shipping ports 441-444 using shuttles 51-54 without causing collisions. Therefore, it is conceivable to use simulator 45 as a solver for this MAPD problem to predict the optimal processing order of multiple tasks T under the above constraints. However, in this case, the computational cost may increase and it may take a long time. Therefore, there is a need to predict the optimal processing order of multiple tasks T in a shorter time than simulator 45.

[0053] As described below, the terminal device 4 (Figure 1) according to the embodiment has a configuration that enables it to predict the above-mentioned preferred processing sequence in a shorter time than the simulator 45.

[0054] Figure 7 is a functional block diagram of the terminal device 4 according to the embodiment. Figure 8 is a conceptual diagram illustrating the method for generating a trained model (described later) in the embodiment. Referring to Figure 7, the terminal device 4 includes, as its functional configuration, a simulation unit 500, a condition setting unit 505, a determination unit 510, and a learning unit 515. These functions are realized by the processing unit 44 of the terminal device 4 running the simulator 45.

[0055] The simulation unit 500 performs a simulation of the operation of shuttles 51 to 54 in the automated warehouse 1 within a predetermined time, according to the conditions set by the condition setting unit 505 (described later). The simulation unit 500 performs the above simulation under the constraint that only one shuttle from among shuttles 51 to 54 can travel in each of the vertical aisles 411 to 414, while two or more shuttles from among shuttles 51 to 54 can travel in the horizontal aisle 43 as long as they do not collide. In this simulation as well, 32 tasks T are assumed, and these tasks T are defined by one of the task queues TQ1 to TQ4, as shown in the example in Figure 5. Shuttles 51 to 54 are assumed to operate in order to process these tasks T sequentially.

[0056] The condition setting unit 505 sets the execution conditions for the above simulation. Referring to Figure 8, these execution conditions are set by a combination of a picking location set and a processing sequence set. The picking location set corresponds to a plurality of picking locations (x1 to x32) specified for each of the 32 tasks T. The processing sequence set corresponds to a plurality of processing sequences (y1 to y32) specified for each task queue TQ in these tasks T. Each processing sequence in the processing sequence set indicates the processing order of the task within the task queue TQ that defines the task T that specifies the corresponding picking location (for example, y1 corresponds to x1), that is, which task queue TQ the task T is processed in and at what position. In one example, y1 to y8, y9 to y16, y17 to y24, and y25 to y32 represent the processing orders in task queues TQ1, TQ2, TQ3, and TQ4, respectively.

[0057] The picking location set and processing sequence set differ for each of the combination patterns described above. For example, the combinations of x1~x32 and y1~y32 in combination pattern 1 are different from the combinations of x1~x32 and y1~y32 in another pattern.

[0058] The simulation unit 500 executes a simulation for each possible combination pattern. In this example, the condition setting unit 505 sets the execution conditions (combinations of x1 to x32 and y1 to y32) for each of the above combination patterns, and the simulation unit 500 executes the simulation for each of the execution conditions set in this manner.

[0059] The determination unit 510 determines for each task T whether or not the task T was processed within the predetermined time in the simulation performed by the simulation unit 500. The determination unit 510 sets the result of this determination process as zi (i=1,...32). For example, the determination unit 510 determines a task T that was determined to have been processed within the predetermined time in the simulation as "processed" and sets zi to "1" for this task T. On the other hand, it determines a task T that was determined not to have been processed within the predetermined time in the simulation as "unprocessed" and sets zi to "0" for this task T.

[0060] The determination process by the determination unit 510 is performed for each of the above combination patterns. The determination result in this determination process reflects the processing status of the 32 tasks T in the simulation performed for a given combination pattern, specifically the number of tasks that were processed within a predetermined time (number of processed tasks). In the above example, the number of processed tasks corresponds to the number of zi for which "1" is defined.

[0061] The learning unit 515 generates a trained model 520 in the memory device 41. The trained model 520 is a neural network model, specifically a deep learning model in which the relationship between the picking location set and processing sequence set (x1~x32, y1~y32) in the above combination pattern and the determination result (z1~z32) of the determination unit 510 has been learned. The trained model 520 is used to optimize the processing efficiency of the 32 tasks T defined in the task queues TQ1~TQ4. This will be explained in more detail later.

[0062] The above combination patterns indicate which storage location 46's goods L will be picked in each task T, and in what order those goods L will be transported to the corresponding shipping dock 440. These combination patterns relate to the processing status of multiple tasks T within a predetermined time in the automated warehouse 1 (whether or not each task T has been processed). For example, the processing status in combination pattern 1 may differ from the processing status in another combination pattern. The processing status of the 32 tasks T when shuttles 51-54 operate for a predetermined time in each combination pattern can be predicted in advance by simulation by the simulation unit 500.

[0063] The trained model 520 is generated as a surrogate model for the simulator 45. In this embodiment, the trained model 520 can be used in place of the simulator 45 to predict a suitable processing sequence set (y1 to 32) when the order management device 3 (Figure 1) defines multiple tasks T scheduled for processing and specifies their picking location sets (x1 to x32). Specifically, the trained model 520 can be used to infer the prediction results of the simulator 45 for the processing status (z1 to z32) of each task T for various combinations of the picking location set and a certain processing sequence set.

[0064] As a result, a single set of processing sequences that maximizes the number of processed tasks based on the inferred prediction results can be predicted as the preferred processing sequence set. In other words, by searching for and identifying a solution to a combinatorial optimization problem to optimize the combination patterns of a given set of picking locations and the set of processing sequences as variables, the processing sequence set indicated by the identified combination pattern can be treated as the predicted result of the preferred processing sequence set.

[0065] Furthermore, since the trained model 520 is a surrogate model for the simulator 45, it is generated taking into account the aforementioned constraints regarding the movement of shuttles 51-54. As a surrogate model, the trained model 520 can predict the above-mentioned preferred processing sequence set in a shorter time and with lower computational cost than the simulator 45 from which it is generated. As will be described in detail later, the inventors have experimentally demonstrated that the computation time for this prediction has been significantly reduced. From the above, according to the embodiment, the preferred processing sequence for each task T, taking into account the constraints for avoiding collisions between shuttles, can be predicted in a shorter time than the simulator 45.

[0066] Figure 9 is a diagram illustrating the detailed functional configuration of the learning unit 515. Referring to Figure 9, the learning unit 515 includes, as its functional configuration, an input vector group generation unit 530, a learnable vector group generation unit 535, an output vector group generation unit 540, a fully connected unit 545, a judgment result vector generation unit 547, and an adjustment unit 550. These functions are executed for each combination pattern (x1~x32, y1~y32) of picking location sets and processing sequence sets set by the condition setting unit 505.

[0067] The input vector group generation unit 530 generates an input vector group 555 according to the combination pattern set by the condition setting unit 505. The detailed functional configuration of the input vector group generation unit 530 and the specific method for generating the input vector group 555 will be explained in detail later. The input vector group 555 contains 32 input vectors 557 that are generated in parallel, the same number as the number of tasks T assumed in the simulation. Each of these tasks T is defined by one of the task queues TQ1 to TQ4, as in the example in Figure 5, and is processed sequentially within the task queue TQ that defines the task T.

[0068] Each input vector 557 is a feature vector that represents the characteristics of the above combination pattern and has a predetermined dimension. In the following description, 128 dimensions will be used as the predetermined dimension. "128 dimensions" is an example of a dimension larger than the total number of storage locations 46 in the automated warehouse 1 (8 × 8 = 64 in the example of Figures 2 and 3). The input vector group 555 is input to the learnable vector group generation unit 535 and the output vector group generation unit 540, respectively, which are described below.

[0069] The learnable vector group generator 535 has an attention mechanism. The attention mechanism is, for example, the self-attention mechanism of a transformer. The learnable vector group generator 535 generates a learnable vector group 560 according to the input vector group 555 using the attention mechanism. The learnable vector group 560 contains 32 learnable vectors 562 that are generated in parallel, the same number as the input vectors 557. Each learnable vector 562 represents the degree of correlation between the input vectors 557 and is used for feature output of the input vector group 555. In this example, each learnable vector 562 is a 128-dimensional vector, similar to the input vectors 557. The learnable vector group generator 535 outputs the learnable vector group 560 to the output vector group generator 540.

[0070] The output vector group generator 540 is, for example, an encoder for a transformer. The output vector group generator 540 generates the output vector group 565 by transforming the input vector group 555 according to the learnable vector group 560. The output vector group 565 contains 32 output vectors 567 generated in parallel, the same number as the input vectors 557, and represents the characteristics of the input vector group 555. Each output vector 567 is a 128-dimensional vector, just like the input vectors 557 in this example. The output vector group generator 540 outputs the output vector group 565 to the fully connected unit 545.

[0071] The fully connected unit 545 calculates a scalar quantity 571 corresponding to each output vector 567 by fully connecting the 128 components (elements) of the output vector (specifically, inputting all linear combinations of these components into a predetermined activation function). This generates a prediction result vector 570 containing 32 scalar quantities 571 as components. The prediction result vector 570 represents the predicted processing status (processed / unprocessed) of each task T, corresponding to the scalar quantity 571. For example, the scalar quantity 571 for a task T predicted to be "processed" by the determination unit 510 is set to "1". On the other hand, the scalar quantity 571 for a task T predicted to be "unprocessed" by the determination unit 510 is set to "0". The prediction result vector 570 reflects the characteristics of the combination pattern set by the condition setting unit 505. The fully connected unit 545 outputs the prediction result vector 570 to the adjustment unit 550.

[0072] The determination result vector generation unit 547 generates a determination result vector 572 that indicates the determination result of the determination unit 510. In this example, the determination result vector 572 is a 32-dimensional vector and has 32 components that correspond to the processing status of the 32 tasks T in the simulation described above. Each component has, for example, "1" if the corresponding task T is determined to be processed by the determination unit 510, and "0" if the corresponding task T is determined to be unprocessed by the determination unit 510. The determination result vector generation unit 547 outputs the determination result vector 572 to the adjustment unit 550.

[0073] The adjustment unit 550 adjusts each component of the 32 input vectors 557 and each component of the 32 learnable vectors 562 through the input vector group generation unit 530 and the learnable vector group generation unit 535 so that a predetermined error function based on the prediction result vector 570 and the judgment result vector 572 is optimized (adjustment process). The error function is, for example, the sum of the squared absolute values of the vector differences between the prediction result vector 570 and the judgment result vector 572 for all of the above possible combination patterns. Specifically, the adjustment process is a process of updating the above components (corresponding to multiple weight parameters in deep learning) many times until the error function falls below a standard. As a result of the adjustment process, each component of the prediction result vector 570 approaches the component of the judgment result vector 572 as closely as possible. As a result, the prediction result vector 570 accurately reflects the prediction results from the simulator 45 of the processing status (processed / unprocessed) of each task T.

[0074] According to the above-described components of the learning unit 515, in addition to the characteristics of the picking location set (x1~x32) and processing sequence set (y1~y32) of the combination pattern set by the condition setting unit 505, the characteristics of the correlation between the input vectors 557 are reflected in the output vector group 565 and the predicted result vector 570. That is, for each task T, the correlation between the picking location and the processing sequence (xr,yr) specified in that task and the picking location and the processing sequence (xs,ys) specified in another task T is reflected in the output vector group 565 and the predicted result vector 570 (1≦r≦32, 1≦s≦32, r≠s). xr represents the coordinates of the r-th picking location (1≦r≦32) out of 32 picking locations. yr represents the r-th processing sequence out of 32 processing sequences. Then, each component of the input vector group 555 and each component of the learnable vector group 560 are adjusted so that the aforementioned error function is optimized (for example, minimized).

[0075] As a result, the pre-trained model 520 is appropriately generated by considering the characteristics of the picking location set (x1~x32) itself, the characteristics of the processing sequence set (y1~y32) itself, and the characteristics of the correlation described above. Consequently, the aforementioned inference can be appropriately performed using the pre-trained model 520. That is, without using the simulator 45, the prediction results of the processing status of each task T by the simulator 45 can be appropriately inferred based on the values (1 or 0) of each component of the prediction result vector 570. Therefore, based on this inference result, the optimal processing sequence (y1~y32) for a given picking location set (x1~x32) can be predicted as the solution to the aforementioned combinatorial optimization problem.

[0076] Figure 10 is a diagram illustrating the detailed functional configuration of the input vector group generation unit 530. Referring to Figure 10, the input vector group generation unit 530 includes feature vector group generation units 580 and 582, a coupling unit 584, and a fully coupling unit 586.

[0077] The feature vector group generator 580 generates a feature vector group 590 according to the 32 picking locations (x1 to x32) of the combination pattern picking location set defined by the condition setting unit 505. The feature vector group 590 contains 32 feature vectors 592 generated in parallel, the same number as the number of tasks T. These feature vectors 592 each represent the features of the 32 picking locations. In this example, each feature vector 592 is a 128-dimensional vector, similar to the input vector 557. The feature vector 592 representing the features of the r-th picking location (xr) among the 32 picking locations is also referred to as feature vector 592_r. Feature vectors 592_1 to 592_32 are generated according to x1 to x32, respectively. The feature vector group generator 580 outputs the feature vector group 590 to the concatenation unit 584. The feature vector group generator 580 corresponds to an example of the "first generator" in this disclosure.

[0078] Figure 11 is a diagram illustrating an example of the method by which the feature vector group generator 580 generates the feature vector group 590. Referring to Figure 11, the feature vector group generator 580 generates the feature vector group 590 by embedding the information (coordinates) of 32 picking locations in the picking location set into a 128-dimensional vector space. The coordinates (x1, ... x32) of the 32 picking locations in real space are represented by x(i1, j1), ... x(i32, j32), respectively. Each of these coordinates corresponds to x(i, j) in Figure 3. The above embedding process is performed in parallel, for example, using a predetermined function that converts 2-dimensional vectors into 128-dimensional vectors. According to the above embedding, these picking locations are each represented by different 128-dimensional vectors 592_1, ... 592_32.

[0079] Figure 12 illustrates another example of the method by which the feature vector group generator 580 generates the feature vector group 590. Referring to Figure 12, in this example, the coordinates (x1, ... x32) of the 32 picking locations in the picking location set are represented by x(p1, q1), ... x(p32, q32), respectively. Each of x(p1, q1), ... x(p32, q32) corresponds to x(p, q) in Figure 4. The feature vector group generator 580 generates tensors 591_1, ... 591_32, respectively, by tensorizing the coordinates of the 32 picking locations. Tensors 591_1 to 591_32 represent x(p1, q1), ... x(p32, q32), respectively.

[0080] For example, each tensor is represented as a one-hot tensor corresponding to the coordinates of cell 415 in Figure 4. In one example, the tensor representing the topmost and leftmost cell (1,1) 415 in Figure 4 has only its 1st row and 1st column component as 1, with all other components being 0. Similarly, the tensor representing the bottommost and rightmost cell (8,4) 415 in Figure 4 has only its 8th row and 4th column component as 1, with all other components being 0.

[0081] The feature vector group generator 580 generates a feature vector group 590 by performing a predetermined convolution operation on each of the tensors 591_1 to 591_32. This convolution operation is performed in parallel with predetermined pooling, flattening, and affine transformations. In this example, the coordinates (x1 to x32) of the 32 picking locations in the picking location set are each represented by distinct 128-dimensional vectors 592_1, ... 592_32.

[0082] Referring again to Figure 10, the feature vector group generator 582 generates a feature vector group 594 according to 32 processing sequences (y1 to y32) of the set of processing sequences of combination patterns defined by the condition setting unit 505. The feature vector group 594 contains 32 feature vectors 596 generated in parallel, the same number as the number of tasks T. These feature vectors 596 each represent the features of the 32 processing sequences. In this example, each feature vector 596 is a 128-dimensional vector, similar to the input vector 557. The feature vector 596 representing the features of the r-th processing sequence (yr) among the 32 processing sequences is also referred to as feature vector 596_r. Feature vectors 596_1 to 596_32 are generated according to y1 to y32, respectively. The feature vector group generator 582 outputs the feature vectors 596 to the concatenation unit 584. The feature vector group generator 582 corresponds to an example of the "second generator" in this disclosure.

[0083] Figure 13 illustrates an example of the method by which the feature vector group generator 582 generates the feature vector group 594. Referring to Figure 13, the feature vector group generator 582 generates the feature vector group 594 by embedding information of 32 processing orders (y1, ... y32) of the processing order set into a 128-dimensional vector space. According to the above embedding, these processing orders are each represented by distinct 128-dimensional vectors 596_1, ... 596_32.

[0084] Figure 14 illustrates another example of the method by which the feature vector group generator 582 generates the feature vector group 594. Referring to Figure 14, the feature vector group generator 582 generates tensors 595_1 to 595_32 by tensorizing the information of the 32 processing orders (y1, ... y32) of the processing order set. Each tensor represents the processing order of the corresponding task T.

[0085] For example, each tensor is represented as a one-hot tensor that represents the position of the corresponding task T in the task queue TQ (Figure 5) that defines the task T. In one example, if a task T is task T1_1, which is processed first in task queue TQ1, the tensor representing the processing order of this task T has only the first row and first column component as 1, and all other components as 0. Similarly, if a task T is task T4_8, which is processed last in task queue TQ4, the tensor representing the processing order of this task T has only the eighth row and fourth column component as 1, and all other components as 0.

[0086] The feature vector group generator 582 generates a feature vector group 594 by performing a predetermined convolution operation on each of the tensors 595_1 to 595_32. As a result, the 32 processing orders (y1, ... y32) of the processing order set are each represented by distinct 128-dimensional vectors 596_1, ... 596_32.

[0087] Referring again to Figure 10, the concatenation unit 584 generates a concatenated vector group 597 by concatenating the feature vector group 590 and the feature vector group 594. The concatenated vector group 597 contains 32 concatenated vectors 599 that are generated in parallel, the same number as the number of feature vectors 592 and 596. For each value of r, the concatenation unit 584 generates a concatenated vector 599_r by concatenating the feature vector 592_r and the feature vector 596_r. Since each of the feature vectors 592_r and 596_r has 128 dimensions, the concatenated vector 599_r has 256 (=128+128) dimensions. The concatenation unit 584 outputs the concatenated vector group 597 to the full concatenation unit 586. The concatenation unit 584 corresponds to an example of the “third generation unit” of this disclosure.

[0088] The fully connected unit 586 generates 32 input vectors 557 (128-dimensional vectors) as an input vector group 555 by fully connecting the components of each connected vector 599 and reducing the dimension of these vectors. The input vector group 555 is input to the learnable vector group generation unit 535 (attention mechanism) and the output vector group generation unit 540 (transformer encoder), respectively.

[0089] According to the feature vector group generation unit 580, the features of the picking location set (x1~x32) are reflected in the feature vector group 590. According to the feature vector group generation unit 582, the features of the processing order set (y1~y32) are reflected in the feature vector group 594. According to the concatenation unit 584, the features of the feature vector group 590 itself and the features of the feature vector group 594 itself are reflected together in the concatenated vector group 597. According to the fully connected unit 586, for each value of r, the 256-dimensional concatenated vector 599_r is transformed into a 128-dimensional input vector 557_r. As a result, the features of the correlation between feature vectors 592_r and 596_r (correlation between picking locations and processing orders between tasks T) are reflected in the input vector 557_r.

[0090] Therefore, for each task T, a trained model 520 is generated by considering the characteristics of the correlation between the picking location (xr) and the processing order (yr) related to it. As a result, the aforementioned inference can be appropriately performed using the trained model 520. Consequently, a suitable set of processing orders (y1 to y32) can be predicted based on this inference result.

[0091] Figure 15 is a diagram illustrating the detailed method of inference using the trained model 520. Referring to Figure 15, when the terminal device 4 generates the trained model 520, it transmits information representing this model to the task management device 2.

[0092] The task management device 2 receives the above information transmitted from the terminal device 4 and stores it in the storage 23 as a learned model 602. The learned model 602 is the same as the learned model 520. After the learned model 602 is stored in the storage 23, the network controller 26 of the task management device 2 receives a task processing command INS from the order management device 3. This command is a signal that instructs the automated warehouse 1 to process 32 tasks T sequentially. These tasks T involve transporting 32 packages L, each stored in one of the 32 storage locations 46 of the automated warehouse 1, to the shipping ports 441 to 444 using shuttles 51 to 54. The task processing command INS includes information that specifies a set of picking locations representing 32 picking locations corresponding to each of these packages L.

[0093] The processor 21 of the task management device 2 includes, as part of its functional configuration, an inference unit 605 and a search unit 610. These functions are realized when the processor 21 executes program 231 (Figure 1).

[0094] The inference unit 605 performs inference processing using the trained model 602 in response to the task processing command INS. The inference processing is the process of predicting the number of tasks T out of the 32 tasks T specified in the task processing command INS that are determined to be processed within the predetermined time in the simulation by the simulator 45. Specifically, the inference processing includes (1) the process of predicting the results (z1 to z32) of the processing status of each task T by the simulator 45 according to the picking location set (x1 to x32) specified in the task processing command INS and the processing sequence set (provisional y1 to y32) that has been provisionally set for inferring the above prediction values, and (2) the process of predicting the above prediction values (the number of zi for which "1" is defined) based on the inference results.

[0095] The search unit 610 provisionally sets a set of processing sequences to be used in the inference process and searches for and identifies a single set of processing sequences (optimal sequence set) that optimizes the predicted value. This search process is equivalent to searching for and identifying the solution to the aforementioned combinatorial optimization problem. In one example, the search unit 610 provisionally sets all possible patterns of processing sequence sets and predicts that the single set of processing sequences that maximizes the predicted value is the optimal sequence set. Alternatively, the search unit 610 may calculate the gradient of the predicted value for a predetermined number of processing sequence sets and identify the optimal sequence set according to the calculation results.

[0096] The set of processing sequences identified by the search unit 610 is used as a predicted result of a preferred set of processing sequences to sequentially process the 32 tasks T specified in the task processing command INS. In other words, shuttles 51 to 54 sequentially process these tasks T according to the 32 processing sequences of the identified set of processing sequences. This optimizes the processing efficiency in the automated warehouse 1 and maximizes the number of processed tasks.

[0097] Figure 16 shows experimental results demonstrating that the trained model 520 (602) as a surrogate model can approximate the simulator 45 with high accuracy. In this example, the feature vector group generator 580 generates the feature vector group 590 by embedding (Figure 11), and the feature vector group generator 582 generates the feature vector group 594 by embedding (Figure 13). Figure 17 shows the values of various hyperparameters for the trained model 520 set in the above experiment.

[0098] Referring to Figure 16, the horizontal axis represents the number of training cycles performed by the learning unit 515 when generating the trained model 520. Specifically, this number represents the number of adjustments performed by the adjustment unit 550 on each component of the input vector group 555 and each component of the learnable vector group 560 (the number of weight parameter updates in deep learning). The vertical axis represents the error function mentioned above. Specifically, this error function represents the error between the number of processed tasks in the simulator 45 simulation and the predicted number of processed tasks inferred using the trained model 520.

[0099] Line 905 represents the error function (training error) of the trained model 520 during the training phase. Line 907 represents the error function (test error) of the trained model 520 during the test phase. As shown in the figures, both the training error and the test error were sufficiently reduced and converged as the number of training iterations increased. Therefore, the inventors were able to confirm that the trained model 520 was properly generated and approximated the simulator 45 with high accuracy.

[0100] Although not shown in the figures, the inventors were able to confirm that, in cases where the feature vector group generation unit 580 generates the feature vector group 590 by convolution (Figure 12), or where the feature vector group generation unit 582 generates the feature vector group 594 by convolution (Figure 14), the trained model 520 is appropriately generated and approximates the simulator 45 with high accuracy, similar to the example in Figure 16.

[0101] The inventors also experimentally confirmed that, under various hyperparameter conditions shown in Figure 17, the trained model 520 could predict the processing status (z1~z32) and the preferred processing sequence set for each task T approximately 30 times faster than the simulator 45. Thus, the inventors demonstrated that the trained model 520 is superior in both inference accuracy and computation speed.

[0102] Figure 18 is a flowchart illustrating an example of processing performed by the terminal device 4 in this embodiment. This flowchart begins, for example, when the input device 42 of the terminal device 4 receives a user operation instructing the generation of a trained model 520. Hereinafter, each step will be abbreviated as "S".

[0103] Referring to Figure 18, terminal device 4 performs a simulation of the operation of shuttles 51-54 in the automated warehouse 1 within a predetermined time for each combination pattern of picking location sets and processing sequence sets it has set (S105). Terminal device 4 determines whether each task T assumed in the simulation was processed within the predetermined time (S110). This determination process is performed for each combination pattern. Terminal device 4 generates a trained model 520 in which the relationship between the above combination patterns and the determination results in S110 has been learned. After that, the process ends.

[0104] Figure 19 is a flowchart showing the detailed steps of the process for generating the trained model 520 in S115. Figure 9 will be referred to as appropriate in the following explanation.

[0105] Referring to Figure 19, terminal device 4 generates input vector group 555 according to the combination pattern of the set picking location set and processing sequence set (S205). Note that S205, and S210 to S220 described below, are executed for each combination pattern.

[0106] Terminal device 4 generates a learnable vector group 560 according to the input vector group 555 using the transformer's attention mechanism (S210). Terminal device 4 generates an output vector group 565 by transforming the input vector group 555 according to the learnable vector group 560 (S215). Terminal device 4 generates a predicted result vector 570 by fully connecting the components of each output vector 567 of the output vector group 565 (S220).

[0107] Terminal device 4 determines whether a predetermined error function based on the prediction result vector 570 and the judgment result vector 572 has been optimized for all combination patterns (S225). Terminal device 4 determines, for example, whether this error function is less than a predetermined threshold.

[0108] If the error function is optimized (YES in S225), the process ends. On the other hand, if the error function is not yet optimized (NO in S225), the terminal device 4 appropriately updates and adjusts each component of the input vector group 555 and each component of the learnable vector group 560 for all combination patterns (S230). Then, the terminal device 4 executes S205 to S220 again based on these adjusted components. The terminal device 4 executes S205 to S220 and S230 until the error function is optimized.

[0109] Figure 20 is a flowchart showing the detailed procedure for generating the input vector group 555 in S205. Figure 10 will be referred to as appropriate in the following explanation.

[0110] Referring to Figure 20, terminal device 4 generates a feature vector group 590 by embedding or convolution according to the set picking location set (x1 to x32) in the configured combination pattern (S305). Terminal device 4 generates a feature vector group 594 by embedding or convolution according to the processing sequence set (y1 to y32) in the pattern (S310).

[0111] Terminal device 4 generates a concatenated vector group 597 by concatenating feature vector group 590 and feature vector group 594 (S315). Terminal device 4 generates an input vector group 555 by fully concatenating the components of each concatenated vector group 597 and reducing the dimensionality of these vectors (S320). After S320, the process proceeds to S210 in Figure 19.

[0112] Figure 21 is a flowchart illustrating an example of the process executed by the task management device 2. This flowchart is initiated in response to the task management device 2 receiving the task processing command INS (Figure 1).

[0113] Referring to Figure 21, the task management device 2 determines the picking location set (x1 to x32) specified in the task processing command INS (S405). The task management device 2 provisionally sets a task sequence set (provisional y1 to y32) for inferring the predicted value of the number of tasks processed by the simulator 45 (S410). The task management device 2 uses the trained model 602 to infer the above predicted value according to the picking location set determined in S405 and the task sequence set provisionally set in S410 (S415).

[0114] Task management device 2 determines whether the predicted value inferred in S415 has been optimized (S420). For example, task management device 2 determines whether this predicted value is equal to or greater than a predetermined threshold value.

[0115] If the predicted value is not optimized (NO in S420), the task management device 2 searches for a task sequence set that optimizes the predicted value by executing S410 and S415 again. On the other hand, if the predicted value is optimized (YES in S420), the task management device 2 predicts the provisional task sequence set used for predictive value in S415, which was executed immediately before S420, as the preferred task sequence set (S425). Subsequently, the task management device 2 sends transport commands to shuttles 51-54 to transport the multiple packages L (32 packages L in this example) specified in the task processing command INS to the shipping ports 441-444 according to the preferred task sequence set in S425 (S430).

[0116] As described above, according to the embodiment, when a picking location set is specified in the task processing command INS, a preferred task sequence set for multiple tasks T can be predicted using the trained model 520 (602). This prediction process can be performed appropriately with an accuracy close to that of the simulator 45, and in a shorter time than the simulator 45. As a result, the processing efficiency of multiple tasks T in the automated warehouse 1 can be appropriately optimized.

[0117] [Differentiation] In this embodiment, a trained model 520 is generated for processing 32 tasks T, each involving picking and unloading 32 packages L (Figure 5). Alternatively, a trained model 520 may be generated for processing n×m tasks, each involving picking and unloading an arbitrary n×m number of packages. In this case, the trained model 602 can be used to predict a suitable processing sequence for the set of picking locations specified in the task processing command INS that defines the n×m tasks.

[0118] In this embodiment, the predetermined dimension is 128 dimensions, meaning that each of the feature vectors 592, 596, input vector 557, learnable vector 562, and output vector 567 has 128 dimensions. In contrast, the predetermined dimension may be a different dimension from 128 dimensions (for example, another number of dimensions greater than the total number of storage locations 46 (=64)).

[0119] In this embodiment, the functions of the simulation unit 500, the condition setting unit 505, the determination unit 510, and the learning unit 515 are implemented by the terminal device 4. However, some or all of these functions may be implemented by the task management device 2. In this case, the task management device 2 and the terminal device 4, or the task management device 2 alone, correspond to an example of the "learning system" of this disclosure.

[0120] In this embodiment, the "predetermined constraints" of this disclosure are constraints to avoid collisions between conveying devices, but other constraints may be added. For example, constraints may be added such as transporting a specific package out of a group of packages to a specific exit, and / or processing the specific package within a specific order (time). Also, in this embodiment, as a constraint to avoid collisions between conveying devices, only one of the shuttles 51 to 54 can travel in each of the vertical aisles 411 to 414, while two or more of the shuttles 51 to 54 can travel in the horizontal aisle 43, but other constraints may be used. For example, two or more shuttles may be allowed to travel in each of the vertical aisles 411 to 414 as long as they do not collide. The "predetermined constraints" may be changed to constraints that are appropriate depending on the storage location and aisles of the warehouse in question.

[0121] The embodiments disclosed herein should be considered in all respects to be illustrative and not restrictive. The scope of the present invention is indicated by the claims rather than by the foregoing description, and all modifications within the meaning and scope equivalent to the claims are intended to be included. [Explanation of Symbols]

[0122] 1 Automated warehouse, 2 Task management device, 3 Order management device, 4 Terminal device, 100 Warehouse system, 500 Simulation unit, 505 Condition setting unit, 510 Judgment unit, 515 Learning unit, 520, 602 Trained model, 530 Input vector group generation unit, 535 Learnable vector group generation unit, 540 Output vector group generation unit, 545, 586 Fully connected unit, 547 Judgment result vector generation unit, 550 Adjustment unit, 580, 582 Feature vector group generation unit, 584 Concatenation unit, 605 Inference unit, 610 Search unit.

Claims

1. A learning system used for learning to process multiple tasks of picking multiple items stored in multiple storage locations of an automated warehouse using multiple conveying devices and transporting them to multiple exits, The aforementioned multiple tasks are defined by multiple task queues, each corresponding to one of the multiple output ports and arranged in parallel with the others. The learning system, A simulation unit that performs a simulation of the operation of the multiple transport devices in the automated warehouse under predetermined constraints for each combination of multiple picking locations specified in each of the multiple tasks and multiple processing sequences specified for each task queue in the multiple tasks, A determination unit for each of the aforementioned tasks determines whether or not the task was processed in the simulation, A learning system comprising a learning unit that generates a trained model in which the relationship between the above combination and the determination result of the determination unit has been learned.

2. The aforementioned learning unit, An input vector group generation unit generates a plurality of input vectors, each representing the characteristics of the combination, according to the plurality of picking locations and the plurality of processing orders. A learnable vector group generation unit generates a group of learnable vectors, each representing the degree of correlation between the group of input vectors, according to the group of input vectors using an attention mechanism; An output vector group generation unit generates a plurality of output vectors that represent the characteristics of the plurality of input vectors by transforming the plurality of input vectors according to the plurality of learnable vectors, The learning system according to claim 1, further comprising an adjustment unit that adjusts each component of the plurality of input vectors and each component of the plurality of learnable vectors so as to optimize an error function based on a vector generated by fully connecting each component of the plurality of output vectors and a vector indicating the determination result of the determination unit.

3. The input vector group generation unit, A first generation unit generates a plurality of first vectors that each represent the characteristics of the plurality of picking locations according to the plurality of picking locations, A second generation unit generates a plurality of second vectors that each represent the characteristics of the plurality of processing sequences, according to the plurality of processing sequences. A third generation unit generates a plurality of third vectors by concatenating the plurality of first vectors and the plurality of second vectors, The learning system according to claim 2, further comprising a fourth generation unit that generates a plurality of fourth vectors having a dimension lower than the dimension of the plurality of third vectors as the plurality of input vectors by fully connecting the components of each of the plurality of third vectors.

4. The learning system according to claim 3, wherein the first generation unit generates the plurality of first vectors by performing a convolution operation on each of the plurality of first tensors that each indicate the plurality of picking locations.

5. The learning system according to claim 3, wherein the first generation unit generates the plurality of first vectors by embedding the information of the plurality of picking locations into a vector space of a predetermined dimension.

6. The learning system according to claim 3, wherein the second generation unit generates the plurality of second vectors by performing a convolution operation on each of the plurality of second tensors that each represent the plurality of processing orders.

7. The learning system according to claim 3, wherein the second generation unit generates the plurality of second vectors by embedding the information of the plurality of processing sequences into a vector space of a predetermined dimension.

8. The learning system according to claim 1, wherein the predetermined constraints include the constraint that the conveying devices included in the plurality of conveying devices do not collide with each other in each of the plurality of first passages connecting the plurality of storage locations and the plurality of discharge ports, and the plurality of second passages connecting the plurality of discharge ports.

9. A trained model generated by the learning system described in any one of claims 1 to 8.

10. A model storage unit that stores a trained model generated by the learning system described in any one of claims 1 to 8, A receiving unit that receives processing commands for multiple tasks of transporting multiple packages stored in the automated warehouse to multiple outlets using the multiple transport devices, An inference unit that performs an inference process to predict the number of tasks to be processed among the multiple tasks, using the trained model in the model storage unit, according to a plurality of picking locations specified in the multiple tasks and a processing order of the multiple tasks that has been provisionally set for inferring the predicted value. An information processing device comprising: a search unit that searches for a processing order for a plurality of tasks in which the predicted values are optimized.

11. The aforementioned automated warehouse, A warehouse system comprising the information processing device described in claim 10.

12. A method used for learning to process multiple tasks of picking multiple items stored in multiple storage locations of an automated warehouse using multiple conveying devices and transporting them to multiple exits, The aforementioned multiple tasks are defined by multiple task queues, each corresponding to one of the multiple output ports and arranged in parallel with the others. The aforementioned method, For each combination of the multiple picking locations specified in each of the multiple tasks and the multiple processing sequences specified for each task queue in the multiple tasks, a simulation of the operation of the multiple transport devices in the automated warehouse under predetermined constraints is performed. For each of the aforementioned tasks, the step of determining whether or not the task was processed in the simulation, A method comprising the step of generating a trained model in which the relationship between the aforementioned combination and the determination result in the determination step is learned.

13. The step of generating the aforementioned trained model is: A step of generating a plurality of input vectors, each exhibiting the characteristics of the combination, according to the plurality of picking locations and the plurality of tasks, The steps include generating a plurality of learnable vectors, each representing the degree of correlation between the plurality of input vectors, using an attention mechanism according to the plurality of input vectors, The steps include generating a plurality of output vectors that represent the characteristics of the plurality of input vectors by transforming the plurality of input vectors according to the plurality of learnable vectors, The method according to claim 12, further comprising the step of adjusting each component of the plurality of input vectors and each component of the plurality of learnable vectors so that an error function based on a vector generated by fully connecting each component of the plurality of output vectors and a vector indicating the determination result in the determination step is optimized.

14. The step of generating the aforementioned plurality of input vectors is: The steps include generating a plurality of first vectors, each representing a characteristic of the plurality of picking locations, according to the plurality of picking locations, A step of generating a plurality of second vectors that each represent the characteristics of the plurality of processing sequences, according to the plurality of processing sequences, The steps include generating a plurality of third vectors by concatenating the plurality of first vectors and the plurality of second vectors, The method according to claim 13, comprising the step of generating a plurality of fourth vectors having a dimension lower than the dimension of the plurality of third vectors as the plurality of input vectors by fully connecting the components of each of the plurality of third vectors.