Method, apparatus, controller, robot and program product for controlling a device

By introducing machine learning models into the model predictive control strategy, target observations are generated using sensor data and control information is optimized, solving the inaccuracy problem caused by sensor noise in robot control and achieving higher precision equipment control and improved user experience.

CN122308159APending Publication Date: 2026-06-30ROBERT BOSCH GMBH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ROBERT BOSCH GMBH
Filing Date
2024-12-31
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In existing robot control, model predictive control strategies based on observations fail to effectively utilize sensor observation data, resulting in inaccurate control, especially when the quality of observations is insufficient, making it difficult to achieve precise motion control.

Method used

By introducing a machine learning model into the model predictive control strategy, target observations are generated using sensor data, and the control information is optimized by combining the loss function of the target state and the observations to improve control accuracy.

Benefits of technology

It improves the accuracy of robot control and user experience, ensures that the equipment can perform tasks better, and reduces the impact of sensor noise on control.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122308159A_ABST
    Figure CN122308159A_ABST
Patent Text Reader

Abstract

Embodiments of the present disclosure relate to a method, an apparatus, a controller, a robot and a program product for controlling a device. The method comprises determining, based on sensor data detected by a sensor of the device, a target observation value related to a target task performed by the device using a machine learning model. The method further comprises determining control information for the device based on a model predictive control policy formed based on a first loss related to a target state of the device and a second loss related to the target observation value. By the method of the embodiments of the present disclosure, the information detected by the sensor is incorporated in the model predictive control, which improves the accuracy of the control of the device, enables the device to perform the task better and more accurately, and improves the user experience.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The embodiments of this disclosure generally relate to the field of device control, and more particularly to methods, apparatus, controllers, robots, and program products for controlling devices. Background Technology

[0002] With the advancement of information technology, the application of robots is gradually increasing. The application fields of robots have expanded to include industrial production, healthcare, and daily life. Due to the rapid development of sensor technology, robots have acquired a certain degree of perception capability. Furthermore, with the rapid development of intelligent technology, artificial intelligence and robots are beginning to merge, resulting in more natural human-computer interaction and an increasingly better user experience.

[0003] Model Predictive Control (MPC) is an advanced control strategy. MPC controls based on predictions of the system's dynamic model. By building a simplified model, it can predict the system's state changes over a future period. Furthermore, by addressing optimization problems, MPC can determine the optimal control actions to achieve the control objective. Therefore, MPC is increasingly used in various fields, and its application in robot control has improved the accuracy and versatility of robot control. Summary of the Invention

[0004] Embodiments of this disclosure provide methods, apparatus, controllers, robots, and program products for controlling devices.

[0005] According to a first aspect of this disclosure, a method for controlling a device is provided. The method includes using a machine learning model to determine target observations related to a target task performed by the device, based on sensor data detected by sensors of the device. The method further includes determining control information for the device based on a model predictive control strategy formed by a first loss related to a target state of the device and a second loss related to the target observations.

[0006] According to a second aspect of this disclosure, an apparatus for controlling a device is provided. The apparatus includes a target observation determination module configured to determine target observations related to a target task performed by the device using a machine learning model based on sensor data detected by sensors of the device; and a control information determination module configured to determine control information for the device based on a model predictive control strategy formed by a first loss related to the target state of the device and a second loss related to the target observations.

[0007] According to a third aspect of this disclosure, a controller is provided. The controller includes at least one processor; and a memory coupled to the at least one processor and having instructions stored thereon, which, when executed by the at least one processor, cause the controller to perform the steps of the method in the first aspect of this disclosure.

[0008] According to a fourth aspect of this disclosure, a robot is provided. The robot includes sensors and a controller as described in a third aspect of this disclosure.

[0009] According to a fifth aspect of this disclosure, a machine program product is provided. The machine program product includes machine-executable instructions, wherein the machine-executable instructions are executed by a processor to implement the steps of the method in the first aspect of this disclosure. Attached Figure Description

[0010] Figure 1 The illustration shows a schematic diagram of an example environment in which devices and / or methods according to some embodiments of the present disclosure may be implemented;

[0011] Figure 2 The illustration shows a schematic diagram of an example method for controlling a device according to some embodiments of the present disclosure;

[0012] Figure 3 The illustration shows a schematic diagram of an example process for controlling a robot according to some embodiments of the present disclosure;

[0013] Figure 4 The illustration shows schematic diagrams for training machine learning models according to some embodiments of the present disclosure;

[0014] Figure 5 The illustration shows a schematic diagram of an example process for controlling a vehicle according to some embodiments of the present disclosure;

[0015] Figure 6 The illustration shows a schematic diagram of an apparatus for controlling a device according to some embodiments of the present disclosure;

[0016] Figure 7 A schematic block diagram of an example device suitable for implementing some embodiments of the present disclosure is shown.

[0017] In the various figures, the same or corresponding reference numerals indicate the same or corresponding parts. Detailed Implementation

[0018] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.

[0019] In the description of embodiments of this disclosure, the term "comprising" and similar terms should be understood as open-ended inclusion, i.e., "including but not limited to". The term "based on" should be understood as "at least partially based on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The terms "first", "second", etc., may refer to different or the same objects. Other explicit and implicit definitions may also be included below.

[0020] As mentioned earlier, MPC (Multi-Parameter Control) has been applied to robot control. However, the objective function of the MPC framework is always designed based on the state and / or control variables, and it always directly excludes objective functions based on observations obtained from sensors. Using this framework, tasks such as navigation or tracking can be performed, and the necessary information (e.g., pose estimation, obstacle detection) can be obtained. However, this approach only optimizes the control variables by minimizing the objective function, without using observations, which may lead to inaccurate control.

[0021] Since the quality of observations can directly impact the task, advancements in observation or perception algorithms, the power of low-cost cameras, and increased computational resources have made it possible to establish synergies between observation and action, integrating them into a single problem. This single problem involves incorporating observations as visual feedback into a loss or cost function. Despite the impressive accuracy demonstrated in robotic observation or perception research, millimeter-level observation or perception navigation systems remain insufficient for generating accurate motion. In designing navigation systems, noise in the perception module is often difficult to estimate using manual design methods, leading to inaccurate robot control.

[0022] Therefore, embodiments of this disclosure provide a method for controlling a device. In this method, a controller acquires sensor data detected by the device's sensors and then uses a machine learning model to determine target observations related to a target task performed by the device. After determining the target observations, the controller can further determine control information for the device using a model predictive control strategy formed by a first loss related to the device's target state and a second loss related to the target observations. By incorporating sensor-detected information into model predictive control, the accuracy of device control is improved, enabling the device to perform tasks better and more accurately, thus improving the user experience.

[0023] The embodiments of this disclosure will now be described in further detail with reference to the accompanying drawings, wherein... Figure 1 The examples illustrate the devices and / or methods of embodiments of this disclosure and may be implemented in an example environment.

[0024] like Figure 1 As shown, example environment 100 includes controller 102. Controller 102 can control the device containing it, such as controlling the movement or operation of the device. In one example, controller 102 is a controller in a mobile robot, which can control the movement of the mobile robot. In another example, controller 102 is a controller in a vehicle, which can control the movement of the vehicle. The above examples are merely for describing this disclosure and are not intended to specifically limit this disclosure.

[0025] The device may also include a sensor 104, which can be used to detect information related to the task the device is about to perform. In one example, the sensor 104 may be a camera that can acquire visual information, such as image information or video information. For example, when the device is a mobile robot, the sensor can acquire an image of the area where the track is located when it wants to move to a predetermined track. In another example, the sensor may be a radar or an inertial measurement unit, used to measure the device's acceleration or angular velocity, etc.

[0026] Sensor data acquired by sensor 104 can be input into controller 102. However, the actual sensor data acquired contains noise. Therefore, if sensor data is used directly with a predetermined calculation formula to obtain the target observation value, or if sensor data is used as the target observation value, the accuracy of the control information calculated by model predictive control strategy module 108 will be low. After many experimental tests, it was found that when sensor data is processed by a neural network model to generate target observation values, the noise problem caused by the sensor itself can be eliminated. Therefore, before being processed by model predictive control strategy module 108, a pre-trained machine learning model 106, such as a neural network model, runs in controller 102. Then, controller 102 uses this machine learning model 106 to process sensor data to obtain target observation values ​​related to the target task. For example, the neural network model can process image data to determine the distance between the device and the track, and the calculated distance is the target observation value related to the navigation task.

[0027] The target observation is then input to the model predictive control strategy module 108 for further processing. In the model predictive control strategy module 108, the optimal control information 110 can be determined using a model predictive control strategy. The objective function for optimization control utilizes not only losses related to the device's state but also losses generated from the target observation. For ease of description, the loss related to the device's state is referred to as the first loss, and the loss related to the target observation is referred to as the second loss. Alternatively or additionally, the model predictive control strategy is a perception model predictive control strategy.

[0028] In this process, the model predictive control strategy module 108 uses the objective function and corresponding constraints to determine the control information 110. Then, the controller 102 provides the control information 110 to the device to perform operations. For example, when the device is a mobile robot, the control information can be the linear velocity in the x-direction and the rotational speed in the z-axis, thereby controlling the mobile robot to move forward.

[0029] This method improves the accuracy of device control by incorporating sensor-detected information into model predictive control, enabling the device to perform tasks better and more accurately, thus enhancing the user experience.

[0030] The above combination Figure 1 Example environments in which some embodiments of this disclosure can be implemented are described below. Figure 2 A schematic diagram illustrating an example method for controlling a device according to some embodiments of the present disclosure. Figure 2 The method shown can be used in Figure 1 Execute in the environment shown or any suitable environment, such as by Figure 1The controller 102 or any suitable computing device in the system can be used to execute this.

[0031] like Figure 2 As shown in process 200, at block 202, controller 102 uses machine learning model 106 to determine target observations related to the target task performed by the device, based on sensor data detected by the device's sensors. Typically, devices use system dynamics models to predict the state at the next moment, such as predicting the device's position. However, to ensure that the device can use the correct control information to reach the predetermined state, model predictive control strategies are usually used in the device to optimize the predictions, thereby obtaining accurate control information for the device.

[0032] To obtain more accurate control information, information observed by sensors can be applied to the predictive optimization process. However, sensor data contains noise, which is inherent to the sensor itself, such as noise caused by the physical characteristics of the electronic components within the sensor or by variations in current within the components. For example, while computer vision technology has developed rapidly in the past, the observed information still contains white noise related to the probability density function. To reduce the impact of this noise, trained machine learning models can be used to process sensor data and filter out sensor-induced noise.

[0033] In some embodiments, the sensor is a camera. In this case, when using the machine learning model 106 to determine target observations related to the target task performed by the device, the controller 102 can receive image data related to the target task from the camera. The image data can be a single image or a video stream composed of multiple video frames. The controller 102 then uses the machine learning model to process this acquired image data to calculate target observations related to the target task performed by the device, such as distance information or location information.

[0034] In some embodiments, the pre-trained machine learning model 106 can be trained using the following method. The pre-trained machine learning model 106 can be trained in the controller 102. During training, the controller 102 can acquire sample image data from the device's sensors. In addition to acquiring the sample image data, the controller 102 also acquires the ground truth values ​​corresponding to the sample image data processed by the machine learning model. After obtaining this data, the machine learning model can be used to process the sample image data to obtain calculation results. Then, the calculation results are compared with the ground truth values ​​to adjust the parameters of the machine learning model. This achieves the training of the machine learning model.

[0035] In one example, when the controller acquires the ground truth corresponding to the sample image data, the ground truth can be obtained using the device's motion capture system. For example, when the machine learning model is used to calculate distance, the ground truth could be the distance between the device and the target object calculated by the motion capture system. In another example, when the neural network model is used to determine location, a QR code can be placed on the target object when determining the ground truth, and the controller can then use the acquired QR code image to determine the ground truth, for example, using a Perspective-n-Point (PnP) algorithm to determine the device's location. The above examples are merely for describing this disclosure and are not intended to specifically limit it. The pre-trained machine learning model 106 can be trained by any suitable computing device.

[0036] In some embodiments, in examples utilizing image data, the corresponding target task may be a navigation or tracking task targeting a target object. In this case, a machine learning model can be used to process the image data to determine the target distance between the device and the target object. In one example, if the device is a robot, its target task is to navigate to a robot charging area. In this case, images acquired by the robot's camera are input into a machine learning model to determine the distance between the robot and the charging area. In another example, if the device is a robot, its target task is to track a target object. In this case, images acquired by the robot's camera are input into a machine learning model to determine the distance between the robot and the target object. The above examples are merely for describing this disclosure and are not intended to specifically limit this disclosure.

[0037] In some embodiments, when using image data, the target task can be other suitable tasks, such as cargo transportation, and in this case, the target observation determined by applying the image data to a machine learning model is the location of the device.

[0038] In some embodiments, the sensor can be other suitable sensor devices. For example, the sensor can be an inertial measurement unit, which can be used to measure the angular velocity of the device's rotation. In this case, the target task can be to move the robot's robotic arm to a predetermined position. The controller can then use a predictive machine learning model to process the obtained angular velocity to obtain target observations related to the target task, such as the target angle of movement. The machine learning model in this case is also calculated based on the corresponding sample data and ground truth.

[0039] At block 204, controller 102 determines control information for the device based on a model predictive control strategy formed by a first loss related to the device's state and a second loss related to the target observation. To generate control information for the device more accurately, information related to the observation is incorporated into the model control strategy. The process of determining control information will be described below with reference to some embodiments.

[0040] This method improves the accuracy of device control by incorporating sensor-detected information into model predictive control, enabling the device to perform tasks better and more accurately, thus enhancing the user experience.

[0041] The above combination Figure 2 A schematic diagram of an example method for controlling a device according to some embodiments of the present disclosure is shown. The process of determining control information is described below with reference to some embodiments.

[0042] In some embodiments, when determining control information for a device, the controller 102 can construct an objective function in the model predictive control strategy module 108 based on a first loss and a second loss, such that the objective function used for optimization includes not only losses related to the state or control but also losses related to the observations. Furthermore, the controller 102 can further obtain the constraints on the objective function in the model predictive control strategy. Then, using the constructed objective function and the constraints on the objective function, the control information for the device is calculated.

[0043] When determining the first loss, controller 102 can pre-acquire a reference observation corresponding to the target observation, which is also the expected value. Controller 102 can then use the target observation and the reference observation to calculate the first loss. In one example, the difference between the target observation and the reference observation can be calculated, for example, determining the difference between the target distance determined by the neural network model and the reference distance. Controller 102 then combines this difference with other information to determine the first loss, for example, constructing a function for the difference to determine the first loss. In another example, the difference between the target observation and the reference observation can be calculated, and this difference can then be determined as the first loss.

[0044] Additionally, when determining the second loss, the controller 102 can further determine a reference state corresponding to the target state of the device. This reference state can be a desired state, such as the state the robot is expected to reach in a robot navigation task, including x-coordinates, y-coordinates, and rotational speed along the z-axis. The controller 102 then calculates the second loss based on the predicted target state and the reference state. In one example, the second loss can be determined based on a function constructed from the difference between the target state and the reference state. In another example, the difference between the target state and the reference state can be determined as the second loss. Furthermore, the first and second losses are for a first time among multiple time points. Therefore, when determining the objective function in the model predictive control strategy, the controller 102 can obtain predicted states, multiple reference states, and multiple observations and reference observations for multiple time points. Thus, the controller 102 can calculate a state-related loss and an observation-related loss for each time point. At this point, the controller 102 can determine a first plurality of state-related losses and a second plurality of observation-related losses for multiple time points. Then, the controller 102 determines the control information based on the first plurality of losses and the second plurality of losses, using the corresponding constraints.

[0045] The process of determining control information is described below with an example. The objective function of the model predictive control strategy is expressed by the following formula (1):

[0046]

[0047] Where x represents the state, which is represented by (x, y, yaw (angle about z)) if the device is a mobile robot; u represents control information, for example, the control information for a robot can be (v_x, w_z), where v_x represents the linear velocity in the x-direction and w_z represents the rotational velocity along the z-axis; where L a The loss or cost associated with the action is represented by the function f below, L p The loss or cost associated with the observation is obtained using the output of machine learning model 106; z represents the observation, T_0 represents the current time, T_f represents the prediction window, which is N x dt, where dt represents the length of a prediction period; This indicates obtaining the minimum value of u.

[0048] The constraints on the objective function above are represented by the following formulas (2)-(6):

[0049] x t+1 =f(x) t ,u t (2)

[0050] xmin ≤x t ≤x max (3)

[0051] u min ≤u t ≤u max (4)

[0052] x0= x init (5)

[0053] C j x(t+T f |t)+d j +d safe ≤0,j=1,2,…,m (6)

[0054] Where f() is the dynamic model function, which is used to calculate the state of the device at the next moment, x t U represents the state at time t. t x represents the control information at time t. t+1 Represents the state at time t+1; where x min and x max Represents the minimum and maximum values ​​of the state, which are pre-set; u min and u max This represents the maximum and minimum values ​​of control information, for example, for a robot, u t It can represent speed; the robot has maximum and minimum speed limits; x init Indicates the initial state; C j Indicates a safety constraint, d j and d safe Parameter values ​​related to safety constraints.

[0055] As can be seen, the observed variable L has been added to the formula. p (z). As mentioned above, since direct observations from sensor 104 still carry white noise with respect to the density function, therefore... To determine z, where Let L represent the true value, Σ represent noise, and N represent a normal distribution. Therefore, the observed variable can be reformatted as L. p (E(z)), where E() represents the maximum likelihood. Then, for this maximum likelihood, machine learning techniques can be used to learn f. p () is used to learn the observations, and then the observed variable becomes L p (f p (E(z))).

[0056] Additionally, after determining the control information, the controller 102 can use the control information to control the device to perform a target task, such as a navigation task or a tracking task.

[0057] The following is combined Figure 3 Schematic diagrams illustrating examples of controlling a robot according to some embodiments of the present disclosure. Figure 3 The mobile robot in the middle is Figure 1 and Figure 2 The equipment used.

[0058] Example 300 describes the use of a mobile robot 308 to perform docking in an application scenario. The docking process involves the mobile robot 308 performing a navigation task to move to a docking area, such as a supermarket driveway or a charging station. In box 302, a docking area is shown that can be used to perform docking, and a track exists within the docking area that can be used to receive the robot.

[0059] The mobile robot 308 has sensors such as a camera. Therefore, the mobile robot 308 can use the camera to acquire images of the docking area, including the target object's track. The images of the docking area are then transmitted to a pre-trained neural network model 304. This neural network model 304 then processes the received image data to calculate the distance or position of the mobile robot relative to the track. The acquired distance or position can then be input into the model predictive control strategy module 306. This model predictive control strategy module 306 has an objective function and corresponding constraints as shown above. The objective function is used to calculate the state of the mobile robot and can be any function suitable for the mobile robot.

[0060] Then, the mobile robot 308 can calculate control information through the model predictive control strategy module 306 to control its movement. After the mobile robot moves, its state information at the next moment can be obtained. Then, based on the state information and the distance at the next moment, further control information for the next moment is generated, thereby controlling the mobile robot 308 to move to the track in the docking area.

[0061] The above combination Figure 3 Examples of controlling a robot according to some embodiments of this disclosure are described below. Figure 4 The illustration depicts schematic diagrams for training machine learning models according to some embodiments of the present disclosure. Figure 4 It is used for training Figure 3 An example of a neural network model in [the context of the text].

[0062] In Example 400, image data of the docking area 402 can be collected using the camera of the mobile robot. A corresponding truth value can be obtained for this docking area 402. For example, a truth value acquisition module 404 can be used to obtain a truth value 406 corresponding to the image data. The truth value acquisition module 404 can be a motion capture system or a QR code processing module. If the truth value acquisition module 404 is a motion capture system, the actual distance of the mobile robot relative to the track can be determined by the motion capture system, and this actual distance can be used as the truth value. If the truth value acquisition module 404 is a QR code processing module, the position of the mobile robot relative to the track can be determined by processing the QR code image data from the docking area or the track, and this position can be used as the truth value. In the QR code processing module, a perspective N-point algorithm can be used to calculate the position of the robot relative to the track.

[0063] Then, the image data from the docking region 402 and the ground truth 406 are input into the neural network model 408 to train the neural network model. During training, prediction results are obtained by inputting image data into the neural network model 408, and the parameters of the neural network model 408 are adjusted by comparing the prediction results with the ground truth. Therefore, the trained neural network model can process the features of the docking region and generate observations that can be used for the corresponding task.

[0064] above Figure 4 The process of training a neural network model 408 is described. When the ground truth is distance, the trained neural network model can calculate distance; when the ground truth is position, the trained neural network model can calculate position. The above examples are only for describing this disclosure and are not intended to limit this disclosure. Any suitable input data and ground truth, along with a suitable loss function, can be used to train any neural network model suitable for this scheme.

[0065] The above combination Figure 4 The following is a schematic diagram illustrating a method for training a machine learning model according to some embodiments of the present disclosure. Figure 5 Schematic diagrams illustrating examples of controlling a vehicle according to some embodiments of the present disclosure. The application of the methods of the present disclosure in robots has been described above. In this disclosure, in... Figure 1 and Figure 2 The equipment used can also be a vehicle, and can be used Figure 2 The method described herein is used to control the vehicle.

[0066] Example 500 describes the process of using vehicle 508 to park in a parking area. This parking process involves vehicle 508 performing a navigation task to move to a parking space, such as a parking space in an underground garage or mechanical parking garage. In box 502, a parking area with available parking spaces is shown that can be used to park vehicle 508.

[0067] Vehicle 508 has sensors such as cameras or radar. Therefore, vehicle 508 can use the camera or radar to acquire images of the parking area, including the target parking space. The images of the parking area are then transmitted to a pre-trained neural network model 504. This neural network model 504 then processes the received image data and can calculate the distance of the vehicle relative to the target parking space. The acquired distance can then be input into the model predictive control strategy module 506. This model predictive control strategy module 506 has an objective function and corresponding constraints as shown above. The objective function is used to calculate the state of the vehicle and can be any function suitable for the vehicle.

[0068] Then, the control information can be calculated by the model predictive control strategy module 506. The vehicle 508 then moves according to this control information. After the vehicle 508 moves, the state information for the next moment can be obtained. The state information and distance for the next moment are then used to further generate control information for the next moment, thereby controlling the vehicle 508 to move to the parking space within the parking area.

[0069] Figure 6 A schematic diagram of an apparatus for controlling a device according to an embodiment of the present disclosure is further shown. The apparatus 600 can be applied to a controller 110, which may include multiple modules for performing functions such as... Figure 2 The corresponding steps in method 200 discussed herein. For example... Figure 6 As shown, the device 600 includes a target observation determination module 602, configured to determine target observations related to a target task performed by the device using a machine learning model based on sensor data detected by the device's sensors; and a control information determination module 604, configured to determine control information for the device based on a model predictive control strategy formed by a first loss related to the device's target state and a second loss related to the target observations.

[0070] In some embodiments, the sensor is a camera, and the target observation determination module (602) includes: an image data receiving module configured to receive image data related to a target task from the camera; and a first observation determination module configured to determine target observations related to a target task performed by the device by applying the image data to a machine learning model.

[0071] In some embodiments, the target task is a navigation or tracking task for a target object, and the first observation determination module includes: a target distance determination module configured to determine the target distance between the device and the target object by applying image data to a machine learning model; and / or a position determination module configured to determine the position of the device relative to the target object by applying image data to a machine learning model.

[0072] In some embodiments, training the machine learning model includes: a sample image data acquisition module configured to acquire sample image data from the device's sensors; a truth value determination module configured to determine a truth value corresponding to the sample image data; and a machine learning model training module configured to train a machine learning model based on the sample image data and the truth value; wherein the truth value determination module includes: a first truth value determination module, which acquires the truth value from the device's motion capture system; or a second truth value determination module, which is configured to determine the truth value based on the acquired QR code image.

[0073] In some embodiments, the control information determination module 604 includes: an objective function determination module configured to determine an objective function in a model predictive control strategy based on a first loss and a second loss; a constraint determination module configured to determine constraints on the objective function in the model predictive control strategy; and an information determination module configured to determine control information for the device based on the objective function and constraints. In some embodiments, the device 600 further includes: a reference observation determination module configured to determine a reference observation corresponding to a target observation; a first loss determination module configured to determine a first loss based on the target observation and the reference observation; and / or a reference state determination module configured to determine a reference state corresponding to a target state of the device; and a second loss determination module configured to determine a second loss based on the target state and the reference state.

[0074] In some embodiments, the first loss and the second loss are for a first time among a plurality of time periods, and the control information determination module 604 includes: a plurality of loss determination modules configured to acquire a first plurality of losses related to the state of the device and a second plurality of losses related to the observations for the plurality of time periods; and a first information determination module configured to determine control information based on the first plurality of losses and the second plurality of losses.

[0075] In some embodiments, the apparatus 600 further includes a target task execution module, configured to control the device to execute a target task based on control information.

[0076] In some embodiments, the model predictive control strategy is a perception model predictive control strategy.

[0077] In some embodiments, the model predictive control strategy is a perception model predictive control strategy.

[0078] Figure 7 A schematic block diagram of an example device 700 that can be used to implement embodiments of the present disclosure is shown. Figure 1 The controller 102 can be implemented using device 700. As shown, device 700 includes a central processing unit (CPU) 701, which can perform various appropriate actions and processes based on computer program instructions loaded into random access memory (RAM) 703 according to computer program instructions stored in read-only memory (ROM) 702. RAM 703 can also store various programs and data required for the operation of device 700. The processor 701, ROM 702, and RAM 703 are interconnected via bus 704. Input / output (I / O) interface 705 is also connected to bus 704.

[0079] The various processes and procedures described above, such as method 200, can be executed by processor 701. For example, in some embodiments, method 200 may be implemented as a computer software program tangibly contained in a machine-readable medium. In some embodiments, part or all of the computer program may be loaded and / or installed on device 700 via ROM 702. When the computer program is loaded into RAM 703 and executed by processor 701, one or more actions of method 200 described above may be performed.

[0080] This disclosure can be a method, apparatus, system, and / or computer program product. A computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for performing various aspects of this disclosure.

[0081] Computer-readable storage media can be tangible devices capable of holding and storing instructions for use by an instruction execution device. Computer-readable storage media can be, for example—but not limited to—electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disc read-only memory (CD-ROM), digital multifunction disc (DVD), memory sticks, floppy disks, mechanical encoding devices, such as punch cards or recessed protrusions storing instructions thereon, and any suitable combination of the foregoing. The computer-readable storage media used herein are not to be construed as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses through fiber optic cables), or electrical signals transmitted through wires.

[0082] The computer-readable program instructions described herein can be downloaded from computer-readable storage media to various computing / processing devices, or downloaded via a network, such as the Internet, local area network, wide area network, and / or wireless network, to an external computer or external storage device. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives the computer-readable program instructions from the network and forwards them to the computer-readable storage media in the respective computing / processing device.

[0083] Computer program instructions used to perform the operations of this disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, status setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages ​​such as Smalltalk, C++, etc., and conventional procedural programming languages ​​such as the "C" language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some embodiments, electronic circuitry, such as programmable logic circuitry, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), is personalized by utilizing the status information of the computer-readable program instructions to implement various aspects of this disclosure.

[0084] Various aspects of this disclosure are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this disclosure. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.

[0085] These computer-readable program instructions can be provided to a processing unit of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processing unit of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner. Thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.

[0086] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0087] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction containing one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than those shown in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.

[0088] The various embodiments of this disclosure have been described above. These descriptions are exemplary and not exhaustive, and are not limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles, practical applications, or technical improvements to the technology in the market, or to enable others skilled in the art to understand the embodiments disclosed herein.

Claims

1. A method (200) for controlling a device, comprising: Based on sensor data detected by the device's sensors, a machine learning model is used to determine (202) target observations related to the target task performed by the device; as well as Based on a model predictive control strategy formed by a first loss related to the target state of the device and a second loss related to the target observation, control information for the device is determined (204).

2. The method (200) of claim 1, wherein the sensor is a camera, and determining target observations related to a target task performed by the device using a machine learning model based on sensor data detected by the device's sensor comprises: Receive image data related to the target task from the camera; as well as The target observations are determined by applying the image data to the machine learning model, which are relevant to the target task performed by the device.

3. The method (200) of claim 2, wherein the target task is a navigation or tracking task for a target object, and determining the target observations related to the target task performed by the device by applying the image data to the machine learning model comprises: The target distance between the device and the target object is determined by applying the image data to the machine learning model. and / or The image data is applied to the machine learning model to determine the position of the device relative to the target object.

4. The method (200) according to claim 2, wherein training the machine learning model comprises: Acquire sample image data from the sensor of the device; Determine the truth value corresponding to the sample image data; as well as The machine learning model is trained based on the sample image data and the ground truth. Determining the true value corresponding to the sample image data includes: The true value is obtained from the motion capture system of the device; or The truth value is determined based on the acquired QR code image.

5. The method (200) of claim 1, wherein Determining the control information for the device includes: Based on the first loss and the second loss, the objective function in the model predictive control strategy is determined; Determine the constraints on the objective function in the model predictive control strategy; Based on the objective function and the constraints, the control information for the device is determined.

6. The method (200) according to claim 5, further comprising: Determine the reference observation value corresponding to the target observation value; as well as The first loss is determined based on the target observation and the reference observation; And / or, Determine a reference state corresponding to the target state of the device; as well as The second loss is determined based on the target state and the reference state.

7. The method (200) of claim 1, wherein the first loss and the second loss are for a first time among a plurality of times, and determining control information for the device includes: Obtain a first plurality of losses related to the state of the device and a second plurality of losses related to the observations for the plurality of times; as well as The control information is determined based on the first plurality of losses and the second plurality of losses.

8. The method (200) according to claim 1, further comprising: Based on the control information, the device is controlled to perform the target task.

9. The method (200) according to claim 1, wherein the model predictive control strategy is a perception model predictive control strategy.

10. A device (600) for controlling equipment, comprising: The target observation determination module (602) is configured to determine target observations related to the target task performed by the device based on sensor data detected by the device's sensors using a machine learning model; as well as The control information determination module (604) is configured to determine control information for the device based on a model predictive control strategy formed by a first loss related to the target state of the device and a second loss related to the target observation.

11. A controller (700), comprising: At least one processor; as well as A memory coupled to the at least one processor and having instructions stored thereon, which, when executed by the at least one processor, cause the controller to perform the method according to any one of claims 1-9.

12. A robot comprising the controller and the sensor according to claim 11.

13. A machine program product comprising machine executable instructions, wherein the machine executable instructions are executed by a processor to implement the method according to any one of claims 1 to 9.