Control method for robot guidance task point, and related apparatus

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By utilizing on-site images and correction information in the metaverse platform, the problem of low task completion rate caused by the error between virtual reality and the real environment was solved, and accurate correction of robot image acquisition pose was achieved, thereby improving task completion rate and efficiency.

WO2026138067A1PCT designated stage Publication Date: 2026-07-02LENOVO (BEIJING) LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: WO · WO
Patent Type: Applications
Current Assignee / Owner: LENOVO (BEIJING) LTD
Filing Date: 2025-10-10
Publication Date: 2026-07-02

Application Information

Patent Timeline

10 Oct 2025

Application

02 Jul 2026

Publication

WO2026138067A1

IPC: G06T7/70; G06T1/00; G06T19/00

AI Tagging

Technology Topics

Information processing Radiology

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Under the metaverse construction platform, during the robot gimbal-guided task process, the task completion rate is low due to the errors between virtual reality and the real environment and the robot's positioning error.

Method used

By acquiring on-site images, the image acquisition pose of the virtual robot in the target 3D scene is determined, and correction information is obtained based on the on-site images. This information is then used by the real robot to correct the image acquisition pose in the real scene, ensuring accurate image acquisition.

Benefits of technology

This improves the accuracy and efficiency of robots in completing tasks in real-world scenarios, and reduces the need for on-site intervention by technical personnel.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN2025126661_02072026_PF_FP_ABST

Patent Text Reader

Abstract

The present application relates to the field of information processing. Disclosed are a control method for a robot gimbal guidance task point, and a related apparatus. The method comprises: obtaining an on-site image, wherein the on-site image is an image used for constructing a target three-dimensional scenario; obtaining a target image collection pose of a virtual robot at a corresponding target task point in the target three-dimensional scenario; on the basis of the on-site image, obtaining a target image corresponding to the target image collection pose; and on the basis of the target image, obtaining correction information, wherein the correction information is used for correcting an image collection pose of a real robot when the real robot detects the target task point in a real scenario.

Need to check novelty before this filing date? Find Prior Art

Description

Control methods and related devices for robot-guided task points

[0001] This application claims priority to Chinese Patent Application No. 202411930139.6, filed on December 25, 2024, entitled "Control Method and Related Apparatus for Robot Guiding Task Points", the entire contents of which are incorporated herein by reference. Technical Field

[0002] This application relates to the field of information processing, and in particular to a control method and related apparatus for guiding robot task points. Background Technology

[0003] A gimbal is a device used to fix and adjust the position of a robot or other equipment. In the field of robotics, gimbals are widely used in various service, industrial, and research robots to adjust the robot's posture and position, enabling it to better perform its tasks.

[0004] As an emerging technology, the metaverse can integrate various new technologies to create new types of virtual and real internet applications and social forms. The metaverse can be used to build platforms to control robot gimbals and guide tasks.

[0005] The robot gimbal guidance task process under the Metaverse construction platform can include: scanning the on-site environment using LiDAR (Light Detection and Ranging); editing robot task points on the Metaverse construction platform and increasing fault tolerance by utilizing optional gimbal guidance tasks; fine-tuning according to the actual situation and debugging on-site with real machines; and deploying the real machine.

[0006] In the above process, the position, orientation, and virtual gimbal parameters of the virtual robot twin body need to be adjusted to appropriate parameters in the metaverse platform according to the virtual environment objects. However, due to the error between the virtual reality environment and the real reality environment, as well as the positioning and mechanical structure operation errors of the robot when performing tasks, the deployed robot cannot reach the absolute position and configuration parameters of the task editing, resulting in a low task completion rate. Summary of the Invention

[0007] The first aspect of this application provides a control method for robot-guided task points, comprising:

[0008] Obtain on-site images, which are used to construct the target 3D scene;

[0009] Obtain the target image acquisition pose of the virtual robot at the corresponding target task point in the target 3D scene;

[0010] Obtain a target image corresponding to the target image acquisition pose based on the on-site image;

[0011] Correction information is obtained from the target image. The correction information is used to correct the image acquisition pose of the real robot when the real robot detects the target task point in a real scene.

[0012] One possible implementation also includes:

[0013] In response to receiving a command, the target image acquisition pose and the correction information are sent to the real robot. The real robot can then acquire a first image in the real scene based on the target image acquisition pose and correct the image acquisition pose according to the first image and the correction information.

[0014] In one possible implementation, the scene images include all images used to construct the target 3D scene.

[0015] In one possible implementation, obtaining the scene image includes:

[0016] Obtain the scene image corresponding to the target task point from the scene image set, wherein the scene image set is a set of images used to construct the target 3D scene.

[0017] In one possible implementation, obtaining the target image corresponding to the target image acquisition pose based on the scene image includes:

[0018] The scene image is imported into the target 3D scene based on the shooting pose of the scene image to obtain the scene image 3D scene;

[0019] Based on the target image acquisition pose, the image acquisition range of the virtual robot in the target 3D scene is determined;

[0020] The target image is obtained by rendering a 3D scene from the on-site image and the image corresponding to the image acquisition range.

[0021] In one possible implementation, obtaining the target image corresponding to the target image acquisition pose based on the scene image includes:

[0022] Obtain the shooting pose of the scene image;

[0023] Based on the scene image and the shooting pose, establish the correspondence between the target image acquisition pose and the shooting pose;

[0024] Based on the aforementioned correspondence, the cropping range corresponding to the target image acquisition pose in the scene image is determined;

[0025] Based on the specified capture range, the target image is captured from the scene image.

[0026] In one possible implementation, the target image acquisition pose includes:

[0027] The position coordinates of the virtual robot in the target 3D scene and the orientation angle of the virtual robot gimbal.

[0028] A second aspect of this application provides a control device for guiding a robot to a task point, comprising:

[0029] A scene image acquisition module is used to acquire scene images, which are images used to construct a target 3D scene;

[0030] The pose acquisition module is used to acquire the target image pose of the virtual robot at the target task point in the target 3D scene.

[0031] The target image acquisition module is used to acquire a target image corresponding to the target image acquisition pose based on the field image;

[0032] The correction information acquisition module is used to obtain correction information based on the target image. The correction information is used to correct the image acquisition pose of the real robot when the real robot detects the target task point in a real scene.

[0033] A third aspect of this application provides a computer program product including computer-readable instructions that, when executed on an electronic device, cause the electronic device to implement the robot guidance task point control method described in the first aspect or any implementation thereof.

[0034] A fourth aspect of this application provides an electronic device, including at least one processor and a memory connected to the processor, wherein:

[0035] The memory is used to store computer programs;

[0036] The processor is used to execute the computer program so that the electronic device can implement the robot guidance task point control method of the first aspect or any implementation thereof.

[0037] The fifth aspect of this application provides a computer storage medium carrying one or more computer programs, which, when executed by an electronic device, enable the electronic device to control the robot guidance task point according to the first aspect or any implementation thereof. Attached Figure Description

[0038] The above and other features, advantages, and aspects of the embodiments of this disclosure will become more apparent from the accompanying drawings and the following detailed description. Throughout the drawings, the same or similar reference numerals denote the same or similar elements. It should be understood that the drawings are schematic, and the originals and elements are not necessarily drawn to scale.

[0039] Figure 1 is a schematic diagram of the control system for robot guidance task points provided in this application;

[0040] Figure 2 is a flowchart illustrating a robot-guided task point control method provided in an embodiment of this application;

[0041] Figure 3 is a schematic diagram of the target image and correction information provided in an embodiment of this application;

[0042] Figure 4 is a schematic diagram of the first image provided in an embodiment of this application;

[0043] Figure 5 is a schematic flowchart of obtaining a target image corresponding to the target image acquisition pose based on the scene image according to an embodiment of this application;

[0044] Figure 6 is a schematic diagram of the virtual robot determining the image acquisition range in a target three-dimensional scene according to an embodiment of this application;

[0045] Figure 7 is a schematic diagram of the image acquisition range corresponding to the image provided in the embodiment of this application;

[0046] Figure 8 is a schematic flowchart of obtaining a target image corresponding to the target image acquisition pose based on the scene image according to an embodiment of this application;

[0047] Figure 9 is a schematic diagram of the structure of a robot-guided task point control device provided in an embodiment of this application;

[0048] Figure 10 is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Detailed Implementation

[0049] The embodiments of this application are described below with reference to the accompanying drawings. The terminology used in the implementation section of this application is for explaining specific embodiments only and is not intended to limit the scope of this application.

[0050] The embodiments of this application will now be described with reference to the accompanying drawings. Those skilled in the art will recognize that, with technological advancements and the emergence of new scenarios, the technical solutions provided in the embodiments of this application are equally applicable to similar technical problems.

[0051] The terms "first," "second," etc., used in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such terms are interchangeable where appropriate; this is merely a way of distinguishing objects with the same attributes in the embodiments of this application. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion, so that a process, method, system, product, or apparatus that comprises a series of elements is not necessarily limited to those elements but may include other elements not explicitly listed or inherent to those processes, methods, products, or apparatuses.

[0052] To facilitate understanding of the solution in this application, the control system for the robot guidance task point in this application will be introduced first.

[0053] Figure 1 is a schematic diagram of the control system for robot-guided task points provided in this application. The system in this embodiment may include: platform device 101 and real robot 102.

[0054] The platform can construct a virtual 3D scene using on-site images, build a virtual robot in the virtual 3D scene, and collect the pose of the virtual robot at the task point in the virtual 3D scene, and then deploy the collected pose information to the real robot.

[0055] The real robot 102 is equipped with a moving device 1021 and a gimbal 1022. The moving device can move the real robot according to the acquisition pose, and the gimbal can acquire images of the environment around the real robot.

[0056] In this application, correction information is determined using on-site images. This correction information can instruct the real robot to correct its image acquisition pose when performing detection in a real scene, ensuring that the acquired image matches the task point.

[0057] Figure 2 is a flowchart illustrating a robot-guided task point control method according to an embodiment of this application, which may include steps 201 to 204. These steps are described in detail below.

[0058] 201. Obtain on-site images, which are used to construct the target 3D scene;

[0059] This embodiment provides a robot guidance task point control method, which is applied to an electronic device that serves as a robot control platform, such as a metaverse construction platform.

[0060] The on-site image can construct a target 3D scene, which is a virtual representation of the real scene.

[0061] The on-site image can be obtained by scanning using LiDAR (Light Detection and Ranging, also known as optical radar) technology, and can be visualized or non-visualized data after being processed by the platform.

[0062] As an example, the visualization data could be in FBX (Filmbox) format.

[0063] The on-site image can be a multi-frame image used to stitch together a panoramic image.

[0064] When constructing the target 3D scene, the multi-frame on-site images can be stitched together to form a panoramic image, and the panoramic image can be used to construct the target 3D scene.

[0065] In one possible implementation, the acquired field images include all images used to construct the target 3D scene.

[0066] This involves obtaining all images of the target 3D scene, and then using these images to obtain the target image.

[0067] In one possible implementation, obtaining a field image includes: obtaining the field image corresponding to the target task point from a set of field images, the set of field images being a collection of images used to construct the target 3D scene.

[0068] The scene image set includes several frames of images obtained by scanning the real scene. Only some of the images in the scene image set may contain relevant content of the task point. Therefore, in this embodiment, the scene images obtained can only be the scene images corresponding to the target task point, so as to reduce the amount of image processing in the subsequent process of obtaining the target image.

[0069] This involves obtaining a portion of the image of the target 3D scene, and then using this portion of the image to obtain the target image.

[0070] 202. Obtain the target image acquisition pose of the virtual robot at the corresponding target task point in the target 3D scene;

[0071] After constructing the target 3D scene, the platform sets the image acquisition pose for each task point in the target 3D scene as set by the virtual robot.

[0072] The image acquisition pose is the pose adopted by the virtual robot in the target 3D scene to acquire images of the corresponding task points.

[0073] The image acquisition pose is the acquisition pose of the virtual robot and the target task point in the target 3D scene. Correspondingly, in the real scene, the acquisition pose of the real robot at the task point is the pose corresponding to the image acquisition pose.

[0074] In one possible implementation, the target image acquisition pose includes: the position coordinates of the virtual robot in the target 3D scene and the orientation angle of the virtual robot gimbal.

[0075] The position coordinates can be XYθ, where XY is the position in the environmental ground coordinate system and θ represents the rotation angle.

[0076] The orientation angle can be expressed as PTZ, which is an abbreviation for Pan / Tilt / Zoom, representing the pan / tilt head's omnidirectional (up and down, left and right) movement and the lens's zoom and magnification control.

[0077] In one possible implementation, the on-site data obtained from real-world scanning is processed by the platform to obtain visualized data. The rendering engine loads the visualized data and imports it into the virtual environment while the scanned content is visualized. In the virtual environment, the virtual robot's pose for each task point is generated. Specifically, the pose XYθ of the virtual robot and the PTZ value of the virtual gimbal on the virtual robot are adjusted in the virtual environment (so that the virtual camera on the virtual robot will display the FBX image it sees). The gimbal image is adjusted to the appropriate position to complete the process of editing the task points.

[0078] In practical applications, multiple, dozens, or even hundreds of task points can be set for a specific scenario. The control method of this application embodiment can be executed for each task point to obtain the correction information for each task point.

[0079] 203. Obtain the target image corresponding to the acquisition pose of the target image based on the on-site image;

[0080] The on-site image is an image of the constructed target 3D scene. Due to factors such as the accuracy of the constructed model, the target 3D scene will have some error compared to the real scene.

[0081] Since the target 3D scene is constructed from the on-site image, the image content that the virtual robot can acquire in the target 3D scene using the target image acquisition pose can be extracted from the on-site image based on the correspondence between the target 3D scene and the on-site image, according to the target image acquisition pose.

[0082] The virtual robot uses the target image pose to acquire images in the target 3D scene. The acquisition range of the virtual robot corresponds to a region, and the target image can be captured in the on-site image, with the image of the region corresponding to the acquisition range of the virtual robot being extracted.

[0083] The target image is a portion of the scene image, which is a cropped image of the scene image. The target image is the image of the target task point in the scene image. The target task point is also arranged in the target 3D scene. The target image is the part of the scene image that corresponds to the target task point in the target 3D scene.

[0084] As an example, the task performed by the target task point is to collect detection data of a flow meter. The target image can be an image corresponding to the target task point, including the flow meter and its installation environment.

[0085] In one possible implementation, the on-site image is first imported into the target 3D scene to obtain the on-site image 3D scene. The image acquisition range of the virtual robot in the target 3D scene is then determined. The target image is obtained by using the on-site image 3D scene and the image corresponding to the image acquisition range. The process of obtaining the target image is explained in detail in Figure 5.

[0086] In one possible implementation, a correspondence can be established between the shooting pose of the scene image and the acquisition pose of the target image. Based on this correspondence, the target image is cropped from the scene image. The process of obtaining the target image is explained in detail in Figure 8.

[0087] 204. Obtain correction information based on the target image. This correction information is used to correct the image acquisition pose of the real robot when the real robot detects the target task point in a real scene.

[0088] When a real robot detects a target task point in a real scene, it may not be able to reach the absolute position and configuration parameters of the task editing when acquiring images of the target task point. The image acquisition pose of the real robot is not the position of the task editing, and its implemented configuration parameters are inconsistent with the configuration parameters of the task editing. Therefore, the image captured by the real robot cannot support the completion of the task. The correction information is the information for correcting the image acquisition pose of the real robot when it is performing detection.

[0089] The correction information corresponds to the points and configuration parameters of the task editing. When the real robot performs image acquisition on the target task point, if it reaches the absolute point and configuration parameters of the task editing, the image it acquires will match the correction information.

[0090] The matching includes: the content corresponding to the correction information is in the central area of the image acquired by the real robot, and the image acquired by the real robot contains only the content corresponding to the correction information. This application does not limit the specific matching method.

[0091] As an example, the corrective information could be task-specific information, such as the task nameplate set at the task point.

[0092] The target image is extracted from the on-site image and contains information about the target task point in the real scene. A portion of the target image is extracted as correction information, which can then be compared with the image acquired in the real scene to correct the image acquisition pose of the real robot.

[0093] The platform will configure the target image acquisition pose determined in the target 3D scene to the real robot. The real robot will then use the configured acquisition pose to reach the corresponding position in the real scene and perform image acquisition.

[0094] The target image contains information about the real scene of the target task point in the on-site image, and a portion of the target image is extracted as correction information.

[0095] In one possible implementation, selection information can be received, which is a selection of the cropping range of the target image. Using this selection information, the content of the portion within the cropping range of the target image is cropped, and the cropped content is the correction information.

[0096] The selection information can be a range manually entered by a technician; or it can be a range automatically selected from the portion of the target image containing the feature information of the target task point, based on the feature information of the target task point that has been pre-entered.

[0097] Figure 3 is a schematic diagram of the target image and correction information provided in an embodiment of this application. The target image 301 includes the target task point and the surrounding environment, and the correction information 302 includes the target task point. The correction information is obtained by cropping a portion of the target image.

[0098] In Figure 3, the target task point is a power cabinet (35kV 1# substation transformer cabinet). The target image 301 is an image containing the power cabinet and its surrounding environment. A portion of the target image 301 is extracted as correction information. The correction information 302 is the display area image of the power cabinet.

[0099] In this embodiment, a field image for constructing a target 3D scene is obtained, and the target image acquisition pose of the virtual robot corresponding to the target task point in the target 3D scene is obtained. A target image corresponding to the target image acquisition pose is obtained from the field image, and correction information is obtained from the target image. This correction information is used to correct the image acquisition pose of the real robot when it detects the target task point in the real scene. Using the target image acquisition pose of the virtual robot at the target task point in the target 3D scene, the field image for constructing the target 3D scene is cropped to obtain the target image. This target image is a portion of the field image, specifically the portion of the field image corresponding to the target task point in the target 3D scene. Correction information is then obtained from this target image. This correction information corresponds to the real scene and is related to the target image acquisition pose. Using this correction information, the image acquisition pose of the real robot when detecting the target task point in the real scene can be corrected, ensuring that the real robot can reach the absolute position and configuration parameters of the task, thereby improving the task completion rate.

[0100] In one possible implementation, after obtaining the correction information, the following is also included:

[0101] In response to receiving a command, the target image acquisition pose and the correction information are sent to the real robot. The real robot can then acquire a first image in the real scene based on the target image acquisition pose and correct the image acquisition pose based on the first image and the correction information.

[0102] The real robot can be equipped with a recognition algorithm. The recognition algorithm in the robot can be trained using the correction information. The recognition algorithm can then be used to identify the relevant features of the target task point in the captured first image. It can be determined whether the relevant features of the target task point meet the specific task completion conditions. If not, the correction parameters of the real robot can be determined.

[0103] In this real robot, another recognition algorithm can be set up to compare the correction information with the captured first image. When the two correspond, it is determined whether the first image meets the task completion conditions. If not, the correction information and the first image are used to determine the correction parameters of the real robot.

[0104] The recognition algorithm can be an image recognition algorithm, which can identify the features of objects in an image and obtain the image features of the object.

[0105] In one possible implementation, if the first image acquired for a certain task point matches the corresponding correction information and the first image meets the task completion conditions, no adjustment is needed for the image acquisition pose of that task point; if the first image acquired for a certain task point matches the corresponding correction information but does not meet the task completion conditions, the image acquisition pose of that task point needs to be adjusted; if the first image acquired for a certain task point does not match the corresponding correction information, the image acquisition pose of that task point needs to be adjusted.

[0106] It should be noted that due to errors between the constructed target 3D scene and the real environment, as well as errors in the positioning of the real robot during task execution and the operation of the mechanical structure, the combined error is generally not large. There will not be a situation where the first image acquired by the real robot at a certain task point does not match the corresponding correction information. Therefore, generally, the first image acquired for a certain task point matches the corresponding correction information, but the first image does not meet the task completion conditions, and the image acquisition pose for that task point needs to be adjusted.

[0107] As an example, the task completion condition is that the area corresponding to the target task point in the first image is located in the central region.

[0108] The image corresponding to the task point in the first image can be located at any position in the first image. If it is located in the central region of the first image, the task completion condition is determined to be met; otherwise, the task completion condition is not met. Of course, if the first image does not contain the image corresponding to the task point, the task completion condition is determined not to be met.

[0109] The task issuance instruction is generated when the operator of the operating platform triggers the issue button. The purpose of this instruction is to send the image acquisition poses of the corresponding task points configured for the virtual robot in the target 3D scene to the real robot, so that the real robot can use the received image acquisition poses to acquire images of the corresponding task points in the real scene and complete the task.

[0110] Specifically, the image acquisition pose corresponding to each task point and its corresponding correction information are sent to the real robot so that the real robot can use the image acquisition pose to acquire the first image of the corresponding task point.

[0111] Figure 4 is a schematic diagram of the first image provided in an embodiment of this application. The first image 401 includes a task point corresponding image 402. The task point image includes the display area image of the 35kV1# substation power cabinet. In (a), the image 402 is located on the left side of the first image 401, which does not meet the task completion condition; in (b), the image 402 is located in the upper right corner of the first image 401, which does not meet the task completion condition; in (c), the image 402 is located in the center of the first image 401, which meets the task completion condition; in (d), the image 402 only contains a part of the task point, and the image 402 is located on the left side of the first image, which does not meet the task completion condition.

[0112] In one possible implementation, after the real robot adjusts its image acquisition pose, the adjusted image acquisition pose can be fed back to the platform so that the platform can refer to the adjusted image acquisition pose when constructing the target 3D scene, thereby improving the accuracy of the target 3D scene construction.

[0113] In one possible implementation, after the real robot adjusts its image acquisition pose, the adjusted image acquisition pose can be recorded. When the same task point is executed again, the adjusted image acquisition pose can be used, reducing the amount of data processing for the real robot.

[0114] In one possible implementation, the real robot uses its configured image acquisition pose and correction information to determine whether a correction process is needed each time it executes a corresponding task point, and performs correction when necessary, until the number of times no correction is needed reaches a certain number. At this point, the robot can directly execute the corresponding task point using the image acquisition pose without further determining whether a correction process is needed, thus reducing the data processing load of the real robot.

[0115] In this embodiment, after obtaining the correction information and the target image acquisition pose, in response to receiving the issued instruction, the target image acquisition pose and the correction information are sent to the real robot. The real robot can acquire a first image in the real scene based on the target image acquisition pose, and correct the image acquisition pose according to the first image and the correction information. This realizes that in the real scene, the real robot can automatically adjust its image acquisition pose without the need for technicians to go to the site, thus improving the efficiency of task completion.

[0116] Figure 5 is a flowchart illustrating the process of obtaining a target image corresponding to the target image acquisition pose based on the scene image according to an embodiment of this application. It may include steps 501 to 503, which are described in detail below.

[0117] 501. Based on the shooting position of the on-site image, import the on-site image into the target 3D scene to obtain the on-site image 3D scene;

[0118] The shooting posture refers to the shooting posture adopted by the image acquisition device when capturing images in a real scene.

[0119] In one possible implementation, during the capture process, each scene image records its shooting pose, which includes the position coordinates and orientation angle of the shooting device in the real scene. The position coordinates can include XYθ, and the orientation angle can be PTZ.

[0120] The shooting position can be saved on the platform along with the on-site images.

[0121] Specifically, by importing the on-site image into the target 3D scene based on the shooting pose of the on-site image, a 3D scene of the on-site image can be obtained. This 3D scene of the on-site image is a 3D scene corresponding to the part of the on-site image.

[0122] The 3D scene in the on-site image corresponds to the real scene in the on-site image.

[0123] In this embodiment, the target 3D scene can be imported from only a portion of the on-site images corresponding to the task point in the on-site image set to obtain the 3D scene of the on-site image.

[0124] 502. Based on the target image acquisition pose, determine the image acquisition range of the virtual robot in the target 3D scene;

[0125] The target image acquisition pose is the image acquisition pose set by the virtual robot in the target 3D scene for the target task point.

[0126] In determining the target image acquisition pose in the target 3D scene, image acquisition parameters from a gimbal in a real robot are incorporated. In this embodiment, these image acquisition parameters are further utilized to determine the image acquisition range of the virtual robot in the target 3D scene based on the determined target image acquisition pose.

[0127] The image acquisition range can be the range determined by the virtual robot in the target 3D scene based on the target image acquisition pose.

[0128] The virtual robot uses a virtual gimbal camera, and the image acquisition range is the acquisition range corresponding to the fov (Field of View, the range that the lens can cover) parameter of the virtual gimbal camera.

[0129] 503. The target image is obtained by rendering the three-dimensional scene based on the on-site image and the image corresponding to the image acquisition range.

[0130] Among multiple frames of on-site images, the portion of the images corresponding to the acquisition range of the image is determined. A certain task point may correspond to two or even more frames of on-site images.

[0131] Specifically, based on the aforementioned determined 3D scene of the on-site image, the image corresponding to the image acquisition range is rendered to obtain the target image, which contains all the content information of the corresponding task point.

[0132] As an example, task point A corresponds to a cabinet-type device. The shooting poses of each frame of on-site images are imported into the target 3D scene to obtain the on-site image 3D scene of task point A. The image acquisition range of the virtual robot for task point A in the target 3D scene is determined, and this image acquisition range corresponds to two frames of on-site images. These two frames are then rendered based on the on-site image 3D scene to obtain the target image.

[0133] Figure 6 is a schematic diagram illustrating the determination of the image acquisition range of a virtual robot in a target 3D scene according to an embodiment of this application. In the target 3D scene 601, the image acquisition range 603 corresponds to the virtual robot 602. The image acquisition range in Figure 6 is represented by a dashed line. In the target 3D scene, any area falling within the image acquisition parameters corresponding to the virtual robot belongs to the image acquisition range.

[0134] Figure 7 is a schematic diagram of the image acquisition range corresponding to the image provided in the embodiment of this application. In this schematic diagram, the image acquisition range corresponds to three on-site images 701-703. This schematic diagram is for the task point 35kV1# substation power cabinet. For the task point, the image acquisition range corresponds to the right half of on-site image 701, the entirety of on-site image 702, and the left half of on-site image 703. The three images are rendered based on the three-dimensional scene of the on-site images to obtain the target image 704. The target image includes the right half of on-site image 701, the entirety of on-site image 702, and the content contained in the left half of on-site image 703.

[0135] In this embodiment, the on-site image is imported into the target 3D scene using the shooting pose of the on-site image to obtain the on-site image 3D scene, thus realizing the construction of the on-site image 3D scene of the target task point; based on the previously determined target image acquisition pose, the image acquisition range of the virtual robot in the target 3D scene is determined; based on the on-site image 3D scene, the image corresponding to the image acquisition range is rendered to obtain the target image, thus realizing the acquisition of the target image corresponding to the target image acquisition pose based on the on-site image. The target image is an image containing the target task point and its surrounding environment, providing a basis for subsequent determination of correction information.

[0136] Figure 8 is a flowchart illustrating the process of obtaining a target image corresponding to the target image acquisition pose based on the scene image according to an embodiment of this application. It may include steps 801 to 804, which are described in detail below.

[0137] 801. Obtain the shooting pose of the scene image;

[0138] The shooting posture refers to the shooting posture adopted by the image acquisition device when capturing images in a real scene.

[0139] During the shooting process, each on-site image records its shooting pose, which includes the position coordinates and orientation angle of the shooting device in the real scene. The position coordinates can include XYθ, and the orientation angle can be PTZ.

[0140] The shooting position can be saved on the platform along with the on-site images.

[0141] 802. Based on the scene image and the shooting pose, establish the correspondence between the target image acquisition pose and the shooting pose;

[0142] Since there is a one-to-one correspondence between the spatial positions of the real 3D scene and the virtual 3D scene, the shooting pose of the on-site image obtained by scanning in the real 3D scene can be established with the target image acquisition pose in the virtual 3D scene.

[0143] Generally, in the process of scanning to obtain on-site images in a real 3D scene, each task point needs to be captured in detail. Therefore, the shooting pose corresponding to each on-site image will have the shooting pose corresponding to each task point, and this shooting pose is the pose facing the task point.

[0144] The correspondence between the target image acquisition pose and the shooting pose represents the difference between the acquisition angle of the virtual robot's virtual gimbal and the acquisition angle of the scanning device in the real scene, and the difference between the two is fixed.

[0145] The target image acquisition pose is the acquisition pose for the target task point. For the target task point, the corresponding shooting pose of the scene image is determined. The shooting pose of the scene image is the shooting pose of the target image acquisition pose in the real 3D scene.

[0146] In practice, for each task point in the target 3D scene, the corresponding shooting poses for the target image acquisition pose can be determined, and the correspondence between the image acquisition pose and the shooting pose of each task point can be established.

[0147] 803. Based on this correspondence, determine the cropping range corresponding to the target image acquisition pose in the on-site image;

[0148] This correspondence represents the difference between the acquisition angle of the virtual gimbal of the virtual robot and the acquisition angle of the scanning device in the real scene, and the difference between the two is fixed.

[0149] Therefore, based on this correspondence, the capture range corresponding to the target image acquisition pose is determined in the on-site image. This capture range can correspond to one or more frames of on-site images, and can include part or all of a certain frame of image.

[0150] In determining the target image acquisition pose in the target 3D scene, image acquisition parameters from a gimbal in a real robot are incorporated. In this embodiment, these image acquisition parameters are further utilized, combined with the target image acquisition pose, to determine the corresponding image acquisition range in the on-site image.

[0151] 804. Based on the capture range, the target image is captured from the scene image.

[0152] The cropping range can correspond to part or all of the scene image.

[0153] When the cropping range is a portion of the scene image, the corresponding portion of the scene image is cropped to obtain the target image.

[0154] As an example, the cropping range is the left half of the scene image, and the left half of the scene image is cropped as the target image.

[0155] When the cropping range is the entirety of the scene image, the scene image is taken as the target image.

[0156] If the capture range involves multiple frames of on-site images, the captured images can be stitched together in a set order to obtain the target image.

[0157] In one possible implementation, the splicing can be done through rendering.

[0158] In this embodiment, the shooting pose of the scene image is obtained; based on the scene image and the shooting pose, a correspondence between the target image acquisition pose and the shooting pose is established; according to the correspondence, the cropping range corresponding to the target image acquisition pose in the scene image is determined; according to the cropping range, the target image is cropped from the scene image. In this process, the spatial position correspondence between the real 3D scene and the target 3D scene is utilized to establish a correspondence between the shooting pose of the scene image in the real environment and the target image acquisition pose. This correspondence is used to determine the cropping range corresponding to the target image acquisition pose in the scene image, and then the target image is cropped from the scene image according to the cropping range. This achieves the acquisition of a target image corresponding to the target image acquisition pose based on the scene image. The target image is an image containing the target task point and its surrounding environment, providing a basis for subsequent determination of correction information.

[0159] The above describes a robot guidance task point control method provided by the embodiments of this application. The following will describe the apparatus for executing the above robot guidance task point control method.

[0160] Figure 9 is a schematic diagram of a robot guidance task point control device provided in an embodiment of this application. As shown in Figure 9, the robot guidance task point control device 900 includes:

[0161] The on-site image acquisition module 901 is used to acquire on-site images, which are used to construct the target 3D scene.

[0162] The pose acquisition module 902 is used to acquire the target image pose of the virtual robot at the target task point in the target 3D scene.

[0163] The target image acquisition module 903 is used to acquire a target image corresponding to the target image acquisition pose based on the scene image;

[0164] The correction information acquisition module 904 is used to obtain correction information based on the target image. This correction information is used to correct the image acquisition pose of the real robot when the real robot detects the target task point in a real scene.

[0165] One possible implementation also includes:

[0166] The response module is used to respond to the received instruction by sending the target image acquisition pose and the correction information to the real robot. The real robot can acquire a first image in the real scene based on the target image acquisition pose and correct the image acquisition pose based on the first image and the correction information.

[0167] In one possible implementation, the site image includes all images used to construct the target 3D scene.

[0168] In one possible implementation, the scene image acquisition module is specifically used for:

[0169] Obtain the scene image corresponding to the target task point from the scene image set, which is a collection of images used to construct the target 3D scene.

[0170] In one possible implementation, the target image acquisition module includes:

[0171] The import unit is used to import the scene image into the target 3D scene according to the shooting pose of the scene image to obtain the scene image 3D scene;

[0172] The first determining unit is used to determine the image acquisition range of the virtual robot in the target three-dimensional scene based on the target image acquisition pose;

[0173] The rendering unit is used to render the target image by rendering the three-dimensional scene of the on-site image and the image corresponding to the image acquisition range.

[0174] In one possible implementation, the target image acquisition module includes:

[0175] The acquisition unit is used to acquire the shooting pose of the scene image;

[0176] The establishment unit is used to establish the correspondence between the target image acquisition pose and the shooting pose based on the scene image and the shooting pose;

[0177] The second determining unit is used to determine the intercept range of the target image acquisition pose in the on-site image based on the correspondence.

[0178] The cropping unit is used to crop the target image from the scene image according to the cropping range.

[0179] In one possible implementation, the target image acquisition pose includes:

[0180] The position coordinates of the virtual robot in the target 3D scene and the orientation angle of the virtual robot gimbal.

[0181] It should be noted that the functional explanations of each component in the robot guidance task point control device provided in this embodiment are as described in the explanations in the aforementioned method embodiments, and will not be repeated in this embodiment.

[0182] In this embodiment, a field image for constructing a target 3D scene is obtained, and the target image acquisition pose of the virtual robot corresponding to the target task point in the target 3D scene is obtained. A target image corresponding to the target image acquisition pose is obtained from the field image, and correction information is obtained from the target image. This correction information is used to correct the image acquisition pose of the real robot when it detects the target task point in the real scene. Using the target image acquisition pose of the virtual robot at the target task point in the target 3D scene, the field image for constructing the target 3D scene is cropped to obtain the target image. This target image is a portion of the field image, specifically the portion of the field image corresponding to the target task point in the target 3D scene. Correction information is then obtained from this target image. This correction information corresponds to the real scene and is related to the target image acquisition pose. Using this correction information, the image acquisition pose of the real robot when detecting the target task point in the real scene can be corrected, ensuring that the real robot can reach the absolute position and configuration parameters of the task, thereby improving the task completion rate.

[0183] This application also provides an electronic device. Referring to FIG10, a schematic diagram of a structure suitable for implementing the electronic device in this application is shown. The electronic device in this application may include, but is not limited to, fixed terminals such as mobile phones, laptops, PDAs (personal digital assistants), PADs (tablet computers), desktop computers, etc. The electronic device shown in FIG10 is merely an example and should not impose any limitations on the functionality and scope of use of the embodiments of this application.

[0184] As shown in Figure 10, the electronic device may include a processing unit (e.g., a central processing unit, a graphics processing unit, etc.) 1001, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 1002 or a program loaded from a storage device 1008 into a random access memory (RAM) 1003. When the electronic device is powered on, the RAM 1003 also stores various programs and data required for the operation of the electronic device. The processing unit 1001, ROM 1002, and RAM 1003 are interconnected via a bus 1004. An input / output (I / O) interface 1005 is also connected to the bus 1004.

[0185] Typically, the following devices can be connected to the I / O interface 1005: input devices 1006 including, for example, a touchscreen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 1007 including, for example, a liquid crystal display (LCD), speaker, vibrator, etc.; storage devices 1008 including, for example, memory card, hard disk, etc.; and communication devices 1009. Communication device 1009 allows the electronic device to communicate wirelessly or wiredly with other devices to exchange data. Although Figure 10 shows an electronic device with various devices, it should be understood that it is not required to implement or possess all the devices shown. More or fewer devices may be implemented or possessed alternatively.

[0186] This application also provides a computer program product including computer-readable instructions, which, when executed on an electronic device, cause the electronic device to implement any of the robot guidance task point control methods provided in this application.

[0187] This application also provides a computer-readable storage medium carrying one or more computer programs. When the one or more computer programs are executed by an electronic device, the electronic device can implement any of the robot guidance task point control methods provided in this application.

[0188] It should also be noted that the device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. In addition, in the device embodiment drawings provided in this application, the connection relationship between modules indicates that they have a communication connection, which can be implemented as one or more communication buses or signal lines.

[0189] Through the above description of the embodiments, those skilled in the art can clearly understand that this application can be implemented by means of software plus necessary general-purpose hardware, or it can be implemented by special-purpose hardware including application-specific integrated circuits, special-purpose CPUs, special-purpose memory, special-purpose components, etc. Generally, any function performed by a computer program can be easily implemented by corresponding hardware, and the specific hardware structure used to implement the same function can also be diverse, such as analog circuits, digital circuits, or special-purpose circuits. However, for this application, software program implementation is more often the preferred implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a readable storage medium, such as a computer floppy disk, USB flash drive, mobile hard disk, ROM, RAM, magnetic disk, or optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, training equipment, or network device, etc.) to execute the methods described in the various embodiments of this application.

[0190] In the above embodiments, the implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, in the form of a computer program product.

[0191] The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions may be transmitted from one website, computer, training device, or data center to another website, computer, training device, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be any available medium that a computer can store or a data storage device such as a training device or data center that integrates one or more available media. The available media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., DVDs), or semiconductor media (e.g., solid-state drives (SSDs)).

Claims

1. A control method for guiding a robot to a task point, comprising: Obtain on-site images, which are used to construct the target 3D scene; Obtain the target image acquisition pose of the virtual robot at the corresponding target task point in the target 3D scene; Obtain a target image corresponding to the target image acquisition pose based on the on-site image; Correction information is obtained from the target image. The correction information is used to correct the image acquisition pose of the real robot when the real robot detects the target task point in a real scene.

2. The robot guidance task point control method according to claim 1 further includes: In response to receiving a command, the target image acquisition pose and the correction information are sent to the real robot. The real robot can then acquire a first image in the real scene based on the target image acquisition pose and correct the image acquisition pose according to the first image and the correction information.

3. The robot guidance task point control method according to claim 1, wherein the field image includes all images used to construct the target three-dimensional scene.

4. The robot guidance task point control method according to claim 1, wherein obtaining the scene image includes: Obtain the scene image corresponding to the target task point from the scene image set, wherein the scene image set is a set of images used to construct the target 3D scene.

5. The robot guidance task point control method according to claim 1, wherein obtaining the target image corresponding to the target image acquisition pose based on the field image includes: The scene image is imported into the target 3D scene based on the shooting pose of the scene image to obtain the scene image 3D scene; Based on the target image acquisition pose, the image acquisition range of the virtual robot in the target 3D scene is determined; The target image is obtained by rendering a 3D scene from the on-site image and the image corresponding to the image acquisition range.

6. The robot guidance task point control method according to claim 1, wherein obtaining the target image corresponding to the target image acquisition pose based on the field image includes: Obtain the shooting pose of the scene image; Based on the scene image and the shooting pose, establish the correspondence between the target image acquisition pose and the shooting pose; Based on the aforementioned correspondence, the cropping range corresponding to the target image acquisition pose in the scene image is determined; Based on the specified capture range, the target image is captured from the scene image.

7. The robot guidance task point control method according to claim 1, wherein the target image acquisition pose includes: The position coordinates of the virtual robot in the target 3D scene and the orientation angle of the virtual robot gimbal.

8. A control device for guiding a robot to a task point, comprising: A scene image acquisition module is used to acquire scene images, which are images used to construct a target 3D scene; The pose acquisition module is used to acquire the target image pose of the virtual robot at the target task point in the target 3D scene. The target image acquisition module is used to acquire a target image corresponding to the target image acquisition pose based on the field image; The correction information acquisition module is used to obtain correction information based on the target image. The correction information is used to correct the image acquisition pose of the real robot when the real robot detects the target task point in a real scene.

9. A computer program product, characterized in that, It includes computer-readable instructions that, when executed on an electronic device, cause the electronic device to implement the control method for robot-guided task points as described in any one of claims 1 to 7.

10. An electronic device, characterized in that, It includes at least one processor and a memory connected to the processor, wherein: The memory is used to store computer programs; The processor is used to execute the computer program to enable the electronic device to implement the robot guidance task point control method as described in any one of claims 1 to 7.