Obstacle recognition method, anti-collision control system and training data augmentation method

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By employing training data augmentation methods and data fusion between visual and radar sensors, the accuracy problem of obstacle recognition in aerial work platforms was solved, enabling efficient obstacle recognition under shaking and swaying conditions, reducing data acquisition costs, and enhancing environmental perception capabilities.

CN116051817BActive Publication Date: 2026-06-23ZOOMLION INTELLIGENT ACCESS MASCH CO LTD

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: ZOOMLION INTELLIGENT ACCESS MASCH CO LTD
Filing Date: 2022-12-30
Publication Date: 2026-06-23

AI Technical Summary

Technical Problem

During operation, aerial work platforms suffer from blurred image data due to the large deflection of the slender boom and the installation gaps in the boom system. Furthermore, the scale of obstacles changes, making it impossible for existing visual models to accurately identify them, thus posing a collision risk.

Method used

Disturbance images are generated by training data augmentation to simulate the shaking scenario of an aerial work platform. By combining data fusion from visual and radar sensors, a target recognition model is used to identify obstacles and perform data compensation processing.

Benefits of technology

It improves the accuracy of obstacle recognition when the aerial work platform is shaking and swaying, reduces data collection costs, and enhances environmental perception and safety.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116051817B_ABST

Patent Text Reader

Abstract

Embodiments of the present application provide a training data augmentation method for a target recognition model, an obstacle recognition method, a processor, an anti-collision control system, an aerial work platform, and a machine-readable storage medium. The training data augmentation method comprises: obtaining an original image dataset, the original image dataset comprising a plurality of original images; extracting a target obstacle image from the original images; adjusting parameters of the target obstacle image to generate a perturbed image of the target obstacle; adding the perturbed image to the original images to generate an augmented image; and adding the augmented image to the original image dataset to generate an augmented image dataset. Through the above technical solution, the recognition effect of the target recognition model when the work platform is shaking or swaying can be improved. More importantly, image data does not need to be collected under the condition of a high-altitude environment and work platform shaking (swaying), and the data collection cost is low and convenient (only needs to be collected on the ground).

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of safety control of aerial work platforms, specifically to a training data augmentation method for a target recognition model, an obstacle recognition method, a processor, a collision avoidance control system, an aerial work platform, and a machine-readable storage medium. Background Technology

[0002] Aerial work platforms are mobile work platforms used for various industries to perform high-altitude operations, such as equipment installation and maintenance. Traditional aerial work platform products include scissor lifts, vehicle-mounted aerial work platforms, articulated boom lifts, self-propelled aerial work platforms, aluminum alloy aerial work platforms, and telescopic boom lifts.

[0003] Currently, during the operation of aerial work platforms, if improper operation or blind spots exist, collisions may occur between the work platform and the external environment. Once a collision occurs, it will cause huge economic losses and even casualties. Adding an anti-collision control system to the aerial work platform can prevent collisions through early warning and movement restriction. Currently, some anti-collision control systems use a combination of cameras and radar to identify obstacles. By calculating the feature clustering matrix of environmental image data and matching it with the feature clustering matrix of obstacles in the environmental sample database, the category of obstacles in the environmental image data can be determined. However, in aerial work scenarios, due to: (1) the large deflection of the slender boom and the existence of the boom system installation gap, the work platform shakes during operation, which will often result in blurred image data, making it difficult for the visual model to accurately identify; (2) as the aerial work platform is lifted, its height changes continuously, and the scale of obstacles in the acquired images will also change. In the past, deep learning-based visual models often had certain constraints and limitations on the input size during training. Due to the characteristics of the aerial work platform itself, traditional pre-trained visual models cannot be directly used. Summary of the Invention

[0004] The purpose of this application is to provide a training data augmentation method for a target recognition model, an obstacle recognition method, a processor, a collision avoidance control system, an aerial work platform, and a machine-readable storage medium.

[0005] To achieve the above objectives, the first aspect of this application provides a training data augmentation method for a target recognition model, wherein the target recognition model is applied to a collision avoidance control system of an aerial work platform, and the training data augmentation method includes:

[0006] Obtain the raw image dataset, which includes multiple raw images;

[0007] Extract the target obstacle image from the original image;

[0008] Adjust the parameters of the target obstacle image to generate a perturbation image of the target obstacle;

[0009] Add the perturbation image to the original image to generate the augmented image;

[0010] Add augmented images to the original image dataset to generate an augmented image dataset.

[0011] In this embodiment of the application, adjusting the parameters of the target obstacle image to generate a perturbation image of the target obstacle includes at least one of the following:

[0012] The target obstacle image is moved horizontally and / or vertically to generate at least one perturbation image of the target obstacle;

[0013] The target obstacle image is rotated to generate at least one perturbation image of the target obstacle;

[0014] The target obstacle image is scaled to generate at least one perturbation image of the target obstacle.

[0015] In this embodiment of the application, adjusting the parameters of the target obstacle image to generate a perturbation image of the target obstacle further includes:

[0016] Increase the transparency of the perturbation image.

[0017] A second aspect of this application provides an obstacle recognition method applied to an aerial work platform. The aerial work platform includes a vision sensor and a radar sensor. The obstacle recognition method includes:

[0018] Acquire environmental images of the area surrounding the aerial work platform from visual sensors;

[0019] Acquire point cloud data of the environment surrounding the aerial work platform collected by radar sensors;

[0020] The environmental image is input into the target recognition model to output obstacle recognition results;

[0021] Determine obstacle information based on point cloud data; and

[0022] The obstacle information is correlated with the obstacle recognition results.

[0023] In this embodiment of the application, associating obstacle information with obstacle recognition results includes:

[0024] Determine the first timestamp of the current data frame in the point cloud data;

[0025] Determine the second timestamp in the image frame of the environmental image that is closest to the first timestamp, and the associated image frame corresponding to the second timestamp;

[0026] Determine the current obstacle information corresponding to the current data frame;

[0027] Determine the associated obstacle recognition results corresponding to the associated image frames;

[0028] Associate the current obstacle information with the associated obstacle recognition results.

[0029] In this embodiment of the application, obstacle information includes the obstacle's position information and speed information. Associating the current obstacle information with the associated obstacle identification results includes:

[0030] The current position information of the current obstacle is compensated using the current speed information, the first timestamp, and the second timestamp of the current obstacle information;

[0031] The compensated current obstacle information is correlated with the associated obstacle identification results.

[0032] In this embodiment of the application, compensating for the current position information of the current obstacle information using the current speed information, the first timestamp, and the second timestamp of the current obstacle information includes compensating for the current position information of the current obstacle information using the following formula:

[0033] x t_image =x t_radar +(t_image-t_radar)×v x

[0034] y t_image =y t_radar +(t_image-t_radar)×v y

[0035] Where, x t_image y t_image These are the compensated x and y coordinates of the obstacle, respectively. t_radar y t_radar The x and y coordinates of the obstacle in the current location information are v respectively. x v y t_radar and t_image represent the speeds of the obstacle along the horizontal and vertical axes in the current speed information, respectively, and the first and second timestamps are t_radar and t_image, respectively.

[0036] In this embodiment of the application, the obstacle recognition method further includes:

[0037] If there are unrelated obstacle recognition results within a fusion cycle, only the obstacle recognition results will be output;

[0038] If there is unrelated obstacle information within the fusion cycle, the obstacle recognition result is determined based on the obstacle information.

[0039] In this embodiment, the fusion period is the least common multiple of the sampling period of the visual sensor and the sampling period of the radar sensor.

[0040] In this embodiment of the application, the target recognition model is trained using the augmented image dataset obtained by the above-described training data augmentation method for the target recognition model.

[0041] A third aspect of this application provides a processor configured to perform the above-described training data augmentation method for a target recognition model.

[0042] A fourth aspect of this application provides a processor configured to perform the obstacle recognition method described above.

[0043] The fifth aspect of this application provides a collision avoidance control system for use on aerial work platforms. The collision avoidance control system includes:

[0044] A vision sensor is configured to acquire images of the environment surrounding the aerial work platform;

[0045] Radar sensors are configured to collect point cloud data of the environment surrounding the aerial work platform; and

[0046] The processor mentioned above.

[0047] The sixth aspect of this application provides an aerial work platform, comprising:

[0048] Work platform; and

[0049] The aforementioned collision avoidance control system.

[0050] A seventh aspect of this application provides a machine-readable storage medium storing instructions that, when executed by a processor, cause the processor to implement the aforementioned training data augmentation method for a target recognition model or the aforementioned obstacle recognition method.

[0051] The above technical solution can improve the target recognition model's performance when the work platform is shaking or swaying. More importantly, it eliminates the need to collect image data in high-altitude environments or under shaking conditions, resulting in low data acquisition costs and convenient data collection (only ground-based data collection is required).

[0052] Other features and advantages of the embodiments of this application will be described in detail in the following detailed description section. Attached Figure Description

[0053] The accompanying drawings are provided to further illustrate the embodiments of this application and form part of the specification. They are used together with the following detailed description to explain the embodiments of this application, but do not constitute a limitation on the embodiments of this application. In the drawings:

[0054] Figure 1A and Figure 1B The diagram illustrates the arrangement of various sensors in a collision avoidance control system for an aerial work platform according to an embodiment of this application.

[0055] Figure 2 A flowchart illustrating a training data augmentation method for a target recognition model according to an embodiment of this application is shown schematically.

[0056] Figure 3 Schematic illustration Figure 2 The process of generating augmented images in the training data augmentation method is shown;

[0057] Figure 4 A flowchart illustrating an obstacle recognition method according to an embodiment of this application is shown schematically.

[0058] Figure 5 The diagram illustrates the coordinate calibration of the radar sensor and the vision sensor. Detailed Implementation

[0059] The specific embodiments of this application will be described in detail below with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are for illustration and explanation only and are not intended to limit the embodiments of this application.

[0060] It should be noted that if the embodiments of this application involve directional indicators (such as up, down, left, right, front, back, etc.), the directional indicators are only used to explain the relative positional relationship and movement of each component in a certain specific posture (as shown in the figure). If the specific posture changes, the directional indicators will also change accordingly.

[0061] If the embodiments of this application involve descriptions such as "first" or "second," these descriptions are for descriptive purposes only and should not be construed as indicating or implying their relative importance or implicitly specifying the number of technical features indicated. Therefore, a feature defined with "first" or "second" may explicitly or implicitly include at least one of those features. Furthermore, the technical solutions of various embodiments can be combined with each other, but this must be based on the ability of those skilled in the art to implement them. When the combination of technical solutions is contradictory or impossible to implement, it should be considered that such a combination of technical solutions does not exist and is not within the scope of protection claimed in this application.

[0062] In the embodiments of this application, the terms "left side," "right side," and "rear side" of the work platform are defined relative to the "front side" of the work platform, which refers to the side that the worker usually faces when standing on the work platform.

[0063] Figure 1A and Figure 1B This illustration schematically shows the arrangement of various sensors in a collision avoidance control system for an aerial work platform according to an embodiment of this application. (Reference) Figure 1A and Figure 1B In this embodiment, a collision avoidance control system for an aerial work platform is provided. The aerial work platform may include a work platform and other components. Depending on the type of aerial work platform, these other components may include, but are not limited to, the main body (e.g., a traveling mechanism), boom, lifting mechanism, etc. The collision avoidance control system may include:

[0064] A vision sensor is configured to acquire images of the environment surrounding the aerial work platform;

[0065] Radar sensors are configured to collect point cloud data of the environment surrounding the aerial work platform; and

[0066] processor.

[0067] Specifically, the vision sensor may include at least one camera, such as a CCD camera. Examples of radar sensors may include, but are not limited to, ultrasonic sensors, millimeter-wave sensors, and laser sensors. Preferably, the radar sensor may be a millimeter-wave sensor. In one example, the vision sensor and the radar sensor may be combined into a data acquisition module. In one example, the data acquisition module may include a left data acquisition module 301, a rear data acquisition module 302, and a right data acquisition module 303 located on the left, rear, and right sides of the work platform, respectively. For example, these data acquisition modules may be disposed on the sides of the bottom of the work platform.

[0068] Examples of processors may include, but are not limited to, microcontrollers, microprocessors, field programmable gate arrays (FPGAs), programmable logic controllers (PLCs), digital signal processors (DSPs), application-specific integrated circuits (ASICs), state machines, etc.

[0069] The processor can be configured to acquire detection signals from radar sensors and environmental images captured by cameras, determine the presence of target obstacles based on the detection signals and environmental images, and execute appropriate collision avoidance measures if the presence of a target obstacle is confirmed. In one example, the collision avoidance measures may include alarms / alarms, such as audible / visual alarms / alarms. The collision avoidance control system may also include an alarm device, to which the processor can send a command when it determines that an alarm is needed, instructing the alarm device to sound an alarm, such as through sound / light. In another example, the processor can control the movement of the work platform. For example, the aerial work platform may include drive mechanisms for driving the movement of the work platform, such as a boom slewing mechanism, a boom luffing mechanism, a platform lifting mechanism, etc. The processor can control the drive mechanism to perform corresponding actions based on obstacle information (e.g., position information, speed information, etc.) of the detected target obstacle to avoid a collision between the work platform and the target obstacle, or send a command (instruction) to the controller of the drive mechanism, which can then control the drive mechanism to perform corresponding actions upon receiving the command.

[0070] In this embodiment, a target recognition model can be used to identify obstacles based on environmental images captured by a camera. Specifically, the target recognition model can be a deep learning model, examples of which include, but are not limited to, one-stage detection models, such as models from the YOLO series (e.g., YOLOv5), SSD models, etc., and two-stage detection models, such as models from the RCCN series, etc. The captured environmental images are input into the target recognition model, which can output obstacle recognition results, such as obstacle category, obstacle location, etc.

[0071] The accuracy of target recognition models depends to some extent on the training efficiency of the models, and the training efficiency is related to the training samples. In high-altitude operation scenarios: (1) the large deflection of the slender boom and the existence of gaps in the boom system cause the work platform to shake during operation, which will often result in blurred image data, making it difficult for the visual model to accurately recognize the target; (2) as the high-altitude operation platform is lifted, its height changes continuously, and the scale of obstacles in the acquired images will also change. In the past, deep learning-based visual models often had certain constraints and limitations on the input size during training, and visual models trained in the traditional way may not be applicable to high-altitude operation scenarios. In view of this, this application provides a method for augmenting training data for target recognition models. Figure 2 A flowchart illustrating a training data augmentation method for a target recognition model according to an embodiment of this application is shown schematically. Figure 3 Schematic illustration Figure 2 The illustrated training data augmentation method demonstrates the process of generating augmented images. (Reference) Figure 2 and Figure 3In the embodiments of this application, the training data augmentation method may include the following steps.

[0072] In step S210, the original image dataset is obtained, which includes multiple original images.

[0073] Specifically, the original image dataset can be a collection of multiple images pre-captured for the scene to be monitored. Images can be captured without any camera shake, ensuring the images in the original dataset are clear and captured under normal conditions. Some of the images may include the environmental background and the target obstacle (foreground). These images can be annotated (e.g., using image annotation tools to mark the target obstacle).

[0074] In step S220, the target obstacle image is extracted from the original image.

[0075] Specifically, for any image in the original image dataset, the image of the target obstacle can be extracted and used as the foreground image.

[0076] In step S230, the parameters of the target obstacle image are adjusted to generate a perturbation image of the target obstacle.

[0077] Specifically, to simulate images captured by a camera under different working conditions of an aerial work platform, resulting in camera shake, some parameters of the extracted target obstacle image can be adjusted to generate a perturbed image. More specifically, adjusting the parameters of the target obstacle image to generate a perturbed image of the target obstacle includes at least one of the following:

[0078] The target obstacle image is moved horizontally and / or vertically to generate at least one perturbation image of the target obstacle. Specifically, by translating a foreground image patch (i.e., the extracted obstacle image) along the horizontal / vertical coordinates of the plane containing the background image, the pitch angle change caused by the high-frequency jitter of the aerial work platform load can be simulated. This can also be used to simulate the relative movement of the obstacle along a plane parallel to the camera's focal plane in a real-world scene. In one example, multiple perturbation images can be generated in this way (i.e., multiple images are obtained through multiple translations).

[0079] The target obstacle image is rotated to generate at least one perturbation image of the target obstacle. Specifically, by rotating a foreground image block about a point on the background image as an axis, the roll angle change of an aerial work platform caused by uneven ground during walking can be simulated. It can also be used to simulate the rotation of an obstacle along a plane parallel to the camera's focal plane in a real scene. In one example, multiple perturbation images can be generated in this way (i.e., multiple images are obtained by multiple rotations).

[0080] The target obstacle image is scaled to generate at least one perturbation image of the target obstacle. Specifically, by scaling the foreground image patch along the horizontal or vertical coordinate, the low-frequency swaying caused by deformation of the main boom of the aerial work platform or assembly gaps is simulated, allowing the target recognition model to adapt to different scales of the obstacle. It can also simulate the rotation of the obstacle along an axis perpendicular or parallel to the plane of the camera in a real scene.

[0081] In step S240, the perturbation image is added to the original image to generate an augmented image. Specifically, after obtaining the perturbation image, the perturbation image is combined with the original image to obtain an augmented image, which includes the original target obstacle image and the newly added perturbation image.

[0082] In step S250, the augmented image is added to the original image dataset to generate an augmented image dataset. Specifically, after obtaining the augmented image, the category and border information of obstacles can be labeled on the augmented image, and then the augmented image and the corresponding labels are added to the original image dataset to update the dataset.

[0083] After obtaining the augmented image dataset, the target recognition model can be trained using it. The specific training steps for the target recognition model are known to those skilled in the art and will not be elaborated here.

[0084] In a preferred embodiment of this application, the transparency of the perturbation image can also be adjusted to further simulate a jittery scene. For example, the opacity attribute of the perturbation image can be adjusted to increase its transparency. Thus, the transparency of the perturbation image is greater than the transparency of the target obstacle image. For example, the opacity of the perturbation image can be set to 70%.

[0085] In a preferred embodiment of this application, a first-update mode can be used to train the target recognition model. For example, new image samples are collected as new data, and the target recognition model is trained iteratively to achieve incremental updates of the target recognition model, and the updated model can replace the previous model.

[0086] Training a target recognition model using the scheme of this application can improve the recognition performance of the target recognition model when the working platform is shaking or swaying. More importantly, it eliminates the need to collect image data in high-altitude environments or under conditions of platform shaking (swaying), resulting in low data acquisition costs and convenient acquisition (data can be collected only on the ground).

[0087] Figure 4 A flowchart illustrating an obstacle recognition method according to an embodiment of this application is shown schematically. This method can be applied to the collision avoidance control system of any of the above embodiments. Figure 4As shown in the embodiments of this application, the obstacle recognition method may include the following steps.

[0088] In step S410, an environmental image of the area surrounding the aerial work platform is acquired by a vision sensor.

[0089] Specifically, a vision sensor (e.g., a camera) can acquire environmental images around the aerial work platform (e.g., a work platform) at one sampling period. The acquired environmental images are a series of environmental image frames, each with a corresponding timestamp.

[0090] In step S420, point cloud data of the environment surrounding the aerial work platform collected by the radar sensor is acquired.

[0091] Specifically, radar sensors (such as millimeter-wave radar sensors) can acquire point cloud data of the environment surrounding an aerial work platform (such as a work platform) at one sampling period. The acquired point cloud data can be a series of point cloud data, and the point cloud data acquired at each sampling time can be called a data frame.

[0092] In step S430, the environmental image is input into the target recognition model to output the obstacle recognition result.

[0093] Specifically, after receiving an input environmental image, the target recognition model can identify obstacles in the image and output the recognition results. The model can use bounding boxes to enclose identified obstacles in the image and display their type, distance, etc. Therefore, in one example, the recognition results may include obstacle type and location (e.g., relative distance to the work platform). If the input to the target recognition model is a series of image frames, the output will be a dynamic recognition result.

[0094] In this embodiment, the target recognition model can be trained using the augmented image dataset obtained by the training data augmentation method of any of the above embodiments.

[0095] In step S440, obstacle information is determined based on point cloud data.

[0096] Specifically, for example, the processor can identify obstacles and determine obstacle information based on point cloud data collected by radar sensors. Obstacle information may include the obstacle's position information (e.g., relative distance to the work platform) and velocity information (e.g., relative velocity relative to the work platform).

[0097] In addition, the collected point cloud data can be preprocessed. For example, invalid information can be filtered out to reduce redundant calculations, and targets outside the effective detection range around the working platform can be removed.

[0098] In step S450, the obstacle information is associated with the obstacle recognition result.

[0099] Specifically, image data from a visual sensor and point cloud data from a radar sensor can be time-aligned. This application embodiment uses frame-synchronized data sequence matching to align image data and radar data in the time dimension. Interpolation and extrapolation are used to perform time-series registration of the series of point cloud and image data acquired by the radar and camera. The overall concept involves using the time stamps of the two sensor frames, calculating the difference using timestamps, prioritizing the low-frame-rate millimeter-wave radar sensor, and finding the data frame corresponding to the high-frame-rate camera sensor with the smallest corresponding time difference. By combining the millimeter-wave radar frame containing motion information with the time difference, the position corresponding to the camera frame with the smallest time difference is calculated for each target in the radar frame. This is equivalent to virtually creating a new radar frame within the camera frame, thereby completing frame synchronization matching between sensor data and ensuring the temporal consistency of the fused data from the two sensors.

[0100] More specifically, the steps of associating obstacle information with obstacle recognition results may include:

[0101] Determine the first timestamp of the current data frame in the point cloud data;

[0102] Determine the second timestamp in the image frame of the environmental image that is closest to the first timestamp, and the associated image frame corresponding to the second timestamp;

[0103] Determine the current obstacle information corresponding to the current data frame;

[0104] Determine the associated obstacle recognition results corresponding to the associated image frames;

[0105] Associate the current obstacle information with the associated obstacle recognition results.

[0106] The obstacle information includes the obstacle's location and speed information. Associating the current obstacle information with the associated obstacle identification results includes:

[0107] The current position information of the current obstacle is compensated using the current speed information, the first timestamp, and the second timestamp of the current obstacle information;

[0108] The compensated current obstacle information is correlated with the associated obstacle identification results.

[0109] The compensation for the current position information of the current obstacle information using the current speed information, the first timestamp, and the second timestamp of the current obstacle information includes using the following formula to compensate for the current position information of the current obstacle information:

[0110] xt_image =x t_radar +(t_image-t_radar)×v x

[0111] y t_image =y t_radar +(t_image-t_radar)×v y

[0112] Where, x t_image y t_image These are the compensated x and y coordinates of the obstacle, respectively. t_radar y t_radar The x and y coordinates of the obstacle in the current location information are v respectively. x v y t_radar and t_image represent the speeds of the obstacle along the horizontal and vertical axes in the current speed information, respectively, and the first and second timestamps are t_radar and t_image, respectively.

[0113] After the data from the visual sensor and radar sensor are synchronized as described above, the obstacle recognition results obtained from the visual sensor and the obstacle information obtained from the radar sensor can be correlated. Based on the projection of the radar sensor into the image coordinate system and the obstacle recognition results output by the target recognition model, a global nearest neighbor data correlation algorithm can be used for data correlation. The correlated obstacle information and the obstacle recognition results can then be jointly displayed in the image frame.

[0114] In one example, a concrete example of the global nearest neighbor algorithm may include the following steps:

[0115] Calculate the set of all measurements at a certain moment {m} j The values j = 1, 2, 3, 4, ... are assigned to the prediction set {p}. i For the cost matrix of {i = 1, 2, 3, 4, ...}, this patent uses Mahalanobis distance for calculation.

[0116]

[0117] e ij (k)=m j (k)-p i (k)

[0118] Among them, e ij (k) represents the measurement set {m} j {j = 1, 2, 3, 4, ...} and the prediction set {p} i Let S(k) be the residual vector of the residual vector, i = 1, 2, 3, 4, ..., and S(k) be the covariance of the residual vector.

[0119] Set the association threshold value G and the maximum distance value.

[0120] Calculate the risk value of each measurement set element when it is assigned to each prediction set element, and select the assignment corresponding to the minimum risk.

[0121]

[0122] Where c ij Let be the cost matrix, when At that time, c ij Pick Otherwise, take the maximum distance value. ε ij The value is 1 when an element of the measurement set is assigned to a track, and 0 otherwise.

[0123] In this embodiment of the application, if there are unrelated obstacle recognition results within a fusion cycle, only the obstacle recognition results are output.

[0124] If unrelated obstacle information exists within the fusion cycle, the output obstacle recognition result is determined based on this obstacle information. Specifically, if unrelated obstacle information exists within the fusion cycle, this obstacle information may include obstacle position and velocity information. The position information (e.g., coordinates) can be mapped to the image coordinate system of the visual sensor. Centered on this mapped point in the image coordinate system, the type and location of the target obstacle are determined from the obstacle recognition result based on the relative distance to the obstacle, and the obstacle distance and velocity information can be displayed below the target bounding box.

[0125] In this embodiment, the fusion period can simultaneously include an integer number of sampling periods from both the visual sensor and the radar sensor. In one example, the fusion period is the least common multiple of the sampling periods of the visual sensor and the radar sensor.

[0126] In this embodiment of the application, the coordinate systems of the visual sensor and the radar sensor can be calibrated. Figure 5 The diagram schematically illustrates the coordinate calibration of the radar and vision sensors. (Reference) Figure 5 Taking millimeter-wave radar as the radar sensor and a monocular camera as the vision sensor as an example, in order to align the spatial information of the millimeter-wave radar and the monocular camera, it is necessary to establish accurate coordinate transformation relationships between the millimeter-wave radar coordinate system, the three-dimensional world coordinate system, the camera coordinate system, the image coordinate system, and the pixel coordinate system.

[0127] The camera coordinate system is defined as O c -X c Y c Z c The millimeter-wave radar coordinate system is defined as O r -X rY r Z r The camera's image coordinate system is defined as O0-xy, and the image's pixel coordinate system is defined as O1-uv. For any point P in space, its coordinates in the camera coordinate system are (X, Y, X, Y, XL ... c ,Y c Z c Its coordinates in the radar coordinate system are (X... r ,Y r Z r The coordinates of the image in the image coordinate system are (x, y), and the pixel coordinates of the image are (u, v). Based on the imaging process and principles of a camera, the transformation relationships between the camera, image, and pixel coordinate systems are as follows:

[0128] Transformation relationship between pixel coordinate system and image coordinate system:

[0129]

[0130] Transformation relationship between image coordinate system and camera coordinate system:

[0131]

[0132] Therefore, the transformation relationship between the pixel coordinate system and the image coordinate system is as follows:

[0133]

[0134] In the formula, Z c d represents the actual depth from the spatial point to the camera plane. x ,d y f represents the unit length of a pixel along the x-axis and y-axis in the image coordinate system, respectively. x ,f y , are the focal lengths of the camera along the x and y axes in the image coordinate system, respectively, and u0, v0 represent the offsets of the camera's principal point. M is called the camera's intrinsic parameter.

[0135] Furthermore, the camera coordinate system and the millimeter-wave radar coordinate system in the world coordinate system satisfy the following:

[0136]

[0137] R represents rotation, and T represents translation; these are the camera's extrinsic parameters.

[0138] Therefore, for any point P in space, the following transformation relationship exists between its corresponding pixel coordinates and radar coordinates:

[0139]

[0140] M is the camera intrinsic parameter matrix, obtained using Zhang Zhengyou's calibration method; R and T represent the rotation matrix and translation vector between the camera coordinate system and the radar coordinate system, respectively, which can be obtained through pre-calibration and direct measurement.

[0141] The calibration steps are as follows:

[0142] 1) Prepare a chessboard grid based on Zhang Zhengyou's calibration method. The size of the chessboard grid is known. Use a camera to take pictures of it from different positions, angles, and postures to obtain a set of images.

[0143] 2) Detect feature points in the image, such as the corner points of the calibration board, and obtain the pixel coordinate values of the corner points of the calibration board. Calculate the physical coordinate values of the corner points of the calibration board based on the known size of the chessboard and the origin of the world coordinate system.

[0144] 3) Solve for the intrinsic and extrinsic parameter matrices; based on the relationship between physical coordinates and pixel coordinates, ignoring distortion, each calibration board can be used to solve for the homography matrix between world coordinates and pixel coordinates, which is then used to solve for the camera intrinsic parameter matrix; ignoring tangential distortion and intrinsic parameter matrix errors, solve for the radial distortion parameters.

[0145] 4) Optimize the above parameters using the LM (Levenberg-Marquardt) algorithm.

[0146] This application provides a processor configured to execute the training data augmentation method for a target recognition model according to any of the above embodiments.

[0147] This application provides a processor configured to execute the obstacle recognition method of any of the above embodiments.

[0148] This application provides a collision avoidance control system for use on aerial work platforms. The collision avoidance control system includes:

[0149] A vision sensor is configured to acquire images of the environment surrounding the aerial work platform;

[0150] Radar sensors are configured to collect point cloud data of the environment surrounding the aerial work platform; and

[0151] The processor is configured to execute the obstacle recognition method of any of the above embodiments.

[0152] This application provides an aerial work platform, including:

[0153] Work platform; and

[0154] The aforementioned collision avoidance control system.

[0155] The methods provided in any of the above embodiments can be executed or implemented in a single processor. Alternatively, depending on the hardware entity corresponding to each step, the methods of the above embodiments can be executed or implemented in different processing entities. For example, the process of processing point cloud data collected by a radar sensor to obtain obstacle information can be completed in a radar module, so the radar module may include hardware entities such as a processor and memory. The process of obtaining obstacle recognition results from images collected by a visual sensor (e.g., a camera) can be completed in a camera module, so the camera module may include hardware entities such as a processor and memory, and the processor may embed or call a target recognition model. The data association, fusion, and output of fusion results between obstacle information and obstacle recognition results can be executed or implemented in other processing entities (e.g., other processors and memories).

[0156] This application provides a machine-readable storage medium storing instructions that, when executed by a processor, cause the processor to implement any of the training data augmentation methods described above for a target recognition model.

[0157] This application provides a machine-readable storage medium storing instructions that, when executed by a processor, cause the processor to implement the obstacle recognition method of any of the above embodiments.

[0158] The solution provided in this application embodiment may have at least one of the following beneficial effects:

[0159] (1) To address the safety issues caused by the inability of aerial work platforms to accurately identify obstacle locations during operation, resulting in delayed obstacle avoidance. This involves improving the accuracy of the aerial work platform's autonomous obstacle location identification, enabling precise perception of the platform's surrounding environment.

[0160] (2) A data augmentation method is proposed to improve the recognition effect of the visual recognition model when the working platform is shaking or swaying. More importantly, it eliminates the need to collect image data in high-altitude environments or under shaking conditions, and the data collection cost is low and the collection is convenient (it only needs to be collected on the ground).

[0161] (3) The recognition model is trained based on the type of obstacle and the edge vertex information of the obstacle in the working scene of the aerial work platform. It can accurately identify the type of obstacle and the edge information in the working scene, and integrate millimeter-wave radar data and camera data to identify the relative position of the obstacle and the aerial work platform.

[0162] (4) The camera uses a deep learning-based multi-target recognition method for obstacle target detection. In the early stage, a large number of training sample datasets for image target obstacle recognition in different environmental scenarios are collected, and data augmentation methods are combined to expand the sample dataset, improve the accuracy of target recognition and reduce the impact of the surrounding environment of high-altitude operations on target recognition.

[0163] (5) Using multi-sensor data fusion technology can improve the environmental perception capability of high-altitude operation scenarios and improve the environmental adaptability of the environmental perception system under extreme weather conditions, thereby improving the reliability of the system.

[0164] Those skilled in the art will understand that embodiments of this application can be provided as methods, systems, or computer program products. Therefore, this application can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, this application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0165] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in one or more flowchart illustrations and / or one or more block diagrams.

[0166] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means that implement the functions specified in one or more flowcharts and / or one or more block diagrams.

[0167] These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer-implemented process, such that the instructions, which execute on the computer or other programmable apparatus, provide steps for implementing the functions specified in one or more flowcharts and / or one or more block diagrams.

[0168] In a typical configuration, a computing device includes one or more processors (CPU), input / output interfaces, network interfaces, and memory.

[0169] Memory may include non-persistent memory in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.

[0170] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information using any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include transient computer-readable media, such as modulated data signals and carrier waves.

[0171] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such process, method, article, or apparatus. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element.

[0172] The above are merely embodiments of this application and are not intended to limit the scope of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of the claims of this application.

Claims

1. An obstacle recognition method, characterized in that, The obstacle recognition method is applied to aerial work platforms, which include visual sensors and radar sensors, and includes: Acquire environmental images of the area surrounding the aerial work platform collected by the vision sensor; Acquire point cloud data of the environment surrounding the aerial work platform collected by the radar sensor; The environmental image is input into the target recognition model to output obstacle recognition results; Obstacle information is determined based on the point cloud data; and Associate the obstacle information with the obstacle recognition result; The target recognition model is trained using the augmented image dataset obtained through the following steps: Obtain a raw image dataset, which includes multiple raw images, at least a portion of which include an environmental background and a target obstacle; Extract the target obstacle image from at least a portion of the original image as the foreground image; Adjust the parameters of the target obstacle image to generate a perturbation image of the target obstacle; The perturbation image is combined with the original image to generate an augmented image, the augmented image including the target obstacle image and the perturbation image; Increase the transparency of the perturbation image, so that the transparency of the perturbation image is greater than the transparency of the target obstacle image; The augmented images are added to the original image dataset to generate the augmented image dataset; The step of adjusting the parameters of the target obstacle image to generate a perturbation image of the target obstacle includes at least one of the following: The target obstacle image is translated along the horizontal / vertical coordinates of the plane containing the background image to generate at least one perturbation image of the target obstacle; The target obstacle image is rotated about a point on the background image as an axis to generate at least one perturbation image of the target obstacle; The target obstacle image is scaled along the horizontal or vertical axis to generate at least one perturbation image of the target obstacle.

2. The obstacle recognition method according to claim 1, characterized in that, The opacity of the perturbation image is set to 70%.

3. The obstacle recognition method according to claim 1, characterized in that, The step of associating the obstacle information with the obstacle recognition result includes: Determine the first timestamp of the current data frame in the point cloud data; Determine the second timestamp in the image frame of the environmental image that is closest to the first timestamp, and the associated image frame corresponding to the second timestamp; Determine the current obstacle information corresponding to the current data frame; Determine the associated obstacle recognition result corresponding to the associated image frame; The current obstacle information is associated with the associated obstacle identification results.

4. The obstacle recognition method according to claim 3, characterized in that, The obstacle information includes the obstacle's position and speed information, and associating the current obstacle information with the associated obstacle identification result includes: The current position information of the current obstacle information is compensated using the current speed information of the current obstacle information, the first timestamp, and the second timestamp; The compensated current obstacle information is associated with the associated obstacle identification result.

5. The obstacle recognition method according to claim 4, characterized in that, The compensation for the current position information of the current obstacle information using the current speed information, the first timestamp, and the second timestamp includes compensating for the current position information of the current obstacle information using the following formula: in, x t_image , y t_image These are the compensated x-coordinate and y-coordinate of the obstacle, respectively. x t_radar , y t_radar The x and y coordinates of the obstacle in the current location information are respectively... v x , v y These represent the obstacles' velocities along the horizontal and vertical axes, respectively, in the current velocity information. t_radar , t_image These are the first timestamp and the second timestamp, respectively.

6. The obstacle recognition method according to claim 1, characterized in that, Also includes: If there are unrelated obstacle recognition results within a fusion cycle, only the obstacle recognition results will be output; If there is unrelated obstacle information within the fusion cycle, the output obstacle recognition result is determined based on the obstacle information.

7. The obstacle recognition method according to claim 6, characterized in that, The fusion period is the least common multiple of the sampling period of the visual sensor and the sampling period of the radar sensor.

8. A processor, characterized in that, It is configured to perform the obstacle recognition method according to any one of claims 1 to 7.

9. A collision avoidance control system, characterized in that, The collision avoidance control system, applied to aerial work platforms, includes: A visual sensor is configured to acquire images of the environment surrounding the aerial work platform; Radar sensors are configured to acquire point cloud data of the environment surrounding the aerial work platform; and The processor according to claim 8.

10. An aerial work platform, characterized in that, include: Work platform; as well as The collision avoidance control system according to claim 9.

11. A machine-readable storage medium, characterized in that, The machine-readable storage medium stores instructions that, when executed by a processor, cause the processor to implement the obstacle recognition method according to any one of claims 1 to 7.