Pallet pose recognition method, controller and forklift
By acquiring sensor images and point clouds on a forklift and fusing them, and then registering them with pallet type template point clouds, the problem of environmental interference in traditional pallet pose recognition is solved, achieving high-precision pallet pose recognition and improving handling efficiency.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- SHANGHAI MUANT ROBOT TECH CO LTD
- Filing Date
- 2025-12-16
- Publication Date
- 2026-07-02
AI Technical Summary
Traditional pallet position recognition methods that rely on sensor-based image and other data acquisition are easily affected by environmental interference such as changes in lighting and occlusion, impacting recognition accuracy and failing to cope with complex and ever-changing operating environments.
The sensor image and point cloud are acquired in front of the forklift fork arm. The pallet image and point cloud are obtained through image detection and fused into a target point cloud. The pallet type template point cloud is then registered to determine the pallet pose information.
It improves the accuracy and success rate of pallet position recognition, reduces manual intervention, and enhances handling efficiency.
Smart Images

Figure CN2025142896_02072026_PF_FP_ABST
Abstract
Description
Pallet pose recognition method, controller and forklift
[0001] This application claims priority to Chinese Patent Application No. 202411959252.7, filed on December 27, 2024, entitled "Pallet Pose Recognition Method, Controller and Forklift", the entire contents of which are incorporated herein by reference. Technical Field
[0002] This application relates to the field of image recognition technology, such as pallet pose recognition methods, controllers, and forklifts. Background Technology
[0003] With the development of intelligent driving technology, unmanned forklifts have also been applied in the field of industrial automation. In modern warehousing and logistics, the use of unmanned forklifts has greatly improved the efficiency of cargo handling and reduced labor costs.
[0004] Automated forklifts typically move goods by picking up pallets. During the process, sensors mounted on the forklift collect images and other data to identify the pallet's position and orientation, thus controlling the forklift to complete the automated handling. Therefore, pallet position recognition is crucial to the efficiency of automated forklift handling. However, traditional methods of pallet position recognition using sensor-based image and data acquisition are easily affected by environmental interference such as changes in lighting and occlusion, which can severely impact the accuracy of pallet position recognition and make it unsuitable for complex and changing operating environments. Summary of the Invention
[0005] The purpose of this application is to provide a pallet pose recognition method, controller, and forklift. When the forklift is in front of the pallet to be picked up, a sensor image and a first point cloud in front of the forklift's forks are first acquired. Both the sensor image and the first point cloud contain the pallet. Subsequently, the sensor image is detected to obtain an image detection result, and based on the image detection result, the pallet image corresponding to the pallet in the sensor image and the pallet point cloud corresponding to the pallet in the first point cloud are obtained. The sensor image and the pallet point cloud are then fused to obtain the target point cloud of the pallet. The target point cloud contains the features of both the pallet image and the point cloud. Therefore, the pallet pose information determined based on the target point cloud of the pallet is more accurate. More accurate pallet pose can improve the success rate of pallet picking, reduce manual intervention, and improve overall handling efficiency.
[0006] This application provides a pallet pose recognition method, comprising: when a forklift is located in front of a pallet to be picked up, acquiring a sensing image of the forklift's fork arm and a first point cloud; based on the image detection result obtained by detecting the sensing image, acquiring a pallet image of the pallet in the sensing image and a pallet point cloud of the pallet in the first point cloud; fusing the pallet image and the pallet point cloud to obtain a target point cloud of the pallet; and determining the pallet pose information based on the target point cloud of the pallet.
[0007] This application also provides a controller configured to perform the above-described tray pose recognition method.
[0008] This application also provides a forklift, including: a forklift body, a data acquisition device, and the aforementioned controller; the data acquisition device is mounted on the forklift body and is used to acquire sensor images and point clouds in front of the fork arms of the forklift and send them to the controller.
[0009] This application also provides a computer-readable storage medium, which is a non-volatile or non-transient storage medium, on which a computer program is stored. When the computer program is run by a processor, it executes the steps of the tray pose recognition method described above.
[0010] In one embodiment, the image detection result includes the tray type of the tray; based on the target point cloud of the tray, determining the pose information of the tray includes:
[0011] Based on the target point cloud of the pallet, the preliminary pose information of the pallet is obtained;
[0012] Based on the initial pose information of the pallet, the target point cloud, and the pallet template point cloud corresponding to the pallet type, the final pose information of the pallet is obtained.
[0013] In one embodiment, obtaining the final pose information of the pallet based on the preliminary pose information of the pallet, the target point cloud, and the pallet template point cloud corresponding to the pallet type includes:
[0014] The pallet template point cloud and the target point cloud are registered to obtain the transformation information of the pallet template point cloud in the coordinate system of the target point cloud;
[0015] Based on the transformation information and the preliminary pose information, the final pose information of the tray is obtained.
[0016] In one embodiment, based on the target point cloud of the tray, preliminary pose information of the tray is obtained, including:
[0017] Select multiple feature points located within a defined feature area of the tray from the target point cloud;
[0018] Calculate the center point of the plurality of feature points, and use it as the reference center point of the tray;
[0019] The vertical vector of the pallet toward the target plane of the forklift is obtained based on the coordinates of the multiple feature points;
[0020] The reference heading angle of the target plane relative to the forklift is determined based on the vertical vector of the target plane;
[0021] The preliminary pose information includes the reference center point and the reference heading angle.
[0022] In one embodiment, obtaining the vertical vector of the pallet toward the target plane of the forklift based on the coordinates of the plurality of feature points includes:
[0023] The multiple feature points are combined into a feature data matrix, and the coordinate values of each dimension contained in the feature data matrix are standardized.
[0024] Obtain the covariance matrix of the feature data matrix after standardization, and use the eigenvector corresponding to the largest eigenvalue among all eigenvalues of the covariance matrix as the vertical vector.
[0025] In one embodiment, the image detection result includes: field-of-view information when the sensing image is acquired; the method for obtaining the tray point cloud in the first point cloud is as follows:
[0026] The second point cloud is obtained by segmenting the first point cloud based on the field of view information;
[0027] Obtain the pallet point cloud from the second point cloud.
[0028] In one embodiment, fusing the tray image with the tray point cloud to obtain the target point cloud of the tray includes:
[0029] Obtain the image weight of the sensed image and the point cloud weight of the first point cloud;
[0030] Based on the image weights and the point cloud weights, the coordinates of each point in the tray image and the coordinates of each point in the tray point cloud are fused to obtain the target point cloud of the tray.
[0031] In one embodiment, obtaining the image weights of the sensed image and the point cloud weights of the first point cloud includes:
[0032] The quality of the sensing image and the first point cloud are evaluated separately to obtain the image quality parameters of the sensing image and the point cloud quality parameters of the first point cloud.
[0033] Based on the image quality parameters and the point cloud quality parameters, the image weight of the sensing image and the point cloud weight of the first point cloud are obtained.
[0034] In one embodiment, after obtaining the image weights of the sensed image and the point cloud weights of the first point cloud, the method further includes:
[0035] Based on the illumination intensity value of the scene where the forklift is currently located and / or the integrity assessment value of the first point cloud, the image weights and the point cloud weights are adjusted.
[0036] Based on the image weights and the point cloud weights, the coordinates of each point in the tray image are fused with the coordinates of each point in the tray point cloud to obtain the target point cloud of the tray, including:
[0037] Based on the adjusted image weights and point cloud weights, the coordinates of each point in the tray image and the coordinates of each point in the tray point cloud are fused to obtain the target point cloud of the tray.
[0038] In one embodiment, the quality assessment method for the sensed image is as follows:
[0039] Obtain the image quality assessment value of the sensed image, which includes sharpness assessment value, exposure assessment value, and motion blur assessment value;
[0040] Based on the quality assessment value of the sensed image, the image quality parameters of the sensed image are obtained.
[0041] In one embodiment, the quality assessment method for the first point cloud is as follows:
[0042] Obtain the point cloud quality assessment value of the first point cloud, including the point cloud density assessment value, the point cloud integrity assessment value, and the noise level assessment value;
[0043] Based on the point cloud quality assessment value of the first point cloud, the point cloud quality parameters of the first point cloud are obtained. Attached Figure Description
[0044] Figure 1 is a detailed flowchart of the tray pose recognition method according to the first embodiment of this application;
[0045] Figure 2 is a schematic diagram of the sensor image after detection according to the first embodiment of this application;
[0046] Figure 3 is a flowchart of step 104 of the pallet pose recognition method in Figure 1;
[0047] Figure 4 is a schematic diagram of the registration of the tray template point cloud and the target point cloud according to the first embodiment of this application;
[0048] Figure 5 is a flowchart of the tray pose recognition method according to the second embodiment of this application. Detailed Implementation
[0049] The exemplary embodiments of this disclosure are described below with reference to the accompanying drawings, including various details of the embodiments of this disclosure to aid understanding, and should be considered merely exemplary. For clarity and brevity, descriptions of well-known functions and structures, as well as functions and structures less relevant to the embodiments described below, are omitted in the following description.
[0050] Figure 1 is a flowchart of a pallet pose recognition method provided in Embodiment 1 of this application, which is applied to a forklift. The forklift is equipped with a controller, which can be a controller dedicated to pallet pose recognition or the main controller in the forklift.
[0051] The forklift body is equipped with data acquisition equipment, which is used to acquire image and point cloud data of the area in front of the forklift's forks and send it to the controller. The data acquisition equipment may include: a radar sensor (e.g., a lidar) and an image sensor (e.g., a camera). The radar sensor is mounted on the front of the forklift body and is used to acquire point cloud data of the area in front of the forklift's forks; the image sensor is mounted on the forklift's forks (e.g., at the fork tips) and is used to acquire images of the area in front of the forklift's forks. This application only describes the pallet pose recognition part of the forklift in detail; for the working principle of the forklift itself, please refer to relevant forklift specifications.
[0052] Step 101: When the forklift is in front of the pallet to be picked up, acquire the sensor image and the first point cloud in front of the forklift's fork arm.
[0053] The forklift can move towards the location of the goods to be transported based on received control commands or autonomously. During the movement, the forklift can determine whether it has reached the goods by its relative position to the goods and by combining the acquired images. The goods are then placed on a pallet. After the forklift moves to the location of the goods, that is, when the forklift is in front of the pallet to be picked up (specifically, the forklift forks are facing the pallet), an image sensor acquires a sensor image of the area in front of the forklift forks, including the pallet, and a radar sensor acquires a first point cloud image of the area in front of the forklift forks, including the pallet.
[0054] In one embodiment, the first point cloud can be obtained by fusing several frames of radar point clouds collected by a radar sensor. The fused point cloud can first undergo ground removal processing, and point cloud filtering can be performed according to the ground range. The filtered point cloud is then subjected to adaptive voxel filtering, where the voxel size is set according to the local point cloud density, with smaller voxels used for high-density point cloud regions and larger voxels used for low-density point cloud regions. In each voxel, the centroid coordinates of the points within that voxel are calculated, and the centroid is used to represent all the original points within that voxel. This reduces the amount of point cloud data while preserving point cloud features. Subsequently, the point cloud obtained after ground removal processing and adaptive voxel filtering is used as the final first point cloud, thereby achieving the effect of noise reduction.
[0055] Step 102: Based on the image detection results obtained by detecting the sensing image, obtain the tray image in the sensing image and the tray point cloud in the first point cloud.
[0056] First, the sensor images acquired by the image sensor are processed, such as image distortion correction, illumination balancing, and size normalization. Then, the processed sensor images are inspected, including pallet type detection, overall pallet detection, and pallet leg position recognition. This yields image detection results, including the pallet type, the overall pallet position in the sensor image, and the position information of the pallet legs. The overall pallet position information includes the coordinate range of the pallet and the pallet leg position information. Subsequently, based on the overall pallet position information indicated in the image detection results, a pallet image containing the pallet is obtained. This pallet image shows the area of the pallet facing the forklift in the sensor image. Figure 2 shows the areas of the overall pallet and pallet legs marked after the sensor image has been inspected, where 10 represents the pallet leg and 11 represents the overall pallet.
[0057] Image detection is achieved through a preset image recognition algorithm or a trained pallet detection network model specifically designed for pallet detection. The pallet detection network model can adopt a YOLO network architecture, such as the YOLOv5S network architecture. The training data of the collected pallet detection network model is formed as (X, Y), where X represents the input image and Y represents the output image detection result. The input image X can cover images of different scenes, such as: different lighting conditions (e.g., sunny days, cloudy days, and indoor photos, which can be represented by the light intensity of the environment), different angles of the forklift relative to the pallet (1 to 180 degrees, with angle intervals that can be selected as needed, such as 15 degrees, 10 degrees, 5 degrees, etc.), different distances of the forklift from the pallet (e.g., 0.5m to 3m), and different degrees of pallet occlusion (occlusion rate 20% to 50%), and different types of pallets (e.g., grid pallets, cross pallets, wooden pallets, plastic pallets, European standard pallets). The image detection result Y includes: pallet type, the overall position information of the pallet in the image, and the position information of the pallet legs in the image. The collected training data is used to train the pallet detection network model to obtain a pallet detection network model for pallet detection; the specific model training process can be referred to relevant technologies.
[0058] In the detection of sensor images, the FOV (Field of View) range of the image sensor is calculated. The FOV range of the image sensor is calculated as follows: Based on the overall position information of the pallet obtained after the sensor image is detected, and the internal parameters of the image sensor, the FOV range of the image sensor can be calculated. Based on the overall position information of the pallet in the image, the actual size of the pallet, and the focal length of the image sensor, the FOV range of the image sensor is calculated using geometric relationships or existing image processing algorithms. For example, based on the relationship between the imaging size of the pallet's face facing the forklift in the image and the camera's focal length, combined with the known actual size of the pallet's face facing the forklift (which can be obtained based on the identified pallet type), the horizontal and vertical FOV angles or ranges of the camera are calculated using methods such as the principle of similar triangles, thereby obtaining the FOV range of the image sensor.
[0059] Based on the extrinsic parameter calibration of the image sensor and the radar sensor, the first point cloud is transformed into the coordinate system of the image sensor. Then, the first point cloud in the image sensor coordinate system is segmented using the FOV (Field of View) range of the image sensor to obtain the second point cloud, which is within the same FOV range as the sensor image. The second point cloud is then clustered, and the largest cluster is selected as the tray point cloud within the second point cloud. Alternatively, before clustering the second point cloud, statistical filtering can be applied to remove noise. The filtered second point cloud is then clustered, and the largest cluster is selected to obtain the tray point cloud.
[0060] Step 103: Fuse the pallet image with the pallet point cloud to obtain the target point cloud of the pallet.
[0061] The coordinates of each point on the tray in the tray image are fused with the corresponding coordinates of each point in the tray point cloud to obtain the coordinates of each point on the tray. The coordinates of each point on the tray obtained here form the target point cloud of the tray. This target point cloud integrates useful features carried in the sensor image collected by the image sensor (such as auxiliary information for tray type recognition) and precise spatial location information contained in the first point cloud collected by the lidar (which is not affected even in low light). The fusion calculation may involve, for example, calculating the average of the two coordinates, or selecting one of the coordinates, or assigning different weights to the coordinates of the points on the tray in the tray image and the coordinates of the points on the tray in the tray point cloud to obtain the coordinates of each point on the tray.
[0062] Step 104: Determine the pose information of the pallet based on the target point cloud of the pallet.
[0063] The target point cloud of the pallet can be understood as representing the target plane of the pallet facing the forklift. Then, based on the target plane, the pose information of the pallet can be obtained. The pose information can include the coordinates of the center point of the pallet and the heading angle of the pallet relative to the forklift.
[0064] In one embodiment, the target point cloud of the tray may have some deviations due to the fusion of image and point cloud information. Therefore, it can be further optimized to obtain more accurate tray pose information. Please refer to Figure 3. Step 104 includes the following sub-steps:
[0065] Sub-step 1041: Based on the target point cloud of the pallet, obtain the preliminary pose information of the pallet.
[0066] The target point cloud obtained through coordinate fusion can be used to analyze and obtain preliminary pose information of the pallet. This preliminary pose information can be used as reference information for subsequent calculations to obtain accurate pallet pose information. The method for obtaining the preliminary pose information of the pallet is as follows:
[0067] Multiple feature points are selected from the target point cloud, located within a defined feature area of the pallet. In other words, a defined feature area is selected from the pallet to represent the entire pallet; this defined feature area could be, for example, the pallet legs. The pallet legs are generally also centrally symmetrical, so all feature points on the pallet legs are selected from the target point cloud. The center point of these multiple feature points is then calculated as the reference center point of the pallet.
[0068] The vertical vector of the pallet facing the forklift is obtained based on the coordinates of multiple feature points. The coordinates of these feature points are used to form a feature data matrix. The coordinate values of each dimension within the feature data matrix are then standardized, i.e., the coordinates of each feature point are standardized. The mean and standard deviation of the coordinates of all feature points on the horizontal, vertical, and horizontal axes are calculated. The coordinates of each feature point are then converted to standardized coordinates with a mean of 0 and a standard deviation of 1, thus obtaining the standardized feature data matrix. Subsequently, the covariance matrix of the standardized feature data matrix is obtained. Assuming the standardized feature data matrix is A, which includes the coordinates of n standardized feature points, then the covariance matrix B of feature data matrix A is (1 / n-1)A. T A.
[0069] Find the eigenvalues and corresponding eigenvectors of the covariance matrix B, sort all eigenvalues from largest to smallest, and use the eigenvector corresponding to the largest eigenvalue of the covariance matrix B as the vertical vector of the pallet toward the target plane of the forklift.
[0070] The target plane is in the forklift coordinate system. Therefore, the reference heading angle of the target plane relative to the forklift can be determined based on the vertical vector of the target plane. The angle between the vertical vector of the target plane and the positive direction of the horizontal axis (x-axis) of the forklift coordinate system is the reference heading angle of the target plane relative to the forklift. The aforementioned calculated reference center point and reference heading angle constitute the preliminary pose information of the pallet.
[0071] Sub-step 1042: Based on the preliminary pose information of the pallet, the target point cloud, and the pallet template point cloud corresponding to the pallet type, the final pose information of the pallet is obtained.
[0072] The pallet template point cloud and the target point cloud are registered to obtain the transformation information of the pallet template point cloud in the coordinate system of the target point cloud. For example, the ICP (Iterative Closest Point) algorithm is used to register the tray template point cloud and the target point cloud. First, the corresponding point pairs between the tray template point cloud and the target point cloud are obtained. Based on the corresponding point pairs, a rotation and translation matrix is constructed. Then, the constructed rotation and translation matrix is used to transform the tray template point cloud into the coordinate system of the target point cloud, and the error function between the transformed tray template point cloud and the target point cloud is calculated. If the value of the error function is greater than the preset error threshold, the above iterative operation is continued. The corresponding point pairs between the tray template point cloud and the target point cloud are adjusted, the rotation and translation matrix is reconstructed, and the value of the error function between the transformed tray template point cloud and the target point cloud is calculated until the value of the obtained error function meets the set error requirement, that is, less than or equal to the preset error threshold. At this time, the error between the tray template point cloud and the target point cloud is minimized. Figure 4 shows a schematic diagram after the tray template point cloud and the target point cloud are registered. The white circle represents the tray template point cloud, the black dot represents the detected target point cloud, and the arrow bar is the vertical vector of the tray template point cloud.
[0073] The rotation and translation matrix used to minimize the error between the pallet template point cloud and the target point cloud is the transformation information from the pallet template point cloud to the target point cloud in the coordinate system.
[0074] Based on the transformation information and the initial pose information, the final pose information of the pallet is obtained. For example, the initial pose information, including the reference center point and the vertical vector of the pallet toward the target plane of the forklift, is transformed using the rotation and translation matrix in the coordinate system of the pallet template point cloud to transform the pallet center point and the vertical vector of the pallet template point cloud. Based on the vertical vector of the transformed pallet template point cloud, the final heading angle of the pallet can be obtained. The pallet center point of the pallet template point cloud and the final heading angle constitute the final pose information of the pallet.
[0075] Based on the final pose information of the pallet, the forklift's position and angle facing the pallet are adjusted. For example, based on the pallet's center point in the final pose information, the forklift's position is adjusted so that its center point aligns with the pallet's center point. The angle between the forklift and pallet is adjusted based on the final heading angle in the final pose information, ensuring they are parallel, i.e., the pallet's vertical vector coincides with the forklift's horizontal axis in the coordinate system. After alignment, the forklift can be directly controlled to move forward and pick up the pallet. Additionally, based on the identified pallet type, the corresponding socket spacing is obtained, and the fork arm spacing is then adjusted.
[0076] Based on the above process, preliminary pose information of the pallet is first obtained from the target point cloud. Since the target point cloud is a fusion of pallet image and point cloud information, the preliminary pose information is also a fusion of features from both images and point clouds. Then, by registering the pallet template point cloud corresponding to the pallet type with the target point cloud, the transformation information of the pallet template point cloud to the target point cloud coordinate system is obtained. At this point, the error between the transformed pallet template point cloud and the target point cloud is minimized, resulting in a more standard pallet point cloud. This avoids the deviation caused by fusing the pallet image and point cloud. Based on the transformation information and the preliminary pose information, more accurate pallet pose information can be obtained, improving the accuracy of pallet pose recognition. The horizontal position accuracy of the pallet reaches ±10mm, and the attitude angle accuracy reaches ±1. This directly shortens the pallet pose recognition time and improves the success rate of forklift pallet picking.
[0077] Figure 5 is a flowchart of a pallet pose recognition method provided in Embodiment 2 of this application. It is described based on the above technical solution and can be combined with the above multiple implementation methods.
[0078] Step 201: When the forklift is in front of the pallet to be picked up, acquire a sensor image and a first point cloud of the forklift's forks. This is largely the same as step 101 in the previous embodiment and will not be described again here.
[0079] Step 202: Based on the image detection results obtained by detecting the sensing image, acquire the tray image in the sensing image and the tray point cloud in the first point cloud. This is largely the same as step 102 in the previous embodiment and will not be described again here.
[0080] Step 203 includes the following sub-steps:
[0081] Sub-step 2031: Obtain the image weight of the sensing image and the point cloud weight of the first point cloud.
[0082] Sub-step 2032: Based on image weights and point cloud weights, the coordinates of each point in the tray image are fused with the coordinates of each point in the tray point cloud to obtain the target point cloud of the tray.
[0083] Different weights are set for the image and the point cloud respectively to perform coordinate fusion of corresponding points in the tray image and the tray point cloud.
[0084] In one embodiment, the image weights and point cloud weights can be preset in advance. The image weights and point cloud weights are related to the quality of the acquired sensor image and the first point cloud. The quality of the sensor image and the first point cloud is evaluated respectively to obtain the image quality parameters of the sensor image and the point cloud quality parameters of the first point cloud. Based on the image quality parameters and the point cloud quality parameters, the image weights of the sensor image and the point cloud weights of the first point cloud are obtained.
[0085] First, the quality of the sensor image is assessed to obtain the image quality parameters of the sensor image. The sensor image quality assessment values include sharpness assessment value, exposure assessment value, and motion blur assessment value.
[0086] The sharpness evaluation value is used to characterize the sharpness of a sensed image. It is calculated as follows: the gradient of the sensed image is calculated using the Laplacian operator, the gradient distribution characteristics of the sensed image are statistically analyzed, and then a sharpness evaluation value, S, is generated based on these gradient distribution characteristics. q The calculation formula is as follows:
[0087] S q =min(100, max(0, α*(1 / N*Σ|∇²I(x, y)|-T_min) / (T_max-T_min)*100));
[0088] Where: (x, y) represents the coordinate position of a pixel in the sensor image, ∇²I(x, y) represents the Laplacian operator result of the sensor image at the pixel position (x, y), N represents the total number of pixels in the sensor image, α represents the preset weight coefficient (e.g., 1.2), T_min represents the preset lower limit threshold of image sharpness (e.g., 20), and T_max represents the preset upper limit threshold of image sharpness (e.g., 150).
[0089] The exposure assessment value is used to characterize the exposure status of a sensed image. It is calculated as follows: A brightness histogram of the sensed image is obtained; the ratio of overexposed to underexposed areas in the sensed image is calculated; and an exposure assessment value, S, is generated based on this ratio. b The calculation formula is as follows:
[0090] S b =min(100,max(0,100*(1-β*max(|μ-μ_target| / σ_target,P_over+P_under))));
[0091] Where: μ represents the average brightness of pixels in the sensor image; μ_target represents the preset target brightness value (e.g., 128); σ_target represents the preset target standard deviation (e.g., 45); P_over represents the proportion of overexposed pixels in the sensor image, which are pixels with brightness values greater than the overexposure threshold (e.g., 220); P_under represents the proportion of underexposed pixels in the sensor image, which are pixels with brightness values less than the underexposure threshold (e.g., 30); β represents the preset weighting coefficient (e.g., 1.5).
[0092] Motion blur evaluation value is used to characterize the degree of motion blur in a sensed image. It is calculated by: calculating the difference between the sensed image and adjacent frames, analyzing the edge blur of the sensed image, and generating a motion blur evaluation value S based on the edge blur. m The calculation formula is as follows:
[0093] S m =min(100, max(0, 100*(1-γ*Σ|I_t(x,y)-I_{t-1}(x,y)| / (N*255))));
[0094] Where: I_t(x,y) represents the pixel value of the current frame of the sensor image; I_{t-1}(x,y) represents the previous frame of the sensor image; N is the total number of pixels in the sensor image; γ is a preset weighting coefficient (e.g., 2.0).
[0095] Based on the image quality assessment value of the sensor image, the image quality parameters of the sensor image are obtained. If the image quality assessment value of the selected sensor image is one of the sharpness assessment value, exposure assessment value, and motion blur assessment value, the image quality assessment value can be directly used as the image quality parameter of the sensor image. If the image quality assessment value of the selected sensor image is multiple of the sharpness assessment value, exposure assessment value, and motion blur assessment value, the image quality parameters of the sensor image can be calculated by calculating the mean or weighted average.
[0096] Simultaneously, the sharpness evaluation value, exposure evaluation value, and motion blur evaluation value of the sensor image are acquired. Then, the weights of the sharpness evaluation value, exposure evaluation value, and motion blur evaluation value are obtained respectively. Thus, the image quality parameter of the sensor image can be calculated by weighted summation. The expression of the image quality parameter Kt of the sensor image is as follows:
[0097] Kt = w_c * sharpness assessment value + w_e * exposure assessment value + w_m * motion blur assessment value;
[0098] Where: w_c is the sharpness weight, for example, 0.4; w_e is the exposure weight, for example, 0.3; w_m is the motion blur weight, for example, 0.3.
[0099] The quality of the first point cloud is evaluated to obtain the quality evaluation parameters of the first point cloud, including: obtaining at least one point cloud quality evaluation value of the first point cloud, which includes point cloud density evaluation value, point cloud integrity evaluation value, and noise level evaluation value;
[0100] The point cloud density assessment value is used to characterize the uniformity of the first point cloud. The number of points per unit volume in the first point cloud is calculated, and the uniformity of the first point cloud is assessed based on the number of points per unit volume, resulting in the point cloud density assessment value S. d The calculation formula is as follows:
[0101] S d =min(100, max(0, 100*(1-δ*|ρ_actual-ρ_target| / ρ_target)));
[0102] Where: ρ_actual represents the actual point cloud density of the first point cloud, which is the number of points in the first point cloud divided by the volume; ρ_target is the preset target point cloud density, which can be set according to the current scene, the characteristics of the radar sensor and the accuracy requirements, for example, a value in the range of 100-500 points / cubic meter; δ is the preset weighting coefficient (for example, 1.5).
[0103] The point cloud integrity assessment value is used to characterize the integrity of the first point cloud. The first point cloud is compared with the point cloud of the pallet template corresponding to the pallet type. The coverage rate of the first point cloud over the key feature region (e.g., the pallet's face towards the forklift) of the pallet template point cloud is calculated, yielding the number of matching points between the first and pallet template point clouds. Using the extrinsic calibration matrix of the radar sensor, the pallet template point cloud is transformed into the radar sensor's coordinate system. Then, a matching algorithm (e.g., nearest neighbor algorithm) is used to find the closest point pair between the pallet template point cloud and the first point cloud in the same coordinate system. This closest point pair is the matching point pair between the first and pallet template point clouds. Comparing the number of matching points in the pallet template point cloud with the total number of points in the pallet template point cloud yields the coverage rate of the first point cloud over the key feature region (e.g., the pallet's face towards the forklift) of the pallet template point cloud. The integrity assessment value S of the first point cloud is then obtained. w The calculation formula is as follows:
[0104] S w =min(100, max(0, 100*(N_match / N_template)*(1-ε*D_avg / D_max)));
[0105] Where: N_match represents the number of points in the first point cloud that match the pallet template point cloud; N_template represents the number of points in the pallet template point cloud; D_avg represents the average point distance of the point pairs that match the pallet template point cloud in the first point cloud. For each point pair that matches the first point cloud and the pallet template point cloud, the Euclidean distance between each point pair is calculated, and the average of these point pairs' Euclidean distances is used as the average point distance; D_max represents the preset maximum allowable distance, which is set based on the geometric features of the pallet. Taking a standard pallet as an example, its value is within 20-50mm; ε represents the preset weighting coefficient (e.g., 0.8).
[0106] The noise level assessment value is used to characterize the noise situation in the first point cloud. First, the proportion of outliers in the first point cloud is statistically analyzed. Then, the first point cloud is clustered to obtain point cloud clusters. For each point, the distance from the point to the center of the local point cloud cluster is calculated, and it is determined whether this distance exceeds a preset threshold. If it exceeds the preset threshold, the point is identified as an outlier; otherwise, it is considered a normal point. The preset threshold is 1.5-2 times the mean distance from the point in the point cloud cluster to the center of the cluster. Smoothness analysis is performed on the first point cloud, using the standard deviation of local points to reflect the smoothness of the point cloud. Local points can be selected from the vicinity of the first point cloud, i.e., points on the pallet facing the forklift surface (with a radius of, for example, 5-10 cm). The standard deviation of the points in the vicinity of the first point cloud is then calculated as the standard deviation of the local points. Finally, a noise level assessment value, S, is generated. z The calculation formula is as follows:
[0107] S z =min(100,max(0,,100*(1-η*(N_outlier / N_total+σ_local / σ_max))));
[0108] Where: N_outlier represents the number of outliers in the first point cloud; N_total represents the total number of points in the first point cloud; σ_local represents the local standard deviation, which is the standard deviation of the local points mentioned above; σ_max is the maximum permissible standard deviation, which is set based on the accuracy of the radar sensor, for example, a value within 5-10mm or 10-20mm; η is the weighting coefficient (for example, 1.2).
[0109] Based on at least one point cloud quality assessment value of the first point cloud, the point cloud quality assessment value of the first point cloud is obtained. If the selected point cloud quality assessment value of the first point cloud is one of the point cloud density assessment value, point cloud integrity assessment value, and noise level assessment value, the image quality assessment value can be directly used as the image quality parameter of the sensing image. If the selected point cloud quality assessment value of the first point cloud is multiple of the point cloud density assessment value, point cloud integrity assessment value, and noise level assessment value, the point cloud quality assessment value of the first point cloud can be calculated by calculating the mean or by weighting.
[0110] Simultaneously, the point cloud density assessment value, point cloud integrity assessment value, and noise level assessment value of the first point cloud were obtained. Then, the weights of the point cloud density assessment value, point cloud integrity assessment value, and noise level assessment value were obtained respectively. Thus, the point cloud quality parameter of the first point cloud can be calculated by weighted summation. The expression of the point cloud quality parameter Kd of the first point cloud is as follows:
[0111] Kd = w_d * Point cloud density assessment value + w_b * Point cloud integrity assessment value + w_n * Noise level assessment value;
[0112] Where: w_d is the sharpness weight, for example, 0.3; w_b is the exposure weight, for example, 0.4; w_n is the motion blur weight, for example, 0.3.
[0113] After the aforementioned calculation process, the image quality parameter Kt of the sensed image and the point cloud quality parameter Kd of the first point cloud were obtained. Then, the image weight of the sensed image and the point cloud weight of the first point cloud were calculated as follows:
[0114] The image weight Wt of the sensed image is calculated as follows: Wc = Kt / (Kt + Kd);
[0115] The formula for calculating the point cloud weight Wd of the first point cloud is: Wd = Kd / (Kt + Kd).
[0116] Subsequently, based on image weights and point cloud weights, the coordinates of each point in the tray image and the coordinates of each point in the tray point cloud can be fused to obtain the target point cloud of the tray.
[0117] For each point on the pallet, it has corresponding coordinates in both the pallet image and the pallet point cloud. The two coordinates corresponding to a point are then fused to obtain the coordinates of that point. Taking a point P as an example, the fused coordinates of point P (P... xm P ym P zm The formula for calculating ) is:
[0118] P xm =Wc*P xt +Wd*P xc ;
[0119] P ym =Wc*P yt +Wd*P yc ;
[0120] P zm =Wc*P zt +Wd*P zc ;
[0121] Among them, (P) xt P yt P zt (P) represents the coordinates of point P on the tray image. xc P yc P zc ) represents the coordinates of point P on the tray point cloud.
[0122] In one embodiment of this application, the image weight and point cloud weight are adjusted based on the illumination intensity value of the scene where the forklift is currently located and / or the integrity assessment value of the first point cloud. For example, upper and lower threshold values for illumination intensity, as well as upper and lower threshold values for integrity, are preset in advance. If the illumination intensity value of the scene where the forklift is currently located is greater than the upper threshold value, and / or the integrity assessment value of the first point cloud is less than the lower threshold value, then the image weight is increased and the point cloud weight is decreased accordingly. For example, if the current image weight is 0.4 and the point cloud weight is 0.6, then the image weight can be increased to 0.5 and the point cloud weight decreased to 0.5. If the illumination intensity value of the scene where the forklift is currently located is less than the lower threshold value, and / or the integrity assessment value of the first point cloud is greater than the upper threshold value, then the image weight is decreased and the point cloud weight is increased accordingly. For example, if the current image weight is 0.4 and the point cloud weight is 0.6, then the image weight can be decreased to 0.3 and the point cloud weight increased to 0.7.
[0123] Based on the adjusted image weights and point cloud weights, the coordinates of each point in the tray image are fused with the coordinates of each point in the tray point cloud to obtain the target point cloud of the tray. Similar to the calculation method described above, it will not be repeated here.
[0124] Step 204: Determine the pose information of the pallet based on the target point cloud of the pallet. This is largely the same as step 104 in the previous embodiment and will not be described again here.
[0125] Based on the aforementioned process, the weights are adjusted according to the current scene of the forklift and point cloud fusion is achieved to adapt to different lighting conditions and various pallet types. This increases the anti-interference capability of pallet pose recognition in the point cloud fusion stage and improves the accuracy of the target point cloud of the pallet obtained by fusion.
[0126] One embodiment of this application relates to a controller applied to a forklift. This controller can be a dedicated controller for pallet pose recognition or a main controller within the forklift. The controller is used to execute the pallet pose recognition method steps described in the foregoing embodiments.
[0127] Another embodiment of this application relates to a forklift, which includes: a forklift body, a data acquisition device, and the controller described in the foregoing embodiment.
[0128] Data acquisition equipment is mounted on the forklift body to collect image and point cloud data of the area in front of the forklift's forks and send it to the controller. The data acquisition equipment may include: a radar sensor (e.g., a lidar) and an image sensor (e.g., a camera). The radar sensor is mounted on the front of the forklift body to acquire point cloud data of the area in front of the forklift's forks; the image sensor is mounted on the forklift's forks (e.g., at the fork tips) to collect images of the area in front of the forklift's forks. This application only describes the pallet pose recognition part of the forklift in detail; for the working principle of the forklift itself, please refer to relevant technologies.
[0129] Another embodiment of this application relates to a computer-readable storage medium, which is a non-volatile or non-transient storage medium, on which a computer program is stored. When the computer program is run by a processor, it executes the steps of the pallet pose recognition method as described in the foregoing embodiments.
[0130] The various processes shown above can be used to reorder, add, or delete steps. For example, the multiple steps described in this application can be executed in parallel, sequentially, or in different orders, as long as the desired result of the technical solution provided in this application can be achieved, and this is not limited herein.
Claims
1. A method for pallet pose recognition, comprising: When the forklift is in front of the pallet to be picked up, acquire a sensor image and a first point cloud of the forklift's fork arm. Based on the image detection results obtained by detecting the sensing image, the tray image of the tray in the sensing image and the tray point cloud of the tray in the first point cloud are obtained. The target point cloud of the pallet is obtained by fusing the pallet image with the pallet point cloud; Based on the target point cloud of the pallet, the pose information of the pallet is determined.
2. The pallet pose recognition method according to claim 1, wherein, The image detection result includes the tray type of the tray; based on the target point cloud of the tray, the pose information of the tray is determined, including: Based on the target point cloud of the pallet, the preliminary pose information of the pallet is obtained; Based on the initial pose information of the pallet, the target point cloud, and the pallet template point cloud corresponding to the pallet type, the final pose information of the pallet is obtained.
3. The pallet pose recognition method according to claim 2, wherein, Based on the target point cloud of the tray, preliminary pose information of the tray is obtained, including: Select multiple feature points located within a defined feature area of the tray from the target point cloud; Calculate the center point of the plurality of feature points, and use it as the reference center point of the tray; The vertical vector of the pallet toward the target plane of the forklift is obtained based on the coordinates of the multiple feature points; The reference heading angle of the target plane relative to the forklift is determined based on the vertical vector of the target plane; The preliminary pose information includes the reference center point and the reference heading angle.
4. The tray pose recognition method according to claim 3, wherein, The perpendicular vector of the pallet toward the target plane of the forklift is obtained based on the coordinates of the multiple feature points, including: The multiple feature points are combined into a feature data matrix, and the coordinate values of each dimension contained in the feature data matrix are standardized. Obtain the covariance matrix of the feature data matrix after standardization, and use the eigenvector corresponding to the largest eigenvalue among all eigenvalues of the covariance matrix as the vertical vector.
5. The pallet pose recognition method according to claim 1, wherein, The target point cloud of the pallet is obtained by fusing the pallet image with the pallet point cloud, including: Obtain the image weight of the sensed image and the point cloud weight of the first point cloud; Based on the image weights and the point cloud weights, the coordinates of each point in the tray image and the coordinates of each point in the tray point cloud are fused to obtain the target point cloud of the tray.
6. The pallet pose recognition method according to claim 5, wherein, Obtaining the image weights of the sensed image and the point cloud weights of the first point cloud includes: The quality of the sensing image and the first point cloud are evaluated separately to obtain the image quality parameters of the sensing image and the point cloud quality parameters of the first point cloud. Based on the image quality parameters and the point cloud quality parameters, the image weight of the sensing image and the point cloud weight of the first point cloud are obtained.
7. The pallet pose recognition method according to claim 5, wherein, After obtaining the image weights of the sensed image and the point cloud weights of the first point cloud, the method further includes: Based on the illumination intensity value of the scene where the forklift is currently located and / or the integrity assessment value of the first point cloud, the image weights and the point cloud weights are adjusted. Based on the image weights and the point cloud weights, the coordinates of each point in the tray image are fused with the coordinates of each point in the tray point cloud to obtain the target point cloud of the tray, including: Based on the adjusted image weights and point cloud weights, the coordinates of each point in the tray image and the coordinates of each point in the tray point cloud are fused to obtain the target point cloud of the tray.
8. The pallet pose recognition method according to claim 6, wherein, The quality assessment method for the sensed image is as follows: Obtain at least one image quality assessment value of the sensed image, wherein the image quality assessment value is any one of the following: sharpness assessment value, exposure assessment value, motion blur assessment value; Based on the at least one image quality assessment value of the sensed image, the image quality parameters of the sensed image are obtained.
9. The quality assessment method for the first point cloud is as follows: Obtain at least one point cloud quality assessment value for the first point cloud, wherein the point cloud quality assessment value is any one of the following: point cloud density assessment value, point cloud integrity assessment value, and noise level assessment value; Based on the at least one point cloud quality assessment value of the first point cloud, the point cloud quality parameters of the first point cloud are obtained.
10. A controller configured to perform the steps of the pallet pose recognition method according to any one of claims 1-8.
11. A forklift, comprising: Forklift body, data acquisition equipment, and controller as described in claim 9; The data acquisition device is mounted on the forklift body and is used to acquire sensor images and point clouds in front of the forklift's forks and send them to the controller.