A deep vision recognition and positioning method for AGV end operation

By combining depth cameras and checkerboard calibration with deep learning and traditional algorithms, accurate identification and positioning of pallets in AGV end-point operations were achieved, solving the problem of inaccurate pallet identification in unstructured scenarios and improving the operational efficiency and accuracy of AGVs.

CN118644532BActive Publication Date: 2026-06-16ANHUI HELI CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ANHUI HELI CO LTD
Filing Date
2024-06-11
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

In existing technologies, AGV end-of-line operations face difficulties in pallet recognition and positioning in unstructured scenarios. They are also susceptible to the effects of light intensity and background noise, resulting in inaccurate recognition and making it difficult to adapt to complex and ever-changing scenarios.

Method used

By employing a depth camera combined with a checkerboard calibration method, pallet feature points are detected and segmented through RGB-D/IR images. Combining deep learning and traditional algorithms, accurate pallet identification and positioning are achieved. Supplementary identification is performed in infrared images, and a local path planning algorithm is designed to complete the end-point operation.

🎯Benefits of technology

It achieves high-precision identification and positioning of pallets in complex and ever-changing unstructured scenarios, improves the accuracy and efficiency of AGV end-point operations, adapts to low-light environments, and reduces computing costs.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN118644532B_ABST
    Figure CN118644532B_ABST
Patent Text Reader

Abstract

The application discloses a kind of depth vision identification and positioning method for AGV end operation, specifically related to AGV technical field, including mapping pixel coordinate system and the depth camera coordinate system of forklift, the external parameter calibration of depth camera;Obtain the pose of current vehicle body center in world coordinate system, utilize depth camera to capture RGB-D / IR image, detect and segment tray in RGB or IR image, locate to tray leg feature point and midline point pixel coordinate;Correlate tray related depth information, calculate the coordinates of tray in world coordinate system, calculate the attitude of tray;Local path planning algorithm of installation design forms local path, completes end operation.The application discloses tray recognition algorithm flow from RGB / IR image to AGV end operation;Utilize tray detection algorithm and tray hole detection algorithm and through pre-trained infrared image recognition neural network model, complete tray detection and segmentation based on infrared and RGB image recognition.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of AGV technology, and more specifically, to a depth vision recognition and positioning method for AGV end-effector operations. Background Technology

[0002] Given the large number of material handling tasks in industrial production, Automated Guided Vehicles (AGVs) have emerged, capable of autonomous driving and performing end-of-line operations in the forklift industry. They use sensors such as lasers and ultrasonic waves for sensing and control algorithms for adaptive navigation. AGVs achieve efficient material handling by automatically performing tasks such as picking up, transporting, and placing pallets.

[0003] In existing AGV end-of-line operations, depth sensor solutions such as depth vision, LiDAR, and millimeter-wave radar are typically used for pallet identification and positioning within a certain range. Image processing methods are employed to extract key features of the pallet, such as its holes and legs, and the coordinates of each leg are used to determine whether the distance between adjacent legs meets preset conditions. The final step is then to pick up and place goods. However, this approach presents several challenges for AGV end-of-line operations. For example, if the aforementioned image processing methods are used for pallet identification and segmentation, they are susceptible to the effects of light intensity, image background, and noise, leading to poor pallet identification and potential errors. When using these methods to process standardized, structured pallets, the length and width of the pallet legs need to be predetermined. Furthermore, with the changing landscape of unstructured and complex handling scenarios, the requirements for AGV end-of-line operations are becoming increasingly stringent, and light intensity affects pallet identification. Therefore, relying solely on image processing methods is insufficient for identifying and locating unstructured pallets.

[0004] Therefore, in order to solve the problems of inconsistent AGV end-effector structure and positional deviations of pallets and cages during AGV end-effector operations, this invention proposes a depth vision recognition and positioning method for AGV end-effector operations as a further improvement. Summary of the Invention

[0005] In order to overcome the above-mentioned defects of the prior art, embodiments of the present invention provide a depth vision recognition and positioning method for AGV end-of-line operations to solve the problems mentioned in the background art.

[0006] To achieve the above objectives, the present invention provides the following technical solution: a depth vision recognition and positioning method for AGV end-effector operations, the specific operation of which is as follows:

[0007] S1: Fix the depth camera on the forklift and ensure that the pallet area falls within the depth camera's field of view when the forklift picks up goods; use a checkerboard grid to map the pixel coordinate system to the depth camera's coordinate system, and calibrate the extrinsic parameters of the depth camera based on Zhang Zhengyou's calibration method;

[0008] S2: When the AGV arrives at the pickup point, the current position of the vehicle center in the world coordinate system is obtained. Based on the pallet height in the actual operation scenario, the fork tip is raised so that the depth camera can capture RGB-D / IR images. The pallet in the RGB or IR image is detected and segmented, and the pixel coordinates of the pallet leg feature points and center line point are located.

[0009] S3: Based on the feature points of the tray legs and the pixel coordinates of the center line point in S2, associate the relevant depth information of the tray, calculate the coordinates of the tray in the world coordinate system, and calculate the pose of the tray through the transformation relationship between pixel coordinates.

[0010] S4: Based on the attitude of the pallet in the world coordinate system calculated by S3, the local path planning algorithm of the installation design forms a local path, and the AGV tracks the expected path to reach the target point to complete the end operation.

[0011] Furthermore, in S1, the depth camera is installed at the intersection of the fork tip center and the vehicle body. This position is used to ensure that the depth camera can capture the pallet area image after the unmanned forklift arrives at the designated pick-up and drop-off point.

[0012] Furthermore, in step S1, the extrinsic parameter calibration of the depth camera includes the following steps:

[0013] S11: Construct a checkerboard pixel coordinate system based on the checkerboard grid, and make the checkerboard pixel coordinate system parallel to the forklift center coordinate system;

[0014] S12: Acquire chessboard images using a depth camera, perform chessboard corner detection using image processing methods, obtain chessboard corner coordinates in multiple chessboard pixel coordinate systems, and convert them into coordinates in the depth camera coordinate system.

[0015] S13: Based on Zhang Zhengyou's calibration method, calculate the transformation matrix between the depth camera coordinate system and the chessboard coordinate system, and use it as the extrinsic parameter matrix from the depth camera to the chessboard coordinate system.

[0016] S14: By measuring the translation of the forklift center coordinate system and the checkerboard pixel coordinate system on the x and y axes in the world coordinate system, the transformation matrix between the two is obtained and used as the external parameter matrix from the checkerboard to the center of the vehicle.

[0017] S15: Based on the two extrinsic parameter matrices, calculate the extrinsic parameter matrix from the depth camera to the vehicle center, and complete the extrinsic parameter calibration of the depth camera;

[0018] S16: Repeat S11-S15 in sequence to obtain the extrinsic parameter matrices between multiple depth camera coordinate systems and the forklift center coordinate system. Use the average value of multiple extrinsic parameter matrices as the final extrinsic parameter matrix.

[0019] Furthermore, in S2, the method for capturing RGB-D / IR images and detecting and segmenting pallets in RGB or IR images includes: a pallet recognition algorithm flow from RGB / IR images to AGV end-of-line operations;

[0020] The pallet recognition algorithm flow is as follows:

[0021] Step 1: Design the target detection network model;

[0022] The second step is to collect datasets of pallets and cages in end-of-line operation scenarios, label and train them, and design detection and classification algorithms to quickly and accurately detect and locate the pallet or cage in the region of interest of the image.

[0023] The third step involves extracting the tray or cage insertion holes through image enhancement, edge detection, morphological image filtering, segmentation, and clustering. Combined with the geometric features of the tray insertion holes or legs, the tray recognition algorithm is completed.

[0024] Furthermore, in S2, the method for capturing RGB-D / IR images and detecting and segmenting trays in RGB or IR images includes: a tray detection algorithm and a tray hole detection algorithm;

[0025] The pallet detection algorithm includes: offline training of the network model and online target detection;

[0026] The network model is trained offline: the tray detection model is trained on a server based on a self-made tray dataset;

[0027] The online target detection involves deploying the trained tray detection model onto an industrial computer, using a depth camera to acquire RGB / IR images of the tray in real time, inputting the preprocessed image into the tray detection model for prediction, and determining whether a tray exists. If a tray exists, a predicted bounding box is output and mapped onto the aligned depth image for further processing. If not, the next frame image is acquired again for judgment.

[0028] The tray hole detection algorithm: Based on the output of the tray detection model, the target box and contour corner coordinate information of the tray are obtained, and the tray hole is segmented based on image processing.

[0029] Furthermore, the tray hole segmentation operation in the tray hole detection algorithm is as follows:

[0030] S21: Select the V channel of the HSV color space for processing through gamma transformation;

[0031] S22: The edge detection of the tray socket is performed using the Canny operator, and the image gradient is used to determine whether the abrupt change is obvious. The specific operation is as follows:

[0032] S221: Convert the image to a grayscale image;

[0033] S222: Use a Gaussian filter to smooth the image and perform noise reduction processing on the image;

[0034] S223: Calculate the magnitude and direction of the gradient of gray value change using the finite difference of first-order partial derivatives;

[0035] S224: Perform nonmaximum suppression on gradient values;

[0036] S225: Use a dual threshold algorithm to detect and connect edges.

[0037] S23: Use morphology to filter the tray image, eliminate unnecessary lines and holes, and complete the segmentation of the tray holes.

[0038] Furthermore, the image segmentation in S22 is determined based on the discontinuity of brightness values.

[0039] Furthermore, the attitude of the pallet calculated in S3 includes: the pallet's x-axis coordinate in the world coordinate system, the pallet's y-axis coordinate in the world coordinate system, the pallet's height, and the pallet's yaw angle.

[0040] Furthermore, in step S4, based on the pallet posture calculated in step S3, a local path planning algorithm is designed and smoothed to form a curve point constructed from multiple points. The AGV tracks the desired path and adjusts the fork tip posture to reach the front of the pallet.

[0041] The technical effects and advantages of this invention are as follows:

[0042] Compared with existing technologies, in order to overcome the problems of inaccurate prediction boxes caused by complex and variable pallet backgrounds, pallet plane tilt, and the presence of adjacent pallets, deep learning algorithms are combined with traditional algorithms to achieve pallet detection and hole segmentation in RGB / IR image planes. The pallet recognition algorithm flow from RGB / IR images to AGV end-point operations is disclosed. Furthermore, considering that it is not easy to complete pallet detection and segmentation in RGB images under low light conditions, pallet detection algorithms and pallet hole detection algorithms, as well as a pre-trained infrared image recognition neural network model, are used to complete pallet recognition in infrared images and obtain the position of the pallet in the infrared image. Pallet detection and segmentation based on infrared and RGB image recognition are completed. Attached Figure Description

[0043] Figure 1 This is a flowchart of the method of the present invention.

[0044] Figure 2 This is a flowchart of the tray recognition algorithm of the present invention.

[0045] Figure 3 This is a flowchart illustrating the pallet inspection process of the present invention.

[0046] Figure 4 This is a schematic diagram showing the pixel coordinates of the centers of the three pallet pillars in the image.

[0047] Figure 5 This is a schematic diagram showing the adjustment of the AGV forklift's pallet-picking posture.

[0048] Figure 6 This is based on the principle of smoothing the path of a Bézier curve.

[0049] Figure 7 This is a diagram illustrating the planning and forklift handling of pallet pickup. Detailed Implementation

[0050] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the invention.

[0051] like Figure 1 As shown, a depth vision recognition and localization method for AGV end-effector operations is described below.

[0052] S1: Fix the depth camera on the forklift and ensure that the pallet area falls within the depth camera's field of view when the forklift picks up goods; use a checkerboard grid to map the pixel coordinate system to the depth camera's coordinate system, and calibrate the extrinsic parameters of the depth camera based on Zhang Zhengyou's calibration method;

[0053] S2: When the AGV arrives at the pickup point, the current position of the vehicle center in the world coordinate system is obtained. Based on the pallet height in the actual operation scenario, the fork tip is raised to enable the depth camera to capture RGB-D / IR images. The pallet in the RGB or IR images is detected and segmented, and the pixel coordinates of the pallet leg feature points and center line point are located.

[0054] In particular, considering multi-layer stacking scenarios, in order to accurately locate the absolute height of the pallet above the ground during operation, it is necessary to measure the pallet height in the actual operation scenario; then, when the fork tip is raised so that the depth camera can capture RGB-D / IR images, the actual height of the fork tip is obtained based on the pose of the current vehicle center in the world coordinate system.

[0055] In order to overcome the problems of inaccurate prediction boxes caused by complex and variable tray backgrounds, tray plane tilt, and the presence of adjacent trays, deep learning algorithms are combined with traditional algorithms to achieve tray detection and jack segmentation in the RGB / IR image plane.

[0056] The method for capturing RGB-D / IR images and detecting and segmenting pallets in RGB or IR images includes: a pallet recognition algorithm flow from RGB / IR images to AGV end-of-line operations, a pallet detection algorithm, and a pallet hole detection algorithm.

[0057] S3: Based on the feature points of the tray legs and the pixel coordinates of the center line point in S2, associate the relevant depth information of the tray, calculate the coordinates of the tray in the world coordinate system, and calculate the pose of the tray through the transformation relationship between pixel coordinates.

[0058] S4: Based on the attitude of the pallet in the world coordinate system calculated by S3, the local path planning algorithm of the installation design forms a local path, and the AGV tracks the expected path to reach the target point to complete the end operation.

[0059] Due to constraints such as the cost of foreign computing chips, AGV forklifts, power consumption, and operational efficiency, existing technical solutions are not suitable for unstructured pallet recognition and positioning. Furthermore, the computational cost is high, making them unsuitable for domestic embedded platforms and limiting their application in actual AGVs. Therefore, this invention uses a depth camera as a sensor solution, with the algorithm running on a domestic embedded ARM chip hardware platform. It discloses a depth vision recognition and positioning method for AGV end-effector operations, as well as the necessary devices, electronic equipment, and storage media. It is mainly applied to end-effector operations in unstructured scenarios using unmanned forklifts.

[0060] In a preferred embodiment, in step S1, the depth camera is installed at the intersection of the center of the fork tip and the vehicle body. This position is used to ensure that the depth camera can capture the image of the pallet area after the unmanned forklift arrives at the designated pick-up and drop-off point.

[0061] In a preferred embodiment, step S1, considering the pose difference between the depth camera coordinates of the forklift and the forklift center coordinate system, includes the following steps for extrinsic parameter calibration of the depth camera:

[0062] S11: Construct a checkerboard pixel coordinate system based on the checkerboard grid, and make the checkerboard pixel coordinate system parallel to the forklift center coordinate system;

[0063] S12: Acquire chessboard images using a depth camera, perform chessboard corner detection using image processing methods, obtain chessboard corner coordinates in multiple chessboard pixel coordinate systems, and convert them into coordinates in the depth camera coordinate system.

[0064] S13: Based on Zhang Zhengyou's calibration method, calculate the transformation matrix between the depth camera coordinate system and the chessboard coordinate system, and use it as the extrinsic parameter matrix from the depth camera to the chessboard coordinate system.

[0065] S14: By measuring the translation of the forklift center coordinate system and the checkerboard pixel coordinate system on the x and y axes in the world coordinate system, the transformation matrix between the two is obtained and used as the external parameter matrix from the checkerboard to the center of the vehicle.

[0066] S15: Based on the two extrinsic parameter matrices, calculate the extrinsic parameter matrix from the depth camera to the vehicle center, and complete the extrinsic parameter calibration of the depth camera;

[0067] S16: Repeat S11-S15 in sequence to obtain the extrinsic parameter matrices between multiple depth camera coordinate systems and the forklift center coordinate system. Use the average value of multiple extrinsic parameter matrices as the final extrinsic parameter matrix.

[0068] In order to reduce the errors in the calibration of external parameters, S11-S15 need to be repeated multiple times in sequence.

[0069] In a preferred embodiment, the method for capturing RGB-D / IR images and detecting and segmenting pallets in RGB or IR images in S2 includes: a pallet recognition algorithm flow from RGB / IR images to AGV end-of-line operations;

[0070] As attached Figure 2 The flowchart of the tray recognition algorithm is shown below:

[0071] Step 1: Considering that the depth camera's image tray occupies a small proportion of image pixels, a lightweight and small object detection network model is designed.

[0072] The second step is to collect datasets of pallets and cages in end-of-line operation scenarios, label and train them, and design detection and classification algorithms to quickly and accurately detect and locate the pallet or cage in the region of interest of the image.

[0073] The third step involves extracting the tray or cage insertion holes through image enhancement, edge detection, morphological image filtering, segmentation, and clustering. Combined with the geometric features of the tray insertion holes or legs, the tray recognition algorithm is completed.

[0074] In a preferred embodiment, the method for capturing RGB-D / IR images and detecting and segmenting trays in RGB or IR images in S2 includes: a tray detection algorithm and a tray hole detection algorithm;

[0075] As attached Figure 3 The diagram below shows the pallet inspection workflow;

[0076] The pallet detection algorithm includes: offline training of the network model and online target detection;

[0077] The network model is trained offline: the tray detection model is trained on a server based on a self-made tray dataset;

[0078] The online target detection involves deploying the trained tray detection model onto an industrial computer, using a depth camera to acquire RGB / IR images of the tray in real time, inputting the preprocessed image into the tray detection model for prediction, and determining whether a tray exists. If a tray exists, a predicted bounding box is output and mapped onto the aligned depth image for further processing. If not, the next frame image is acquired again for judgment.

[0079] In particular, considering that it is not easy to detect and segment trays in RGB images under low light conditions, a pre-trained infrared image recognition neural network model is used to identify trays in infrared images and obtain the position of trays in infrared images; thus, tray detection and segmentation based on infrared and RGB image recognition are completed.

[0080] The tray hole detection algorithm: Based on the output of the tray detection model, the target box and contour corner coordinate information of the tray are obtained, and the tray hole is segmented based on image processing.

[0081] In a preferred embodiment, the tray hole segmentation operation in the tray hole detection algorithm is as follows:

[0082] S21: Select the V channel of the HSV color space for processing through gamma transformation;

[0083] Among them, gamma transformation can reduce the impact of external factors such as uneven lighting on the image; since lighting has the greatest impact on the V channel, the V channel of the HSV color space is selected for processing to eliminate some of the impact of lighting on tray recognition.

[0084] S22: Edge detection of tray sockets is performed using the Canny operator, while image segmentation is determined based on the discontinuity of brightness values, using image gradients to determine whether abrupt changes are significant.

[0085] The specific steps are as follows:

[0086] S221: Convert the image to a grayscale image;

[0087] S222: Use a Gaussian filter to smooth the image and perform noise reduction processing on the image;

[0088] S223: Calculate the magnitude and direction of the gradient of gray value change using the finite difference of first-order partial derivatives;

[0089] S224: Perform nonmaximum suppression on gradient values;

[0090] S225: Use a dual threshold algorithm to detect and connect edges.

[0091] S23: Use morphology to filter the tray image to eliminate unnecessary lines and holes so that the edges become smooth and the tray holes are segmented.

[0092] Among them, the pre-designed pallet detection model can pick up pallet with a reliability of 94% and can locate the feature points of the pallet legs and the pixel coordinates of the center line point.

[0093] In a preferred embodiment, the attitude of the pallet calculated in S3 includes: the pallet's x-axis coordinate in the world coordinate system, the pallet's y-axis coordinate in the world coordinate system, and the pallet's yaw angle.

[0094] In S3, based on the position of the pallet's leg feature points in pixel coordinates, and associated with depth data, the physical 3D coordinates of the pallet's leg feature points in the depth camera coordinate system are calculated; based on the calibration depth camera coordinate system and the AGV vehicle body external parameter matrix, the physical 3D coordinates of the pallet's leg feature points in the forklift center coordinate system are calculated; based on the vehicle body's pose in the world coordinate system when it reaches the forward point, the physical 3D coordinates (x, y) of the pallet's leg feature points in the world coordinate system are calculated; and the yaw angle of the pallet is calculated using multiple feature points.

[0095] In this process, during the end-effector operation of the AGV forklift, a depth camera is installed at a fixed position on the forklift. The relationship between the pixel coordinate system and the depth camera coordinate system of the forklift is as follows: Figure 4 The diagram below shows the pallet mapping of the pinhole camera model. In the depth camera coordinate system of the forklift, the xz plane is parallel to the ground, the z-axis is directly in front of the depth camera, and the y-axis is perpendicular to the ground and pointing downwards.

[0096] The normalized plane is the z=1 plane located in front of the depth camera;

[0097] P is any point on the pallet, with coordinates (X,Y,Z) in the depth camera coordinate system of the forklift.

[0098] p is the point corresponding to P in the normalized plane, with pixel coordinates (x, y).

[0099] p and P have the following mapping relationship:

[0100]

[0101] Where, f x f y For normalized focal length, c x c yLet H be the coordinates of the principal point, K be the intrinsic parameter matrix of the camera, H be the physical height of the tray, h be the thickness of the upper edge of the fork opening, L be the length of the tray, θ be the angle between the front of the tray and the projection of the coordinate system axis onto the ground, and D be the center position of the front of the tray.

[0102] Among them, the calculation of pallet position and yaw angle is as follows:

[0103] Based on the predicted key points, the coordinates of the centers of the three pallet uprights are obtained, such as... Figure 4 The diagram shows the pixel coordinates of the centers of the three pallet pillars in the image. From left to right, the center coordinates of the three pillars are a(u1,v1), o(u,v) and b(u2,v2).

[0104] In this context, the pixel coordinates are directly obtained from the pixel values ​​of the corresponding pixels in the Depth image, i.e., the depth information Z1, Z2. c For Z2, the intrinsic parameter matrix of the camera is K.

[0105] Specifically, by combining the perspective projection principle of the camera imaging model, the coordinates (X, Y, F, G, I) of the center points of the three pillars in the depth camera coordinate system were calculated. c ,Y c Z c )

[0106]

[0107] Substituting a(u1,v1), o(u,v), and b(u2,v2) into the calculation, we can obtain the 3D coordinates of these three points in the depth camera coordinate system, such as... Figure 6 The pose calculation principle of the center point O of the tray in the RGB depth camera coordinate system is shown. The corresponding point coordinates are represented as A(X1,Y1,Z1), O(X... c ,Y c Z c ), B(X2,Y2,Z2).

[0108] The yaw angle of the pallet center point is calculated based on the 3D coordinates of the three points.

[0109]

[0110] After the above steps are performed, the pose of the center point O of the tray in the depth camera coordinate system is obtained.

[0111] In a preferred embodiment, in step S4, a local path planning algorithm is designed and smoothed based on the pallet posture calculated in step S3 to form a curve point constructed from multiple points. The AGV tracks the desired path and adjusts the fork tip posture to reach the front of the pallet in order to achieve accurate pallet picking.

[0112] Among them, such as Figure 5 The diagram shown illustrates the adjustment of the AGV forklift's pallet-picking posture. The green dashed box represents the ideal pallet position, while the blue solid box indicates the actual pallet position in the scenario.

[0113] Among them, such as Figure 6 The diagram shown illustrates the principle of smoothing a Bézier curve path. For an AGV to accurately pick up a pallet, it needs to perform local path planning based on the global pose of the pallet and make adaptive control adjustments according to the new trajectory. That is, the picking task is completed in two steps from AB to BP. The path from AB to BP still has some defects and needs to be smoothed. The planned picking path is processed by referring to the third-order Bézier curve, and the inflection points in the path are fitted. Two control points are needed, namely the midpoints B1 and B2 of AB and BP, respectively, to obtain the result.

[0114] Among them, such as Figure 7 The diagram shows the pallet picking plan and forklift picking process. The green path in the diagram is planned based on the third-order Bézier curve. The path planning is made smooth enough according to the relative pose of the front point, the end point, the vehicle body and the pallet, so that the controller drives the vehicle body to travel along the planned path and reach point B directly in front of the pallet.

[0115] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such process, method, article, or apparatus.

[0116] Although embodiments of the invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims and their equivalents.

Claims

1. A depth vision recognition and positioning method for AGV end-effector operations, characterized in that: The specific steps are as follows: S1: Fix the depth camera on the forklift, ensuring that the pallet area falls within the depth camera's field of view when the forklift picks up goods; map the pixel coordinate system to the depth camera's coordinate system using a checkerboard pattern, and calibrate the depth camera's extrinsic parameters based on Zhang Zhengyou's calibration method; the specific extrinsic parameter calibration of the depth camera includes the following steps: S11: Construct a checkerboard pixel coordinate system based on the checkerboard grid, and make the checkerboard pixel coordinate system parallel to the forklift center coordinate system; S12: Acquire chessboard images using a depth camera, perform chessboard corner detection using image processing methods, obtain chessboard corner coordinates in multiple chessboard pixel coordinate systems, and convert them into coordinates in the depth camera coordinate system. S13: Based on Zhang Zhengyou's calibration method, calculate the transformation matrix between the depth camera coordinate system and the chessboard coordinate system, and use it as the extrinsic parameter matrix from the depth camera to the chessboard coordinate system. S14: By measuring the translation of the forklift center coordinate system and the checkerboard pixel coordinate system on the x and y axes in the world coordinate system, the transformation matrix between the two is obtained and used as the external parameter matrix from the checkerboard to the center of the vehicle. S15: Based on the two extrinsic parameter matrices, calculate the extrinsic parameter matrix from the depth camera to the vehicle center, and complete the extrinsic parameter calibration of the depth camera; S16: Repeat S11-S15 in sequence to obtain the extrinsic parameter matrices between multiple depth camera coordinate systems and forklift center coordinate system, and take the average of multiple extrinsic parameter matrices as the final extrinsic parameter matrix. S2: When the AGV arrives at the pickup point, the current position of the vehicle center in the world coordinate system is obtained. Based on the pallet height in the actual operation scenario, the fork tip is raised so that the depth camera can capture RGB-D / IR images. The pallet in the RGB or IR image is detected and segmented, and the pixel coordinates of the pallet leg feature points and center line point are located. S3: Based on the feature points of the tray legs and the pixel coordinates of the center line point in S2, associate the relevant depth information of the tray, calculate the coordinates of the tray in the world coordinate system, and calculate the pose of the tray through the transformation relationship between pixel coordinates. S4: Based on the pallet's attitude in the world coordinate system calculated by S3, the local path planning algorithm of the installation design forms a local path, and the AGV tracks the expected path to reach the target point to complete the end operation.

2. The depth vision recognition and positioning method for AGV end-point operations according to claim 1, characterized in that: In S1, the depth camera is installed at the intersection of the center of the fork tip and the vehicle body. This position is used to ensure that the depth camera can capture the pallet area after the unmanned forklift arrives at the designated pick-up and drop-off point.

3. The depth vision recognition and positioning method for AGV end-point operations according to claim 1, characterized in that: In step S2, the method for capturing RGB-D / IR images and detecting and segmenting pallets in RGB or IR images includes: a pallet recognition algorithm flow from RGB / IR images to AGV end-of-line operations; The pallet recognition algorithm flow is as follows: Step 1: Design the target detection network model; The second step is to collect datasets of pallets and cages in end-of-line operation scenarios, label and train them, and design detection and classification algorithms to quickly and accurately detect and locate the pallet or cage in the region of interest of the image. The third step involves extracting the tray or cage insertion holes through image enhancement, edge detection, morphological image filtering, segmentation, and clustering. Combined with the geometric features of the tray insertion holes or legs, the tray recognition algorithm is completed.

4. The depth vision recognition and positioning method for AGV end-point operations according to claim 3, characterized in that: In step S2, the method for capturing RGB-D / IR images and detecting and segmenting trays in RGB or IR images further includes: a tray detection algorithm and a tray hole detection algorithm; The pallet detection algorithm includes: offline training of the network model and online target detection; The network model is trained offline: the tray detection model is trained on a server based on a self-made tray dataset; The online target detection involves deploying the trained tray detection model onto an industrial computer, using a depth camera to acquire RGB / IR images of the tray in real time, inputting the preprocessed image into the tray detection model for prediction, and determining whether a tray exists. If a tray exists, a predicted bounding box is output and mapped onto the aligned depth image for further processing. If not, the next frame image is acquired again for judgment. The tray hole detection algorithm: Based on the output of the tray detection model, the target box and contour corner coordinate information of the tray are obtained, and the tray hole is segmented based on image processing.

5. A depth vision recognition and positioning method for AGV end-of-line operations according to claim 4, characterized in that: The tray hole segmentation operation in the tray hole detection algorithm is as follows: S21: Select the V channel of the HSV color space for processing through gamma transformation; S22: The edge detection of the tray socket is performed using the Canny operator, and the image gradient is used to determine whether the abrupt change is obvious. The specific operation is as follows: S221: Convert the image to a grayscale image; S222: Use a Gaussian filter to smooth the image and perform noise reduction processing on the image; S223: Calculate the magnitude and direction of the gradient of gray value change using the finite difference of first-order partial derivatives; S224: Perform nonmaximum suppression on gradient values; S225: Use a dual threshold algorithm to detect and connect edges; S23: Use morphology to filter the tray image, eliminate unnecessary lines and holes, and complete the segmentation of the tray holes.

6. A depth vision recognition and positioning method for AGV end-of-line operations according to claim 5, characterized in that: The image segmentation in S22 is determined based on the discontinuity of brightness values.

7. A depth vision recognition and positioning method for AGV end-point operations according to claim 1, characterized in that: The attitude of the pallet calculated in S3 includes: the pallet's x-axis coordinate in the world coordinate system, the pallet's y-axis coordinate in the world coordinate system, and the pallet's yaw angle.

8. A depth vision recognition and positioning method for AGV end-point operations according to claim 7, characterized in that: In step S4, based on the pallet posture calculated in step S3, a local path planning algorithm is designed and smoothed to form a curve point constructed from multiple points. The AGV tracks the desired path and adjusts the fork tip posture to reach the front of the pallet.