A method of hooking by a hooking robot

By combining stereo vision sensors and deep learning networks, the problems of complex environment and changing lighting during the uncoupling process of train carriages were solved, enabling efficient and precise robot uncoupling operations.

CN115619853BActive Publication Date: 2026-06-19WUHAN POWER EQUIP WORKS

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
WUHAN POWER EQUIP WORKS
Filing Date
2022-10-12
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies often fail to uncouple train carriages due to complex on-site environments, numerous obstacles, and unfavorable lighting conditions. Furthermore, existing image processing methods have low detection accuracy, making it difficult to achieve precise uncoupling path planning.

Method used

Information is collected using a stereo vision sensor, combined with a deep learning network for real-time detection and recognition, and a hook removal path is planned. The point cloud coordinate data of the hook is calculated by registering RGB images and depth images, and then converted into the robot's reference coordinate system to generate the grasping path.

Benefits of technology

It improves the accuracy and success rate of the unhooking process, overcomes the influence of complex environments and changes in lighting, and achieves efficient and precise robot unhooking operation.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115619853B_ABST
    Figure CN115619853B_ABST
Patent Text Reader

Abstract

This invention proposes a method for uncoupling a train using a robot, belonging to the field of train uncoupling technology. The method involves collecting the position and speed of the train carriage, and simultaneously determining whether the carriage has entered the robot's grasping window. If the carriage has not entered the window, but the robot has, the camera is activated and acquires RGB and depth images. The depth image is then registered to the color image, and the point cloud coordinates of the coupler are calculated. The robot receives the grasping path and executes the grasping action according to the order of the points forming the path. Once the carriage is uncoupled, the robot's grasping action is complete. The acquisition of RGB and depth images utilizes a perspective projection model and Zhang Zhengyou calibration method to acquire multiple sets of checkerboard patterns. This method overcomes the challenges of complex uncoupling environments, numerous obstacles, short uncoupling windows, and unfavorable lighting conditions that can lead to uncoupling failures.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of uncoupling technology for train carriages, and more specifically, to a method for uncoupling a robot. Background Technology

[0002] With the rapid development of my country's economy and society, railway transportation has been widely used in industrial production due to its advantages of large transport capacity and high timeliness. After the freight cars arrive at the station, they need to be uncoupled from the adjacent cars to unload the goods. The uncoupling of the cars is difficult, risky and the working environment is harsh, so manual uncoupling is rarely used now.

[0003] Publication No.: CN 112441055 B, Title: Uncoupling Control Method for Train Uncoupling Robot, discloses a method based on image recognition equipment, using background difference and inter-frame difference methods to detect carriage targets, using a similarity measurement algorithm to associate targets, determining the type and location of the carriage to be uncoupled, and determining the handle position of the coupler to be uncoupled, thus completing the uncoupling process. Because freight cars are always in a dynamic state due to the inertia before and after braking, image processing-based detection methods have low matching accuracy in coupler detection and grasping points. Furthermore, the complex environment of the uncoupling site, with numerous obstacles, a short uncoupling window, and unfavorable factors such as early morning and late evening lighting can easily lead to uncoupling failure.

[0004] Therefore, there is an urgent need for a hook removal method that overcomes the complex environment, obstacles, and lighting conditions at the hook removal site and provides a more precise hook removal path. Summary of the Invention

[0005] In view of this, the present invention proposes a method to extract information from the hook removal site using a stereo vision sensor, and then use a deep learning network to detect and identify the visual information in real time, and plan the hook removal path and method.

[0006] The technical solution of this invention is implemented as follows: including the following steps:

[0007] S1. Collect the position and speed of the carriage, and determine whether the carriage has entered the robot's grasping window. If the carriage has not entered the robot's grasping window, repeat step S1.

[0008] S2. The robot enters the grasping window, starts the camera, and acquires RGB and depth images. At the same time, it registers the depth image to the RGB image and calculates the point cloud coordinate data of the coupler.

[0009] S3. Calculate the position of the coupler using a deep learning network, and calculate the point cloud coordinates of the coupler in the point cloud coordinates corresponding to the robot's grasping path.

[0010] S4. Convert the point cloud coordinate data corresponding to the hook in the point cloud coordinate data to the robot reference coordinate system, obtain the reference coordinate data of the gripping point in the robot reference coordinate system, and generate the robot gripping path.

[0011] S5. The robot receives the robot grasping path and performs the grasping action according to the order of the points that make up the robot grasping path. The carriage is uncoupled, the robot grasping action is completed, and the robot returns to step S1 to prepare for the next uncoupling task.

[0012] Based on the above technical solution, preferably, the camera includes two cameras with identical parameters, and the pixel depth value of the scene captured by each camera is Z. Where f is the focal length of the two cameras, Z is the parallax between the two cameras, and B is the projection width of the two cameras in the same plane.

[0013] Based on the above technical solutions, preferably, the camera acquires RGB images and depth images using a perspective projection model, and the camera captures images of the calibration board to obtain optimized parameters.

[0014] Based on the above technical solutions, the preferred method is to use the Zhang Zhengyou calibration method to collect multiple sets of chessboard grids, calculate the point cloud coordinates of the internal corner points, and return the positions of all chessboard corner points in the image. The distortion parameters are compensated by polynomial fitting.

[0015] Based on the above technical solutions, preferably, the deep learning network is trained pixel by pixel using a fully convolutional neural network model with an encoder and decoder structure based on SegNet, and the coupler image is manually annotated. The trained network is then deployed in the uncoupling scenario to test the accuracy and real-time performance of the uncoupling operation.

[0016] Based on the above technical solutions, preferably, the deep learning network consists of 11 blocks, of which the SegNet encoder has 5 convolutional blocks, each convolutional block consists of 2 to 3 convolutional layers and a max pooling layer, each convolutional block in the SegNet decoder contains an upsampling layer and 2 to 3 convolutional layers, and the SegNet decoder also contains a softmax layer.

[0017] Based on the above technical solutions, preferably, the deep learning network automatically adjusts its learning rate through an optimization formula and updates it multiple times; the optimization formula is: ,in, and Let be the parameter gradients at times t+1 and t, respectively. It is the exponentially decaying average. The exponential decay value at time t-1, and Let θ be the gradient of parameter θ at time t, and RMS be the root mean square value; it also includes a performance metric function, which evaluates the effectiveness of the updated deep learning network. The performance metric function is:

[0018]

[0019]

[0020] Where precision is the computational precision, recall is the recall rate, TP is the true positive, FP is the false positive, and FN is the false negative.

[0021] Based on the above technical solutions, preferably, the generated robot grasping path includes the hook removal action path and the carriage action path, and the robot grasping path is obtained by fitting the hook removal action path and the carriage action path.

[0022] Based on the above technical solutions, the preferred method for extracting the point cloud coordinates of any point in the hook-unhooking action path using the arc formed by the hook-unhooking auxiliary rod as the center O and the swing arm as the radius R, and employing the equal chord length method, is as follows:

[0023] The point cloud coordinates of the point are (x, y), and the formula for calculating the arc is... + = ;

[0024] If the next path point If the coordinates of point (x', y') are (x', y'), then point The formula for calculating point cloud coordinates is:

[0025] x' = xcosθ - ysinθ;

[0026] y' = xsinθ + ycosθ;

[0027] θ = 2arcsin(L / 2R);

[0028] Among them, setting Point and The length of the chord between the points is L;

[0029] The movement path of the carriage is determined by the carriage speed and the uncoupling time. The formula for the movement path of the carriage is as follows: S=vt', where S represents the movement path, v represents the carriage speed, and t' represents the time for the robot to grasp the path.

[0030] Based on the above technical solutions, preferably, a planar coordinate system is established for the unhooking action path, with the starting point as the origin, the X-axis representing the horizontal movement distance, the Y-axis representing the vertical movement distance, and the t'-axis representing the time of the robot's grasping path; a planar coordinate system is established for the carriage action path, with the period when the carriage enters the robot's grasping window as the origin, the t'-axis representing the time of the robot's grasping path, and X' representing the carriage action path. The carriage action path along the X'-axis is fitted with the path along the X-axis of the unhooking action path to obtain the planar coordinate system of the robot's grasping path.

[0031] The hook removal method of the present invention, which uses a hook removal robot, has the following advantages over the prior art:

[0032] (1) Multiple sets of chessboard grids were collected by using perspective projection model and Zhang Zhengyou calibration method to collect RGB and depth images. This overcame the problems of complex environment, many obstacles, short window period for de-hooking, and adverse factors such as morning and evening light that led to de-hooking failure.

[0033] (2) The X-axis path of the carriage motion path and the hook-off motion path are fitted to obtain the robot's grasping path, making the robot's hook-off action more accurate. Attached Figure Description

[0034] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0035] Figure 1 This is a front view of a hook removal device according to the present invention;

[0036] Figure 2 For the present invention Figure 1 The left view;

[0037] Figure 3 This is a schematic diagram illustrating the principle of the equal chord length method for extracting the hook-unhooking action path according to the present invention.

[0038] Figure 4 A planar coordinate system is established for the hook removal action path of this invention;

[0039] Figure 5 This is the plane coordinate system for the movement path of the carriage in this invention;

[0040] Figure 6 This is the plane coordinate system for the robot's grasping path in this invention. Detailed Implementation

[0041] The technical solutions of the present invention will be clearly and completely described below with reference to the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present invention.

[0042] like Figure 1-6 As shown, a method for unhooking a robot includes the following steps:

[0043] S1. Collect the position and speed of the carriage, and determine whether the carriage has entered the robot's grasping window. If the carriage has not entered the robot's grasping window, repeat step S1.

[0044] A distance measuring sensor is used to measure the distance between the carriage and the robot to determine the carriage's position, and a speed sensor is used to measure the carriage's speed. Determining whether the carriage has entered the robot's grasping window is based on the position and speed between the carriage and the robot. A specific carriage speed is set as the grasping speed. When the robot needs to grasp, the carriage moves in a straight line at this constant speed. When the distance between the carriage and the robot reaches the grasping distance, the robot enters its grasping window. Existing methods involve marking the carriages or couplers, and image or digital sensors detect the carriage or coupler that needs to be uncoupled, providing feedback to the robot. The robot then prepares to uncouple and performs the uncoupling when it reaches the uncoupling position. However, this preparation time is insufficient, easily leading to uncoupling failure. This technical solution pre-adjusts the carriage's motion path, resulting in a more accurate robot grasping path.

[0045] If the car speed has not reached the grab window period, it means that there is no need to uncouple; simply repeat step S1.

[0046] S2. The robot enters the grasping window, starts the camera, and acquires RGB and depth images. At the same time, it registers the depth image to the RGB image and calculates the point cloud coordinate data of the coupler.

[0047] Depth images, also known as distance images, are images that use the distance (depth) from the image acquisition device to various points in the scene as pixel values. They directly reflect the geometry of the visible surfaces of objects. Depth images can be converted into point cloud data through coordinate transformation, and point cloud data with regularity and necessary information can also be converted into depth image data.

[0048] In the image frames provided by the depth data stream, each pixel represents the distance from the object at that specific (x, y) coordinate to the plane closest to the camera plane within the depth sensor's field of view.

[0049] To acquire depth images, two cameras are included, both with identical parameters. The pixel depth value of the scene captured by each camera is Z. Where f is the focal length of the two cameras, Z is the parallax between the two cameras, and B is the projection width of the two cameras in the same plane.

[0050] The camera acquires RGB and depth images using a perspective projection model, and optimizes parameters by capturing images of a calibration board. RGB image acquisition is an industry-standard color method that uses variations in the red (R), green (G), and blue (B) color channels and their superposition to obtain a wide variety of colors. RGB represents the colors of the red, green, and blue channels, and this standard encompasses almost all colors perceptible to human vision, resulting in more realistic and reliable images.

[0051] A perspective projection model uses images captured by a camera to calculate the geometric parameters of a measured object in three-dimensional space. The image is the reflection of the spatial object on the image plane through an imaging system; that is, the projection of the spatial object onto the image plane. The grayscale of each pixel in the image reflects the intensity of reflected light at a point on the surface of the spatial object, and the position of that point on the image is related to the set of positions of corresponding points on the surface of the spatial object. The interrelationships between these positions are determined by the geometric projection model of the camera imaging system.

[0052] The Zhang Zhengyou calibration method was used to collect multiple sets of chessboard grids, calculate the point cloud coordinates of the internal corner points, and return the positions of all chessboard corner points in the image. The distortion parameters were compensated by polynomial fitting.

[0053] Zhang Zhengyou's camera calibration method is a single-plane checkerboard camera calibration method proposed by Professor Zhang Zhengyou in 1998. Traditional calibration methods require three-dimensional calibration boards with extremely high precision, which are difficult to manufacture. Professor Zhang's method lies between traditional and self-calibration methods, overcoming the need for high-precision calibration materials required by traditional methods, while only requiring a printed checkerboard. It also improves accuracy and simplifies operation compared to self-calibration. Therefore, Zhang's calibration method is widely used in computer vision.

[0054] In summary, the acquisition of RGB and depth images using a perspective projection model and the Zhang Zhengyou calibration method to acquire multiple sets of chessboard grids overcomes the challenges of complex on-site environments with numerous obstacles, short unhooking windows, and unfavorable factors such as early morning and late evening lighting, which can lead to unhooking failures.

[0055] To further improve the accuracy of hook removal, S3. utilizes a deep learning network to calculate the position of the hook and the corresponding point cloud coordinates of the robot's grasping path. Deep learning belongs to the field of robot learning, and its introduction brings it closer to artificial intelligence. Deep learning learns the inherent patterns and hierarchical representations of sample data. The information gained during this learning process greatly aids in interpreting data such as text, images, and sound. Its ultimate goal is to enable robots to possess analytical learning capabilities like humans, capable of recognizing data such as text, images, and sound. Deep learning is a complex robot learning algorithm, achieving results in speech and image recognition far exceeding previous related technologies. Therefore, by employing a deep learning network, the hook removal action becomes more precise.

[0056] The deep learning network is trained pixel-by-pixel using a fully convolutional neural network model with a SegNet encoder and SegNet decoder structure. The coupler images are manually annotated, and the trained network is deployed in the uncoupling scenario to test the accuracy and real-time performance of the uncoupling operation.

[0057] The deep learning network consists of 11 blocks. The SegNet encoder has 5 convolutional blocks, each of which consists of 2 to 3 convolutional layers and a max pooling layer. The SegNet decoder contains an upsampling layer and 2 to 3 convolutional layers in each convolutional block. The SegNet decoder also contains a softmax layer.

[0058] To adjust the learning rate of a deep learning network, the network automatically adjusts its learning rate using an optimization formula, which is updated multiple times. The optimization formula is: ,in, and Let be the parameter gradients at times t+1 and t, respectively. It is the exponentially decaying average. The exponential decay value at time t-1, and Let θ be the gradient of parameter θ at time t, and RMS be the root mean square value; it also includes a performance metric function, which evaluates the effectiveness of the updated deep learning network. The performance metric function is:

[0059]

[0060]

[0061] Where precision is the computational precision, recall is the recall rate, TP is the true positive, FP is the false positive, and FN is the false negative.

[0062] S4. Transform the gripping point cloud coordinate data corresponding to the hook's point cloud coordinate data into the robot's reference coordinate system, obtain the reference coordinate data of the gripping point in the robot's reference coordinate system, and generate the robot's gripping path. The transformation of gripping point cloud coordinate data into reference coordinate data is existing technology. For example, CN1094017A, "Method for Acquiring and Solving Laser Scanning Data Reflected by Arbitrary Curved Mirrors," reveals how to transform the positioned point cloud coordinates into a pre-positioned spatial coordinate system. The transformation method will not be elaborated here.

[0063] S5. The robot receives the robot grasping path and performs the grasping action according to the order of the points that make up the robot grasping path. The carriage is uncoupled, the robot grasping action is completed, and the robot returns to step S1 to prepare for the next uncoupling task.

[0064] The robot's grasping path includes the hook-unhooking action path and the carriage action path, and the robot's grasping path is obtained by fitting the hook-unhooking action path and the carriage action path.

[0065] like Figure 1-3 As shown, the robot swings the handle 4, which rotates the hook-unhooking auxiliary rod 1, causing the hook-unhooking rod 5 to swing. The connector 3 connected to the hook-unhooking rod 5 is pulled out from the connection position of the two carriages, and the carriages are unhooked. In order to control the connector 3 to be pulled out through the hook-unhooking rod 5, the hook-unhooking rod 5 and the connector 3 are connected by a chain.

[0066] The hook removal motion path is an arc formed by the end face of the hook removal auxiliary rod 1 as the center O and the swing arm 2 as the radius R. The method for extracting the point cloud coordinates of any point in the hook removal motion path using the equal chord length method is as follows:

[0067] The point cloud coordinates of the point are (x, y), and the formula for calculating the arc is... + = ;

[0068] If the next path point If the coordinates of point (x', y') are (x', y'), then point The formula for calculating point cloud coordinates is:

[0069] x' = xcosθ - ysinθ;

[0070] y' = xsinθ + ycosθ;

[0071] θ = 2arcsin(L / 2R);

[0072] Among them, setting Point and The length of the chord between the points is L;

[0073] The movement path of the carriage is determined by the carriage speed and the uncoupling time. The formula for the movement path of the carriage is as follows: S=vt', where S represents the movement path, v represents the carriage speed, and t' represents the time for the robot to grasp the path.

[0074] Establish a planar coordinate system for the hook removal motion path, such as... Figure 4 As shown, with the starting point as the origin, the X-axis represents the horizontal movement distance, the Y-axis represents the vertical movement distance, and the t'-axis represents the time for the robot to grasp the path; a planar coordinate system for the carriage's motion path is established, as follows. Figure 5 As shown, with the period when the carriage enters the robot's grasping window as the origin, the t' axis represents the time of the robot's grasping path, and X' represents the carriage's motion path. The carriage's motion path along the X' axis is fitted with the path in the X-axis direction of the unhooking motion path to obtain the robot's grasping path, as shown. Figure 6 As shown.

[0075] The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A method for a hook-removing robot, characterized in that: Includes the following steps: S1. Collect the position and speed of the carriage. At the same time, determine whether the carriage has entered the robot's grasping window based on the position and speed between the carriage and the robot. Set the carriage speed as the grasping speed. When the robot needs to grasp, the carriage moves in a straight line at a constant grasping speed. When the distance between the carriage and the robot reaches the grasping distance, the carriage enters the robot's grasping window. If the carriage does not enter the robot's grasping window, repeat step S1. S2. The robot enters the grasping window, starts the camera, and acquires RGB and depth images. At the same time, it registers the depth image to the RGB image and calculates the point cloud coordinate data of the coupler. S3. Calculate the position of the coupler using a deep learning network, and the corresponding point cloud coordinates of the robot's grasping path in the point cloud coordinate data; S4. Convert the point cloud coordinates of the hook into the robot's reference coordinate system, obtain the reference coordinates of the gripping point in the robot's reference coordinate system, and generate the robot's gripping path, specifically including: The generated robot grasping path includes a hook removal action path and a carriage action path; the hook removal action path is an arc formed by the end face position of the hook removal auxiliary rod (1) as the center O and the swing arm (2) as the radius R, and the point cloud coordinates of any point in the hook removal action path are extracted by the equal chord length method. The method for extracting the point cloud coordinates of any point in the hook-unhooking action path using the equal chord length method is as follows: The point cloud coordinates of the point are (x, y), and the formula for calculating the arc is... + = ; If the next path point If the coordinates of point (x', y') are (x', y'), then point The formula for calculating point cloud coordinates is: x' = xcosθ - ysinθ; y' = xsinθ + ycosθ; θ = 2arcsin(L / 2R); Among them, setting Point and The length of the chord between the points is L; The movement path of the carriage is determined by the carriage speed and the uncoupling time; The formula for the movement path of the carriage is as follows: S=vt', where S represents the movement path, v represents the speed of the carriage, and t' represents the time for the robot to grasp the path; The robot's grasping path is obtained by fitting the unhooking action path and the carriage action path. S5. The robot receives the robot grasping path and performs the grasping action in the order of the points that make up the robot grasping path. The robot swings the handle (4), the hook removal auxiliary rod (1) rotates, and the hook removal rod (5) swings. The connector (3) connected to the hook removal rod (5) is pulled out from the connection position of the two carriages. The hook removal rod (5) and the connector (3) are connected by a chain. The carriages are unhooked. The robot grasping action is completed. The robot returns to step S1 and prepares to perform the next unhooking task.

2. The method for unhooking a robot as described in claim 1, characterized in that: The camera includes two cameras with identical parameters. The pixel depth value of the scene captured by each camera is Z. Where f is the focal length of the two cameras, Z is the parallax between the two cameras, and B is the projection width of the two cameras in the same plane.

3. The method for unhooking a robot as described in claim 2, characterized in that: The camera acquires RGB and depth images using a perspective projection model, and obtains optimized parameters by capturing images of a calibration board.

4. The method for unhooking a robot as described in claim 2, characterized in that: The Zhang Zhengyou calibration method was used to collect multiple sets of chessboard grids, calculate the point cloud coordinates of the internal corner points, and return the positions of all chessboard corner points in the image. The distortion parameters were compensated by polynomial fitting.

5. The method for unhooking a robot as described in claim 3, characterized in that: The deep learning network is trained pixel-by-pixel using a fully convolutional neural network model with a SegNet encoder and SegNet decoder structure. The coupler images are manually annotated, and the trained network is deployed in the uncoupling scenario to test the accuracy and real-time performance of the uncoupling operation.

6. The method for unhooking a robot as described in claim 4, characterized in that: The deep learning network consists of 11 blocks. The SegNet encoder has 5 convolutional blocks, each of which consists of 2-3 convolutional layers and a max pooling layer. The SegNet decoder has one upsampling layer and 2-3 convolutional layers in each convolutional block. The SegNet decoder also includes a softmax layer.

7. The method for unhooking a robot as described in claim 5, characterized in that: The deep learning network automatically adjusts its learning rate using an optimization formula and updates it multiple times; the optimization formula is: ,in, and Let be the parameter gradients at times t+1 and t, respectively. It is the exponentially decaying average. The exponential average decay value at time t-1 and Let θ be the gradient of parameter θ at time t, and RMS be the root mean square value; it also includes a performance metric function, which evaluates the effectiveness of the updated deep learning network. The performance metric function is: ; ; Where precision is the computational precision, recall is the recall rate, TP is the true positive, FP is the false positive, and FN is the false negative.

8. The method for unhooking a robot as described in claim 1, characterized in that: The unhooking action path establishes a planar coordinate system with the starting point as the origin. The X-axis represents the horizontal movement distance, the Y-axis represents the vertical movement distance, and the t'-axis represents the time of the robot's grasping path. The carriage action path establishes a planar coordinate system with the carriage entering the robot's grasping window as the origin. The t'-axis represents the time of the robot's grasping path, and X' represents the carriage action path. The carriage action path along the X'-axis is fitted with the path along the X-axis of the unhooking action path to obtain the planar coordinate system of the robot's grasping path.