A method for positioning a pose of a tank truck filling port and a storage medium
By combining depth cameras with 3D point clouds and 2D images, rapid, accurate and safe positioning of methanol filling ports on tank trucks was achieved, solving the problems of low positioning accuracy and insufficient safety in existing technologies.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NAT ENERGY COAL & COKING GRP CO LTD
- Filing Date
- 2024-05-28
- Publication Date
- 2026-06-16
Smart Images

Figure CN119722784B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of automatic filling of tank trucks, and specifically designs a method for positioning the filling port of a tank truck and a storage medium. Background Technology
[0002] Currently, the methanol refueling process in tanker trucks is entirely manual. This process mainly involves aligning the refueling gun with the filling port, monitoring whether the tank is full, closing the valve, and retracting the refueling gun. During refueling, constant monitoring is required to prevent accidents such as truck runaway or methanol leakage, making it both time-consuming and labor-intensive. Furthermore, methanol is a highly toxic and flammable chemical, posing a significant safety hazard as it can easily cause serious harm to people and property.
[0003] To date, research on methanol refueling port positioning is limited, and a mature positioning solution has not yet been developed. Existing technologies generally suffer from low accuracy, poor robustness, and conservative speed in methanol refueling port positioning, as well as safety issues during the methanol refueling process in tank trucks. Summary of the Invention
[0004] To address the shortcomings of existing technologies, this invention provides a method and storage medium for locating the position of a methanol refueling port on a tanker truck. This method solves the problems of low accuracy, poor robustness, and conservative speed in methanol refueling port positioning, thereby improving the safety of the methanol refueling process on tanker trucks. Depth cameras offer many advantages, simultaneously outputting 2D images and 3D point cloud data, which helps improve the accuracy and robustness of methanol refueling port positioning. Furthermore, some cameras possess explosion-proof properties and low power consumption, making them suitable for locating the refueling port on tanker trucks.
[0005] To solve the above-mentioned technical problems, the technical solution adopted by the present invention is: a method for locating the position of the filling port of a tanker truck, comprising the following steps:
[0006] Acquire image data and initial point cloud of the filling port of the tanker truck;
[0007] Based on the image data of the filling port of the tanker truck, a visual detection model is constructed, and the filling port of the tanker truck is coarsely located using the visual detection model to obtain the approximate three-dimensional pose of the filling port of the tanker truck.
[0008] The initial point cloud is segmented based on the physical model of the filling port and the approximate three-dimensional pose to obtain local point cloud data containing only the filling port.
[0009] The local point cloud data is filtered according to the adaptive filtering threshold function to obtain the target point cloud data with the optimal density.
[0010] Based on the smoothness, edge point features are extracted from the target point cloud data with the optimal density to obtain a set of edge feature points;
[0011] The set of edge feature points is clustered using Euclidean clustering to obtain a cluster set.
[0012] Based on the stability of triangles, construct the triangle descriptor for the cluster set;
[0013] The triangular descriptors are paired according to the feature association algorithm and similarity scoring function to obtain a set of descriptor pairs. The set of descriptor pairs is then coarsely registered to obtain the target descriptor pairs and their coarse transformed poses.
[0014] A nonlinear optimization function is used to perform fine registration on the coarse transformation pose of the target descriptor pair to obtain the accurate pose of the filling port of the tanker truck.
[0015] Furthermore, based on the image data of the tanker truck's filling port, a visual detection model is constructed, and the filling port is coarsely located using the visual detection model to obtain the approximate three-dimensional pose of the filling port, including:
[0016] Real-time image data of the methanol filling port is acquired using a depth camera; the image data includes information about the methanol filling port and its surrounding environment.
[0017] The acquired image data is preprocessed and features are extracted. Combined with image data of different resolutions, multi-scale information fusion technology is used to process the data to obtain the training dataset.
[0018] The training dataset is input into the YOLOv5 model based on temporal information learning and attention mechanism for training, resulting in the YOLO visual detection model;
[0019] The environmental image of the filling port location is input into the YOLO vision detection model for target detection and feature extraction, and the two-dimensional pixel coordinates of the center point of the filling port are output; the two-dimensional pixel coordinates of the center point are then converted into the approximate three-dimensional pose of the filling port of the tanker truck.
[0020] Furthermore, the two-dimensional pixel coordinates of the center point are transformed into an approximate three-dimensional pose of the filling port of the tanker truck, including:
[0021] Obtain the two-dimensional pixel coordinates of the center point of the filling port of the tanker truck from the visual inspection model;
[0022] Based on the depth map captured by the depth camera, the pixel value of the two-dimensional pixel coordinates of the center point of the filling port is obtained, and the depth value of the center of the filling port of the tanker truck is obtained through an affine function.
[0023] Based on the depth value, the approximate three-dimensional pose of the tanker truck's filling port in the camera coordinate system is obtained using the camera's three-dimensional projection model.
[0024] Furthermore, the initial point cloud is segmented based on the physical model of the filling port and the approximate coordinates to obtain local point cloud data containing only the filling port, including:
[0025] Based on the actual physical model of the filling port of the tanker truck, it is modeled as a cylindrical object of fixed size, and the physical parameters and filling port contour model are obtained based on the MSAC fitting algorithm.
[0026] Using the approximate three-dimensional pose as the center and the nozzle contour model as the contour, the initial point cloud is segmented to obtain a local point cloud containing only the methanol nozzle.
[0027] Furthermore, the local point cloud data is filtered according to an adaptive filtering threshold function to obtain target point cloud data with optimal density, including:
[0028] Based on the depth map captured by the camera, the pixel value at the center of the filling port is obtained, and the depth value at the center of the filling port of the tanker truck is obtained through an affine function.
[0029] An adaptive filtering threshold function is constructed based on the depth value to obtain the optimal voxel filtering parameters;
[0030] Voxel downsampling is performed on the local point cloud data based on the optimal voxel filtering parameters to obtain the target point cloud data with the optimal density.
[0031] Furthermore, edge point features are extracted from the target point cloud data of the optimal density based on smoothness to obtain an edge feature point set; the edge feature point set is clustered using Euclidean clustering to obtain a cluster set; based on the stability of triangles, a triangle descriptor for the cluster set is constructed, including:
[0032] Based on the selection of 5 points before and after each point in the target point cloud data with optimal density, the curvature of each point in the target point cloud data with optimal density is calculated, and the set of edge feature points is selected according to the smoothness threshold.
[0033] A functional relationship between the clustering threshold and the optimal voxel filtering parameters is constructed, and the Euclidean clustering algorithm is used to cluster the set of edge feature points to obtain a cluster set;
[0034] Obtain the coordinates of the center point of each cluster, obtain the direction vector of each cluster through PCA principal component analysis, project the coordinates of the center point of each cluster onto the direction vector to obtain the interval between two adjacent points, that is, the length value of each cluster.
[0035] Traverse the cluster set, search for the 5 nearest neighbor clusters of each cluster, and construct a stable triangle descriptor by traversing all pairwise unique combinations of clusters from the nearest neighbor clusters and using the cluster as the vertex. At the same time, remove the cluster from the cluster set.
[0036] Furthermore, the triangle descriptor includes the length information of the three vertices, the direction vector information, and the length information of the three sides.
[0037] Furthermore, the triangular descriptors are paired according to a feature association algorithm and a similarity scoring function to obtain a set of descriptor pairs. Coarse registration is then performed on the set of descriptor pairs to obtain target descriptor pairs and their coarse transformed poses, including:
[0038] The first phase involves traversing the set of descriptors, using the sum of the side lengths of the descriptors as the key and the triangular descriptors as the target values to construct a hash table.
[0039] Traverse the length cluster set of the target point cloud, and for each triangle descriptor, query the hash table of the source point cloud for the number of triangle descriptors whose length difference is within the range. At the same time, obtain the potential descriptor pair set. The number is the first stage score of each triangle descriptor. Sort the scores from largest to smallest to obtain the score set and the total descriptor pair set.
[0040] Based on the total number of triangular descriptors, the top 20% of the descriptors by score are retained for feature association in the second stage, while the other descriptors are removed from the set, resulting in a new set of descriptor pairs.
[0041] Traverse the new set of descriptor pairs. For a pair of descriptors, subtract the length information of the vertices, the direction vector information, and the length information of the three sides to obtain the difference in vertex length, the difference in direction vector, and the difference in side length.
[0042] For each paired descriptor, determine whether it satisfies the condition that all three differences are less than the threshold. If not, remove it. Finally, retain the descriptor pair in each set of paired descriptors that has the minimum sum of the three differences.
[0043] Traverse the set of retained descriptor pairs, calculate the pose transformation between each pair of descriptors, perform pose transformation on the remaining descriptor pairs, calculate whether the distance between the three vertices of the transformed descriptor is within the threshold range, if the threshold requirement is met, increment the score by 1, and finally obtain the score set. The descriptor pair corresponding to the maximum score is the target descriptor pair, and the corresponding pose transformation is the coarse transformation pose.
[0044] Furthermore, a nonlinear optimization function is used to perform fine registration on the coarse transformed pose of the target descriptor pair to obtain the precise pose of the tanker truck's filling port, including:
[0045] The target point cloud is transformed according to the coarse transformation pose, and a least squares relationship is established between the transformed target point cloud and the source point cloud.
[0046] The least squares relation is solved using the Gauss-Newton method to obtain a high-precision pose transformation.
[0047] The product of the high-precision transformed pose and the source point cloud pose is the precise pose of the filling port of the tanker truck.
[0048] On the other hand, this application also provides a computer storage medium storing executable program code; the executable program code is used to execute the above-described method for locating the position of a filling port on a tanker truck.
[0049] Compared with the prior art, the advantages and positive effects of the present invention are as follows:
[0050] 1) This invention provides a method for locating the position of a methanol filling port on a tanker truck. It integrates 3D point cloud and 2D image data to achieve the positioning of the filling port, and can be applied to the field of automatic filling technology for tanker trucks. This method enables fast, accurate, and robust positioning of the methanol filling port, while significantly improving the safety of the methanol filling process on tanker trucks.
[0051] 2) This invention provides a method for locating the methanol filling port of a tanker truck. It utilizes a depth camera to acquire real-time image data of the target area, including surrounding environmental information and potential hazards of the methanol filling port. The acquired images are preprocessed, including noise removal, missing value filling, and image denoising, to ensure the accuracy and reliability of subsequent processing. Features of the methanol filling port, including shape, color, and texture, are extracted, and the surrounding environmental features are analyzed to achieve effective identification and location of the filling port. Multi-scale information fusion technology is used, combining image data of different resolutions to comprehensively utilize image and point cloud information, improving the robustness and accuracy of methanol filling port detection. Temporal information learning and attention mechanisms are incorporated into the YOLOv5 model training process to learn and focus on visual features at different time steps, improving recognition speed and coping with factors such as light changes and interference from debris, thereby enhancing the robustness and adaptability of the filling port detection system. Through target detection and feature extraction techniques, coarse visual localization of the methanol filling port is achieved, and its coordinate information is converted into a recognizable data format for subsequent point cloud registration.
[0052] 3) This invention provides a method for locating the position of a filling port on a tanker truck. It combines the physical model of the filling port with the visual positioning results to design a target point cloud segmentation algorithm suitable for methanol filling ports, which obtains a local point cloud containing only the filling port, reducing the interference caused by redundant point clouds. At the same time, it designs an adaptive filtering threshold function that considers depth and distance, avoiding the need to adjust the filtering parameters, and achieving optimal processing of the target point cloud, thus accelerating the 3D point cloud processing speed while ensuring accuracy.
[0053] 4) This invention provides a method for locating the filling port of a tanker truck. It extracts edge points based on the smoothness of the point cloud, uses edge points with more information for point cloud registration, and obtains multiple edge clusters based on Euclidean clustering. It proposes a triangle descriptor with spatial and temporal invariance by leveraging the stability of triangles. At the same time, it designs a two-stage feature association algorithm and a similarity scoring function for pairing triangle descriptors, which accelerates the pairing process and improves the pairing success rate.
[0054] In addition to the objectives, features, and advantages described above, the present invention has other objectives, features, and advantages. The invention will now be described in further detail with reference to the figures. Attached Figure Description
[0055] The accompanying drawings, which form part of this application, are used to provide a further understanding of the invention. The illustrative embodiments of the invention and their descriptions are used to explain the invention and do not constitute an undue limitation of the invention. In the drawings:
[0056] Figure 1 This is a flowchart of the method for locating the filling port of a tanker truck according to an embodiment of the present invention;
[0057] Figure 2 This is a partial image data acquisition diagram according to an embodiment of the present invention;
[0058] Figure 3 This is a visual detection result diagram of an embodiment of the present invention;
[0059] Figure 4 This is a schematic diagram of point cloud data according to an embodiment of the present invention;
[0060] Figure 5 This is a partial point cloud image after target point cloud segmentation according to an embodiment of the present invention;
[0061] Figure 6 This is a point cloud registration result diagram of an embodiment of the present invention. Detailed Implementation
[0062] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0063] This invention provides a method for locating the position of a filling port on a tanker truck, which can be selected, but is not limited to, locating the position of a filling port on a methanol tanker truck. Specifically, it includes the following steps:
[0064] Step 101: Obtain image data and initial point cloud of the filling port of the tanker truck.
[0065] Specifically, a depth camera can be used to acquire real-time image data of the target area; this target area preferably includes the methanol filling port and its surroundings, to obtain environmental information about the methanol filling port and its surroundings, and to identify the port's pose and potential hazards. The specific implementation process is as follows:
[0066] S1.1 Utilizing the onboard visual sensor device (e.g., a high-resolution camera), the robot perceives its surrounding environment and acquires high-definition environmental image information of the location of the filling port, storing it as... .
[0067] S1.2 The obtained environmental images may include various environmental features such as the refueling port, surrounding vegetation, buildings, and terrain, possessing complex scene composition and rich information content, and are stored as variables. .
[0068] S1.3. The robot's surrounding environment is scanned and monitored in real time using a vision sensor to obtain precise information about the filling port location, which is then fed back to the system and stored as a variable. .
[0069] S1.4. Utilize advanced image processing algorithms and deep learning techniques to perform complex scene analysis and target detection on the acquired environmental images to identify the precise location of the refueling port in the environment and store it as a variable. .
[0070] S1.5. In the environmental image of the filling port location, target detection and feature extraction techniques are used to accurately locate the filling port, and its coordinate information is converted into a data format that the robot can recognize and stored as variables. .
[0071] Step 102: Based on the image data of the filling port of the tanker truck, construct a visual detection model, and use the visual detection model to coarsely locate the filling port of the tanker truck to obtain the approximate three-dimensional pose of the filling port of the tanker truck.
[0072] Specifically, a depth camera is used to acquire real-time image data of the methanol filling port; the acquired image data is preprocessed and features are extracted, and combined with image data of different resolutions, multi-scale information fusion technology is used to process the data to obtain a training dataset;
[0073] The training dataset is input into the YOLOv5 model based on temporal information learning and attention mechanism for training, resulting in the YOLO visual detection model;
[0074] The environmental image of the filling port location is input into the YOLO vision detection model for target detection and feature extraction, and the two-dimensional pixel coordinates of the center point of the filling port are output; the two-dimensional pixel coordinates of the center point are then converted into the approximate three-dimensional pose of the filling port of the tanker truck.
[0075] The process of converting the two-dimensional pixel coordinates of the center point into the approximate three-dimensional pose of the filling port of the tanker truck includes:
[0076] Obtain the two-dimensional pixel coordinates of the center point of the filling port of the tanker truck from the visual inspection model;
[0077] Based on the depth map captured by the depth camera, the pixel value of the two-dimensional pixel coordinates of the center point of the filling port is obtained, and the depth value of the center of the filling port of the tanker truck is obtained through an affine function.
[0078] Based on the depth value, the approximate three-dimensional pose of the tanker truck's filling port in the camera coordinate system is obtained using the camera's three-dimensional projection model.
[0079] Furthermore, the acquired image undergoes preprocessing, including noise removal, missing value filling, and image denoising; the specific implementation process is as follows:
[0080] S2.1 Image Denoising: First, the acquired image information is denoised using methods such as Gaussian filters or median filters to reduce interference and noise in the image. Specifically, for an image I, the denoising operation can be expressed as follows: ,in It is the standard deviation of the Gaussian kernel.
[0081] S2.2 Edge Enhancement: Next, edge enhancement techniques are used to process the image to highlight edge features. Common edge enhancement methods include the Sobel operator, Prewitt operator, and Canny edge detection, which can effectively detect edge information in the image. Taking the Sobel operator as an example, its formula for calculating the edge intensity in the horizontal direction is:
[0082]
[0083] S2.3 Other processing: Finally, some other preprocessing operations are performed, such as adjusting the image brightness and contrast, and converting the color space, to further optimize the image quality and improve the detection effect of the injection port.
[0084] Furthermore, deep learning technology is used to extract features of the methanol filling port, including its shape, color, and texture, while simultaneously analyzing the surrounding environmental features to achieve effective identification and localization of the filling port; the specific implementation process is as follows:
[0085] S3.1 Feature Extraction: Feature extraction is performed on the preprocessed image using image processing algorithms. This includes edge detection, corner detection, and color histogram calculation. Let the image be I, the edge image be E, the corner image be C, and the color histogram be... Then it can be expressed as:
[0086]
[0087] in, , and These represent functions for edge detection, corner detection, and color histogram calculation, respectively.
[0088] S3.2 Feature Analysis: Further analysis is performed on the extracted features. This is done by calculating the edge image E, corner image C, and color histogram. By analyzing the correlation between features, we can obtain richer feature information. Let the correlation calculation function be... Then it can be expressed as:
[0089]
[0090] in, This indicates the correlation between edge images and corner images. This indicates the correlation between the edge image and the color histogram. This indicates the correlation between the corner image and the color histogram.
[0091] S3.3 Target Localization: Target localization is performed based on the extracted feature information. This involves comprehensively considering the correlation between edge images, corner images, and color histograms, combined with a machine learning model for target recognition and localization. Let the machine learning model be... Then it can be expressed as:
[0092] in Indicates the location of the filling port. This represents a machine learning model.
[0093] Furthermore, by combining image data of different resolutions and employing multi-scale information fusion technology, image and point cloud information are comprehensively utilized to improve the robustness and accuracy of methanol refueling port detection; the specific implementation process includes:
[0094] S4.1 Resolution unification and calibration: The resolution of image data acquired by explosion-proof cameras is unified and calibrated to ensure the comparability and compatibility of data at the same scale.
[0095] S4.2 Feature Weighting and Combination: Based on the importance and reliability of different features, features are weighted and combined to improve the robustness and accuracy of the methanol filling port detection algorithm.
[0096] Furthermore, temporal information learning and attention mechanisms are incorporated into the YOLOv5 model training process to learn and focus on visual features at different time steps, thereby improving recognition speed and coping with factors such as lighting changes and interference from clutter, thus enhancing the robustness and adaptability of the filling port detection system. The specific implementation process is as follows:
[0097] S5.1 In the training data, extract the temporal information of the image data through timestamps or sequence numbers, so as to introduce temporal learning during the model training process.
[0098] S5.2 Design a suitable temporal feature representation method to combine temporal information with image features to form an input data format suitable for model learning.
[0099] S5.3. For time series features, design a time series model structure and use LSTM for the time series data model structure to learn and represent the visual features at different time steps.
[0100] S5.4. Introduce an attention mechanism into the time series model to enable the model to dynamically focus on and learn important information at different time steps, thereby improving the adaptability of the filling port detection system to factors such as changes in light and interference from debris.
[0101] S5.5 Model Training and Optimization: Based on the designed temporal model and attention mechanism, the training data is used to train and continuously optimize the YOLOv5 model parameters and structure to achieve accurate identification and location of the injection port.
[0102] S5.6 Performance Evaluation and Adjustment: Evaluate the performance of the trained YOLO vision detection model, and adjust and optimize the model based on the evaluation results to ensure the robustness and adaptability of the injection port detection system under different environmental conditions.
[0103] Furthermore, through target detection and feature extraction techniques, the injection port is coarsely located, and its coordinate information is converted into a data format recognizable for subsequent point cloud registration; the specific implementation process is as follows:
[0104] S6.1. Use the YOLOv5 algorithm for methanol refueling port target detection to detect the environmental image at the location of the refueling port, ensuring that the position of the refueling port in the image can be accurately located.
[0105] S6.2. Combining the target detection and feature extraction results, the localization algorithm is used to accurately locate the injection port, determine its exact position in the environmental image, and calculate its coordinate information.
[0106] S6.3 Data Format Conversion: Convert the coordinate information of the injection port. The data is converted into a format that can be recognized by subsequent point cloud registration, ensuring that the location information of the filling port can be accurately identified and processed by the subsequent point cloud registration algorithm.
[0107] Step 103: Segment the initial point cloud according to the physical model of the filling port and the approximate three-dimensional pose to obtain local point cloud data containing only the filling port; filter the local point cloud data according to the adaptive filtering threshold function to obtain target point cloud data with optimal density.
[0108] The process involves segmenting the initial point cloud based on the physical model of the filling port and the approximate three-dimensional pose to obtain local point cloud data containing only the filling port. This includes: first, modeling the actual physical model of the filling port of the tanker truck as a cylindrical object of fixed size, and obtaining the physical parameters and contour model of the filling port based on the MSAC fitting algorithm; then, segmenting the initial point cloud with the approximate three-dimensional pose as the center and the filling port contour model as the contour to obtain a local point cloud containing only the methanol filling port.
[0109] The local point cloud data is filtered using an adaptive filtering threshold function to obtain target point cloud data with optimal density. This process includes: first, obtaining the pixel value at the center of the filling port based on the depth map acquired by the camera, and then obtaining the depth value at the center of the filling port of the tanker truck through an affine function; next, constructing an adaptive filtering threshold function based on the depth value to obtain the optimal voxel filtering parameters; and finally, performing voxel downsampling on the local point cloud data based on the optimal voxel filtering parameters to obtain target point cloud data with optimal density.
[0110] Specifically, after the camera reaches the target area, it collects point cloud data of the tanker truck's filling port. A target point cloud segmentation algorithm is designed, combining the physical model of the filling port with visual positioning results. An adaptive filtering threshold function, using depth and distance as factors, is also designed to perform voxel filtering on the target point cloud. The specific implementation process is as follows:
[0111] S7.1 Obtain the pixel coordinates of the center of the filling port of the tanker truck from the visual recognition stage. Based on the depth map captured by the camera, obtain The pixel value at that location is used to obtain the depth value of the center of the tanker truck's filling port through an affine function. The approximate 3D pose in the camera coordinate system is obtained using the camera's 3D projection model. The relevant formulas are as follows:
[0112] ;
[0113] in, This refers to the camera's internal parameters.
[0114] S7.2. Based on the actual physical model of the methanol filling port, it can be modeled as a cylindrical object of fixed dimensions. The physical parameters of the filling port of the tanker truck are obtained based on the MSAC fitting algorithm: {radius:} ;Cylinder axis vector: Cylinder height: } and filling port contour model ;
[0115] S7.3, by position Centered on, fit the model The initial point cloud was segmented to obtain a local point cloud containing only the methanol filling port. ;
[0116] S7.4, Based on depth value Design an adaptive filtering threshold function to calculate the optimal voxel filtering parameters. The relevant calculation formulas are as follows:
[0117] ;
[0118] in, For ideal accuracy.
[0119] S7.5, according to local point cloud Perform voxel downsampling to obtain the target point cloud with optimal density. ;
[0120] Furthermore, in step S7.2, the specific implementation process of obtaining the physical parameters of the tanker truck's filling port based on the MSAC fitting algorithm is as follows:
[0121] 1) Select several points arbitrarily on the cylindrical surface. And fit the plane formed by these points. This leads to the unit normal vector of the plane. This is the normal vector at that point. Perform this operation on every point on the cylindrical surface to obtain the unit normal vector for each point. Consider the unit normal vector of each point as a single point... These points are then fitted again to obtain the plane normal vector. That is, the initial value of the cylinder axis vector. ;
[0122] 2) After obtaining the axis, perform coordinate transformation on the cylinder to obtain the cylinder axis vector. Transform it into a vector parallel to the Z-axis of the camera coordinate system, such that the points on the cylindrical surface... The coordinates form a planar circle. These points are then fitted to obtain the center of the circle. ,Right now and radius .
[0123] 3) Distance from a point on the cylindrical surface to the axis of the cylinder Since the initial value is constant and represents the radius, an error equation can be established based on the above initial values. The specific formula for solving the fitting model of the injection port is as follows:
[0124] ;
[0125] In the formula, The unit component of the direction vector of the cylinder axis constitutes the unit direction vector. ,satisfy , represents the orientation of the central axis of the cylinder in three-dimensional space.
[0126] Step 104: Extract edge point features from the target point cloud data with the optimal density based on smoothness to obtain an edge feature point set; cluster the edge feature point set using Euclidean clustering to obtain a cluster set; construct a triangle descriptor for the cluster set based on the stability of triangles.
[0127] The specific implementation process is as follows:
[0128] S8.1 Select the target point cloud The target point cloud is calculated based on the five points before and after each point. curvature at each point And based on the smoothness threshold Filter out the set of edge feature points The specific calculation formula is as follows:
[0129] ;
[0130] Where S represents the neighborhood set of the five points before and after; Indicates the first In the point cloud of the first frame The three-dimensional coordinate vector of a point Indicates the first In the point cloud of the first frame Three-dimensional coordinate vector of points; subscript , middle, Indicates the point cloud frame number. The superscript L indicates the point number, and the coordinate label of the point under the local reference formed by its neighborhood point set S.
[0131] S8.2 Constructing Clustering Thresholds With optimal voxel filter parameters The functional relationship between them, for the set of edge feature points Cluster sets are obtained using the Euclidean clustering algorithm. The specific formula for calculating the clustering threshold is as follows:
[0132] ;
[0133] S8.3 For the above cluster sets, calculate the coordinates of their respective centroids. The main direction is obtained by principal component analysis (PCA). That is, the direction vector of each cluster. By projecting the points in the point cloud of each cluster onto the main direction and calculating the distance between the two farthest points, the length value of each cluster can be obtained. The specific calculation formula is as follows:
[0134] ;
[0135] in, express The number of points.
[0136] S8.4 Traversing the Cluster Set Search each cluster 5 nearest neighbor clusters To traverse all pairwise unique combinations of clusters and clusters from the nearest neighbor cluster. Construct stable triangle descriptors for vertices and will Remove each triangle descriptor from the set. Includes length information of the three vertices Direction vector information and information on the lengths of the three sides .
[0137] Step 105: Pair the triangular descriptors according to the feature association algorithm and similarity scoring function to obtain a set of descriptor pairs, and perform coarse registration on the set of descriptor pairs to obtain the target descriptor pairs and their coarse transformed poses.
[0138] The specific implementation process is as follows:
[0139] S9.1, First stage: Traversing the descriptive subset of the source point cloud To describe the sub The sum of the side lengths Triangular descriptor as key value Build a hash table for the target value ;
[0140] S9.2, Traverse the length cluster set of the target point cloud For triangle descriptors In the hash table of the source point cloud The query length difference is within the range Number of triangular descriptors within At the same time, Potential descriptor pair set ,quantity That is, the first-stage scoring of each triangular descriptor. The ratings are then sorted from largest to smallest to obtain a rating set. and the total set of descriptor pairs ;
[0141] S9.3. Based on the total number of triangular descriptors, retain the top 20% of the descriptors by score to enter the second stage for feature association, and remove the other descriptors from the set, resulting in... ,in ;
[0142] S9.4, Traversal For a set of paired descriptors The length information of the two vertices Direction vector information and information on the lengths of the three sides By subtracting, we obtain the difference in vertex lengths. Direction vector difference and the difference in side length ;
[0143] S9.5 For each paired descriptor Determine whether it satisfies the condition that all three differences are less than the threshold. If the condition is not met, the descriptor is removed, and finally each paired descriptor subset is retained. The sum of the differences of the three terms Descriptor pairs corresponding to the minimum value , The calculation formula is as follows:
[0144] ;
[0145] S9.6. Traverse the preserved set of descriptor pairs and calculate the pose transformation between each pair of descriptors. For the remaining descriptor pairs conduct Transformation, calculate the distance between the three vertices of the transformed descriptor. Is it at the threshold? Within the range, if the threshold requirement is met, then Add 1 to get the final score collection. ,Fraction The descriptor pair corresponding to the maximum value is the target descriptor pair. And the corresponding pose transformation This is the coarse transformation pose value.
[0146] Step 106: Use a nonlinear optimization function to perform fine registration on the coarse transformation pose of the target descriptor pair to obtain the accurate pose of the filling port of the tanker truck.
[0147] The specific implementation process is as follows:
[0148] S10.1, Based on the coarse transformation pose value Obtain the transformed target point cloud ;
[0149] S10.2 Establish the target point cloud With Source Cloud The least squares relationship between them is expressed as follows:
[0150] ;
[0151] S10.3 Solve for the least squares function to obtain the high-precision transformed pose. , and by Find the center position of the filling port of the tanker truck The calculation formula is as follows:
[0152] ;
[0153] in, The pose of the source point cloud.
[0154] The present invention will be explained and described below with reference to specific embodiments.
[0155] As attached Figure 1 The diagram shown is an overall flowchart of a fast and high-precision method for locating the pose of a tanker truck's filling port, provided by an embodiment of the present invention. It can be divided into two main processes: visual detection and point cloud registration, specifically including the following steps:
[0156] Step 1: Use a depth camera to acquire real-time image data of the target area, including environmental information and potential hazards surrounding the methanol refueling port, including the following steps:
[0157] Step 1: Using a high-resolution explosion-proof camera, perceive the surrounding environment and acquire high-definition environmental images of the location of the filling port for subsequent analysis and processing. Image quality is optimized by adjusting parameters such as camera exposure time and focal length, further improving the efficiency and accuracy of image information acquisition. Store high-resolution image information of the location of the filling port.
[0158] Step 2: The obtained environmental image may include various environmental features of the refueling port and its surroundings, with complex scene composition and rich information content, which are stored as variables. In project practice, methods for analyzing and processing image information can be further explored to extract and utilize more environmental features. For example, the filling port and its surrounding objects or areas can be marked and identified to enable the location and recognition of different objects.
[0159] Step 3: Perform a comprehensive scan and real-time monitoring of the environment surrounding the filling port to accurately obtain the location information of the filling port, and feed it back to the system in real time, storing it as a variable. Optimize the parameters and scanning logic of the explosion-proof camera to improve the accuracy and real-time performance of the filling port location information. Simultaneously, an efficient data storage method can be employed to store the real-time acquired location information as variables.
[0160] Step 4: Perform complex scene analysis on the acquired environmental images to identify the true location of the refueling port in the environment and store it as a variable. .
[0161] Step 5: In the environmental image of the filling port location, convert the filling port coordinate information into a data format that the robot can recognize and store it as a variable. .
[0162] Step Two: A partial display of the images and image information acquired through the explosion-proof camera (a total of 400 sets of images and related data are shown below). Figure 2 Preprocessing includes noise removal, missing value filling, and image denoising; the steps include:
[0163] Step 1: First, the acquired image information is denoised using methods such as Gaussian filters or median filters to reduce interference and noise in the image. Specifically, for an image I, the denoising operation can be represented as follows: , where σ is the standard deviation of the Gaussian kernel, GaussianBlur is the Gaussian filter function, and I_denoised is the denoised image.
[0164] Step 2: Obtain the denoised image, which is stored as a variable and denoted by the symbol I. Define the horizontal convolution kernel of the Sobel operator: Perform a horizontal convolution operation on image I to calculate the horizontal edge intensity values: Define the vertical convolution kernel of the Sobel operator: Perform a vertical convolution operation on image I to calculate the edge intensity values in the vertical direction: ; Calculate the total edge intensity of each pixel in the image: Based on the obtained total edge strength value Edge enhancement processing is applied to the image to make edge features more prominent.
[0165] In this embodiment , :
[0166] ;
[0167] Step 3: For some images, other preprocessing operations are performed based on the image data, such as adjusting the image brightness (img_Brightness) and contrast, and converting the color space (RGB / HSV) to optimize image quality and improve the detection effect of subsequent injection ports.
[0168] Step 3: Extract the characteristics of the methanol filling port, including its shape, color, and texture, and analyze the surrounding environmental features to achieve effective identification and location of the filling port; this includes the following steps:
[0169] Step 1: First, the preprocessed image I is further analyzed and processed to extract important feature information. This process involves edge detection, corner detection, and color histogram calculation. Edge detection aims to identify edges and boundaries in the image. In this step, image processing algorithms are used to perform edge detection on image I, generating an edge image. This includes edge information in the image. Next, corner detection is performed to identify prominent feature points in the image, typically representing key locations or structures of an object. A corner image is generated using a corresponding algorithm. This includes corner information in the image. Finally, a color histogram is calculated, which is a statistical analysis of the frequency of various colors in the image, used to describe the color distribution of the image. The color histogram of image I is calculated... This allows us to obtain the distribution of various colors in the image.
[0170] in, , and These represent functions for edge detection, corner detection, and color histogram calculation, respectively.
[0171] Step 2: Calculate the edge image E, corner image C, and color histogram. By analyzing the correlation between features, we can obtain richer feature information. Let the correlation calculation function be... .
[0172] Step 3: After calculating the correlation function, target recognition and localization are performed by comprehensively considering the correlation between edge images, corner images, and color histograms, combined with a machine learning model. Let the machine learning model be... Then it can be expressed as: ;
[0173] in Indicates the location of the filling port. This represents a machine learning model.
[0174] Step 4: Combining image data at different resolutions, multi-scale information fusion technology is employed to comprehensively utilize image and point cloud information, improving the robustness and accuracy of methanol refueling port detection. This includes the following steps:
[0175] Step 1: First, for the image data acquired by the explosion-proof camera, record the original resolution information of each image. Then, perform resolution unification processing on all image data to ensure they are compared at the same spatial scale. For example, adjust all images to a uniform resolution of 1920×1080 pixels. Simultaneously, perform camera calibration to determine the camera's intrinsic parameters (such as focal length, principal point, etc.) and extrinsic parameters (such as camera position, attitude, etc.). This can be achieved by photographing a calibration board or using specific calibration tools. For example, using the pixel coordinates obtained from the calibration board and the actual world coordinates, a calibration algorithm is used to calculate the camera's intrinsic and extrinsic parameters. The camera's intrinsic and extrinsic parameters are typically represented by the intrinsic parameter matrix K and the extrinsic parameter matrix [R|t]. The intrinsic parameter matrix K describes the camera's intrinsic parameters, including focal length (f_x, f_y), principal point (c_x, c_y), and distortion parameters (k_1, k_2, p_1, p_2), and is typically expressed as:
[0176] ;
[0177] In this embodiment, K:
[0178] ;
[0179] The extrinsic parameter matrix [R|t] describes the camera's external parameters, including the camera's rotation matrix R and translation vector t, which are used to describe the camera's position and orientation in the world coordinate system.
[0180] The projection transformation formula for a camera can be expressed as:
[0181] ;
[0182] Ultimately, the calibration results will be used in subsequent image processing and analysis to ensure the accuracy of the experimental data and to effectively unify and calibrate the resolution of the image data acquired by the explosion-proof camera.
[0183] Step 2: Perform feature weighting and combination on the image features acquired from the explosion-proof camera. Taking the edge image E, corner image C, and color histogram H(I) acquired in Step 3 as an example, the resulting digital data is as follows: the edge image E contains 1200 edge points, the corner image C identifies 25 corner points, and the color histogram H(I) contains 256 color channels. Next, based on the importance and reliability of different features, the features are weighted and combined to improve the robustness and accuracy of the detection algorithm. For example, linear weighting or nonlinear weighting can be used to weight the edge features, corner features, and color features to obtain a comprehensive feature vector. Assume that the comprehensive feature vector after weighting is F=[0.6·E,0.3·C,0.1·H(I)], where the weight of each feature is adjusted according to its importance in the detection algorithm to obtain the weighted and combined feature vector F.
[0184] Step 5: Combining the parameters and data processing results obtained in Steps 3 and 4, temporal information learning and attention mechanisms are incorporated into the YOLOv5 model training process. Visual features at different time steps are learned and focused upon to improve recognition speed and address factors such as lighting changes and interference from debris, thereby enhancing the speed and accuracy of the filling port detection system. This includes the following steps:
[0185] Steps 1, 2, and 3: For the training data, the temporal information of the image data is first extracted by recording the timestamps of image acquisition or assigning sequence numbers to introduce the concept of temporal learning. Subsequently, a suitable temporal feature representation method is designed to combine the temporal information with image features, forming an input data format suitable for model learning. This includes combining a series of consecutive image frames into a temporal sequence and combining each temporal sequence with its corresponding image features to construct a temporal feature representation. For the temporal features, a Long Short-Term Memory (LSTM) network is designed as the model structure to learn and represent visual features at different time steps. LSTM, as a recurrent neural network structure suitable for modeling sequential data, can effectively capture long-term dependencies in sequential data.
[0186] Step 4: Design a temporal injection port detection algorithm based on an attention mechanism. First, use a convolutional neural network (CNN) to extract features from each frame of the image, obtaining an image feature sequence {x_1, x_2, ..., x_T}, where T is the length of the sequence. Design an attention model to dynamically calculate the attention weights at each time step, so that the model can focus on key time steps. Specifically, a soft attention mechanism can be used, and the formula for calculating the attention weights is as follows: e_t is the attention energy corresponding to the t-th time step, which can be calculated using a multilayer perceptron (MLP) or a fully connected layer. Next, the image feature sequence is weighted and summed according to the attention weights to obtain the weighted feature representation: Finally, the weighted feature z is input into the LSTM model.
[0187] Step 5: Based on the designed temporal model and attention mechanism, train the training data and continuously optimize the YOLOv5 model parameters and structure to achieve accurate identification and location of the injection port.
[0188] Step 6: Evaluate the performance of the trained YOLO vision detection model, and adjust and optimize the model based on the evaluation results to ensure the robustness and adaptability of the injection port detection system under different environmental conditions.
[0189] Step Six: Using target detection and feature extraction techniques, coarsely locate the filling port and convert its coordinate information into a data format recognizable for subsequent point cloud registration; this includes the following steps:
[0190] Step 1: Use a methanol refueling port target detection algorithm to detect the environmental image at the location of the refueling port. The detection results are as follows: Figure 3 As shown.
[0191] Step 2: Obtain the coordinates of the coarsely located center point for precise detection, in TXT document data format (including labels, detection results, and confidence scores). In this embodiment:
[0192] ('exit',0.385,0.523,0.98) ('entrance',0.756,0.533,0.98)
[0193] Step 3: Share the detection data position_exit and position_entrance via dynamic link library or file sharing. In this embodiment:
[0194] position_exit (-215.6, 3.5, 1417.5); position_entrance (203.7, 7.3, 1441.4).
[0195] Step 7: After the depth camera reaches the target area, it collects point cloud data of the tanker truck's filling port, as shown in the attached image. Figure 4 As shown, a target point cloud segmentation algorithm combining the physical model of the injection port and visual positioning results is designed, and an adaptive filtering threshold function with depth and distance as factors is designed to perform voxel filtering on the target point cloud:
[0196] Step 1: Obtain the visual detection results, the approximate coordinates of the filling port in the camera coordinate system, which in this embodiment are: (-215.6, 3.5, 1417.5) and (203.7, 7.3, 1441.4).
[0197] Step 2: Based on the actual physical model of the methanol filling port, it can be modeled as a cylindrical object of fixed dimensions. The physical parameters of the filling port of the tanker truck are obtained using the MSAC fitting algorithm: {radius:} ;Cylinder axis vector: Cylinder height: } and filling port contour model ;
[0198] Step 2.1: Select several points arbitrarily on the cylindrical surface. And fit the plane formed by these points. This leads to the unit normal vector of the plane. This is the normal vector at that point. Perform this operation on every point on the cylindrical surface to obtain the unit normal vector for each point. Consider the unit normal vector of each point as a single point... These points are then fitted again to obtain the plane normal vector. That is, the initial value of the cylinder axis vector. ;
[0199] In this embodiment :
[0200] =(-0.000993 -0.014894 -0.999889);
[0201] Step 2.2: After obtaining the axis, perform coordinate transformation on the cylinder to obtain the cylinder axis vector. Transform it into a vector parallel to the Z-axis of the camera coordinate system, such that the points on the cylindrical surface... The coordinates form a planar circle. These points are then fitted to obtain the center of the circle. ,Right now and radius ;
[0202] In this embodiment, the center of the circle and radius :
[0203] = (-215.6, 3.5, 1417.5), (203.7, 7.3, 1441.4);
[0204] =75mm, 58mm;
[0205] Step 2.3: Distance from a point on the cylindrical surface to the axis of the cylinder. Since the initial value is constant and represents the radius, an error equation can be established based on the above initial values. The specific formula for solving the fitting model of the injection port is as follows:
[0206] ;
[0207] Step 3: In pose Centered on, fit the model The initial point cloud was segmented to obtain a local point cloud containing only the methanol filling port. As attached Figure 5 As shown;
[0208] Step 4: Based on depth value Design an adaptive filtering threshold function to calculate the optimal voxel filtering parameters. The relevant calculation formulas are as follows:
[0209] ;
[0210] In this embodiment, aim = 5mm, therefore:
[0211] ;
[0212] Step 5: According to local point cloud Perform voxel downsampling to obtain the target point cloud with optimal density. .
[0213] Step 8: Extract edge point features from the target point cloud based on smoothness, use Euclidean clustering to obtain multiple edge point clusters, and construct a triangle descriptor based on edge point clusters by leveraging the stability of triangles.
[0214] Step 1: Select 5 points before and 5 points after the target point cloud as the basis for calculation. curvature at each point And based on the smoothness threshold Filter out the set of edge feature points The specific calculation formula is as follows:
[0215] ;
[0216] in, This represents the set of 5 points before and after.
[0217] Step 2: Construct clustering thresholds With optimal voxel filter parameters The functional relationship between them, for the set of edge feature points Cluster sets are obtained using the Euclidean clustering algorithm. The specific formula for calculating the clustering threshold is as follows:
[0218] ;
[0219] In this embodiment =20;
[0220] Step 3: For the above cluster sets, calculate the coordinates of their respective centroids. The main direction is obtained by principal component analysis (PCA). That is, the direction vector of each cluster. By projecting the points onto the main direction and calculating the interval between the two points, the length value of each cluster can be obtained. The specific calculation formula is as follows:
[0221] ;
[0222] in, express The number of points.
[0223] In this embodiment, the coordinates of the center point of a certain cluster Cluster_i =(394.918,-15.270,1410.790).
[0224] Step 4: Traverse the cluster set Search each cluster 5 nearest neighbor clusters To traverse all pairwise unique combinations of clusters and clusters from the nearest neighbor cluster. Construct stable triangle descriptors for vertices and will Remove each triangle descriptor from the set. Includes length information of the three vertices Direction vector information and information on the lengths of the three sides .
[0225] In this embodiment, the length information, direction vector information, and three side length information of a certain cluster Cluster_i are respectively: (20.853 17.697 23.544), {(0.684429 -0.223967 -0.693827), (0.1372160.032886 -0.989995), (0.061187 0.041865 -0.997248)} and (267.545 197.256103.936).
[0226] Step 9: Design a two-stage feature association algorithm and similarity scoring function based on the vertex and side length information of the triangle descriptor;
[0227] Step 1: First stage: Traverse the descriptor subset To describe the sub The sum of the side lengths Triangular descriptor as key value Build a hash table for the target value ;
[0228] Step 2: Traverse the length cluster set of the target point cloud For triangle descriptors In the hash table of the source point cloud The query length difference is within the range Number of triangular descriptors within At the same time, Potential descriptor pair set ,quantity That is, the first-stage scoring of each triangular descriptor. The ratings are then sorted from largest to smallest to obtain a rating set. and the total set of descriptor pairs ;
[0229] Step 3: Based on the total number of triangular descriptors, retain the top 20% of the descriptors by score to enter the second stage for feature association, and remove the other descriptors from the set, resulting in... ,in ;
[0230] Step 4: Traverse For a set of paired descriptors The length information of the two vertices Direction vector information and information on the lengths of the three sides By subtracting, we obtain the difference in vertex lengths. Direction vector difference and the difference in side length ;
[0231] In this embodiment, the vertex length difference of a certain pair of descriptors Direction vector difference and the difference in side length for:
[0232] =7.821 =3.459 =12.478;
[0233] Step 5: For each paired descriptor Determine whether it satisfies the condition that all three differences are less than the threshold. If the condition is not met, the descriptor is removed, and finally each paired descriptor subset is retained. The sum of the differences of the three terms Descriptor pairs corresponding to the minimum value , The calculation formula is as follows:
[0234] ;
[0235] In this embodiment, a certain most The minimum value is 10.724.
[0236] Step 6: Traverse the set of preserved descriptor pairs and calculate the pose transformation between each pair of descriptors. For the remaining descriptor pairs conduct Transformation, calculate the distance between the three vertices of the transformed descriptor. Is it at the threshold? Within the range, if the threshold requirement is met, then Add 1 to get the final score collection. ,Fraction The descriptor pair corresponding to the maximum value is the target descriptor pair. And the corresponding pose transformation This is the coarse transformation pose value.
[0237] In this embodiment, the coarse transformation pose value is:
[0238] ;
[0239] Step 10: Design a nonlinear optimization function to obtain the precise pose of the filling port of the tanker truck.
[0240] Step 1: Based on the coarse transformation pose values Obtain the transformed target point cloud ;
[0241] Step 2: Create the target point cloud With Source Cloud The least squares relationship between them is expressed as follows:
[0242] ;
[0243] Step 3: Solve for the least squares function to obtain the high-precision transformed pose. , and by Find the center position of the filling port of the tanker truck The calculation formula is as follows:
[0244] ;
[0245] in, The pose of the source point cloud.
[0246] In this embodiment , As attached Figure 6 Pairing results for the point clouds of the two injection ports:
[0247] ;
[0248] Methanol filling port 1:
[0249] ;
[0250] Methanol filling port 2:
[0251] ;
[0252] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention. Various modifications and variations can be made to the present invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
[0253] On the other hand, the present invention also provides a computer storage medium storing executable program code; the executable program code is used to execute any of the above-mentioned tanker truck filling port pose positioning methods.
[0254] For example, the program code may be divided into one or more modules / units, which are stored in the memory and executed by the processor to complete the present invention.
[0255] The memory can be an internal storage unit of the terminal device, such as a hard drive or RAM. The memory can also be an external storage device of the terminal device, such as a plug-in hard drive, SmartMedia Card (SMC), Secure Digital (SD) card, or Flash Card. Furthermore, the memory can include both internal and external storage units of the terminal device. The memory is used to store the program code and other programs and data required by the terminal device. The memory can also be used to temporarily store data that has been output or will be output.
[0256] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0257] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are relatively specific and detailed, they should not be construed as limiting the scope of the invention patent. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this patent application should be determined by the appended claims.
Claims
1. A method for locating the position of the filling port of a tanker truck, characterized in that, The method includes: Acquire image data and initial point cloud of the filling port of the tanker truck; Based on the image data of the filling port of the tanker truck, a visual detection model is constructed, and the filling port of the tanker truck is coarsely located using the visual detection model to obtain the approximate three-dimensional pose of the filling port of the tanker truck. The initial point cloud is segmented based on the physical model of the filling port and the approximate three-dimensional pose to obtain local point cloud data containing only the filling port; this includes: modeling the actual physical model of the filling port of the tanker truck as a cylindrical object of fixed size, obtaining the physical parameters and contour model of the filling port of the tanker truck based on the MSAC fitting algorithm; segmenting the initial point cloud with the approximate three-dimensional pose as the center and the contour model of the filling port as the contour to obtain local point cloud containing only the methanol filling port; The local point cloud data is filtered according to the adaptive filtering threshold function to obtain the target point cloud data with the optimal density. Based on the smoothness, edge point features are extracted from the target point cloud data with the optimal density to obtain a set of edge feature points; The set of edge feature points is clustered using Euclidean clustering to obtain a cluster set. Based on the stability of triangles, construct the triangle descriptor for the cluster set; The triangular descriptors are paired according to the feature association algorithm and similarity scoring function to obtain a set of descriptor pairs. The set of descriptor pairs is then coarsely registered to obtain the target descriptor pairs and their coarse transformed poses. A nonlinear optimization function is used to perform fine registration on the coarse transformation pose of the target descriptor pair to obtain the accurate pose of the filling port of the tanker truck.
2. The method for locating the position of the filling port of a tanker truck according to claim 1, characterized in that, Based on the image data of the tanker truck's filling port, a visual detection model is constructed, and the filling port is coarsely located using the visual detection model to obtain its approximate three-dimensional pose, including: Real-time image data of the methanol filling port is acquired using a depth camera; the image data includes information about the methanol filling port and its surrounding environment. The acquired image data is preprocessed and features are extracted. Combined with image data of different resolutions, multi-scale information fusion technology is used to process the data to obtain the training dataset. The training dataset is input into the YOLOv5 model based on temporal information learning and attention mechanism for training, resulting in the YOLO visual detection model; The environmental image of the filling port location is input into the YOLO vision detection model for target detection and feature extraction, and the two-dimensional pixel coordinates of the center point of the filling port are output; the two-dimensional pixel coordinates of the center point are then converted into the approximate three-dimensional pose of the filling port of the tanker truck.
3. The method for locating the position of the filling port of a tanker truck according to claim 2, characterized in that, The two-dimensional pixel coordinates of the center point are converted into an approximate three-dimensional pose of the filling port of the tanker truck, including: Obtain the two-dimensional pixel coordinates of the center point of the filling port of the tanker truck from the visual inspection model; Based on the depth map captured by the depth camera, the pixel value of the two-dimensional pixel coordinates of the center point of the filling port is obtained, and the depth value of the center of the filling port of the tanker truck is obtained through an affine function. Based on the depth value, the approximate three-dimensional pose of the tanker truck's filling port in the camera coordinate system is obtained using the camera's three-dimensional projection model.
4. The method for locating the position of the filling port of a tanker truck according to claim 1, characterized in that, The local point cloud data is filtered according to an adaptive filtering threshold function to obtain target point cloud data with optimal density, including: Based on the depth map captured by the camera, the pixel value at the center of the filling port is obtained, and the depth value at the center of the filling port of the tanker truck is obtained through an affine function. An adaptive filtering threshold function is constructed based on the depth value to obtain the optimal voxel filtering parameters; Voxel downsampling is performed on the local point cloud data based on the optimal voxel filtering parameters to obtain the target point cloud data with the optimal density.
5. The method for locating the position of the filling port of a tanker truck according to claim 1, characterized in that, Based on the smoothness, edge point features are extracted from the target point cloud data with the optimal density to obtain a set of edge feature points; The set of edge feature points is clustered using Euclidean clustering to obtain a cluster set. Based on the stability of triangles, a triangle descriptor for the cluster set is constructed, including: Based on the selection of 5 points before and after each point in the target point cloud data with optimal density, the curvature of each point in the target point cloud data with optimal density is calculated, and the set of edge feature points is selected according to the smoothness threshold. A functional relationship between the clustering threshold and the optimal voxel filtering parameters is constructed, and the Euclidean clustering algorithm is used to cluster the set of edge feature points to obtain a cluster set; Obtain the coordinates of the center point of each cluster, obtain the direction vector of each cluster through PCA principal component analysis, project the coordinates of the center point of each cluster onto the direction vector to obtain the interval between two adjacent points, that is, the length value of each cluster. Traverse the cluster set, search for the 5 nearest neighbor clusters of each cluster, and construct a stable triangle descriptor by traversing all pairwise unique combinations of clusters from the nearest neighbor clusters and using the cluster as the vertex. At the same time, remove the cluster from the cluster set.
6. The method for locating the position of the filling port of a tanker truck according to claim 5, characterized in that, The triangle descriptor contains the length information of the three vertices, the direction vector information, and the length information of the three sides.
7. The method for locating the position of the filling port of a tanker truck according to claim 1, characterized in that, The triangular descriptors are paired according to a feature association algorithm and a similarity scoring function to obtain a set of descriptor pairs. Coarse registration is then performed on the set of descriptor pairs to obtain target descriptor pairs and their coarse transformed poses, including: The first phase involves traversing the set of descriptors, using the sum of the side lengths of the descriptors as the key and the triangular descriptors as the target values to construct a hash table. Traverse the length cluster set of the target point cloud, and for each triangle descriptor, query the hash table of the source point cloud for the number of triangle descriptors whose length difference is within the range. At the same time, obtain the potential descriptor pair set. The number is the first stage score of each triangle descriptor. Sort the scores from largest to smallest to obtain the score set and the total descriptor pair set. Based on the total number of triangular descriptors, the top 20% of the descriptors by score are retained for feature association in the second stage, while the other descriptors are removed from the set, resulting in a new set of descriptor pairs. Traverse the new set of descriptor pairs. For a pair of descriptors, subtract the length information of the vertices, the direction vector information, and the length information of the three sides to obtain the difference in vertex length, the difference in direction vector, and the difference in side length. For each paired descriptor, determine whether it satisfies the condition that all three differences are less than the threshold. If not, remove it. Finally, retain the descriptor pair in each set of paired descriptors that has the minimum sum of the three differences. Traverse the set of retained descriptor pairs, calculate the pose transformation between each pair of descriptors, perform pose transformation on the remaining descriptor pairs, calculate whether the distance between the three vertices of the transformed descriptor is within the threshold range, if the threshold requirement is met, increment the score by 1, and finally obtain the score set. The descriptor pair corresponding to the maximum score is the target descriptor pair, and the corresponding pose transformation is the coarse transformation pose.
8. A method for locating the position of a filling port on a tanker truck according to any one of claims 1-7, characterized in that, A nonlinear optimization function is used to perform fine registration on the coarse transformed pose of the target descriptor pair to obtain the precise pose of the tanker truck's filling port, including: The target point cloud is transformed according to the coarse transformation pose, and a least squares relationship is established between the transformed target point cloud and the source point cloud. The least squares relation is solved using the Gauss-Newton method to obtain a high-precision transformed pose; The product of the high-precision transformed pose and the source point cloud pose is the precise pose of the filling port of the tanker truck.
9. A computer storage medium storing executable program code, characterized in that, The executable program code is used to execute the method for locating the filling port of a tanker truck as described in any one of claims 1-8.