Method for monitoring status of grain wagon by fusing laser and visual image data
By fusing laser and visual data, combined with inertial measurement and deep learning, a dense point cloud is generated and error compensation is performed. This solves the problem of insufficient accuracy in the condition monitoring of grain transport vehicles caused by environment and vibration, and realizes high-precision calculation of loading volume and surface flatness, supporting the intelligent management of agricultural transport vehicles.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- LINYI UNIVERSITY
- Filing Date
- 2025-11-17
- Publication Date
- 2026-06-12
AI Technical Summary
In the current technology for monitoring the condition of grain transport vehicles, the single-sensor method is easily affected by changes in ambient light, vehicle vibration and grain surface characteristics, resulting in insufficient monitoring accuracy and robustness, and making it difficult to achieve high-precision three-dimensional model construction in complex agricultural environments.
The monitoring method integrates laser and visual image data. It acquires texture and stripe images through a visual perception module, performs motion blur compensation by combining an inertial measurement unit, generates a dense point cloud, and uses Kalman filtering and deep learning networks for error compensation to calculate the loading volume and surface flatness.
It achieves high-precision and stable monitoring of grain transport vehicles in complex agricultural environments, accurately calculates loading volume and identifies abnormal conditions such as uneven loading, thus improving the stability and reliability of the monitoring system.
Smart Images

Figure CN121527708B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent monitoring of agricultural machinery, specifically a method for monitoring the status of grain transport vehicles that integrates laser and visual image data. Background Technology
[0002] In the field of intelligent management of agricultural transport vehicles, real-time and accurate monitoring of the grain loading status of grain trucks is crucial for improving transportation efficiency and ensuring driving safety. Currently, common status monitoring methods mainly rely on a single type of sensor, such as using only a vision camera or only LiDAR. While vision methods are low-cost and can acquire rich texture information, they are easily affected by factors such as changes in ambient lighting and uneven reflectivity of the grain surface in actual operations, leading to decreased image quality and unstable feature extraction. In addition, vehicle vibration during transportation can introduce severe motion blur, further reducing the reliability of image analysis. Although LiDAR methods are not sensitive to lighting, when dealing with objects like grains that have complex surface textures and different reflectivity, the accuracy and completeness of the point cloud are easily affected by surface characteristics, and it is difficult to directly obtain texture information for auxiliary matching.
[0003] Existing technologies have attempted to combine visual and laser data, but these methods largely remain at a simple data overlay level, failing to deeply integrate the advantages of both types of data to construct high-precision 3D models. Particularly in dynamic operating environments, key issues remain unresolved: how to effectively compensate for image quality degradation and point cloud sequence errors caused by vehicle vibrations, and how to perform refined processing and error correction of point cloud data to address the specific characteristics of grain surfaces. These shortcomings limit the accuracy, robustness, and applicability of existing monitoring systems in complex agricultural environments. Summary of the Invention
[0004] To address the technical problems mentioned in the background section, this invention proposes a method for monitoring the status of grain transport vehicles by fusing laser and visual image data.
[0005] Therefore, the technical solution adopted by the present invention is as follows:
[0006] S1: Based on a preset visual perception module, a texture image and a stripe image with laser stripes are acquired on the surface of the grain; the visual perception module is equipped with an inertial measurement unit for acquiring the attitude data of the visual perception module; motion blur compensation is performed on the texture image and stripe image based on the attitude data.
[0007] S2: Extract the center line of the laser stripe in the compensated stripe image, and combine it with the parallax information of binocular stereo vision to generate the first point cloud data of the grain surface; based on the compensated texture image, perform feature point matching and densification on the first point cloud data to obtain the second point cloud data;
[0008] S3: Input the second point cloud data into the preset compensation model and output the third point cloud data after error compensation; the compensation model is composed of Kalman filtering and deep learning network.
[0009] S4: Based on the third point cloud data, calculate the loading volume and surface flatness of the grain, compare them with a preset threshold, and output the status of the grain transport vehicle.
[0010] Furthermore, the visual perception module consists of a binocular stereo vision camera, a laser line projector, and the inertial measurement unit;
[0011] The binocular stereo vision camera is calibrated to obtain epipolar constraints;
[0012] The stripe image includes a left-eye stripe image and a right-eye stripe image; the stripe image includes a left-eye texture image and a right-eye texture image.
[0013] Furthermore, the motion blur compensation includes point spread function modeling and image deconvolution restoration;
[0014] The diffusion function modeling described above models the image degradation process using the pose data. The point spread function can be represented as a linear motion blur model, mathematically expressed as follows.
[0015]
[0016] in, x and y The coordinates are on the image plane. For fuzzy length, The direction is ambiguous;
[0017] The image deconvolution restoration process uses the Wiener filtering algorithm to process the texture and stripe images, and uses the point spread function as the kernel function to restore the clear image. The formula is as follows.
[0018]
[0019] in, To obtain a clear image after motion compensation, This is the inverse Fourier transform operator. For Fourier transform operators, For blurred images, Let be the point spread function. This is the regularization parameter.
[0020] Furthermore, the center line of the laser stripe is extracted using the Steger algorithm based on the Hessian matrix. The specific steps are as follows:
[0021] The compensated stripe image is then subjected to Gaussian filtering to smooth the noise.
[0022] Construct the Hessian matrix for each pixel in the striped image. ,
[0023]
[0024] in, and The stripes in the image are along the pixel. x direction and y Second-order partial derivatives in the direction, For a striped image, along the pixel point x direction and y Mixed partial derivatives in direction;
[0025] Calculate the eigenvalues and corresponding eigenvectors of the Hessian matrix. The eigenvector corresponding to the eigenvalue with the larger absolute value indicates the local normal direction of the laser stripe at the pixel.
[0026] In the local normal direction, one-dimensional quadratic interpolation is performed on the pixel and its neighboring pixels to fit a continuous light intensity distribution curve.
[0027] The pixel point where the first derivative of the light intensity distribution curve is zero is the light intensity maximum point. By solving for the light intensity maximum point, the center line coordinates with sub-pixel precision can be obtained.
[0028] By traversing all the pixels in the laser stripe region of the stripe image, a set of light stripe centerline points can be obtained.
[0029] The light stripe centerline point set includes a left centerline point set and a right centerline point set.
[0030] Furthermore, the steps for acquiring the first point cloud data are as follows:
[0031] Based on the epipolar constraint and the preset matching rules, the matching point of each point in the left centerline point set is found from the right centerline point set, and the horizontal pixel displacement between the two points is the disparity.
[0032] Based on the parallax and triangulation methods, each point in the set of points along the center line of the light stripe is converted into three-dimensional coordinates. The conversion formula is as follows.
[0033]
[0034] in, , and Let the three-dimensional coordinates of each point in the centerline point set of the light stripe be given. Focal length Baseline distance, The principal point coordinates of the camera. The coordinates of the left centerline point are... For parallax.
[0035] Furthermore, the specific steps of feature point matching are as follows:
[0036] The FAST corner detector is used to locate feature points in the texture image and calculate the principal orientation of the feature points; a binary string is generated for each feature point using the BRIEF descriptor;
[0037] The binary strings of the feature points in the left eye texture image are compared with the binary strings of all the feature points in the right eye texture image using a Hamming distance ratio.
[0038] Nearest neighbor search is used to identify the feature points with the smallest Hamming distance in the right eye texture image as the initial matching pair;
[0039] The initial matching pairs are processed using the RANSAC algorithm based on epipolar geometry constraints to obtain matching pairs;
[0040] The three-dimensional coordinates of the matching points are calculated using triangulation, and these three-dimensional coordinates are added to the first point cloud data to obtain the intermediate point cloud.
[0041] Furthermore, the specific steps of the densification are as follows:
[0042] Each point in the intermediate point cloud is used as a seed point, and a tiny rectangular patch is generated in three-dimensional space around each seed point, the rectangular patch having a normal direction;
[0043] The three-dimensional position and normal direction of the rectangular patch are fine-tuned by optimizing the algorithm;
[0044] The optimization algorithm generates new patches between the spatially adjacent rectangular patches that have high photometric consistency.
[0045] Extract the center points of all the rectangular patches to form an initial dense point cloud;
[0046] The outliers in the initial dense point cloud are removed to obtain the second point cloud data.
[0047] Furthermore, the compensation model is a serial two-stage architecture, consisting of a first stage and a second stage.
[0048] The first stage first defines the state vector, state equation, and observation equation for each point in the second point cloud data;
[0049] The state equation and observation equation are as follows:
[0050]
[0051]
[0052] in, For state vectors, Here is the state transition matrix. and These are process noise and observation noise, respectively. for k The observation vector at time;
[0053] Then, Kalman filtering iterations are performed, and the iteration process is as follows:
[0054] Based on the state equation, predict the current state vector according to the state vector of the previous time step;
[0055] The predicted current state vector is corrected by combining the observed vector to obtain the optimal state vector at the current moment;
[0056] Finally, position information is extracted from the optimal state vector to obtain denoised point cloud data;
[0057] The second stage employs a deep learning network based on the PointNet++ architecture to further process the denoised point cloud data. The specific operations are as follows:
[0058] The deep learning network outputs the correction amount for each point in the denoised point cloud data, and corrects each point in the denoised point cloud data based on the correction amount to obtain the third point cloud data.
[0059] Furthermore, the calculation steps for the loading volume are as follows:
[0060] The third point cloud data is projected onto a horizontal plane, and the projection boundary of the grain surface on the horizontal plane is determined.
[0061] Obtain the grain box cavity model of the grain transport vehicle, and discretize the grain box cavity model into several layers in the vertical direction. At the same time, calculate the intersection area between the projection area of the third point cloud data of each layer and the cross-section of the grain box cavity model.
[0062] Loading volume The calculation formula is as follows:
[0063]
[0064] in, The thickness of each layer of the grain bin cavity model. This represents the total number of layers in the grain tank cavity model. For the grain bin cavity model The intersection area of the layers;
[0065] The loading rate of the grain transport vehicle is calculated based on the loading volume;
[0066] The calculation steps for the surface flatness are as follows:
[0067] By performing plane fitting on each point in the third point cloud data, a reference plane equation representing the overall surface trend of the grain is obtained:
[0068]
[0069] in, A , B , C and D The coefficients of the plane equation are... , and The three-dimensional coordinates of each point in the third point cloud data;
[0070] Calculate the vertical distance from each point in the third point cloud data to the reference plane;
[0071] The surface flatness is defined as the standard deviation of the vertical distance of all points in the third point cloud data.
[0072] Furthermore, the status of the grain transport vehicle includes at least the fully loaded status, the partially loaded status, and the current loading rate.
[0073] Compared with the prior art, the advantages of the present invention are as follows:
[0074] 1. This invention generates a more complete and refined dense point cloud of grain surface by deeply fusing the precise three-dimensional contour information of laser stripes with the rich texture features of visual images, and by using feature matching and patch densification technology. This effectively overcomes the limitations of single sensor technology and provides a more reliable data foundation for volume and flatness calculation.
[0075] 2. This invention innovatively introduces motion blur compensation and a two-stage point cloud error compensation model based on inertial measurement, which can effectively suppress image quality degradation and point cloud temporal noise caused by vehicle vibration, and correct complex nonlinear errors, thereby improving the stability and reliability of the system in real and dynamic agricultural operation environments.
[0076] 3. Based on three-dimensional point clouds, this invention can not only accurately calculate the grain loading volume, but also introduce surface flatness as a key evaluation indicator. It can automatically identify abnormal states such as uneven loading, realizing more comprehensive and intelligent monitoring and early warning of the loading status of grain transport vehicles, and providing strong support for the intelligent management of agricultural transport vehicles. Attached Figure Description
[0077] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0078] Figure 1 This is a schematic diagram of the method flow of the present invention;
[0079] Figure 2 This is a schematic diagram of the visual perception module of the present invention;
[0080] Figure 3 This is a schematic diagram of the process for extracting the center line of the light stripe according to the present invention. Detailed Implementation
[0081] To achieve the above objectives, the present invention provides a method for monitoring the status of grain transport vehicles by fusing laser and visual image data. Please refer to [link to relevant documentation]. Figure 1 ,include:
[0082] S1: Based on a preset visual perception module, a texture image and a stripe image with laser stripes are acquired on the surface of the grain; the visual perception module is equipped with an inertial measurement unit for acquiring the attitude data of the visual perception module; motion blur compensation is performed on the texture image and stripe image based on the attitude data.
[0083] The visual perception module consists of a binocular stereo vision camera, a structured laser line projection device, and an inertial measurement unit. Please refer to [link / reference needed]. Figure 2 ;
[0084] The binocular stereo vision camera and laser projection device are coaxially mounted on a rigid bracket and fixed at the optimal observation point above the grain tank of the grain truck to ensure that the field of view covers the entire grain loading surface. The inertial measurement unit is rigidly connected to the binocular vision camera and is usually directly mounted on the housing of the binocular vision camera or on a rigid bracket fixed to it.
[0085] Before acquiring data with the binocular stereo vision camera, the parameters of the two cameras need to be calibrated. The parameters that need to be calibrated are as follows:
[0086] Focal length and principal point coordinates: determine the mapping relationship between pixel coordinates and the direction of light in the real world;
[0087] Distortion coefficient: Used to correct image distortion caused by the lens itself, ensuring that straight lines in the image are also straight lines in reality, which is a prerequisite for obtaining accurate measurement results;
[0088] Rotation matrix and translation vector: The horizontal component of the translation vector T is the key baseline distance. This set of parameters defines the spatial position of the right camera relative to the left camera.
[0089] Stereo correction: The imaging planes of the two cameras are pulled to the same plane that are completely parallel, so that the position of the same physical point in the left and right images differs only in the horizontal direction, that is, the vertical coordinates are the same. After stereo correction, epipolar constraints are obtained.
[0090] The specific steps for data collection are as follows:
[0091] First, turn off the laser projection device and then simultaneously capture a frame of texture image of the grain surface using the left and right cameras of the binocular stereo vision camera.
[0092] Then the laser projection device is turned on and a visible laser stripe of a specific wavelength is projected. The left and right cameras of the binocular stereo vision camera simultaneously acquire a frame of stripe image with laser stripes.
[0093] This time-sharing acquisition mode avoids the mutual interference between natural light and laser, and ensures strict spatial correspondence between texture and stripe images through hardware synchronization.
[0094] While acquiring images, the inertial measurement unit acquires attitude data from the visual perception module in real time at a frequency higher than that of the binocular stereo vision camera. The attitude data includes triaxial angular velocity and triaxial linear acceleration.
[0095] In this embodiment, Kalman filtering is used to filter and fuse the attitude data to calculate the attitude angle and motion displacement vector of the visual perception module at the moment of image exposure. The attitude angle includes roll angle, pitch angle and yaw angle.
[0096] Motion blur compensation includes point spread function modeling and image deconvolution restoration;
[0097] Diffusion function modeling models the image degradation process using pose data. The point spread function can be characterized as a linear motion blur model, and the formulas for calculating the blur direction θ and blur length L are as follows:
[0098]
[0099] in, and During the image exposure time, the visual perception module in shaft and The displacement difference in the direction is calculated by time integration from the linear acceleration data measured by the inertial measurement unit. It is the arctangent function;
[0100] The mathematical representation of the point spread function is as follows:
[0101]
[0102] in, x and y The coordinates are on the image plane;
[0103] Image deconvolution restoration processes texture and stripe images using the Wiener filtering algorithm, employing the point spread function as the kernel function to restore a clear image. The formula is as follows:
[0104]
[0105] in, To obtain a clear image after motion compensation, This is the inverse Fourier transform operator. For Fourier transform operators, For blurred images, Let be the point spread function. This is a regularization parameter used to balance the degree of deblurring and noise suppression. It is a normal number that is adaptively adjusted or preset according to the image noise characteristics. It is usually derived based on the image signal-to-noise ratio to prevent noise amplification during the restoration process.
[0106] S2: Extract the center line of the laser stripe in the compensated stripe image, and combine it with the parallax information of binocular stereo vision to generate the first point cloud data of the grain surface; based on the compensated texture image, perform feature point matching and densification on the first point cloud data to obtain the second point cloud data;
[0107] After motion blur compensation, the laser lines in the stripe image become clearer. For accurate measurement, it is necessary to extract the position of the laser stripes at the center line of the stripe image with sub-pixel accuracy. This invention uses the Steger algorithm based on the Hessian matrix for this operation. The specific steps are as follows, please refer to [link / reference]. Figure 3 :
[0108] Gaussian filtering is applied to the compensated stripe image to smooth the noise;
[0109] Calculate the first-order gradient and second-order derivative of each pixel in the striped image, and construct the Hessian matrix of the pixel. :
[0110]
[0111] in, and The stripes in the image are along the pixel. x direction and y Second-order partial derivatives in the direction, For a striped image, along the pixel point x direction and y Mixed partial derivatives in direction;
[0112] The eigenvalues and corresponding eigenvectors of the Hessian matrix are calculated to quantify and locate the shape and orientation of local regions in the image. The eigenvectors corresponding to eigenvalues with larger absolute values indicate the local normal direction of the laser stripes at the pixel.
[0113] In the local normal direction, one-dimensional quadratic interpolation is performed on the pixel and its neighboring pixels to fit a continuous light intensity distribution curve.
[0114] The pixel whose first derivative of the light intensity distribution curve is zero is the light intensity maximum point. By solving for the light intensity maximum point, the center line coordinates with sub-pixel precision can be obtained.
[0115] By traversing all the pixels in the laser stripe region of the stripe image, a continuous and precise set of sub-pixel level light stripe centerline points can be obtained.
[0116] After acquiring the centerline point sets of the laser stripes from the perspectives of the left and right cameras, a 3D point cloud, i.e., the first point cloud data, is generated using the principle of binocular stereo vision. The specific steps are as follows:
[0117] For each centerline point in the left centerline point set, a candidate point set for its matching point is searched within the right centerline point set based on epipolar constraints. The matching point is then determined from the candidate point set based on preset matching rules, including the continuity and curvature of the centerline. This matching point is the projection of the left centerline point onto the right eye stripe image, and the horizontal pixel displacement between the two points is the disparity. :
[0118]
[0119] in, The x-coordinate of the left centerline point. The x-coordinate of the matching point;
[0120] The epipolar region refers to a finite range along a straight line that represents the possible locations of a point in the left eye image, as seen in the right eye image.
[0121] Parallax is inversely proportional to physical depth;
[0122] The center line point of the light stripe and its parallax are converted into three-dimensional point coordinates (X, Y, Z) using triangulation. The conversion formula is as follows:
[0123]
[0124] in, Focal length Baseline distance, The principal point coordinates of the camera. These are the coordinates of the left centerline point.
[0125] By traversing all successfully matched centerline point pairs of the light stripes, sparse but precise first point cloud data covering the area scanned by the laser lines can be generated. The first point cloud data directly reflects the three-dimensional contour of the grain surface.
[0126] The specific steps of feature point matching are as follows:
[0127] The ORB algorithm is used to process both the left and right eye texture images simultaneously. First, the FAST corner detector is used to locate the feature points with obvious characteristics in the image (such as the edges and corners of grain particles). Then, the principal orientation of the feature points is calculated to make the key points rotation invariant (so that they can be matched correctly even if the vehicle is slightly tilted). Finally, the BRIEF descriptor is used to generate a binary string for each feature point. This string compactly represents the unique pattern of the pixel block around the key point. The advantage of the binary descriptor is that the calculation and matching speed is extremely fast.
[0128] For feature points in the left-eye texture image, the binary string is compared with the binary strings of all feature points in the right-eye texture image using Hamming distance, which is the number of corresponding bits that are different between the two binary strings. The smaller the Hamming distance, the higher the similarity.
[0129] Nearest neighbor search is used to find the feature point with the smallest Hamming distance in the right eye for the feature points in the left eye texture image as the initial matching pair;
[0130] Since the initial matching pairs contain a large number of erroneous matches, the RANSAC algorithm based on epipolar geometry constraints is used to process the initial matching pairs to obtain matching pairs in order to remove these erroneous matches. The specific operation is as follows:
[0131] A minimum subset is randomly selected from the initial matching pairs to estimate a fundamental matrix. The fundamental matrix encodes the epipolar geometry between the binocular cameras. All initial matching pairs are tested using the fundamental matrix, and the number of matching points that conform to the epipolar geometry is counted. This process is repeated iteratively, and finally the number of matching points that have the most conforming epipolar geometry is adopted. All matching point pairs that do not conform to the epipolar geometry are discarded as outliers.
[0132] The three-dimensional coordinates of the matching points are calculated using triangulation, and these three-dimensional coordinates are added to the first point cloud data to obtain the intermediate point cloud.
[0133] The intermediate point cloud is densified using a patch-based multi-view stereo matching algorithm. The specific steps are as follows:
[0134] Using the intermediate point cloud as a seed point, a tiny rectangular patch is generated in three-dimensional space around each seed point. This patch has a normal direction, which can be estimated from the neighboring points of the initial seed point.
[0135] For each facet, its 3D position and normal direction are fine-tuned through an optimization algorithm to maximize the photometric consistency of the facet across all visible viewpoints. If a facet fails to meet the consistency criteria (e.g., due to its location on an object boundary or in an occluded area), it is discarded.
[0136] Between spatially adjacent patches with high photometric consistency, the optimization algorithm generates new patches. For example, in front of two existing patches, a new patch may be generated to fill the blank area. This process is like seed germination, iterating continuously on the surface area of the grain until no new patches that meet the consistency constraints can be generated.
[0137] Extract the center points of all the facets to form an initial dense point cloud;
[0138] In this embodiment, statistical filtering and other methods are used to remove isolated outliers caused by noise, ensuring the cleanliness of point cloud data.
[0139] The final output is high-density, high-precision second-point cloud data, which fully and meticulously reflects the three-dimensional topological structure of the grain surface.
[0140] S3: Input the second point cloud data into the preset compensation model and output the third point cloud data after error compensation; the compensation model is composed of Kalman filtering and deep learning network.
[0141] The second point is that cloud data may contain the following errors:
[0142] Temporal correlation error: Due to the continuous vibration of the grain transport vehicle during operation, the point cloud data collected by the visual perception module is correlated in the time series, which manifests as high-frequency jitter or low-frequency drift.
[0143] Systematic nonlinear error: Affected by factors such as temperature changes in the actual working environment, uneven reflection characteristics of grain surface, and residual distortion of camera lens, point cloud data may have nonlinear distortions that are difficult to be completely corrected by traditional geometric models.
[0144] Therefore, the second cloud data needs further processing.
[0145] The compensation model is a serial two-stage architecture, consisting of a first stage and a second stage.
[0146] The first stage models the generation process of the second point cloud as a dynamic system, which independently tracks each point in the second point cloud data.
[0147] Based on the assumption that the grain transport vehicle is in a stable driving state or briefly stationary during monitoring, the uniform speed model is used as the basis for Kalman filtering.
[0148] The state vector of each point in the second point cloud data is defined as follows: This includes the position and velocity information of the points. The velocity here does not represent the physical speed of motion, but rather the trend of spatial position change between adjacent points in the point cloud. This trend helps Kalman filtering establish the assumption that the point cloud coordinates are continuously changing, thereby intelligently smoothing out abnormal jumps in the coordinates of individual points and achieving a noise reduction effect.
[0149] The state equation and observation equation for each point in the second point cloud data are defined as follows:
[0150]
[0151]
[0152] in, Let be the state transition matrix, describing how the dynamic system transitions from 0 to 100%. k-1 The state of time naturally evolves to k The state at any given moment is based on physical assumptions about the dynamic characteristics of the dynamic system. and These are process noise and observation noise, respectively. for k The observation vector at time t is derived from the actual value of the second point cloud;
[0153] Process noise and observation noise are modeled as zero-mean Gaussian white noise. The covariance matrix of process noise and observation noise is obtained. The covariance matrix is determined by experimental calibration, for example, by calculating the variance of point cloud coordinates when repeatedly measuring the same static target.
[0154] Kalman filtering recursively performs prediction and update, with the following iterative process:
[0155] Predict the current state vector based on the state equation and the state vector of the previous time step.
[0156] The predicted current state vector is corrected by combining the current observation vector to obtain the optimal state vector at the current moment;
[0157] Position information is extracted from the optimal state vector to obtain denoised point cloud data;
[0158] The second stage uses a deep learning network based on the PointNet++ architecture to process the denoised point cloud data. The specific steps are as follows:
[0159] The deep learning network first divides the denoised point cloud data into local blocks, embeds features for each point through a multilayer perceptron, then selects key points by sampling the farthest point, and aggregates neighborhood information at multiple scales through an ensemble abstraction layer, thereby extracting local geometric features with a hierarchical structure. These features can effectively capture key information such as the surface properties and normal direction of the grain surface.
[0160] The correction amount for each point in the point cloud data is output based on local geometric features, and the correction amount is used to correct each point in the point cloud data.
[0161] The training of deep learning networks relies on a large dataset of paired point clouds that is synchronously acquired by a high-precision 3D ground scanner (as ground truth) and the system's visual perception module.
[0162] The loss function combines chamfer distance and point-to-point mean square error to ensure that the compensated point cloud is highly consistent with the real surface in both overall shape and local detail.
[0163] S4: Based on the third point cloud data, calculate the loading volume and surface flatness of the grain, compare them with a preset threshold, and output the status of the grain transport vehicle.
[0164] The loading volume is obtained by calculating the spatial volume enclosed by the 3D model of the grain bin cavity and the point cloud of the grain surface. The specific steps are as follows:
[0165] When the grain truck is empty, point cloud data inside the grain bin is collected through a visual perception module;
[0166] The third point cloud data is projected onto the horizontal plane, and the projection boundary of the grain surface on the horizontal plane is determined.
[0167] The grain tank cavity model is discretized into several layers in the vertical direction;
[0168] For each layer, calculate the intersection area of the grain surface point cloud projection area and the cross-section of the grain bin cavity model at that layer height;
[0169] Loading volume The calculation formula is as follows:
[0170]
[0171] in, The thickness of each layer of the grain bin cavity model. This represents the total number of layers in the grain tank cavity model. For the grain bin cavity model The intersection area of the layers;
[0172] The loading rate of a grain transport vehicle is calculated based on the loading volume and the volume of the grain bin.
[0173] The calculation steps for surface flatness are as follows:
[0174] Plane fitting is performed on each point in the third point cloud data to obtain the most representative grain.
[0175] Reference plane equation for surface trend:
[0176]
[0177] in, A , B , C and D The coefficients of the plane equation are obtained by performing plane fitting on the points in the third point cloud data using the RANSAC algorithm or the least squares method. 、 and The three-dimensional coordinates of the point;
[0178] For each point in the third point cloud data, calculate its perpendicular distance to the reference plane. :
[0179]
[0180] in, For the third point cloud data The perpendicular distance from each point to the reference plane. , and For the third point cloud data The three-dimensional coordinates of each point;
[0181] Surface flatness Defined as the standard deviation of the distances from all points to the reference plane:
[0182]
[0183] in, For the quantity, For all The average value;
[0184] The smaller the value, the smoother the surface and the more uniform the distribution.
[0185] If the loading rate is greater than or equal to the preset full load threshold, the full load status signal will be output as the status of the grain transport vehicle.
[0186] If the loading rate is lower than the full load threshold but higher than the preset safety threshold, the current loading rate will be output as the status of the grain transport vehicle.
[0187] If the surface flatness is greater than the preset flatness threshold, it indicates that the surface height of the grain is significantly different and there is a risk of uneven loading. In this case, the uneven loading warning signal will be output as the status of the grain transport vehicle.
[0188] The proposed method for monitoring the status of grain transport vehicles by fusing laser and visual image data integrates the precise three-dimensional contour information of laser stripes with the rich texture features of visual images, and combines this with motion blur compensation using an inertial measurement unit. This constructs a complete processing flow from image acquisition and point cloud generation to error compensation, effectively solving the problems of inaccurate and unstable monitoring data caused by vehicle vibration, changes in lighting, and uneven grain surface characteristics in complex agricultural operating environments. As a result, it achieves high-precision and robust monitoring of the grain loading volume and surface flatness of grain transport vehicles, providing reliable technical support for the intelligent management of agricultural transport vehicles.
[0189] In summary, this invention significantly improves the accuracy and environmental adaptability of grain transport vehicle status monitoring through deep fusion of laser and visual data, motion fuzz compensation based on inertial measurement, and point cloud optimization combining Kalman filtering and deep learning. It effectively overcomes the technical challenges of unstable monitoring under vibration interference, complex surface characteristics, and changes in illumination, which are common problems in traditional methods. This provides a reliable solution for the intelligent and precise management of agricultural transport vehicles.
[0190] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A method of fusion of laser and visual image data for monitoring the status of a wagon, characterized in that, include: S1: Based on a preset visual perception module, a texture image and a stripe image with laser stripes are acquired on the surface of the grain; the visual perception module is equipped with an inertial measurement unit for acquiring the attitude data of the visual perception module; motion blur compensation is performed on the texture image and stripe image based on the attitude data; the texture image includes a left-eye texture image and a right-eye texture image; The motion blur compensation includes point spread function modeling and image deconvolution restoration; The diffusion function modeling described above models the image degradation process using the pose data. The point spread function can be represented as a linear motion blur model, mathematically expressed as follows. in, x and y The coordinates are on the image plane. For fuzzy length, The direction is ambiguous; The formulas for calculating the fuzzy length and fuzzy direction are as follows: in, and During the image exposure time, the visual perception module in and Displacement difference in direction, It is the arctangent function; The image deconvolution restoration process uses the Wiener filtering algorithm to process the texture and stripe images, and uses the point spread function as the kernel function to restore the clear image. The formula is as follows. in, To obtain a clear image after motion compensation, This is the inverse Fourier transform operator. For Fourier transform operators, For blurred images, Let be the point spread function. For regularization parameters; S2: The center line of the laser stripes in the compensated stripe image is extracted using the Steger algorithm based on the Hessian matrix. The specific steps are as follows. The compensated stripe image is then subjected to Gaussian filtering to smooth the noise. Construct the Hessian matrix for each pixel in the striped image. , in, and The stripe image is located at the pixel point along... x direction and y Second-order partial derivatives in the direction, For a striped image, along the pixel point x direction and y Mixed partial derivatives in direction; Calculate the eigenvalues and corresponding eigenvectors of the Hessian matrix. The eigenvector corresponding to the eigenvalue with the larger absolute value indicates the local normal direction of the laser stripe at the pixel. In the local normal direction, one-dimensional quadratic interpolation is performed on the pixel and its neighboring pixels to fit a continuous light intensity distribution curve. The pixel point where the first derivative of the light intensity distribution curve is zero is the light intensity maximum point. By solving for the light intensity maximum point, the center line coordinates with sub-pixel precision can be obtained. By traversing all the pixels in the laser stripe region of the stripe image, a set of light stripe centerline points can be obtained. The light stripe centerline point set includes a left centerline point set and a right centerline point set; By combining the parallax information of binocular stereo vision, the first point cloud data of the grain surface is generated; based on the compensated texture image, feature point matching and densification are performed on the first point cloud data to obtain the second point cloud data. The specific steps for feature point matching are as follows. The FAST corner detector is used to locate feature points in the texture image and calculate the principal orientation of the feature points; a binary string is generated for each feature point using the BRIEF descriptor; The binary strings of the feature points in the left eye texture image are compared with the binary strings of all the feature points in the right eye texture image using a Hamming distance ratio. Nearest neighbor search is used to identify the feature points with the smallest Hamming distance in the right eye texture image as the initial matching pair; The initial matching pairs are processed using the RANSAC algorithm based on epipolar geometry constraints to obtain matching pairs; The three-dimensional coordinates of the matching points are calculated using triangulation, and these three-dimensional coordinates are added to the first point cloud data to obtain the intermediate point cloud. The specific steps of the densification are as follows: Each point in the intermediate point cloud is used as a seed point, and a tiny rectangular patch is generated in three-dimensional space around each seed point, the rectangular patch having a normal direction; The three-dimensional position and normal direction of the rectangular patch are fine-tuned by optimizing the algorithm; The optimization algorithm generates new patches between the spatially adjacent rectangular patches that have high photometric consistency. Extract the center points of all the rectangular patches to form an initial dense point cloud; The second point cloud data is obtained by removing outliers from the initial dense point cloud. S3: Input the second point cloud data into the preset compensation model and output the third point cloud data after error compensation; the compensation model is composed of Kalman filtering and deep learning network. The compensation model is a series two-stage architecture, consisting of a first stage and a second stage. The first stage first defines the state vector, state equation, and observation equation for each point in the second point cloud data; The state equation and observation equation are as follows: in, For state vectors, Here is the state transition matrix. and These are process noise and observation noise, respectively. for k The observation vector at time; Then, Kalman filtering iterations are performed, and the iteration process is as follows: Based on the state equation, predict the current state vector according to the state vector of the previous time step; The predicted current state vector is corrected by combining the observed vector to obtain the optimal state vector at the current moment; Finally, position information is extracted from the optimal state vector to obtain denoised point cloud data; The second stage employs a deep learning network based on the PointNet++ architecture to further process the denoised point cloud data. The specific operations are as follows: The deep learning network outputs the correction amount for each point in the denoised point cloud data, and corrects each point in the denoised point cloud data based on the correction amount to obtain the third point cloud data. S4: Based on the third point cloud data, calculate the loading volume and surface flatness of the grain, compare them with a preset threshold, and output the status of the grain transport vehicle; the status of the grain transport vehicle includes at least the full load status, the off-center load status, and the current loading rate; The steps for calculating the loading volume are as follows: The third point cloud data is projected onto a horizontal plane, and the projection boundary of the grain surface on the horizontal plane is determined. Obtain the grain box cavity model of the grain transport vehicle, and discretize the grain box cavity model into several layers in the vertical direction. At the same time, calculate the intersection area between the projection area of the third point cloud data of each layer and the cross-section of the grain box cavity model. Loading volume The calculation formula is as follows: in, The thickness of each layer of the grain bin cavity model. This represents the total number of layers in the grain tank cavity model. For the grain bin cavity model The intersection area of the layers; The loading rate of the grain transport vehicle is calculated based on the loading volume; If the loading rate is greater than or equal to the preset full load threshold, then the full load state is taken as the grain transport vehicle state. If the loading rate is lower than the full load threshold but higher than the preset safety threshold, the current loading rate is taken as the grain transport vehicle status. The calculation steps for the surface flatness are as follows: By performing plane fitting on each point in the third point cloud data, a reference plane equation representing the overall surface trend of the grain is obtained: in, A, B, C and D The coefficients of the plane equation are... 、 and The three-dimensional coordinates of each point in the third point cloud data; Calculate the vertical distance from each point in the third point cloud data to the reference plane; The surface smoothness is defined as the standard deviation of the vertical distance of all points in the third point cloud data; If the surface flatness is greater than the preset flatness threshold, then the off-center loading state will be regarded as the grain transport vehicle state.
2. The method according to claim 1, characterized in that, The visual perception module consists of a binocular stereo vision camera, a laser line projector, and the inertial measurement unit. The binocular stereo vision camera is calibrated to obtain epipolar constraints; The stripe image includes a left-eye stripe image and a right-eye stripe image.
3. The method according to claim 2, characterized in that, The steps for obtaining the first point cloud data are as follows: Based on the epipolar constraint and the preset matching rules, the matching point of each point in the left centerline point set is found from the right centerline point set, and the horizontal pixel displacement between the two points is the disparity. Based on the parallax and triangulation methods, each point in the set of points along the center line of the light stripe is converted into three-dimensional coordinates. The conversion formula is as follows. in, , and Let the three-dimensional coordinates of each point in the centerline point set of the light stripe be given. Focal length Baseline distance, The principal point coordinates of the camera. The coordinates of the left centerline point are... For parallax.