A rice field early seedling row precision acquisition method based on leaf sheath identification

By constructing an augmented set of field image data of early rice seedlings and an improved YOLOv8 network model, combined with perspective correction and cluster fitting techniques, the problem of insufficient accuracy caused by wind disturbance and growth changes in rice seedling identification was solved, achieving efficient and accurate seedling row acquisition, which is suitable for intelligent agricultural machinery equipment.

CN120495892BActive Publication Date: 2026-06-16NANJING AGRICULTURAL UNIVERSITY +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NANJING AGRICULTURAL UNIVERSITY
Filing Date
2025-05-13
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing technologies for identifying rice seedlings are easily affected by weeds, algae, and leaf swaying, resulting in insufficient accuracy. Furthermore, traditional models have high computational complexity or a large number of parameters, making it difficult to meet real-time requirements. In particular, the errors are significant under the early wind disturbance of the seedlings, affecting navigation accuracy.

Method used

By constructing a field image data augmentation set of early rice seedlings, we used an improved YOLOv8 network model to identify key points of the seedling leaf sheaths, and combined perspective correction and cluster fitting techniques to extract seedling rows, thereby reducing the impact of wind disturbance and changes in growth stages.

🎯Benefits of technology

It significantly reduces wind disturbance error by about 70%, improves detection accuracy and computational efficiency, reduces the number of model parameters, and has a single-frame detection time of less than 30ms. It is suitable for intelligent agricultural machinery and equipment, improving navigation accuracy and operational stability.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN120495892B_ABST
    Figure CN120495892B_ABST
Patent Text Reader

Abstract

The application discloses a rice field early seedling row precision acquisition method based on leaf sheath identification, and steps include: constructing a field image data enhancement set of rice field early seedlings; constructing a target detection model for positioning seedling leaf sheath key points; identifying seedling leaf sheath key points in a seedling image to be identified by using the target detection model; performing perspective correction and clustering fitting on the seedling leaf sheath key points in the seedling image to be identified; and mapping extracted seedling rows back to a coordinate system on the seedling image to be identified. The rice field early seedling row precision acquisition method can effectively inhibit the influence of wind disturbance and a complex background on the early weak seedling row acquisition precision, provides reliable navigation for rice field early facility precision operation, and is especially suitable for low-power embedded devices.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of intelligent agricultural machinery and equipment technology, specifically relating to a method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition. Background Technology

[0002] As the application of intelligent agricultural machinery in paddy field planting and management deepens, the requirements for the accuracy of automatic identification of paddy seedling rows in refined operations are becoming increasingly stringent. Traditional methods for detecting paddy seedling rows rely on image processing techniques such as color segmentation and edge detection, which are easily affected by weeds, algae, and leaf swaying, resulting in insufficient accuracy. Semantic segmentation methods based on deep learning have high computational complexity and cannot meet real-time requirements; while target detection methods are more efficient, traditional models have limited ability to extract features from slender and curved seedlings. Especially for early-stage paddy seedlings, existing leaf localization methods have significant errors under wind disturbances and have a large number of model parameters, making them difficult to deploy on low-power devices of intelligent agricultural machinery.

[0003] For example, in the early stages of robotic weeding in rice paddies, the seedlings are less resistant to bending, and their leaves are easily displaced by wind. This leads to a significant error between the traditional whole-plant detection-based method and the ideal navigation line (i.e., the central axis between rows of seedlings) when generating the navigation line. Figure 3 As shown in d. Furthermore, as rice seedlings progress through their growth stages, the extension direction, morphology, and structural characteristics of their leaves undergo dynamic changes, further increasing the difficulty of detecting leaf positioning stability. However, compared to the leaves, the contact point between the leaf sheath and the soil is relatively stable and less affected by environmental disturbances and changes in growth stages. Summary of the Invention

[0004] The purpose of this invention is to provide a method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition. By accurately extracting key feature points of the seedling leaf sheath, the method reduces the impact of wind disturbance and changes in growth stage on positioning accuracy, thereby improving the navigation accuracy and operational stability of weeding robots in complex field environments.

[0005] Technical solution: The method for accurately acquiring early rice seedling rows based on leaf sheath recognition described in this invention includes the following steps:

[0006] Step 1: Construct an augmented set of field image data of early rice seedlings in paddy fields;

[0007] Step 2: Construct a target detection model for locating key points of seedling leaf sheaths, and train the target detection model using field image data augmentation set;

[0008] Step 3: Read the image of the seedling to be identified, establish a coordinate system on the image of the seedling to be identified, and then use the target detection model to identify the key points of the seedling leaf sheath in the image of the seedling to be identified, and output the key points of the seedling leaf sheath in the image of the seedling to be identified.

[0009] Step 4: Perform perspective correction and cluster fitting on the key points of the leaf sheaths of the seedlings in the image of the seedlings to be identified, and extract the rows of seedlings in the image of the seedlings to be identified.

[0010] Step 5: Map the extracted seedling rows back to the coordinate system on the seedling image to be identified, and use the seedling image to be identified along with the seedling rows as the final detection result.

[0011] Furthermore, in step 1, the specific steps for constructing the field image data augmentation set of early rice seedlings are as follows:

[0012] Step 1.1: Collect images of early rice seedlings in paddy fields under different weather conditions, at different locations, and at different locations according to the set image resolution;

[0013] Step 1.2: Perform data augmentation processing on the collected early rice seedling images from various paddy fields, including cropping, rotation, brightness adjustment, and Gaussian noise injection.

[0014] Step 1.3: Construct a field image data augmentation set using early rice seedling images from various paddy fields after data augmentation processing, and divide the field image data augmentation set into a training set, a validation set, and a test set according to a set ratio.

[0015] Furthermore, in step 2, the specific steps for constructing the target detection model for locating key points of the seedling leaf sheath are as follows:

[0016] Step 2.1: Set the positioning range of the key points of the seedling leaf sheath to the seedling area where the leaf ring extends to the soil surface;

[0017] Step 2.2: Construct an improved YOLOv8 network model as the object detection model;

[0018] Step 2.3: Train the object detection model using the training set, then validate the trained object detection model using the validation set. After successful validation, test the object detection model using the test set. Once the test is passed, the object detection model is complete.

[0019] Furthermore, in step 2.2, the constructed improved YOLOv8 network model includes a backbone network module, a neck module, and a head module. The backbone network module includes a first Conv layer, a second Conv layer, a first C2f layer, a third Conv layer, a second C2f layer, a fourth Conv layer, a third C2f layer, a fifth Conv layer, a DSC-C2f layer, a CBAM layer, an SPPF layer, and a C3Ghost layer, all connected in sequence. The neck module includes a first Upsample layer, a first Concat layer, a fourth C2f layer, a second Upsample layer, a second Concat layer, a fifth C2f layer, a sixth Conv layer, a third Concat layer, a sixth C2f layer, a seventh Conv layer, a fourth Concat layer, and a seventh C2f layer, all connected in sequence. The head module includes a first Detect layer, a second Detect layer, and a third Detect layer.

[0020] The first, second, third, fourth, fifth, sixth, and seventh Conv layers are all used for feature extraction; the first, second, third, fourth, fifth, sixth, and seventh C2f layers are all used for feature extraction and information fusion on the input feature map to obtain a higher-level feature representation; the DSC-C2f layer is used to enhance feature extraction capability through adaptive deformable convolution; the CBAM layer is used to suppress background noise interference; the SPPF layer is used to capture multi-scale information; the C3Ghost layer is used to reduce the number of network parameters and computation through a redundant feature generation strategy; the first and second Upsample layers are both used for upsampling; the first, second, third, and fourth Concat layers are all used for feature fusion; the first, second, and third Detect layers are all used for bounding box and category prediction.

[0021] The input of the first Conv layer is used to input the seedling image to be identified. The output of the second C2f layer is also connected to the input of the second Concat layer. The output of the third C2f layer is also connected to the input of the first Concat layer. The output of the C3Ghost layer is connected to the input of the second Upsample layer and the input of the fourth Concat layer, respectively. The output of the fifth C2f layer is also connected to the input of the first Detect layer. The input of the sixth C2f layer is also connected to the input of the second Detect layer. The input of the seventh C2f layer is connected to the input of the third Detect layer. The outputs of the first Detect layer, the second Detect layer, and the third Detect layer respectively output seedling images marked with key points of the seedling leaf sheath.

[0022] Furthermore, the DSC-C2f layer includes an eighth Conv layer, a Split layer, n DSC-Bneck layers, a fifth Concat layer, and a ninth Conv layer connected in series. The outputs of the eighth Conv layer, the Split layer, and the n DSC-Bneck layers are all connected to the input of the fifth Concat layer. The DSC-Bneck layer includes a tenth Conv layer, a DySnakeConv layer, an eleventh Conv layer, and an Add layer connected in series. The input of the tenth Conv layer is also connected to the input of the Add layer. The DySnakeConv layer includes a twelfth Conv layer, a DSConv layer, and a sixth Concat layer. The input of the twelfth Conv layer serves as the input of the DySnakeConv layer. The output of the twelfth Conv layer is connected to the input of the DSConv layer. The output of the DSConv layer is connected to the input of the sixth Concat layer. The output of the sixth Concat layer serves as the output of the DySnakeConv layer.

[0023] The eighth, ninth, tenth, eleventh, and twelfth Conv layers are all used for feature extraction; the Split layer is used to segment the input tensor; the fifth and sixth Concat layers are used for feature fusion; the Add layer is used to add the input tensor element by element; and the DSConv layer is used to decompose the standard convolution into two steps: depthwise convolution and pointwise convolution, reducing the amount of computation and parameters.

[0024] Furthermore, in step 4, the specific steps for extracting the rows of seedlings from the image of the seedlings to be identified are as follows:

[0025] Step 4.1: Locate the vanishing point of the seedling in the image of the seedling to be identified based on the key points of the seedling leaf sheath;

[0026] Step 4.2: Correct the top view of the seedling image to be identified and the key points of the seedling leaf sheath based on the seedling vanishing point to obtain the corrected top view of the seedling and each corrected key point.

[0027] Step 4.3: Perform row fitting on each correction key point to obtain the corrected seedling rows.

[0028] Furthermore, in step 4.1, the specific steps for locating the vanishing point of the seedling in the image of the seedling to be identified based on the key points of the seedling leaf sheath are as follows:

[0029] Step 4.1.1: Set the diameter of the dense region circle and the number of points threshold. The diameter of the region circle is 8 to 12 pixels in sequence, and the number of points threshold is 2 to 4.

[0030] Step 4.1.2: Generate initial line segments for each row on the seedling image to be identified based on the obtained key points of the seedling leaf sheath, so that the key points of each seedling leaf sheath are distributed close to each other along the corresponding initial line segments.

[0031] Step 4.1.3: Represent each generated initial line segment in two-dimensional homogeneous coordinates and extend each initial line segment to the far end;

[0032] Step 4.1.4: Obtain the intersection points of the extended initial line segments of each row, and then establish a region circle on the seedling image to be identified according to the diameter of the region circle. When the number of intersection points in the region circle reaches the maximum and is greater than the point number threshold, stop moving. Take the region circle at this time as the dense intersection point region, and then set the center point of the dense intersection point region as the seedling disappearance point.

[0033] Furthermore, in step 4.2, the specific steps for top-view correction of the seedling image to be identified and the key points of the seedling leaf sheath based on the seedling vanishing point are as follows:

[0034] Step 4.2.1: Establish a homography matrix H to describe the mapping relationship between the seedling image to be identified and the corrected top view of the seedlings. Then, solve the homography matrix H based on the vanishing point coordinates of the seedlings and the measured row spacing of the seedlings.

[0035] Step 4.2.2: Establish the coordinate projection relationship between the pixel coordinates in the seedling image to be identified and the pixel coordinates in the corrected top view of the seedling based on the homography matrix H;

[0036] Step 4.2.3: Correct the key points of the seedling leaf sheath in the seedling image to be identified to the corrected key points in the top view of the seedling, based on the coordinate projection relationship.

[0037] Step 4.2.4: Correct the initial line segments to the corrected line segments in the top view of the seedlings according to the coordinate projection relationship.

[0038] Furthermore, in step 4.3, the specific steps for obtaining the corrected seedling rows by performing row fitting on each key correction point are as follows:

[0039] Step 4.3.1: Perform vertical ground projection on each key correction point in the top view of the corrected seedlings to obtain each seedling projection point. Then, put each seedling projection point that is adjacent to the same correction line segment into the same point set to obtain each projection point set.

[0040] Step 4.3.2: Set the reference search point, search angle, and search radius. Based on the search angle and search radius, establish two fan-shaped regions with opposite vertical angles centered on the reference search point. The angle bisector of the central angle of the fan-shaped region is parallel to the correction line segment adjacent to the reference search point.

[0041] Step 4.3.3: Select a set of projection points, take out a seedling projection point as the reference search point, establish two fan-shaped regions of the seedling projection point according to the search angle and search radius, translate the correction line segment adjacent to the reference search point, and draw two boundary lines when translating to the two ends of the arc of the fan-shaped region. Take the area between the two boundary lines as the search range, and take each seedling projection point in the search range as the same type of search result for the current seedling projection point.

[0042] Step 4.3.4: Determine whether each seedling projection point in the current projection point set has obtained the same type of search result. If all of them have obtained the same type of search result, proceed to step 4.3.5; otherwise, return to step 4.3.3.

[0043] Step 4.3.5: Compare the size of each similar search result corresponding to the current set of projection points, find the similar search result with the most seedling projection points, and use the least squares method to fit a straight line for each seedling projection point in the found similar search result to obtain the fitted straight line corresponding to the current set of projection points.

[0044] Step 4.3.6: Determine whether each set of projection points has obtained a fitted line. If no fitted line has been obtained, return to step 4.3.3. If the corresponding fitted line has been obtained, then each fitted line is used as the corrected seedling row.

[0045] Furthermore, in step 5, the specific steps for mapping the extracted seedling rows back to the coordinate system on the seedling image to be identified are as follows:

[0046] Step 5.1, calculate the inverse homography matrix H of the homography matrix H. -1 ;

[0047] Step 5.2: Calculate the equations of the straight lines corresponding to each row of seedlings after correction, and then use the inverse homography matrix H... -1Map each straight line equation back to the coordinate system on the image of the seedling to be identified.

[0048] Compared with the prior art, the beneficial effects of this invention are: (1) This invention makes full use of the growth characteristics of early seedlings, that is, in terms of tissue structure, the leaf sheath is more resistant to wind disturbance than the leaf blade, so that the early seedling row acquisition method based on leaf sheath recognition can significantly reduce the wind disturbance error by about 70%, and improve the detection accuracy and computational efficiency; (2) The improved YOLOv8 model proposed in this invention integrates dynamic snake convolution (DSC-C2F), CBAM attention mechanism and C3Ghost lightweight module, which improves the mAP@50 index by 2.9%, and the model parameters The amount is 4.12M, and the single frame detection time is less than 30ms, which significantly improves the detection accuracy and computational efficiency; (3) The VP-PSE strategy proposed in this invention reduces the lateral distance error by 46.3% (from 6.17 pixels to 3.31 pixels) and the angle error by 33.8% (from 5.76° to 3.82°) compared with the traditional K-Means and DBSCAN algorithms. The single frame fitting time is 29.18ms, which is suitable for low-power embedded devices of intelligent agricultural machinery equipment and provides a more stable seedling row detection scheme for field automation operations. Attached Figure Description

[0049] Figure 1 Flowchart of the present invention;

[0050] Figure 2 This is an illustration of early-stage rice seedlings under conditions of interference such as algae, weeds, and wind.

[0051] Figure 3 This is a schematic diagram illustrating autonomous navigation of early seedling rows based on two methods: leaf recognition and leaf sheath recognition.

[0052] Figure 4 An improved YOLOv8 model architecture diagram;

[0053] Figure 5 for Figure 4 The structural block diagram of the DSC-C2F module in the diagram;

[0054] Figure 6 for Figure 4 Note the block diagram of the CBAM channel in the diagram;

[0055] Figure 7 for Figure 4 Spatial attention module block diagram of CBAM;

[0056] Figure 8 A schematic diagram of the seedling row clustering and fitting method based on VP-PSE;

[0057] Figure 9Comparison of target detection results for leaf sheath recognition under interference scenarios such as algae and wind blowing;

[0058] Figure 10 Comparison of seedling row detection methods for identifying leaf sheaths and leaf blades;

[0059] Figure 11 A comparison of errors in seedling row detection methods for identifying leaf sheaths and leaves. Detailed Implementation

[0060] The technical solution of the present invention will be described in detail below with reference to the accompanying drawings, but the scope of protection of the present invention is not limited to the embodiments described.

[0061] like Figure 1 As shown, the method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition disclosed in this invention includes the following steps:

[0062] Step 1: Construct an augmented set of field image data of early rice seedlings in paddy fields;

[0063] Step 2: Construct a target detection model for locating key points of seedling leaf sheaths, and train the target detection model using field image data augmentation set;

[0064] Step 3: Read the image of the seedling to be identified, establish a coordinate system on the image of the seedling to be identified, and then use the target detection model to identify the key points of the seedling leaf sheath in the image of the seedling to be identified, and output the key points of the seedling leaf sheath in the image of the seedling to be identified.

[0065] Step 4: Based on the vanishing point and perspective geometry (VP-PSE) strategy, perform perspective correction and cluster fitting on the key points of the seedling leaf sheath in the seedling image to be identified, and extract the seedling rows in the seedling image to be identified.

[0066] Step 5: Map the extracted seedling rows back to the coordinate system on the seedling image to be identified, and use the seedling image to be identified along with the seedling rows as the final detection result.

[0067] Normally growing seedlings Figure 3 As shown in Figure a, by accurately extracting key feature points of rice seedling leaf sheaths, the impact of wind disturbance and changes in growth stage on positioning accuracy is reduced. Even when the outline of rice leaves is blurred in the image, this method can still accurately identify the leaf sheath position, ensuring the stability and reliability of the detection. This improves the navigation accuracy and operational stability of weeding robots in complex field environments. Figure 3 As shown in e. In the horizontal direction, the positioning is adjusted based on the maximum tilt of the leaf sheath in the image; in the vertical direction, the detection range extends downwards from the leaf ring (the junction of the leaf sheath and the leaf blade), covering the portion of the seedling above the soil surface, such as... Figure 3 As shown in c.

[0068] Furthermore, in step 1, the specific steps for constructing the field image data augmentation set of early rice seedlings are as follows:

[0069] Step 1.1: Acquire images of early rice seedlings in paddy fields under different weather conditions, at different locations, and in different locations, according to the set image resolution. Figure 2 As shown, to evaluate and verify the strong adaptability of the subsequent detection model to complex environments, early rice seedling images were collected under various weather conditions, including sunny, cloudy, and overcast days, with wind speeds ranging from 0 to 2 m / s. The dynamic deformation of the leaves due to wind was obvious. A total of 3,300 images were collected, covering scenes such as algae, weeds, and wind. The image resolution was 640×640. During this period, the images were acquired at different times and locations to ensure that there were no overlapping areas in the images.

[0070] Step 1.2 involves performing data augmentation processing on the collected early rice seedling images from various paddy fields, including cropping, rotation, brightness adjustment, and Gaussian noise injection. Data augmentation processing can enhance the detection model's ability to recognize data in complex field environments.

[0071] Step 1.3: Construct a field image data augmentation set using early rice seedling images from various paddy fields after data augmentation processing. Divide the field image data augmentation set into a training set, a validation set, and a test set according to a set ratio. Through cropping, rotation, brightness adjustment, and Gaussian noise injection, the 3300 augmented images are divided into the training set, validation set, and test set in a 7:2:1 ratio.

[0072] Furthermore, in step 2, the specific steps for constructing the target detection model for locating key points of the seedling leaf sheath are as follows:

[0073] Step 2.1, define the location range of the key points of the seedling leaf sheath as the area of ​​the seedling extending from the leaf ring to above the soil surface, such as... Figure 3 As shown, in order to address the problem that early and fragile seedlings are easily disturbed by wind, the key point positioning range of the seedling leaf sheath is located in the area where the leaf ring extends to the soil surface, which avoids positioning deviation caused by disturbance and improves the detection accuracy of seedlings in the initial growth stage.

[0074] Step 2.2: Construct an improved YOLOv8 network model as the object detection model, such as... Figure 4 As shown, given the high real-time requirements of early precision operation equipment in paddy fields, in order to achieve lightweight model and efficient processing, an improved YOLOv8 was constructed based on the YOLOv8n and YOLOv8s network models, which takes into account both real-time and accuracy requirements.

[0075] Step 2.3: Train the object detection model using the training set, then validate the trained object detection model using the validation set. After successful validation, test the object detection model using the test set. Once the test is passed, the object detection model is complete.

[0076] Furthermore, in step 2.3, when training the target detection model using the training set, the Focal Loss loss function is used to alleviate the class imbalance problem between seedlings and weeds. The initial learning rate (0.001) is dynamically adjusted by cosine annealing learning rate scheduling, and mixed precision training (FP16 / FP32) is used to accelerate the calculation. The batch size is set to 32. The trained target detection model can accurately output the key points of the seedling leaf sheath, providing reliable input for subsequent seedling row extraction.

[0077] Furthermore, such as Figure 4 As shown in step 2.2, the constructed improved YOLOv8 network model includes a backbone network module, a neck module, and a head module. Adaptive deformable convolution enhances the feature extraction capability for curved leaf sheath structures. The backbone network module comprises, in sequence, a first Conv layer, a second Conv layer, a first C2f layer, a third Conv layer, a second C2f layer, a fourth Conv layer, a third C2f layer, a fifth Conv layer, a DSC-C2f layer, a CBAM layer, an SPPF layer, and a C3Ghost layer. The neck module comprises, in sequence, a first Upsample layer, a first Concat layer, a fourth C2f layer, a second Upsample layer, a second Concat layer, a fifth C2f layer, a sixth Conv layer, a third Concat layer, a sixth C2f layer, a seventh Conv layer, a fourth Concat layer, and a seventh C2f layer. The head module comprises a first Detect layer, a second Detect layer, and a third Detect layer.

[0078] The first, second, third, fourth, fifth, sixth, and seventh Conv layers are all used for feature extraction, and each Conv layer is a standard convolutional layer. The first, second, third, fourth, fifth, sixth, and seventh C2f layers are used for feature extraction and information fusion of the input feature map to obtain a higher-level feature representation. The DSC-C2f layer is a dynamic snake-shaped convolutional layer used to enhance feature extraction capabilities through adaptive deformable convolution. The CBAM layer is a convolutional block attention layer used to suppress background noise interference. The SPPF (Spatial Pyramid Pooling Fast) layer is used to capture multi-scale information. The C3Ghost layer is used to reduce the number of network parameters and computational cost through a redundant feature generation strategy. The first Up... The sample layer and the second upsample layer are both used for upsampling; the first concat layer, the second concat layer, the third concat layer, and the fourth concat layer are all used for feature fusion; the first detector layer, the second detector layer, and the third detector layer are all used for predicting bounding boxes and categories.

[0079] The DSC-C2f layer enhances the network's ability to capture multi-scale features in rice seedling row recognition. By introducing a serpentine convolutional structure, the DSC-C2f layer can adaptively adjust the path of the convolutional kernel according to different regions of the input image to adapt to targets of different shapes and sizes. This structure effectively improves the recognition accuracy of complex backgrounds in rice paddies and subtle differences between rice seedlings, especially in cluttered backgrounds or densely packed rows, enabling more accurate positioning of seedling rows. The CBAM layer automatically adjusts the weight distribution of different channels and spatial locations in the feature map, highlighting key leaf sheath regions in the image. The C3Ghost layer reduces the number of network parameters and computation through a redundant feature generation strategy. By using a single-repetition structure, the total number of parameters is compressed to below 5M, significantly reducing floating-point operations and achieving a real-time inference efficiency of less than 30ms per frame detection time. This provides a lightweight, high-precision, and low-latency detection solution for precision agriculture equipment.

[0080] The input of the first Conv layer is used to input the seedling image to be identified. The output of the second C2f layer is also connected to the input of the second Concat layer. The output of the third C2f layer is also connected to the input of the first Concat layer. The output of the C3Ghost layer is connected to the input of the second Upsample layer and the input of the fourth Concat layer, respectively. The output of the fifth C2f layer is also connected to the input of the first Detect layer. The input of the sixth C2f layer is also connected to the input of the second Detect layer. The input of the seventh C2f layer is connected to the input of the third Detect layer. The outputs of the first Detect layer, the second Detect layer, and the third Detect layer respectively output seedling images marked with key points of the seedling leaf sheath.

[0081] The CBAM (Convolutional Block Attention Module) layer includes channel attention modules and spatial attention modules. The channel attention layer includes max pooling layers, average pooling layers, and a shared multilayer perceptron, such as... Figure 6 As shown; the spatial attention layer includes a max pooling layer, an average pooling layer, a convolutional layer, and a sigmoid activation function, such as... Figure 7 As shown; the max pooling layer is used to max pool each channel of the input feature map, extracting the most salient features of each channel and forming global information; the average pooling layer is used to average pool each channel of the input feature map, generating global average information for each channel; the multilayer perceptron is used to perform nonlinear transformation on the pooled features through fully connected layers, further capturing the dependencies between channels and generating weights for each channel. MLPs are usually composed of one or more fully connected layers and processed by an activation function (ReLU); the convolution operation is used to process the pooled features through convolutional layers (1x1 convolutions) to generate spatial attention maps. These convolutional layers help the model learn the relationships between different spatial regions in the image; the sigmoid activation function is used to process the convolution output through the sigmoid activation function, restricting the values ​​of the spatial attention map to between [0,1], representing the importance of each spatial location.

[0082] Furthermore, such as Figure 5As shown, the DSC-C2f (Dynamic Snake Convolutional Module) layer includes an eighth Conv layer, a Split layer, n DSC-Bneck layers, a fifth Concat layer, and a ninth Conv layer connected in sequence. The outputs of the eighth Conv layer, the Split layer, and the n DSC-Bneck layers are all connected to the input of the fifth Concat layer. The DSC-Bneck layer includes a tenth Conv layer, a DySnakeConv layer, an eleventh Conv layer, and an Add layer connected in sequence. The input of the tenth Conv layer is also connected to the input of the Add layer. The DySnakeConv layer includes a twelfth Conv layer, a DSCConv layer, and a sixth Concat layer. The input of the twelfth Conv layer serves as the input of the DySnakeConv layer, the output of the twelfth Conv layer is connected to the input of the DSCConv layer, the output of the DSCConv layer is connected to the input of the sixth Concat layer, and the output of the sixth Concat layer serves as the output of the DySnakeConv layer.

[0083] The eighth, ninth, tenth, eleventh, and twelfth Conv layers are all used for feature extraction; the Split layer is used to segment the input tensor; the fifth and sixth Concat layers are used for feature fusion; the Add layer is used to add the input tensor element by element; the DSConv (Dynamic Snake Convolutional Network) layer is used to decompose the standard convolution into two steps: depthwise convolution and pointwise convolution, reducing the amount of computation and parameters.

[0084] Furthermore, in step 4, the specific steps for extracting the seedling rows from the seedling image by performing perspective correction and cluster fitting on the key points of the seedling leaf sheath in the seedling image based on the vanishing point and perspective geometry (VP-PSE) strategy are as follows:

[0085] Step 4.1: Locate the vanishing point of the seedling in the image of the seedling to be identified based on the key points of the seedling leaf sheath;

[0086] Step 4.2: Correct the top view of the seedling image to be identified and the key points of the seedling leaf sheath based on the seedling vanishing point to obtain the corrected top view of the seedling and each corrected key point.

[0087] Step 4.3: Perform row fitting on each correction key point to obtain the corrected seedling rows.

[0088] Furthermore, in step 4.1, the specific steps for locating the vanishing point of the seedling in the image of the seedling to be identified based on the key points of the seedling leaf sheath are as follows:

[0089] Step 4.1.1: Set the diameter of the dense area circle and the number of points threshold. The diameter of the area circle is 8 to 12 pixels of sequential length, and the number of points threshold is 2 to 4. The area circle diameter is preferably set to 10 pixels of sequential length, and the number of points threshold is preferably set to 3.

[0090] Step 4.1.2, based on the obtained key points of the seedling leaf sheath {(x i ,y i Generate initial line segments for each row on the image of the seedling to be identified, and key points of the seedling leaf sheath {(x i ,y i )}like Figure 8 As shown in a, the generated initial line segments are as follows: Figure 8 As shown in b, the key points of each seedling leaf sheath are distributed close together along the corresponding initial line segment of the row.

[0091] Step 4.1.3: Convert each generated initial line segment into two-dimensional homogeneous coordinates l i =(a i ,b i ,c i ) T The form is used to represent the line segments, and each initial line segment is extended to the far end, such as... Figure 8 As shown in c;

[0092] Step 4.1.4: Obtain the intersection points of the extended initial line segments of each row, and then establish a region circle on the seedling image to be identified according to the diameter of the region circle. When the number of intersection points in the region circle reaches the maximum and is greater than the point number threshold, stop moving. Take the region circle at this time as the dense intersection point region, and then set the center point of the dense intersection point region as the seedling disappearance point.

[0093] Furthermore, in step 4.2, the specific steps for top-view correction of the seedling image to be identified and the key points of the seedling leaf sheath based on the seedling vanishing point are as follows:

[0094] Step 4.2.1: Establish a homography matrix H to describe the mapping relationship between the seedling image to be identified and the corrected top view of the seedling. The homography matrix H is a 3×3 matrix. Then, solve the homography matrix H according to the coordinates of the seedling vanishing point and the measured row spacing of the seedlings.

[0095] Step 4.2.2: Establish the pixel coordinates p = (x, y, 1) in the image of the seedling to be identified based on the homography matrix H. T The pixel coordinates of the corrected top view of the seedlings are p′=(x′,y′,1). T The coordinate projection relationship is p′~Hp, and the top view of the seedling is as follows: Figure 8 As shown in d;

[0096] Step 4.2.3: Based on the coordinate projection relationship, identify the key points {(x} of the seedling leaf sheath in the seedling image to be identified. i ,y i The key points for correction in the top view of the seedlings are {(x′)}. i ,y′ i )};

[0097] Step 4.2.4: Correct the initial line segments to the corrected line segments in the top view of the seedlings according to the coordinate projection relationship.

[0098] Furthermore, in step 4.3, the specific steps for obtaining the corrected seedling rows by performing row fitting on each key correction point are as follows:

[0099] Step 4.3.1: Perform vertical ground projection on each key correction point in the top view of the corrected seedlings to obtain each seedling projection point. Then, put each seedling projection point that is adjacent to the same correction line segment into the same point set to obtain each projection point set.

[0100] Step 4.3.2: Set the reference search point, search angle, and search radius. The preferred search angle is 5°, and the preferred search radius is a length of 5 pixels. Based on the search angle and search radius, establish two fan-shaped regions with opposite vertical angles, centered on the reference search point. The angle bisector of the central angle of each fan-shaped region is parallel to the correction line segment adjacent to the reference search point. Figure 8 As shown in d and 8e;

[0101] Step 4.3.3: Select a set of projection points, take out a seedling projection point as the reference search point, establish two fan-shaped regions of the seedling projection point according to the search angle and search radius, translate the correction line segment adjacent to the reference search point, and draw two boundary lines when translating to the two ends of the arc of the fan-shaped region. Take the area between the two boundary lines as the search range, and take each seedling projection point in the search range as the same type of search result for the current seedling projection point.

[0102] Step 4.3.4: Determine whether each seedling projection point in the current projection point set has obtained the same type of search result. If all of them have obtained the same type of search result, proceed to step 4.3.5; otherwise, return to step 4.3.3.

[0103] Step 4.3.5: Compare the size of each similar search result corresponding to the current set of projection points, find the similar search result with the most seedling projection points, and use the least squares method to fit a straight line for each seedling projection point in the found similar search result to obtain the fitted straight line corresponding to the current set of projection points.

[0104] Step 4.3.6: Determine whether each set of projection points has obtained a fitted line. If no fitted line has been obtained, return to step 4.3.3. If the corresponding fitted line has been obtained, then each fitted line is used as the corrected seedling row.

[0105] Furthermore, in step 5, the specific steps for mapping the extracted seedling rows back to the coordinate system on the seedling image to be identified are as follows:

[0106] Step 5.1, calculate the inverse homography matrix H of the homography matrix H. -1 For the corrected coordinate p′, the corresponding original image coordinates satisfy p~H -1 p′;

[0107] Step 5.2: Calculate the equations of the straight lines corresponding to each row of seedlings after correction, and then use the inverse homography matrix H... -1 Map each straight line equation back to the coordinate system on the image of the seedling to be identified, such as Figure 8 As shown in f.

[0108] A verification experiment was conducted on the method for accurate acquisition of early rice seedling rows based on leaf sheath recognition disclosed in this invention. Two interference environments—algae and wind—were used as examples. The experimental results are as follows: Figures 9-11 As shown.

[0109] To verify the seedling detection performance of the early rice seedling row acquisition method based on leaf sheath recognition under different field conditions, experiments were conducted under two interference environments: algae and wind. The results are as follows: Figure 9 As shown. In algae-disturbed environments, such as Figure 9 As shown in a, although this model can accurately detect most seedlings, identifying individual seedlings still presents certain challenges, such as... Figure 9 As shown by the white circle, this may be because the similarity in color between algae and leaves, or the morphology of withered leaves, interferes with the model's ability to extract features from the leaf sheath region. In wind-affected environments, such as... Figure 9 As shown in f, the visibility of some seedlings decreased due to leaf movement, thus affecting the model's detection accuracy. It is evident that the model's detection accuracy was somewhat affected by both algae and wind interference, but overall it still maintained good recognition capabilities. Therefore, the leaf sheath recognition method proposed in this invention demonstrates strong robustness and good detection performance in various field environments.

[0110] To verify the accuracy of fitting early seedling rows based on VP-PSE, manually labeled seedling rows were used as a benchmark. The average lateral distance and angle between the fitted result and the manually labeled rows were calculated. The search angle was set to 5°, and the search radius was set to 5 pixels. Let K represent the accuracy of fitting early seedling rows. This represents the average lateral distance error. To represent the average angular error, K, and It can be calculated using formula (1).

[0111]

[0112] In the formula, N is the number of images that correctly fit the early seedling rows, M is the total number of images participating in the test, U is the total number of point pairs selected at the same horizontal position on the two straight lines, and L... n α represents the lateral distance between the nth pair of points. i Let be the angle error of the i-th image.

[0113] The early seedling row fitting test dataset contains 120 seedling images taken in complex scenarios, covering various complex situations such as missing seedlings, interstitial seedlings, and irregular seedling rows. These complex scenarios do not occur independently but often alternate or coexist within the same seedling row, significantly increasing the difficulty of seedling row fitting. The early seedling rows were obtained using the VP-PSE algorithm combined with typical clustering algorithms (K-Means, DBSCAN) and the least squares method, and the results were compared with manually labeled images. The results are shown in Table 1.

[0114] Table 1 Comparison of Detection Accuracy Results of VP-PSE Algorithm, K-Means Algorithm, and DBSCAN Algorithm

[0115]

[0116] Table 1 shows that the VP-PSE algorithm achieved a fitting accuracy of 93.33% on 120 test images, successfully fitting 112 images. The average lateral distance was 3.31, and the average angle error was 3.82°, significantly lower than the K-Means and DBSCAN algorithms (6.17 and 4.83, 4.67° and 5.76° respectively). This indicates that the VP-PSE algorithm excels in fitting accuracy (K) and average angle error... In all aspects, it outperforms the K-Means and DBSCAN algorithms. Regarding the fitting time for a single image, the VP-PSE algorithm (29.18ms) and the K-Means algorithm (28.33ms) are similar, while the DBSCAN algorithm requires the longest fitting time of 49.27ms. This indicates that the VP-PSE algorithm not only provides high-precision fitting results but also possesses high computational efficiency.

[0117] Under two interference environments—algae-infested and wind-blown—the results of acquiring early seedling rows based on leaf sheath and leaf blade recognition were compared. Figure 10 and 11 As shown. Figure 10In the diagram, the yellow line represents the early seedling rows marked manually, the red line represents the early seedling rows identified based on leaf sheaths, and the white line represents the early seedling rows identified based on leaves.

[0118] When faced with algae disturbing the environment, by Figure 11 It can be seen that the row errors for early seedling identification based on leaf sheath and leaf blade recognition are 3.26 and 6.73, respectively. The reason for the larger row error in early seedling identification based on leaf blade recognition may be due to the reduced visible portion of leaves in the edge area and the change in leaf angle, which increases the difficulty of row line identification. Figure 10 It was also found that, especially as the rows of seedlings gradually approached the edge of the image, the differences between the rows gradually increased.

[0119] When faced with wind-induced environmental disturbances, by Figure 11 It can be seen that the errors in obtaining early seedling rows based on leaf sheath and leaf identification are 3.38 and 11.57, respectively. Obviously, the influence of wind interference on the error in obtaining early seedling rows based on leaf sheath and leaf identification is greater than that of algae interference, especially on the error in obtaining early seedling rows based on leaf identification.

[0120] It is evident that the error in obtaining early seedling rows based on leaf sheath identification does not change significantly under both algae and wind interference environments. This identification method can accurately identify relatively stable key points of early seedling leaf sheaths, effectively suppress interference from seedling leaves, and well solve the problem of accurate identification of early seedling rows.

[0121] As described above, although the invention has been shown and described with reference to specific preferred embodiments, it should not be construed as limiting the invention itself. Various changes in form and detail may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition, characterized in that, Includes the following steps: Step 1: Construct an augmented set of field image data of early rice seedlings in paddy fields; Step 2: Construct a target detection model for locating key points of seedling leaf sheaths, and train the target detection model using field image data augmentation set; Step 3: Read the image of the seedling to be identified, establish a coordinate system on the image of the seedling to be identified, and then use the target detection model to identify the key points of the seedling leaf sheath in the image of the seedling to be identified, and output the key points of the seedling leaf sheath in the image of the seedling to be identified. Step 4: Perform perspective correction and cluster fitting on the key points of the leaf sheaths of the seedlings in the image of the seedlings to be identified, and extract the rows of seedlings in the image of the seedlings to be identified. Step 5: Map the extracted seedling rows back to the coordinate system on the seedling image to be identified, and use the seedling image to be identified along with the seedling rows as the final detection result; In step 2, the specific steps for constructing a target detection model for locating key points of the seedling leaf sheath are as follows: Step 2.1: Set the positioning range of the key points of the seedling leaf sheath to the seedling area where the leaf ring extends to the soil surface; Step 2.2: Construct an improved YOLOv8 network model as the object detection model; Step 2.3: Train the object detection model using the training set, then validate the trained object detection model using the validation set, and test the object detection model using the test set after the validation is passed. After the test is passed, the object detection model is completed. In step 2.2, the constructed improved YOLOv8 network model includes a backbone network module, a neck module, and a head module. The backbone network module includes the following layers connected in sequence: a first Conv layer, a second Conv layer, a first C2f layer, a third Conv layer, a second C2f layer, a fourth Conv layer, a third C2f layer, a fifth Conv layer, a DSC-C2f layer, a CBAM layer, an SPPF layer, and a C3Ghost layer. The neck module includes the following layers connected in sequence: a first Upsample layer, a first Concat layer, a fourth C2f layer, a second Upsample layer, a second Concat layer, a fifth C2f layer, a sixth Conv layer, a third Concat layer, a sixth C2f layer, a seventh Conv layer, a fourth Concat layer, and a seventh C2f layer. The head module includes a first Detect layer, a second Detect layer, and a third Detect layer. The DSC-C2f layer is used to enhance feature extraction capabilities through adaptive deformable convolution; the CBAM layer is used to suppress background noise interference; the SPPF layer is used to capture multi-scale information; and the C3Ghost layer is used to reduce the number of network parameters and computational cost through a redundant feature generation strategy. The output of the C3Ghost layer is connected to the input of the first Upsample layer and the input of the fourth Concat layer, respectively; the outputs of the first Detect layer, the second Detect layer and the third Detect layer output seedling images marked with key points of the seedling leaf sheath, respectively.

2. The method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition according to claim 1, characterized in that, In step 1, the specific steps for constructing the field image data augmentation set of early rice seedlings are as follows: Step 1.1: Collect images of early rice seedlings in paddy fields under different weather conditions and at different locations according to the set image resolution; Step 1.2: Perform data augmentation processing on the collected early rice seedling images from various paddy fields, including cropping, rotation, brightness adjustment, and Gaussian noise injection. Step 1.3: Construct a field image data augmentation set using early rice seedling images from various paddy fields after data augmentation processing, and divide the field image data augmentation set into a training set, a validation set, and a test set according to a set ratio.

3. The method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition according to claim 1, characterized in that, The first, second, third, fourth, fifth, sixth, and seventh Conv layers are all used for feature extraction; the first, second, third, fourth, fifth, sixth, and seventh C2f layers are all used for feature extraction and information fusion of the input feature map to obtain a higher-level feature representation; the first and second Upsample layers are both used for upsampling. The first, second, third, and fourth Concat layers are all used for feature fusion; the first, second, and third Detect layers are all used for bounding box and category prediction. The input of the first Conv layer is used to input the seedling image to be identified. The output of the second C2f layer is also connected to the input of the second Concat layer. The output of the third C2f layer is also connected to the input of the first Concat layer. The output of the fifth C2f layer is also connected to the input of the first Detect layer. The output of the sixth C2f layer is also connected to the input of the second Detect layer. The output of the seventh C2f layer is connected to the input of the third Detect layer.

4. The method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition according to claim 1, characterized in that, The DSC-C2f layer comprises an eighth Conv layer, a Split layer, n DSC-Bneck layers, a fifth Concat layer, and a ninth Conv layer, all connected in series. The outputs of the eighth Conv layer, the Split layer, and the n DSC-Bneck layers are all connected to the input of the fifth Concat layer. The DSC-Bneck layer comprises a tenth Conv layer, a DySnakeConv layer, an eleventh Conv layer, and an Add layer, all connected in series. The input of the tenth Conv layer is also connected to the input of the Add layer. The DySnakeConv layer comprises a twelfth Conv layer, a DSCConv layer, and a sixth Concat layer. The input of the twelfth Conv layer serves as the input of the DySnakeConv layer. The output of the twelfth Conv layer is connected to the input of the DSCConv layer. The output of the DSCConv layer is connected to the input of the sixth Concat layer. The output of the sixth Concat layer serves as the output of the DySnakeConv layer. The eighth, ninth, tenth, eleventh, and twelfth Conv layers are all used for feature extraction; the Split layer is used to segment the input tensor; the fifth and sixth Concat layers are used for feature fusion; the Add layer is used to add the input tensor element by element; and the DSConv layer is used to decompose the standard convolution into two steps: depthwise convolution and pointwise convolution, reducing the amount of computation and parameters.

5. The method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition according to claim 1, characterized in that, In step 4, the specific steps for extracting the rows of seedlings from the image of the seedlings to be identified are as follows: Step 4.1: Locate the vanishing point of the seedling in the image of the seedling to be identified based on the key points of the seedling leaf sheath; Step 4.2: Correct the top view of the seedling image to be identified and the key points of the seedling leaf sheath based on the seedling vanishing point to obtain the corrected top view of the seedling and each corrected key point. Step 4.3: Perform row fitting on each correction key point to obtain the corrected seedling rows.

6. The method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition according to claim 5, characterized in that, In step 4.1, the specific steps for locating the vanishing point of the seedling in the image of the seedling to be identified based on the key points of the seedling leaf sheath are as follows: Step 4.1.1: Set the diameter of the dense region circle and the threshold for the number of points. The diameter of the region circle is 8 to 12 pixels in sequence, and the threshold for the number of points is 2 to 4. Step 4.1.2: Generate initial line segments for each row on the seedling image to be identified based on the obtained key points of the seedling leaf sheath, so that the key points of each seedling leaf sheath are distributed close to each other along the corresponding initial line segments. Step 4.1.3: Represent each generated initial line segment in two-dimensional homogeneous coordinates and extend each initial line segment to the far end; Step 4.1.4: Obtain the intersection points of the extended initial line segments of each row, and then establish a region circle on the seedling image to be identified according to the diameter of the region circle. Move the region circle on the seedling image to be identified. When the number of intersection points in the region circle reaches the maximum and is greater than the point number threshold, stop moving. Take the region circle at this time as the dense intersection area, and then set the center point of the dense intersection area as the seedling disappearance point.

7. The method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition according to claim 6, characterized in that, In step 4.2, the specific steps for top-view correction of the seedling image to be identified and the key points of the seedling leaf sheath based on the seedling vanishing point are as follows: Step 4.2.1: Establish a homography matrix H to describe the mapping relationship between the seedling image to be identified and the corrected top view of the seedlings. Then, solve the homography matrix H based on the vanishing point coordinates of the seedlings and the measured row spacing of the seedlings. Step 4.2.2: Establish the coordinate projection relationship between the pixel coordinates in the seedling image to be identified and the pixel coordinates in the corrected top view of the seedling based on the homography matrix H; Step 4.2.3: Correct the key points of the seedling leaf sheath in the seedling image to be identified to the corrected key points in the top view of the seedling, based on the coordinate projection relationship. Step 4.2.4: Correct the initial line segments to the corrected line segments in the top view of the seedlings according to the coordinate projection relationship.

8. The method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition according to claim 7, characterized in that, In step 4.3, the specific steps for obtaining the corrected seedling rows by performing row fitting on each key correction point are as follows: Step 4.3.1: Perform vertical ground projection on each key correction point in the top view of the corrected seedlings to obtain each seedling projection point. Then, put each seedling projection point that is adjacent to the same correction line segment into the same point set to obtain each projection point set. Step 4.3.2: Set the reference search point, search angle, and search radius. Based on the search angle and search radius, establish two fan-shaped regions with opposite vertical angles centered on the reference search point. The angle bisector of the central angle of the fan-shaped region is parallel to the correction line segment adjacent to the reference search point. Step 4.3.3: Select a set of projection points, take out a seedling projection point as the reference search point, establish two fan-shaped regions of the seedling projection point according to the search angle and search radius, translate the correction line segment adjacent to the reference search point, and draw two boundary lines when translating to the two ends of the arc of the fan-shaped region. Take the area between the two boundary lines as the search range, and take each seedling projection point in the search range as the same type of search result for the current seedling projection point. Step 4.3.4: Determine whether each seedling projection point in the current projection point set has obtained the same type of search result. If all of them have obtained the same type of search result, proceed to step 4.3.5; otherwise, return to step 4.3.

3. Step 4.3.5: Compare the size of each similar search result corresponding to the current set of projection points, find the similar search result with the most seedling projection points, and use the least squares method to fit a straight line for each seedling projection point in the found similar search result to obtain the fitted straight line corresponding to the current set of projection points. Step 4.3.6: Determine whether each set of projection points has obtained a fitted line. If no fitted line has been obtained, return to step 4.3.

3. If the corresponding fitted line has been obtained, then each fitted line is used as the corrected seedling row.

9. The method for accurately acquiring early rice seedling rows in paddy fields based on leaf sheath recognition according to claim 7, characterized in that, In step 5, the specific steps for mapping the extracted seedling rows back to the coordinate system on the seedling image to be identified are as follows: Step 5.1, compute inverse homography H of homography H -1 ; Step 5.2, calculate the rectified straight line equation of each seedling row, and then map the rectified straight line equation back to the coordinate system of the image to be recognized according to the inverse homography matrix H -1 map the rectified straight line equation back to the coordinate system of the image to be recognized.