Lane line detection model inference optimization method trained with generalized data

By training a lane detection model with generalized data, using a deep learning vision model to repair and predict lane lines, and combining it with temporal information for evaluation, the problem of inaccurate lane line recognition is solved, and the robustness and accuracy of the autonomous driving system are improved.

CN119478869BActive Publication Date: 2026-06-23ANHUI JIANGHUAI AUTOMOBILE GRP CORP LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ANHUI JIANGHUAI AUTOMOBILE GRP CORP LTD
Filing Date
2024-10-30
Publication Date
2026-06-23

Smart Images

  • Figure CN119478869B_ABST
    Figure CN119478869B_ABST
Patent Text Reader

Abstract

The application discloses a lane line detection model inference optimization method trained by generalization data, and the main design concept of the application is that a lane line identification scheme of association learning+time sequence control is proposed based on a deep learning vision model. Specifically, in the training stage, the vision generation model is used to repair abnormal lane line pictures and predict rare or difficult-to-catch lane line pictures, so that the accuracy and efficiency of lane line labeling are improved in a man-machine coupling manner; in the actual inference stage, the time sequence concept is introduced, the mechanism for evaluating the detection result is proposed by combining the time sequence information of the front and rear frames. The application effectively enhances and generalizes the training data of the detection model, and improves the robustness and accuracy of the lane line detection result, so that many problems of lane line identification in automatic driving application can be solved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of autonomous driving technology, and in particular to an inference optimization method for lane detection models trained using generalized data. Background Technology

[0002] Lane recognition is crucial in autonomous driving, helping the system monitor and track lane markings in real time to ensure the vehicle stays centered in the lane, thus improving driving safety. By accurately recognizing lane lines, the system can effectively prevent lane departure, assist vehicle decisions such as changing lanes or overtaking, and collaborate with other autonomous driving functions to optimize the driving experience and overall safety.

[0003] However, in reality, lane markings may not be clear, especially in urban areas. Some lane markings may be obscured by other vehicles, or there may be no lane markings on some roads. These factors can all affect the accuracy and reliability of lane marking recognition. Currently, to address the problems in lane marking recognition, most technical approaches combine high-precision maps with real-time positioning systems (such as GPS) to obtain road geometry and topology information.

[0004] However, practical experience has revealed the following problems with combining high-precision maps and real-time positioning:

[0005] (1) Untimely map updates. High-precision maps need to be updated regularly. If the map data is outdated, it may lead to inaccurate lane line recognition.

[0006] (2) The map coverage is limited. Some areas, such as industrial parks and country roads, do not include lane markings on the map;

[0007] (3) High dependence on real-time positioning accuracy. Errors caused by positioning will affect lane line information, resulting in inaccurate lane line recognition;

[0008] (4) Road data updates. High-precision maps cannot reflect changes such as temporary roads, construction, or accidents in real time, resulting in discrepancies between the map and the actual situation. Summary of the Invention

[0009] In view of the above, the present invention aims to provide an inference optimization method for lane detection models trained with generalized data, so as to solve the aforementioned technical problems.

[0010] The technical solution adopted in this invention is as follows:

[0011] This invention provides an inference optimization method for a lane detection model trained with generalized data, comprising:

[0012] The first visual generation model and the second visual generation model are constructed respectively using the first training set and the second training set;

[0013] Obtain the image to be labeled and determine whether there are abnormal lane lines in the image;

[0014] An image containing abnormal lane lines is input into the first visual generation model to obtain an extended lane line image, in which the abnormal lane lines are repaired.

[0015] The images of lanes without abnormal lane lines and / or the lane line extension images are input into the second visual generation model to generate several lane line prediction images.

[0016] Annotate the lane line extension images and lane line prediction images;

[0017] A lane detection model is built using the labeled images, and the lane detection model is used to detect lane lines in the actual input images.

[0018] The detection results of the current frame are evaluated by combining the detection results of the previous and next frames.

[0019] In at least one of the possible implementations, the output of the second visual generation model is fed back into the first visual generation model for repair processing.

[0020] In at least one of the possible implementations, the output of the first visual generation model is judged for defects. If a defect is determined, the output is fed back into the first visual generation model for expansion.

[0021] In at least one possible implementation, the first training set is constructed using real lane line images under multiple scenarios and lighting conditions, and the images in the first training set include lane line anomalies.

[0022] In at least one possible implementation, the second training set is constructed using real lane line images containing time-series information, wherein the images in the second training set contain lane line conditions that change state over consecutive time periods.

[0023] In at least one possible implementation, the evaluation includes several combinations of the following scoring dimensions: distance consistency score, shape consistency score, smoothness score, and feature matching score.

[0024] Compared with existing technologies, the main design concept of this invention lies in proposing a lane line recognition scheme based on associative learning and temporal control, using a deep learning visual model. Specifically, during the training phase, a visual generative model is used to repair abnormal lane line images and predict rare or hard-to-capture lane line images, improving the accuracy and efficiency of lane line annotation through human-machine coupling. During the actual inference phase, a temporal concept is introduced, combining temporal information from previous and subsequent frames to propose a mechanism for evaluating the detection results. This invention effectively enhances and generalizes the training data of the detection model, improving the robustness and accuracy of lane line detection results, thereby solving many problems in lane line recognition in autonomous driving applications. Attached Figure Description

[0025] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described below with reference to the accompanying drawings, wherein:

[0026] Figure 1 This is a schematic diagram of an inference optimization method for a lane detection model trained using generalized data, provided in an embodiment of the present invention. Detailed Implementation

[0027] Embodiments of the present invention are described in detail below. Examples of these embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the present invention, and should not be construed as limiting the present invention.

[0028] This invention proposes an embodiment of an inference optimization method for a lane detection model trained with generalized data. Specifically, as follows: Figure 1 As shown, it includes:

[0029] Step S1: Pre-construct the first visual generation model and the second visual generation model;

[0030] Step S2: Obtain the image to be labeled and determine whether there are abnormal lane lines in the image;

[0031] Step S3: Input the image containing abnormal lane lines into the first visual generation model to obtain an extended lane line image, in which the abnormal lane lines are repaired;

[0032] Step S4: Input the image of the lane line without abnormal lane lines and / or the lane line extension image into the second vision generation model to generate several lane line prediction images;

[0033] In practice, a high-quality dataset can be prepared in advance, including lane line images under various scenarios and lighting conditions. This dataset can contain labeled data of real lane lines and cover different occlusion situations, such as vehicles, trees, or road obstacles. This labeled data will serve as the basis for training the first generative model, ensuring that the model can learn the features and occlusion states of lane lines.

[0034] Next, a suitable first visual generative model architecture is selected, such as, but not limited to, deep generative models like Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs), which can learn features from the input image and generate the corresponding lane line image. After selecting a suitable model architecture, the first model needs to be trained using the aforementioned dataset.

[0035] Furthermore, in practice, a temporal generation model architecture, among others, can be chosen as the architecture for the second visual generation model. This ensures the generation of predicted images for multiple future time periods based on the current lane line image. Specifically, during the training phase of the second model, a dataset containing time-series information is used to train the model. The dataset includes lane line images over consecutive time periods, allowing the model to learn the patterns and regularities of lane line changes over time. During training, the second model starts from the current image and generates future image sequences that encompass various possible variations, such as lane line deformation, background changes, and the movement of other elements on the road.

[0036] After the two visual generation models are trained and built, they are deployed to data annotation tools or platforms. When annotators process images:

[0037] On the one hand, the defect lane lines are automatically expanded by the first-view generation model to assist in the subsequent data supplementation process.

[0038] On the other hand, a second-vision generative model can be used to predict lane lines from a single image, generating multiple predicted images that illustrate possible changes in the lane lines over time. Annotators can then select representative predicted images for subsequent annotation processing.

[0039] Step S4: The lane line expansion image and lane line prediction image are manually labeled;

[0040] The above-described solution of this invention can improve annotation efficiency and reduce errors caused by manual intervention. Based on this, a feedback mechanism can be designed to correct and adjust the outputs of the first and second visual generation models, iterating the performance of the visual generation models. For example, the output of the second visual generation model can be filtered and then fed back into the first visual generation model to repair images predicted to have defects; alternatively, the output of the first visual generation model can be directly judged for defects (defects are the anomalies mentioned above, such as occlusion, unclear images, missing parts, etc.). If a defect is determined, it is fed back into the first visual generation model for expansion; if no defect is determined, it proceeds to the subsequent manual annotation stage.

[0041] Step S5: Construct a lane line detection model using the labeled images, and use the lane line detection model to detect lane lines in the actual input images;

[0042] Step S6: Evaluate the detection results of the current frame by combining the detection results of the previous and next frames.

[0043] Understandably, this invention recognizes that the inference results from a single frame image are easily affected by factors such as occlusion, missing lane lines, or blurring, leading to inaccurate detection results. Therefore, to improve the reliability of lane line detection, this invention proposes a method that combines the results from consecutive frames to construct an evaluation model, thereby significantly enhancing the detection results.

[0044] In practice, the main technical concept for constructing a time-series-based inference result evaluation model is to utilize time-series data to assess and optimize lane detection results. Specifically, after obtaining the lane detection results for the current frame, the detection results from previous and subsequent frames are input together with the current frame's detection results into the inference result evaluation model. Based on the consistency between previous and subsequent frames and the time-series patterns, the evaluation model provides a score or correction suggestions for the current frame's detection results. This method allows for adjustments to the lane detection results of the current frame, thereby improving detection accuracy.

[0045] The specific evaluation dimensions and process for the reasoning result evaluation model can be found as follows:

[0046] Distance consistency score: For each lane line in the current frame, find the lane line that is most similar to the lane lines in the previous and next frames, and calculate their distance;

[0047] Shape consistency score: Shape consistency is calculated using a shape similarity metric (Hausdorff distance);

[0048] Smoothness rating: The smoothness of lane line changes is assessed by using a smoothness metric (such as differential variance) to calculate the magnitude of change between the lane lines in the current frame and those in the preceding and following frames.

[0049] Feature matching score: Extract the feature vector of the lane line and calculate the cosine similarity between the feature vector of the lane line in the current frame and the feature vectors of the lane lines in the previous and next frames.

[0050] Weighted composite score: Weights are assigned to different scoring dimensions, and then a weighted average score is calculated. For example, the weights of distance consistency score, shape consistency score, smoothness score, and feature matching score are all 25%.

[0051] In summary, the main design concept of this invention lies in proposing a lane line recognition scheme based on associative learning and temporal control, using a deep learning visual model. Specifically, during the training phase, a visual generative model is used to repair abnormal lane line images and predict rare or hard-to-capture lane line images, improving the accuracy and efficiency of lane line annotation through human-machine coupling. During the actual inference phase, a temporal concept is introduced, combining temporal information from previous and subsequent frames to propose a mechanism for evaluating the detection results. This invention effectively enhances and generalizes the training data of the detection model, improving the robustness and accuracy of lane line detection results, thereby solving many problems in lane line recognition in autonomous driving applications.

[0052] In this invention, when directional terms are mentioned, they are relative concepts based on the embodiments. Furthermore, "at least one" refers to one or more, and "more than one" refers to two or more. "And / or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent the existence of A alone, A and B simultaneously, or B alone. A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects have an "or" relationship. "At least one of the following" and similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one of a, b, and c can represent: a, b, c, a and b, a and c, b and c, or a and b and c, where a, b, and c can be single or multiple.

[0053] The above description of the structure, features, and effects of the present invention is based on the embodiments shown in the figures. However, the above are only preferred embodiments of the present invention. It should be noted that the technical features involved in the above embodiments and their preferred methods can be reasonably combined and matched by those skilled in the art to form a variety of equivalent solutions without departing from or changing the design concept and technical effects of the present invention. Therefore, the present invention is not limited to the scope of implementation shown in the figures. Any changes made in accordance with the concept of the present invention, or modifications to equivalent embodiments, that do not exceed the spirit covered by the specification and figures, should be within the protection scope of the present invention.

Claims

1. A method for inference optimization of a lane detection model trained with generalized data, characterized in that, include: The first visual generation model and the second visual generation model are constructed respectively using the first training set and the second training set; Obtain the image to be labeled and determine whether there are abnormal lane lines in the image; An image containing abnormal lane lines is input into the first visual generation model to obtain an extended lane line image, in which the abnormal lane lines are repaired. The image without abnormal lane lines and the lane line extension image output by the first vision generation model are input into the second vision generation model to generate several lane line prediction images. After filtering the output results of the second vision generation model, they are fed back into the first vision generation model to repair the predicted images with defects. Annotate the lane line extension images and lane line prediction images; A lane detection model is built using the labeled images, and the lane detection model is used to detect lane lines in the actual input images. The detection results of the current frame are evaluated by combining the detection results of the previous and next frames.

2. The inference optimization method for lane detection models trained with generalized data according to claim 1, characterized in that, The output of the first vision generation model is judged for defects. If a defect is found, it is sent back to the first vision generation model for expansion.

3. The inference optimization method for lane detection models trained with generalized data according to claim 1, characterized in that, The first training set is constructed using real lane line images under multiple scenarios and lighting conditions. The images in the first training set include lane line anomalies.

4. The inference optimization method for lane detection models trained with generalized data according to claim 1, characterized in that, The second training set is constructed using real lane line images containing time-series information. The images in the second training set contain lane line conditions that change in state over consecutive time periods.

5. The inference optimization method for lane detection models trained with generalized data according to any one of claims 1 to 4, characterized in that, The evaluation includes several combinations of the following scoring dimensions: distance consistency score, shape consistency score, smoothness score, and feature matching score.