Lane line detection model training method
By extracting features and clustering lane line image samples, and combining local constraints of Bézier curves and gradient clustering methods, the problem of insufficient overall consideration in existing lane line detection methods is solved, thereby improving the performance and accuracy of the lane line detection model.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ALIBABA DAMO (HANGZHOU) TECH CO LTD
- Filing Date
- 2022-12-26
- Publication Date
- 2026-06-30
AI Technical Summary
Existing lane detection methods based on key points have insufficient overall considerations, resulting in poor lane detection performance and affecting the accuracy of subsequent lane prediction.
By extracting features from lane line image samples, first and second feature images are obtained, lane line clustering information and initial lane line key points are determined, and predicted lane lines are obtained through clustering. The lane line detection model is trained, and the overall consideration of network learning is enhanced by combining local constraints of Bézier curves and gradient clustering methods.
It improves the overall performance of the lane detection model, increases the accuracy and efficiency of lane detection, and enhances the clustering ability and segmentation accuracy of lane key points.
Smart Images

Figure CN115995069B_ABST
Abstract
Description
Technical Field
[0001] The embodiments in this specification relate to the field of computer technology, and in particular to a method for training a lane detection model. Background Technology
[0002] In the field of autonomous driving, lane line detection is an integral part of visual road environment perception, and having superior lane line detection capabilities is crucial.
[0003] Currently, lane detection methods based on key points do not adequately consider the overall situation during key point prediction, resulting in poor lane detection performance. Therefore, it is of great significance to obtain a high-performance lane detection model to improve the accuracy of subsequent lane prediction. Summary of the Invention
[0004] In view of the above, embodiments of this specification provide a lane line detection model training method. One or more embodiments of this specification also relate to a lane line detection method, a target object detection model training method, a lane line detection model training device, a lane line detection device, a target object detection model training device, a computing device, a computer-readable storage medium, and a computer program, to address the technical deficiencies existing in the prior art.
[0005] According to a first aspect of the embodiments of this specification, a lane line detection model training method is provided, comprising:
[0006] Determine lane line image samples and sample labels, wherein the sample label is the target lane line in the lane line image sample;
[0007] Feature extraction is performed on the lane line image sample to obtain a first feature image and a second feature image of the lane line image sample;
[0008] Based on the first feature image and the second feature image, determine the lane line clustering information and the initial lane line key points;
[0009] Clustering is performed based on the lane line clustering information and the initial lane line key points to obtain the predicted lane lines;
[0010] A lane detection model is trained based on the predicted lane lines and the target lane lines.
[0011] According to a second aspect of the embodiments of this specification, a lane line detection model training apparatus is provided, comprising:
[0012] The first determining module is configured to determine lane line image samples and sample labels, wherein the sample label is the target lane line in the lane line image sample;
[0013] The first acquisition module is configured to perform feature extraction on the lane line image sample to obtain a first feature image and a second feature image of the lane line image sample.
[0014] The second determining module is configured to determine lane line clustering information and initial lane line key points based on the first feature image and the second feature image.
[0015] The second acquisition module is configured to perform clustering based on the lane line clustering information and the initial lane line key points to obtain the predicted lane line;
[0016] The third acquisition module is configured to train a lane detection model based on the predicted lane line and the target lane line.
[0017] According to a third aspect of the embodiments of this specification, a lane line detection method is provided, comprising:
[0018] The target lane line image is input into the lane line detection model, wherein the lane line detection model is obtained by the lane line detection model training method described above.
[0019] Based on the lane line detection model, feature extraction is performed on the target lane line image to obtain a first feature image and a second feature image of the target lane line image;
[0020] Based on the first feature image and the second feature image, determine the target lane line clustering information and the initial lane line key points;
[0021] Clustering is performed based on the lane line clustering information and the key points of the initial lane lines to obtain the target lane lines.
[0022] According to a fourth aspect of the embodiments of this specification, a lane line detection device is provided, comprising:
[0023] The image input module is configured to input the target lane line image into the lane line detection model, wherein the lane line detection model is obtained by the lane line detection model training method described above.
[0024] The fourth acquisition module is configured to extract features from the target lane line image based on the lane line detection model to obtain a first feature image and a second feature image of the target lane line image.
[0025] The third determining module is configured to determine the target lane line clustering information and the initial lane line key points based on the first feature image and the second feature image.
[0026] The fifth acquisition module is configured to perform clustering based on the lane line clustering information and the initial lane line key points to obtain the target lane line.
[0027] According to a fifth aspect of the embodiments of this specification, a method for training a target object detection model is provided, comprising:
[0028] Identify target object image samples and sample labels, wherein the sample label is the target object in the target object image sample;
[0029] Feature extraction is performed on the target object image sample to obtain a first feature image and a second feature image of the target object image sample;
[0030] Based on the first feature image and the second feature image, determine the target object clustering information and the initial target object key points;
[0031] Clustering is performed based on the target object clustering information and the key points of the initial target object to obtain the predicted target object;
[0032] Based on the predicted target object and the target object, a target object detection model is trained to obtain the model.
[0033] According to a sixth aspect of the embodiments of this specification, a target object detection model training apparatus is provided, comprising:
[0034] The fourth determining module is configured to determine a target object image sample and a sample label, wherein the sample label is the target object in the target object image sample;
[0035] The sixth acquisition module is configured to extract features from the target object image sample to obtain a first feature image and a second feature image of the target object image sample;
[0036] The fifth determining module is configured to determine the target object clustering information and the initial target object key points based on the first feature image and the second feature image;
[0037] The seventh acquisition module is configured to perform clustering based on the target object clustering information and the initial target object key points to obtain the predicted target object;
[0038] The eighth acquisition module is configured to train a target object detection model based on the predicted target object and the target object.
[0039] According to a seventh aspect of the embodiments of this specification, a computing device is provided, comprising:
[0040] Memory and processor;
[0041] The memory is used to store computer-executable instructions, and the processor is used to execute the computer-executable instructions. When the computer-executable instructions are executed by the processor, they implement the steps of the above-described lane line detection model training method, or the steps of the above-described lane line detection method, or the steps of the above-described target object detection model training method.
[0042] According to an eighth aspect of the embodiments of this specification, a computer-readable storage medium is provided that stores computer-executable instructions, which, when executed by a processor, implement the steps of the lane line detection model training method described above, or implement the steps of the lane line detection method described above, or implement the steps of the target object detection model training method described above.
[0043] According to a ninth aspect of the embodiments of this specification, a computer program is provided, wherein when the computer program is executed in a computer, the computer is instructed to perform the steps of the lane line detection model training method, or to perform the steps of the lane line detection method, or to perform the steps of the target object detection model training method.
[0044] The lane detection model training method provided in this specification includes: determining lane line image samples and sample labels, wherein the sample labels are target lane lines in the lane line image samples; performing feature extraction on the lane line image samples to obtain a first feature image and a second feature image of the lane line image samples; determining lane line clustering information and initial lane line key points based on the first feature image and the second feature image; performing clustering based on the lane line clustering information and the initial lane line key points to obtain predicted lane lines; and training a lane line detection model based on the predicted lane lines and the target lane lines.
[0045] Specifically, this method extracts global and local features from the input lane line sample image to determine a first feature image (i.e., local features) and a second feature image (i.e., global features). Then, it predicts the lane lines based on the first and second feature images to obtain lane line clustering information and initial lane line key points. Subsequently, based on this lane line clustering information and initial lane line key points, considering both global and local features, the lane line key points are clustered to obtain predicted lane lines. This enables network training of the lane line detection model. By integrating the clustering process of lane line key points into the entire lane line detection model training process, the clustering of lane line key points is performed simultaneously with network training, thereby improving the overall performance of the lane line detection model. Attached Figure Description
[0046] Figure 1This is a schematic diagram of a scenario for a lane detection model training method provided in one embodiment of this specification;
[0047] Figure 2 This is a flowchart of a lane detection model training method provided in one embodiment of this specification;
[0048] Figure 3 This is a flowchart illustrating the processing steps of a lane detection model training method provided in one embodiment of this specification.
[0049] Figure 4 This is a flowchart of a lane line detection method provided in one embodiment of this specification;
[0050] Figure 5 This is a flowchart of a target object detection model training method provided in one embodiment of this specification;
[0051] Figure 6 This is a schematic diagram of the structure of a lane line detection model training device provided in one embodiment of this specification;
[0052] Figure 7 This is a schematic diagram of the structure of a lane line detection device provided in one embodiment of this specification;
[0053] Figure 8 This is a schematic diagram of the structure of a target object detection model training device provided in one embodiment of this specification;
[0054] Figure 9 This is a structural block diagram of a computing device provided in one embodiment of this specification. Detailed Implementation
[0055] Many specific details are set forth in the following description to provide a full understanding of this specification. However, this specification can be implemented in many other ways than those described herein, and those skilled in the art can make similar extensions without departing from the spirit of this specification. Therefore, this specification is not limited to the specific implementations disclosed below.
[0056] The terminology used in one or more embodiments of this specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of this specification. The singular forms “a,” “described,” and “the” as used in one or more embodiments of this specification and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used in one or more embodiments of this specification refers to and includes any or all possible combinations of one or more associated listed items.
[0057] It should be understood that although the terms first, second, etc., may be used to describe various information in one or more embodiments of this specification, such information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, first may also be referred to as second without departing from the scope of one or more embodiments of this specification, and similarly, second may also be referred to as first. Depending on the context, the word "if" as used herein may be interpreted as "when," "when," or "in response to a determination."
[0058] First, the terms and concepts used in one or more embodiments of this specification will be explained.
[0059] Lane detection: Input a single-frame image from a single camera, and the network outputs discrete keypoint coordinates to describe the lane line regions in the image.
[0060] This specification provides one or more embodiments of a lane line detection model training method. This specification also relates to a lane line detection method, a target object detection model training method, a lane line detection model training device, a lane line detection device, a target object detection model training device, a computing device, and a computer-readable storage medium, which will be described in detail in the following embodiments.
[0061] See Figure 1 , Figure 1 The illustration shows a specific implementation scenario of a lane line detection model training method according to an embodiment of this specification.
[0062] like Figure 1 As shown, Figure 1 The system includes a server 102 and a client device 104. The server 102 can be understood as a cloud server or a physical server, etc. The client device 104 includes, but is not limited to, desktop computers, laptops, tablets, mobile phones, etc.
[0063] For ease of understanding, in the embodiments of this specification, the server 102 is a physical server and the end device 104 is a laptop computer as an example. Based on the lane line detection model training method, the training process of the lane line detection model is described in detail.
[0064] In specific implementation, on the server 102, multiple lane line image samples and sample labels corresponding to each lane line image sample are collected or received. The multiple lane line image samples can be any type of image containing lane lines obtained from any scene. The sample label corresponding to each lane line image sample can be understood as a lane line composed of discrete key point coordinates. That is, these discrete key point coordinates are used to describe the lane lines in each lane line image sample.
[0065] During specific model training, on server 102, each lane line image sample and its corresponding sample label are input into the backbone network of the lane line detection model for downsampling to obtain downsampled feature images. Then, the downsampled feature images are input into FPN (Feature Pyramid Networks) for upsampling a corresponding number of times to obtain upsampled multi-size feature images. These multi-size feature images are then fused to obtain the initial feature image. In practical applications, to improve the feature extraction effect of small-size feature images, a self-attention mechanism (SA) can be introduced to capture global information and obtain a larger receptive field and contextual information. Furthermore, to further improve the prediction effect of lane line starting coordinates, length, slope or gradient, and Bezier control points, a Local Feature Analysis (LFA) module can be introduced to enhance local features of the initial feature image, thereby enhancing the correlation between adjacent key points, supplementing local information, and improving subsequent prediction effects.
[0066] After determining the initial feature image and the target feature image after local feature enhancement, the heatmap corresponding to the lane line image sample, the distance (offset) from each key point of each lane line in the lane line image sample to its corresponding lane line, the starting coordinates, length, slope or gradient of the lane line, and the Bezier control points (i.e., lane line clustering information) can be predicted based on the initial feature image and the target feature image using different detection heads (HEADS).
[0067] Next, the heatmap is processed using the Non-maximum Suppression (NMS) algorithm to obtain the initial lane line key points, i.e., the target heatmap. Then, based on the predicted lane line start coordinates, starting from the bottom of the target heatmap, key points belonging to the same lane line are continuously searched upwards. In this process, the slope or gradient is mainly used for continuous searching during clustering (i.e., the position of a key point and its corresponding slope or gradient can describe a ray, and the distance (offset) of other key points to this ray can be used to determine whether other key points belong to the same lane line). Gradient clustering is completed in this way, resulting in multiple lane lines composed of multiple key points. Then, outliers are removed based on the lane line start coordinates of the lane lines in the clustering results to optimize the clustering results. At the same time, each lane line after clustering can generate a length, and by judging the difference between its length and the predicted lane line length, the clustering results can be further filtered to determine the final predicted lane lines.
[0068] Simultaneously, based on the Bezier control points corresponding to each key point in the predicted lane line, a predicted Bezier curve corresponding to the predicted lane line is generated.
[0069] Finally, a lane detection model is trained using the predicted lane lines, the corresponding sample labels of the lane line image samples (i.e., the target lane lines), the predicted Bézier curves, and the real Bézier curves generated based on the sample labels of the lane line image samples. Specifically, loss1 (loss function 1) is calculated based on the predicted lane lines and the corresponding sample labels of the lane line image samples; loss2 (loss function 2) is calculated based on the predicted Bézier curves and the real Bézier curves generated based on the sample labels of the lane line image samples. The network parameters of the lane detection model are adjusted based on loss1 and loss2, and the above steps are iterated again until the preset model training conditions are met (e.g., the number of iterations meets an iteration threshold, such as exceeding 200 iterations, or the accuracy of the lane detection model meets a preset accuracy threshold, etc.). The model training then ends, and the trained lane detection model is obtained.
[0070] In addition, networks that predict lane line start coordinates, length, slope or gradient, Bezier control points, heatmaps, and offsets can also be trained. For example, the real heatmap can be determined based on lane line image samples, and the loss function can be calculated by combining the predicted heatmap. Then, the network parameters can be adjusted according to the loss function to complete the training process. Similarly, other prediction parameters can be adjusted in the same way, which will not be elaborated here.
[0071] Furthermore, to improve the feature extraction accuracy of FPN, the loss function can be calculated by combining the predicted segmentation mask corresponding to the extracted target feature image with the real segmentation mask determined based on the lane line image samples. Then, the network parameters can be adjusted based on the loss function to optimize the network parameters of FPN, thereby improving the lane line segmentation accuracy of FPN and thus improving the feature extraction accuracy of lane line.
[0072] When the edge device 104 needs to use the lane line detection model, the server 102 can send the lane line detection model to the edge device 104 for deployment and application. Of course, if the edge device 104 has limited computing resources, it can also use the lane line detection model from the server 102 without actual deployment. The specific settings depend on the actual application, and this specification does not impose any limitations on this embodiment.
[0073] The lane detection model training method provided in this specification enhances the overall consideration of network learning by introducing local constraints using Bézier curves and constraining clustering results during training. Furthermore, the gradient clustering method for lane key points provided in this specification can easily and quickly achieve the clustering process, greatly improving the clustering ability and enhancing the performance of the lane detection model. Simultaneously, a segmentation auxiliary loss (i.e., a loss calculated through a segmentation mask) can be added to the network during lane model training to enhance the lane segmentation accuracy of the FPN network, further improving the performance of the lane detection model. Subsequently, in applying this lane detection model, based on an input lane image, the model can predict the key points of the lane lines contained in the image; that is, based on the predicted key points, the lane lines contained in the image can be determined.
[0074] See Figure 2 , Figure 2 A flowchart of a lane detection model training method provided in one embodiment of this specification is shown, which specifically includes the following steps.
[0075] Step 202: Determine lane line image samples and sample labels, wherein the sample label is the target lane line in the lane line image sample.
[0076] Here, lane line image samples can be understood as image samples containing one or more lane lines in any scene and of any type. Sample labels can be understood as the prediction results of lane line image samples, such as the target lane line in the lane line image sample, which is composed of multiple pixels with defined coordinates.
[0077] Specifically, determining lane line image samples and sample labels can be understood as: acquiring or receiving multiple lane line image samples of any scene and any type, as well as the sample label corresponding to each lane line image sample, that is, the target lane line in each lane line image sample.
[0078] In the embodiments of this specification, lane line image samples and corresponding sample labels are determined to provide a basis for subsequent feature extraction of lane line image samples. The loss function is calculated for the subsequently predicted lane lines using the sample labels, and then the lane line detection model is trained based on the loss function.
[0079] Step 204: Perform feature extraction on the lane line image sample to obtain the first feature image and the second feature image of the lane line image sample.
[0080] In this context, both the first feature image and the second feature image can be understood as feature images obtained after feature extraction from lane line image samples; the difference is that the first feature image can be understood as a feature image generated after local feature enhancement processing of the second feature image based on the second feature image.
[0081] In practical applications, to improve the prediction performance of local lane line information (such as the starting coordinates, length, slope / gradient, etc.), when predicting local lane line information, the second feature image can be processed through local feature enhancement to obtain the first feature image. This makes the first feature image present deeper feature information, and subsequent prediction is performed using the locally enhanced first feature image to obtain better prediction results. The specific implementation method is as follows:
[0082] The step of extracting features from the lane line image samples to obtain a first feature image and a second feature image of the lane line image samples includes:
[0083] Feature extraction is performed on the lane line image sample to obtain an initial feature image, and the initial feature image is determined as the second feature image of the lane line image sample;
[0084] The initial feature image is subjected to local feature enhancement processing to obtain the first feature image of the lane line image sample.
[0085] In practical applications, the lane line image sample can be downsampled using the backbone network layer in the lane line detection model to obtain a multi-scale lane line feature image. This downsampled multi-scale lane line feature image is then input into the Feature Pyramid Network (FPN) layer. The FPN layer upsamples the downsampled multi-scale lane line feature image to obtain an initial feature image corresponding to the lane line image sample. This initial feature image is then determined as the second feature image. Finally, this initial feature image is input into the Local Feature Analysis (LFA) network layer of the lane line detection model to perform local feature enhancement processing, thereby obtaining the first feature image.
[0086] Of course, in practical applications, the local feature enhancement method of the second feature image corresponding to the lane line image sample can not only be implemented through the LFA network layer, but also through the unsharp masking algorithm, etc. The embodiments in this specification only use the LFA network layer as an example for introduction, but do not make any limitations.
[0087] Furthermore, since the lane line image samples are downsampled through the backbone network layer, the small-scale lane line feature images obtained after downsampling do not contain rich features. In order to improve the overall effect of the lane line feature images, a self-attention mechanism can be learned for the small-scale lane line feature images. This reduces the dependence of the small-scale lane line feature images on external information, makes them better at capturing the internal correlation of features, and increases the feature receptive field.
[0088] Specifically, after downsampling the lane line image samples through the backbone network layer to obtain the downsampled multi-scale lane line feature image, the small-scale lane line feature image in the downsampled multi-scale lane line feature image is input into the SA network layer (Self attention net). Self attention processing is performed on the small-scale lane line feature image in the downsampled multi-scale lane line feature image to obtain the small-sized lane line feature image after self attention processing. Then, the small-sized lane line feature image after self attention processing, combined with lane line feature images of other scales, is input into the FPN network layer for upsampling processing.
[0089] The lane detection model training method provided in this specification extracts features from lane line image samples to obtain a second feature image of the lane line image samples. Then, it performs local feature enhancement processing on the second feature image to obtain a corresponding first feature image. This greatly improves the prediction accuracy of lane line parameters based on the enhanced first and second feature images, thereby enhancing the performance of the lane line detection model trained on the lane line parameters.
[0090] As shown above, feature extraction from lane line image samples can be achieved using feature extraction networks such as backbone networks and FPN networks. To improve the feature extraction performance of these networks, during the training of the lane line detection model, a loss function can be calculated between the segmentation mask determined by the sample labels of the lane line image samples and the segmentation mask output by the FPN network layer. This allows for adjustment of the FPN network parameters, further improving the accuracy of subsequent lane line prediction parameters and enhancing the performance of the lane line detection model trained based on these parameters. The specific implementation method is as follows:
[0091] The step of extracting features from the lane line image samples to obtain an initial feature image, and determining the initial feature image as the second feature image of the lane line image samples, includes:
[0092] Based on the feature extraction network of the lane line detection model, features are extracted from the lane line image samples to obtain an initial feature image, and the initial feature image is determined as the second feature image of the lane line image samples, wherein the feature extraction network includes a feature prediction layer;
[0093] Accordingly, after determining the initial feature image as the second feature image of the lane line image sample, the method further includes:
[0094] Generate a target feature image based on the target lane line in the lane line image sample;
[0095] Based on the target feature image and the second feature image, a third loss function is determined, and the network parameters of the feature prediction layer are adjusted according to the third loss function.
[0096] In this context, the feature extraction network can be understood as the network in the lane detection model that extracts image features. For example, a feature extraction network formed by a backbone network layer and an FPN network layer can be understood as the feature prediction layer in the feature extraction network.
[0097] Specifically, based on the feature extraction network of the lane line detection model, features are extracted from the lane line image samples to obtain an initial feature image, which is then determined as the second feature image of the lane line image sample. This can be understood as extracting features from the lane line image samples using the backbone network layer and FPN network layer in the lane line detection model to obtain an initial feature image, which is then determined as the second feature image of the lane line image sample. In practical applications, after obtaining the second feature image, a local feature enhancement network, such as an LFA network layer, can be used to perform local feature enhancement processing on the second feature image to obtain a first feature image.
[0098] In practice, the Feature Prediction Layer (FPN network layer) upsamples the downsampled feature image and then predicts the feature image of the lane line image sample to obtain the segmentation mask corresponding to the predicted feature image. To improve the subsequent feature image prediction effect of the feature prediction layer, a target feature image can be generated based on the target lane line in the lane line image sample, that is, the segmentation mask (i.e., mask) corresponding to the real feature image generated from the real lane line in the lane line image sample. Then, a third loss function (i.e., mask loss function) is calculated based on the predicted segmentation mask and the real segmentation mask corresponding to the lane line image sample. The network parameters of the feature prediction layer are then tuned based on the third loss function to improve the subsequent prediction accuracy of the feature prediction layer, thereby improving the overall training efficiency and performance of the subsequent lane line detection model.
[0099] This specification describes an embodiment that uses a feature extraction network to extract features from lane line image samples to obtain a second feature image of the lane line image samples. Then, based on the second feature image, local feature enhancement processing is performed to obtain the corresponding first feature processing, which further improves the accuracy of model detection and the performance of the model. Based on the obtained second feature image and target feature image, a loss function is calculated, and the network parameters of the corresponding feature extraction network are optimized to further optimize the detection performance of the model.
[0100] Step 206: Determine lane line clustering information and initial lane line key points based on the first feature image and the second feature image.
[0101] Among them, lane line clustering information can be understood as the predicted information of lane lines obtained from the prediction, such as the starting coordinates of the lane line and the length of the lane line.
[0102] Specifically, determining lane line clustering information and initial lane line key points based on the first feature image and the second feature image can be understood as: determining lane line clustering information in the lane line image sample based on the first feature image, and determining initial lane line key points in the lane line image sample based on the second feature image.
[0103] In one embodiment, lane line clustering information may include lane line start coordinates, lane line length, and lane line slope (i.e., lane line gradient). By extracting the corresponding lane line clustering information from the first feature image, a basis is provided for subsequent filtering of initial lane line key points obtained from the second feature image, further accelerating the prediction of lane line key points. The specific implementation is as follows:
[0104] The lane line clustering information includes the lane line start coordinates, lane line length, and lane line slope.
[0105] Accordingly, determining lane line clustering information and initial lane line key points based on the first feature image and the second feature image includes:
[0106] The starting coordinates of the lane line, the length of the lane line, and the slope of the lane line in the lane line image sample are determined based on the first feature image.
[0107] The initial lane line key points corresponding to the lane line image sample are determined based on the second feature image.
[0108] Specifically, prediction is performed based on the first feature image to obtain the starting coordinates of the lane line, the length of the lane line, and the slope of the lane line corresponding to the predicted lane line image sample; prediction is performed based on the second feature image to obtain the initial lane line key points corresponding to the predicted lane line image sample.
[0109] In one implementation, the detection head can predict the first feature image to obtain the lane line start coordinates, lane line length, and lane line slope corresponding to the predicted lane line image sample; the detection head can predict the second feature image to obtain the heatmap corresponding to the lane line sample image, and the initial lane line key points in the second feature image can be determined based on the heatmap.
[0110] Furthermore, the initial lane line key points obtained through heatmaps may contain a lot of noise, affecting the accuracy of subsequent training of the lane line detection model. Therefore, it is necessary to filter the obtained initial lane line key points to reduce noise and improve the training accuracy of the model. The specific implementation method is as follows:
[0111] The lane line clustering information includes the reference distances of key points of the lane lines in the lane line image samples;
[0112] Accordingly, after determining the initial lane line key points corresponding to the lane line image sample based on the second feature image, the process includes:
[0113] The initial lane line key points are processed using a preset filtering algorithm, along with the lane line starting coordinates, lane line length, lane line slope, and reference distance, to obtain the target lane line key points.
[0114] The reference distance of the key point can be understood as the distance from each key point of each lane line in the lane line image sample to the lane line corresponding to that key point, i.e., the offset in the above embodiment; and the preset filtering algorithm includes, but is not limited to, the non-maximum suppression algorithm.
[0115] Specifically, the initial lane line key points are processed using a preset filtering algorithm and the lane line starting coordinates, lane line length, lane line slope, and reference distance to obtain the target lane line key points. This can be understood as follows: first, the initial lane line key points are initially processed using a preset filtering algorithm to filter out unnecessary points; then, the initial lane line key points are further filtered using the lane line length, lane line slope, and reference distance to remove other noise points, thus obtaining the target lane line key points.
[0116] In practical implementation, this can be understood as using a non-maximum suppression algorithm to initially filter key points in the heatmap that are not maxima or are unnecessary. At the same time, the key points after the initial filtering are further filtered by combining the obtained lane line starting coordinates, lane line length, lane line slope and reference distance to obtain the target heatmap. Then, based on the target heatmap, the target lane line key points corresponding to the lane line image sample are obtained.
[0117] Furthermore, after obtaining the clustering information of lane lines through prediction, a loss function can be calculated based on this clustering information and the actual clustering information of the target lane lines determined based on the sample labels. Then, the parameters of the clustering information can be adjusted and optimized based on these loss functions.
[0118] For example, taking the lane length in the cluster information as an example, if the predicted lane length is determined to be 'a', and the actual length of the target lane corresponding to the target lane in the sample label is determined to be 'b', the loss function between 'a' and 'b' is calculated. If it is determined that the loss function does not meet the preset lane length loss threshold, the network parameters of the detection head corresponding to the lane length are adjusted.
[0119] Similarly, in the embodiments of this specification, the loss function calculations for lane line starting coordinates, lane line slope, reference distance, etc. in the above lane line clustering information can all refer to the loss function calculation method with lane line length as an example, and the network parameters of the corresponding detection head can be adjusted based on the calculated loss function, thereby improving the accuracy of subsequent lane line clustering information.
[0120] The embodiments in this specification predict lane line clustering information by predicting the first feature image, and then predicting the second feature image to obtain the initial lane line key points, which provides a basis for subsequent lane line prediction and further accelerates the training speed of the lane line model.
[0121] Step 208: Perform clustering based on the lane line clustering information and the initial lane line key points to obtain the predicted lane lines.
[0122] Specifically, based on the acquired lane line clustering information, key points of the same type of initial lane lines are clustered, and the clustering results are filtered based on the lane line clustering information to generate target clustering results. Predicted lane lines are then obtained based on the target clustering results.
[0123] In one embodiment, to improve the accuracy of predicted lane line generation, it is necessary to cluster key points of the same initial lane line based on the acquired clustering information, and then filter the generated clustering results based on the clustering information to improve the accuracy of the predicted lane lines. The specific implementation method is as follows:
[0124] The step of clustering based on the lane line clustering information and the initial lane line key points to obtain predicted lane lines includes:
[0125] Based on the lane line slope, clustering is performed from the bottom of the key points of the initial lane line to obtain clustering results;
[0126] Based on the starting coordinates of the lane line and the length of the lane line, the clustering results are filtered to obtain the predicted lane line.
[0127] Specifically, based on the lane slope in the predicted clustering information, starting from the bottom of the initial lane key points (i.e., heatmap), key points belonging to the same lane line are continuously searched upwards to complete the gradient clustering of lane lines and obtain the clustering results. Then, the clustering results are filtered based on the lane starting coordinates of the lane lines in the clustering results to remove outliers. At the same time, based on the lane length of each lane line generated after clustering, the difference between the generated lane length and the predicted lane length is judged, and the clustering results are further filtered to generate predicted lane lines.
[0128] In another embodiment, not only can clustering be performed directly based on the initial lane line key points to generate the corresponding predicted lane lines, but also, after denoising the initial lane line key points to obtain target lane line key points with lower noise, clustering can be performed based on these target lane line key points with lower noise. This makes the generated predicted lane lines more accurate and has higher precision. The specific implementation method is as follows:
[0129] After obtaining the key points of the target lane line, the process also includes:
[0130] Clustering is performed based on the lane line clustering information and the key points of the target lane line to obtain the predicted lane line;
[0131] Accordingly, the step of clustering based on the lane line clustering information and the target lane line key points to obtain the predicted lane line includes:
[0132] Based on the lane line slope, clustering is performed from the bottom of the key points of the target lane line to obtain the clustering results;
[0133] Based on the starting coordinates of the lane line and the length of the lane line, the clustering results are filtered to obtain the predicted lane line.
[0134] Specifically, based on the lane slope in the predicted clustering information, starting from the bottom of the target lane key point (i.e., the target heatmap), key points belonging to the same lane line are continuously searched upwards to complete the gradient clustering of the lane lines and obtain the clustering results. Then, the clustering results are filtered based on the lane line starting coordinates in the clustering results to remove outliers. At the same time, based on the lane line length generated after clustering, the difference between the generated lane line length and the predicted lane line length is judged, and the clustering results are further filtered to generate predicted lane lines.
[0135] In practical applications, based on the predicted starting coordinates of the lane line, a ray is drawn from the bottom of the target lane line key point, according to the obtained position of the target lane line key point and its corresponding slope. Based on the reference distance (i.e., the aforementioned offset) from other lane line key points to this ray, it is determined whether other key points belong to the same lane line, thereby realizing gradient clustering and obtaining multiple lane lines composed of multiple key points (i.e., clustering results). Then, the clustering results are filtered based on the starting coordinates of the lane lines and the length of the lane lines in the clustering results to determine the final predicted lane lines.
[0136] The embodiments in this specification utilize a gradient clustering method for lane line key points, which can easily and quickly achieve the clustering process of lane line key points, greatly improving the clustering capability of lane line key points and further enhancing the performance of the lane line detection model.
[0137] Step 210: Train a lane detection model based on the predicted lane lines and the target lane lines.
[0138] Specifically, training a lane detection model based on the predicted lane line and the target lane line can be understood as follows: calculating a loss function based on the predicted lane line and the target lane line in the sample labels corresponding to the lane line sample images, optimizing the lane detection model based on the loss function, and training to obtain the lane detection model.
[0139] In one implementation, to enhance the overall learning of the network, the corresponding Bézier curve control point can be determined based on each key point of each lane line in the lane line image sample. Then, the corresponding Bézier curve is generated based on the Bézier curve control point. The specific implementation method is as follows:
[0140] After obtaining the first feature image and the second feature image of the lane line image sample, the method further includes:
[0141] Based on the first feature image, determine the first Bézier curve control point corresponding to the key point of the lane line in the lane line image sample;
[0142] Based on the first Bézier curve control point and the predicted lane line, determine the first Bézier curve corresponding to the predicted lane line.
[0143] The first Bézier curve control point can be understood as the control point that controls the generation of the Bézier curve.
[0144] Specifically, based on the first feature image, the first Bézier curve control points corresponding to each key point of each lane line in the lane line image sample are determined. Based on these first Bézier curve control points and the coordinates of the predicted lane line, the Bézier key points corresponding to the predicted lane line coordinates are determined, and the first Bézier curve corresponding to the predicted lane line coordinates is determined based on these key points. Each key point corresponding to each lane line generates four corresponding Bézier curve control points, and then the first Bézier curve corresponding to each lane line is generated based on these four control points.
[0145] After determining the first Bézier curve for the predicted lane line, a second Bézier curve can be generated based on the target lane line in the sample labels. The loss function is then calculated using both the first and second Bézier curves, and the lane detection model is trained based on this loss function. Local constraints on the Bézier curves are introduced, transforming the constraint from "point-to-point" to "line-to-line," enhancing the overall holistic consideration of the lane detection model. The specific implementation is as follows:
[0146] After determining the first Bézier curve corresponding to the predicted lane line, the method further includes:
[0147] Determine the second Bézier curve control points corresponding to the key points of the target lane line in the lane line image sample, and determine the second Bézier curve corresponding to the target lane line based on the second Bézier curve control points and the target lane line.
[0148] The lane detection model is trained based on the first and second Bézier curves.
[0149] Specifically, based on the second Bézier curve control points corresponding to each key point of the target lane line in the lane line image sample, and based on the second Bézier curve control points corresponding to each key point of the target lane line and the target lane line coordinates contained in the sample label, the Bézier key points corresponding to the target lane line coordinates are determined, and the second Bézier curve corresponding to the target lane line coordinates is determined based on the Bézier key points; then, the loss function is calculated based on the first Bézier curve and the second Bézier curve, and the lane line detection model is trained based on the loss function obtained from the loss.
[0150] In one embodiment, to enhance the overall quality and accuracy of lane line model training, the loss function between the predicted lane line and the target lane line, and the loss function between the first Bézier curve and the second Bézier curve, can be calculated. The lane line detection model can then be trained based on these two loss functions. The specific implementation is as follows:
[0151] The step of training a lane detection model based on the predicted lane line and the target lane line includes:
[0152] A first loss function is determined based on the predicted lane line and the target lane line;
[0153] Determine the second loss function based on the first and second Bézier curves;
[0154] A lane detection model is trained based on the first loss function and the second loss function.
[0155] Specifically, a first loss function is obtained by calculating the loss function based on the predicted lane line coordinates and the target lane line coordinates in the sample labels. A second loss function is obtained by calculating the loss function based on the first Bézier curve of the predicted lane line coordinates and the second Bézier curve of the target lane line coordinates. The network parameters of the lane line detection model are then adjusted based on these first and second loss functions. This process is repeated iteratively until the preset model training conditions are met, at which point the training of the lane line detection model ends, resulting in a trained lane line model. The preset model training conditions can be that the number of iterations meets an iteration threshold, or that the accuracy of the lane line detection model meets a preset accuracy threshold, etc., and can be set according to the actual application. This specification does not specify any particular limitation.
[0156] The embodiments in this specification train the lane detection model by calculating the loss function between the predicted lane line and the target lane line in the sample label. At the same time, by combining the constraint of Bézier curve, the "point-to-point" constraint is transformed into a "line-to-line" constraint, which makes the model training more accurate and enhances the overall consideration of network learning.
[0157] The lane detection model training method provided in this specification includes: determining lane line image samples and sample labels, wherein the sample labels are target lane lines in the lane line image samples; performing feature extraction on the lane line image samples to obtain a first feature image and a second feature image of the lane line image samples; determining lane line clustering information and initial lane line key points based on the first feature image and the second feature image; performing clustering based on the lane line clustering information and the initial lane line key points to obtain predicted lane lines; and training a lane line detection model based on the predicted lane lines and the target lane lines.
[0158] Specifically, this method extracts global and local features from the input lane line sample image to determine a first feature image (i.e., local features) and a second feature image (i.e., global features). Then, it predicts the lane lines based on the first and second feature images to obtain lane line clustering information and initial lane line key points. Subsequently, based on this lane line clustering information and initial lane line key points, considering both global and local features, the lane line key points are clustered to obtain predicted lane lines. This enables network training of the lane line detection model. By integrating the clustering process of lane line key points into the entire lane line detection model training process, the clustering of lane line key points is performed simultaneously with network training, thereby improving the overall performance of the lane line detection model.
[0159] See Figure 3 , Figure 3The flowchart illustrates a lane detection model training method according to an embodiment of this specification, specifically including the following steps:
[0160] Step 302: Determine the lane line image samples and sample labels.
[0161] Step 304: Feature extraction.
[0162] Specifically, lane line image samples and their corresponding labels are input into the backbone network of the lane line detection model for image downsampling to obtain downsampled feature images. Then, the downsampled feature images are input into Feature Pyramid Networks (FPN) for upsampling a corresponding number of times to obtain upsampled multi-size feature images. These multi-size feature images are then fused to obtain the initial feature image. In practical applications, to improve the feature extraction effect of small-size feature images, a self-attention mechanism (SA) can be introduced to capture global information and obtain a larger receptive field and contextual information.
[0163] In one optional embodiment, in order to improve the feature extraction accuracy of FPN, after obtaining the target lane line feature image, a corresponding predictive segmentation mask can be obtained based on the target lane line feature image, and then a true segmentation mask can be determined based on the target lane line in the lane line sample image. A loss function is calculated based on the predictive segmentation mask of the target lane line feature image and the true segmentation mask of the target feature image, so as to adjust the network parameters of FPN according to the loss function value, thereby improving the lane line segmentation accuracy of FPN.
[0164] Step 306: Local feature enhancement.
[0165] Specifically, in order to improve the prediction efficiency of subsequent clustering information, the initial lane line feature image can be input into the local feature analysis network layer to enhance the local features of the initial feature image, enhance the correlation between adjacent key points, supplement local information, obtain the target lane line feature image, and improve the subsequent prediction effect.
[0166] Step 308: Feature extraction.
[0167] Specifically, after obtaining the initial lane line feature image, the heatmap corresponding to the lane line sample image can be predicted using different detection heads based on the initial lane line feature image and the target feature image. The distance (offset) from each key point of each lane line in the lane line image sample to its corresponding lane line is also predicted. The lane line clustering information (i.e., lane line starting coordinates, length, slope or gradient, and Bezier control points) is also predicted.
[0168] Step 310: Clustering.
[0169] Specifically, the heatmap is suppressed by a non-maximum suppression algorithm to remove some key points from the heatmap, thus obtaining the target heatmap (i.e., the initial lane line key points). Starting from the bottom of the target heatmap, key points belonging to the same lane line are searched based on the lane line slope. The key points of the same lane line are clustered to obtain the clustering results. The clustering results are then filtered based on the starting coordinates and lane line length of each lane line to obtain the clustered predicted lane lines.
[0170] Step 312: Obtain the target lane line.
[0171] Specifically, the target lane line is obtained based on the sample label corresponding to the lane line image sample.
[0172] Step 314: Model training.
[0173] Specifically, the Bézier curve of the predicted lane line is obtained by using the control points of the Bézier curve obtained above and the predicted lane line. Then, the loss function is calculated based on the Bézier curve of the target lane line generated from the target lane line in the sample label to obtain the first loss function. Then, the second loss function of the predicted lane line and the target lane line in the sample label is calculated. The lane line detection model is trained using the first loss function and the second loss function to obtain the trained lane line detection model.
[0174] The specific implementation of steps 302-314 above is consistent with the specific implementation of the lane line detection model training method in the above embodiment, and will not be discussed in detail here. For details, please refer to the lane line detection model training method in the above embodiment.
[0175] This specification provides a lane detection model training method according to one embodiment, which involves determining lane line image samples and sample labels, wherein the sample labels are target lane lines in the lane line image samples; extracting features from the lane line image samples to obtain a first feature image and a second feature image of the lane line image samples; determining lane line clustering information and initial lane line key points based on the first feature image and the second feature image; performing clustering based on the lane line clustering information and the initial lane line key points to obtain predicted lane lines; and training a lane line detection model based on the predicted lane lines and the target lane lines.
[0176] Specifically, this method extracts global and local features from the input lane line sample image to determine a first feature image (i.e., local features) and a second feature image (i.e., global features). Then, it predicts the lane lines based on the first and second feature images to obtain lane line clustering information and initial lane line key points. Subsequently, based on this lane line clustering information and initial lane line key points, considering both global and local features, the lane line key points are clustered to obtain predicted lane lines. This enables network training of the lane line detection model. By integrating the clustering process of lane line key points into the entire lane line detection model training process, the clustering of lane line key points is performed simultaneously with network training, thereby improving the overall performance of the lane line detection model.
[0177] See Figure 4 , Figure 4 This specification shows a flowchart of a lane line detection method according to an embodiment, which specifically includes the following steps:
[0178] Step 402: Input the target lane line image into the lane line detection model, wherein the lane line detection model is obtained through the lane line detection model training method described above;
[0179] The target lane line image can be understood as any image containing lane lines in any scenario and of any type.
[0180] Step 404: Extract features from the target lane line image based on the lane line detection model to obtain a first feature image and a second feature image of the target lane line image;
[0181] Specifically, the feature detection network layer in the lane line detection model is used to extract features from the target lane line image to obtain the second feature image of the target lane line image. Then, local feature enhancement processing is performed on the second feature image to obtain the first feature image corresponding to the target lane line image.
[0182] Step 406: Determine the target lane line clustering information and the initial lane line key points based on the first feature image and the second feature image;
[0183] Specifically, the first feature image is detected to obtain the initial lane line key points of the target lane line image, as well as the lane line start coordinates, lane line length, lane line slope (i.e. lane line clustering information) of each lane line, and the reference distance from the lane line key points to their corresponding lane lines.
[0184] Step 408: Perform clustering based on the lane line clustering information and the initial lane line key points to obtain the target lane line.
[0185] Specifically, starting from the bottom of the initial lane line key points obtained by the aforementioned method, key points belonging to the same lane line are searched according to the lane line slope. The initial lane line key points of the same lane line are clustered, and the clustering results are filtered according to the starting coordinates of each lane line and the length of the lane line to obtain the clustered lane lines. The target lane line is determined based on the clustered lane lines.
[0186] The specific implementation of steps 402-408 above is consistent with the specific implementation of the lane line detection model training method in the above embodiment, and will not be discussed in detail here. For details, please refer to the lane line detection model training method in the above embodiment.
[0187] This specification provides a lane line detection method according to one embodiment, which involves inputting a target lane line image into a lane line detection model, wherein the lane line detection model is obtained through the lane line detection model training method described above; extracting features from the target lane line image based on the lane line detection model to obtain a first feature image and a second feature image of the target lane line image; determining target lane line clustering information and initial lane line key points based on the first feature image and the second feature image; and performing clustering based on the lane line clustering information and the initial lane line key points to obtain the target lane line.
[0188] Specifically, by extracting features from the input lane line sample image, corresponding first feature image and second feature image are obtained. Then, based on the first feature image and second feature image, lane line clustering information and initial lane line key points are determined. Furthermore, lane line clustering is performed during the detection process to improve the accuracy of lane line detection.
[0189] See Figure 5 , Figure 5 This specification shows a flowchart of a target object detection model training method according to an embodiment, which specifically includes the following steps:
[0190] Step 502: Determine the target object image sample and the sample label, wherein the sample label is the target object in the target object image sample;
[0191] Step 504: Perform feature extraction on the target object image sample to obtain a first feature image and a second feature image of the target object image sample;
[0192] Step 506: Determine the target object clustering information and initial target object key points based on the first feature image and the second feature image;
[0193] Step 508: Perform clustering based on the target object clustering information and the key points of the initial target object to obtain the predicted target object;
[0194] Step 510: Train a target object detection model based on the predicted target object and the target object.
[0195] The specific implementation of steps 502-510 above is similar to that in the lane line detection model training method of the above embodiment, and will not be discussed in detail here. For details, please refer to the lane line detection model training method of the above embodiment.
[0196] This specification provides a target object detection model training method according to one embodiment, which involves determining target object image samples and sample labels, wherein the sample labels are target objects in the target object image samples; extracting features from the target object image samples to obtain a first feature image and a second feature image of the target object image samples; determining target object clustering information and initial target object key points based on the first feature image and the second feature image; performing clustering based on the target object clustering information and the initial target object key points to obtain predicted target objects; and training a target object detection model based on the predicted target objects and the target objects.
[0197] Specifically, this method extracts global and local features from the input target object sample image to determine the first feature image (i.e., local features) and the second feature image (i.e., global features). Then, it predicts the target object clustering information and initial target object key points by using the first and second feature images. Subsequently, based on the target object clustering information and the initial target object key points, and considering both global and local features, the target object key points are clustered to obtain the predicted target object. This enables the network training of the target object detection model. By integrating the clustering process of target object key points into the entire target object detection model training process, the clustering of target object key points is performed simultaneously with network training, thereby improving the overall performance of the target object detection model.
[0198] Corresponding to the above method embodiments, this specification also provides embodiments of a lane line detection model training device. Figure 6 A schematic diagram of a lane detection model training device according to one embodiment of this specification is shown. Figure 6 As shown, the device includes:
[0199] The first determining module 602 is configured to determine lane line image samples and sample labels, wherein the sample label is the target lane line in the lane line image sample;
[0200] The first acquisition module 604 is configured to perform feature extraction on the lane line image sample to obtain a first feature image and a second feature image of the lane line image sample.
[0201] The second determining module 606 is configured to determine lane line clustering information and initial lane line key points based on the first feature image and the second feature image.
[0202] The second acquisition module 608 is configured to perform clustering based on the lane line clustering information and the initial lane line key points to obtain the predicted lane line.
[0203] The third acquisition module 610 is configured to train a lane detection model based on the predicted lane line and the target lane line.
[0204] Optionally, the first acquisition module 604 is further configured to:
[0205] Feature extraction is performed on the lane line image sample to obtain an initial feature image, and the initial feature image is determined as the second feature image of the lane line image sample;
[0206] The initial feature image is subjected to local feature enhancement processing to obtain the first feature image of the lane line image sample.
[0207] Optionally, the second determining module 606 includes:
[0208] The starting coordinates of the lane line, the length of the lane line, and the slope of the lane line in the lane line image sample are determined based on the first feature image.
[0209] The initial lane line key points corresponding to the lane line image sample are determined based on the second feature image.
[0210] Optionally, the device further includes:
[0211] The key point processing module is configured as follows:
[0212] The initial lane line key points are processed using a preset filtering algorithm, along with the lane line starting coordinates, lane line length, lane line slope, and reference distance, to obtain the target lane line key points.
[0213] Optionally, the second acquisition module is further configured to:
[0214] Based on the lane line slope, clustering is performed from the bottom of the key points of the initial lane line to obtain clustering results;
[0215] Based on the starting coordinates of the lane line and the length of the lane line, the clustering results are filtered to obtain the predicted lane line.
[0216] Optionally, the device further includes:
[0217] The target lane line key point clustering module is configured as follows:
[0218] Clustering is performed based on the lane line clustering information and the key points of the target lane line to obtain the predicted lane line;
[0219] Accordingly, the target lane line key point clustering module is further configured as follows:
[0220] Based on the lane line slope, clustering is performed from the bottom of the key points of the target lane line to obtain the clustering results;
[0221] Based on the starting coordinates of the lane line and the length of the lane line, the clustering results are filtered to obtain the predicted lane line.
[0222] Optionally, the device further includes:
[0223] The first Bézier curve determination module is configured as follows:
[0224] Based on the first feature image, determine the first Bézier curve control point corresponding to the key point of the lane line in the lane line image sample;
[0225] Based on the first Bézier curve control point and the predicted lane line, determine the first Bézier curve corresponding to the predicted lane line.
[0226] Optionally, the device further includes:
[0227] The second Bézier curve determination module is configured as follows:
[0228] Determine the second Bézier curve control points corresponding to the key points of the target lane line in the lane line image sample, and determine the second Bézier curve corresponding to the target lane line based on the second Bézier curve control points and the target lane line.
[0229] The lane detection model is trained based on the first and second Bézier curves.
[0230] Optionally, the third acquisition module 610 is further configured to:
[0231] A first loss function is determined based on the predicted lane line and the target lane line;
[0232] Determine the second loss function based on the first and second Bézier curves;
[0233] A lane detection model is trained based on the first loss function and the second loss function.
[0234] Optionally, the first acquisition module 604 is further configured to:
[0235] Based on the feature extraction network of the lane line detection model, features are extracted from the lane line image samples to obtain an initial feature image, and the initial feature image is determined as the second feature image of the lane line image samples, wherein the feature extraction network includes a feature prediction layer;
[0236] Accordingly, the device further includes:
[0237] The parameter adjustment module is configured as follows:
[0238] Generate a target feature image based on the target lane line in the lane line image sample;
[0239] Based on the target feature image and the second feature image, a third loss function is determined, and the network parameters of the feature prediction layer are adjusted according to the third loss function.
[0240] This specification provides a lane line detection model training apparatus in one embodiment, which determines lane line image samples and sample labels, wherein the sample labels are target lane lines in the lane line image samples; performs feature extraction on the lane line image samples to obtain a first feature image and a second feature image of the lane line image samples; determines lane line clustering information and initial lane line key points based on the first feature image and the second feature image; performs clustering based on the lane line clustering information and the initial lane line key points to obtain predicted lane lines; and trains a lane line detection model based on the predicted lane lines and the target lane lines.
[0241] Specifically, the device extracts global and local features from the input lane line sample image to determine a first feature image (i.e., local features) and a second feature image (i.e., global features). Then, it predicts the lane lines based on the first and second feature images to obtain lane line clustering information and initial lane line key points. Subsequently, based on this lane line clustering information and initial lane line key points, considering both global and local features, the lane line key points are clustered to obtain predicted lane lines. This enables network training of the lane line detection model. By integrating the clustering process of lane line key points into the entire lane line detection model training process, the clustering of lane line key points is performed simultaneously with network training, thereby improving the overall performance of the lane line detection model.
[0242] The above is a schematic scheme of a lane line detection model training device according to this embodiment. It should be noted that the technical solution of this lane line detection model training device and the technical solution of the lane line detection model training method described above belong to the same concept. For details not described in detail in the technical solution of the lane line detection model training device, please refer to the description of the technical solution of the lane line detection model training method described above.
[0243] Corresponding to the above method embodiments, this specification also provides embodiments of lane line detection devices. Figure 7 A schematic diagram of a lane line detection device according to one embodiment of this specification is shown. Figure 7 As shown, the device includes:
[0244] The image input module 702 is configured to input the target lane line image into the lane line detection model, wherein the lane line detection model is obtained by the lane line detection model training method described above.
[0245] The fourth acquisition module 704 is configured to extract features from the target lane line image based on the lane line detection model to obtain a first feature image and a second feature image of the target lane line image.
[0246] The third determining module 706 is configured to determine target lane line clustering information and initial lane line key points based on the first feature image and the second feature image.
[0247] The fifth acquisition module 708 is configured to perform clustering based on the lane line clustering information and the initial lane line key points to obtain the target lane line.
[0248] One embodiment of this specification provides a lane line detection device that inputs a target lane line image into a lane line detection model, wherein the lane line detection model is obtained through the lane line detection model training method described above; features are extracted from the target lane line image based on the lane line detection model to obtain a first feature image and a second feature image of the target lane line image; target lane line clustering information and initial lane line key points are determined based on the first feature image and the second feature image; and clustering is performed based on the lane line clustering information and the initial lane line key points to obtain the target lane line.
[0249] Specifically, by extracting features from the input lane line sample image, corresponding first feature image and second feature image are obtained. Then, based on the first feature image and second feature image, lane line clustering information and initial lane line key points are determined. Furthermore, lane line clustering is performed during the detection process to improve the accuracy of lane line detection.
[0250] The above is a schematic scheme of a lane line detection device according to this embodiment. It should be noted that the technical solution of this lane line detection device and the technical solution of the lane line detection method described above belong to the same concept. For details not described in detail in the technical solution of the lane line detection device, please refer to the description of the technical solution of the lane line detection method described above.
[0251] Corresponding to the above method embodiments, this specification also provides embodiments of a target object detection model training device. Figure 8 A schematic diagram of a target object detection model training device according to one embodiment of this specification is shown. Figure 8 As shown, the device includes:
[0252] The fourth determining module 802 is configured to determine a target object image sample and a sample label, wherein the sample label is the target object in the target object image sample;
[0253] The sixth acquisition module 804 is configured to extract features from the target object image sample to obtain a first feature image and a second feature image of the target object image sample;
[0254] The fifth determining module 806 is configured to determine the target object clustering information and the initial target object key points based on the first feature image and the second feature image;
[0255] The seventh acquisition module 808 is configured to perform clustering based on the target object clustering information and the initial target object key points to obtain the predicted target object;
[0256] The eighth acquisition module 810 is configured to train a target object detection model based on the predicted target object and the target object.
[0257] This specification provides a target object detection model training apparatus in one embodiment, which determines target object image samples and sample labels, wherein the sample labels are target objects in the target object image samples; performs feature extraction on the target object image samples to obtain a first feature image and a second feature image of the target object image samples; determines target object clustering information and initial target object key points based on the first feature image and the second feature image; performs clustering based on the target object clustering information and the initial target object key points to obtain predicted target objects; and trains a target object detection model based on the predicted target objects and the target objects.
[0258] Specifically, this method extracts global and local features from the input target object sample image to determine the first feature image (i.e., local features) and the second feature image (i.e., global features). Then, it predicts the target object clustering information and initial target object key points by using the first and second feature images. Subsequently, based on the target object clustering information and the initial target object key points, and considering both global and local features, the target object key points are clustered to obtain the predicted target object. This enables the network training of the target object detection model. By integrating the clustering process of target object key points into the entire target object detection model training process, the clustering of target object key points is performed simultaneously with network training, thereby improving the overall performance of the target object detection model.
[0259] The above is a schematic scheme of a target object detection model training device according to this embodiment. It should be noted that the technical solution of this target object detection model training device and the technical solution of the target object detection model training method described above belong to the same concept. For details not described in detail in the technical solution of the target object detection model training device, please refer to the description of the technical solution of the target object detection model training method described above.
[0260] Figure 9 A structural block diagram of a computing device 900 according to one embodiment of this specification is shown. The components of the computing device 900 include, but are not limited to, a memory 910 and a processor 920. The processor 920 is connected to the memory 910 via a bus 930, and a database 950 is used to store data.
[0261] The computing device 900 also includes an access device 940, which enables the computing device 900 to communicate via one or more networks 960. Examples of these networks include Public Switched Telephone Network (PSTN), Local Area Network (LAN), Wide Area Network (WAN), Personal Area Network (PAN), or combinations of communication networks such as the Internet. The access device 940 may include one or more of any type of wired or wireless network interface (e.g., a network interface card (NIC)), such as an IEEE 802.11 Wireless Local Area Network (WLAN) wireless interface, a Wi-MAX (Worldwide Interoperability for Microwave Access) interface, an Ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a Bluetooth interface, or a Near Field Communication (NFC) interface.
[0262] In one embodiment of this specification, the above-described components of the computing device 900 and Figure 9 Other components, not shown, can also be connected to each other, for example, via a bus. It should be understood that... Figure 9 The block diagram of the computing device shown is for illustrative purposes only and is not intended to limit the scope of this specification. Those skilled in the art can add or replace other components as needed.
[0263] The computing device 900 can be any type of stationary or mobile computing device, including mobile computers or mobile computing devices (e.g., tablet computers, personal digital assistants, laptop computers, notebook computers, netbooks, etc.), mobile phones (e.g., smartphones), wearable computing devices (e.g., smartwatches, smart glasses, etc.) or other types of mobile devices, or stationary computing devices such as desktop computers or personal computers (PCs). The computing device 900 can also be a mobile or stationary server.
[0264] The processor 920 is configured to execute the following computer-executable instructions, which, when executed by the processor, implement the steps of the aforementioned data processing method. The above is an illustrative scheme of a computing device according to this embodiment. It should be noted that the technical solution of this computing device belongs to the same concept as the aforementioned lane detection model training method, lane detection method, or target object detection model training method. Details not described in detail in the technical solution of the computing device can be found in the descriptions of the aforementioned lane detection model training method, lane detection method, or target object detection model training method.
[0265] An embodiment of this specification also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the lane detection model training method, lane detection method, or target object detection model training method described above.
[0266] The above is an illustrative scheme of a computer-readable storage medium according to this embodiment. It should be noted that the technical solution of this storage medium belongs to the same concept as the technical solution of the lane detection model training method, lane detection method, or target object detection model training method described above. For details not described in detail in the technical solution of the storage medium, please refer to the description of the technical solution of the lane detection model training method described above.
[0267] An embodiment of this specification also provides a computer program, wherein when the computer program is executed in a computer, it causes the computer to perform the steps of the lane line detection model training method, the lane line detection method, or the target object detection model training method described above.
[0268] The above is an illustrative scheme of a computer program according to this embodiment. It should be noted that the technical solution of this computer program belongs to the same concept as the technical solution of the lane line detection model training method, lane line detection method, or target object detection model training method described above. For details not described in detail in the technical solution of the computer program, please refer to the description of the technical solution of the lane line detection model training method, lane line detection method, or target object detection model training method described above.
[0269] The foregoing has described specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims may be performed in a different order than that shown in the embodiments and may still achieve the desired result. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired result. In some embodiments, multitasking and parallel processing are possible or may be advantageous.
[0270] The computer instructions include computer program code, which may be in the form of source code, object code, executable file, or some intermediate form. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording media, USB flash drive, portable hard drive, magnetic disk, optical disk, computer memory, read-only memory (ROM), random access memory (RAM), electrical carrier signals, telecommunication signals, and software distribution media, etc. It should be noted that the content included in the computer-readable medium may be appropriately added to or subtracted according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to legislation and patent practice, computer-readable media may not include electrical carrier signals and telecommunication signals.
[0271] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that the embodiments in this specification are not limited to the described order of actions, because according to the embodiments in this specification, some steps can be performed in other orders or simultaneously. Furthermore, those skilled in the art should also understand that the embodiments described in this specification are all preferred embodiments, and the actions and modules involved are not necessarily essential to the embodiments in this specification.
[0272] In the above embodiments, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments.
[0273] The preferred embodiments disclosed above are merely illustrative of this specification. The optional embodiments do not exhaustively describe all details, nor do they limit the invention to the specific implementations described. Clearly, many modifications and variations can be made based on the embodiments described herein. These embodiments are selected and specifically described in this specification to better explain the principles and practical applications of the embodiments, thereby enabling those skilled in the art to better understand and utilize this specification. This specification is limited only by the claims and their full scope and equivalents.
Claims
1. A method for training a lane detection model, comprising: Determine lane line image samples and sample labels, wherein the sample label is the target lane line in the lane line image sample; Feature extraction is performed on the lane line image sample to obtain a second feature image of the lane line image sample, and local feature enhancement processing is performed on the second feature image to generate a first feature image; The lane line clustering information in the lane line image sample is determined based on the first feature image, and the initial lane line key points corresponding to the lane line image sample are determined based on the second feature image; the lane line clustering information includes: lane line start coordinates, lane line length, lane line slope, and first Bézier curve control points; Clustering is performed based on the lane line slope and the key points of the initial lane line in the lane line clustering information to obtain clustering results. The clustering results are then filtered based on the lane line starting point coordinates and the lane line length to obtain predicted lane lines. The clustering results include multiple lane lines composed of multiple key points. Based on the first Bézier curve control point and the predicted lane line, determine the first Bézier curve corresponding to the predicted lane line; Determine the second Bézier curve control points corresponding to the key points of the target lane line, and determine the second Bézier curve corresponding to the target lane line based on the second Bézier curve control points and the target lane line; Based on the predicted lane line and the target lane line, a first loss function is determined, and based on the first Bézier curve and the second Bézier curve, a second loss function is determined. Based on the first loss function and the second loss function, a lane line detection model is trained to obtain the model.
2. The lane line detection model training method according to claim 1, wherein the step of extracting features from the lane line image samples to obtain a first feature image and a second feature image of the lane line image samples includes: Feature extraction is performed on the lane line image sample to obtain an initial feature image, and the initial feature image is determined as the second feature image of the lane line image sample; The initial feature image is subjected to local feature enhancement processing to obtain the first feature image of the lane line image sample.
3. The lane line detection model training method according to claim 1, wherein the lane line clustering information includes lane line start coordinates, lane line length, and lane line slope; Accordingly, based on the first feature image and the second feature image, lane line clustering information and initial lane line key points are determined, including: The starting coordinates of the lane line, the length of the lane line, and the slope of the lane line in the lane line image sample are determined based on the first feature image. The initial lane line key points corresponding to the lane line image sample are determined based on the second feature image.
4. The lane line detection model training method according to claim 3, wherein the lane line clustering information includes the reference distance of key points of the lane lines in the lane line image samples; Accordingly, after determining the initial lane line key points corresponding to the lane line image sample based on the second feature image, the process includes: The initial lane line key points are processed using a preset filtering algorithm, along with the lane line starting coordinates, lane line length, lane line slope, and reference distance, to obtain the target lane line key points.
5. The lane line detection model training method according to claim 3, wherein the step of clustering based on the lane line clustering information and the initial lane line key points to obtain the predicted lane line includes: Based on the lane line slope, clustering is performed from the bottom of the key points of the initial lane line to obtain clustering results; Based on the starting coordinates of the lane line and the length of the lane line, the clustering results are filtered to obtain the predicted lane line.
6. The lane line detection model training method according to claim 4, further comprising, after obtaining the target lane line key points: Clustering is performed based on the lane line clustering information and the key points of the target lane line to obtain the predicted lane line; Accordingly, the step of clustering based on the lane line clustering information and the target lane line key points to obtain the predicted lane line includes: Based on the lane line slope, clustering is performed from the bottom of the key points of the target lane line to obtain the clustering results; Based on the starting coordinates of the lane line and the length of the lane line, the clustering results are filtered to obtain the predicted lane line.
7. The lane line detection model training method according to claim 1, further comprising, after obtaining the first feature image and the second feature image of the lane line image sample: Based on the first feature image, determine the first Bézier curve control point corresponding to the key point of the lane line in the lane line image sample; Based on the first Bézier curve control point and the predicted lane line, determine the first Bézier curve corresponding to the predicted lane line.
8. The lane detection model training method according to claim 7, further comprising, after determining the first Bézier curve and the second Bézier curve corresponding to the predicted lane line: The lane detection model is trained based on the first and second Bézier curves.
9. The lane line detection model training method according to claim 2, wherein the step of extracting features from the lane line image samples to obtain an initial feature image, and determining the initial feature image as the second feature image of the lane line image samples, comprises: Based on the feature extraction network of the lane line detection model, features are extracted from the lane line image samples to obtain an initial feature image, and the initial feature image is determined as the second feature image of the lane line image samples, wherein the feature extraction network includes a feature prediction layer; Accordingly, after determining the initial feature image as the second feature image of the lane line image sample, the method further includes: Generate a target feature image based on the target lane line in the lane line image sample; Based on the target feature image and the second feature image, a third loss function is determined, and the network parameters of the feature prediction layer are adjusted according to the third loss function.
10. A lane line detection method, comprising: The target lane line image is input into the lane line detection model, wherein the lane line detection model is obtained by any one of the lane line detection model training methods of claims 1-9 above; Based on the lane line detection model, feature extraction is performed on the target lane line image to obtain a second feature image of the target lane line image, and local feature enhancement processing is performed on the second feature image to generate a first feature image; The lane line clustering information in the lane line image sample is determined based on the first feature image, and the initial lane line key points corresponding to the lane line image sample are determined based on the second feature image; the lane line clustering information includes: lane line start coordinates, lane line length, lane line slope, and first Bézier curve control points; Clustering is performed based on the lane line slope and the key points of the initial lane line in the lane line clustering information to obtain clustering results. The clustering results are then filtered based on the lane line starting point coordinates and the lane line length to obtain the target lane line. The clustering results include multiple lane lines composed of multiple key points.
11. A method for training a target object detection model, comprising: Identify target object image samples and sample labels, wherein the sample label is the target object in the target object image sample; Feature extraction is performed on the target object image sample to obtain a second feature image of the target object image sample, and local feature enhancement processing is performed on the second feature image to generate a first feature image; The target object clustering information in the target object image sample is determined based on the first feature image, and the initial target object key points corresponding to the target object image sample are determined based on the second feature image; the target object clustering information includes: lane line start coordinates, lane line length, lane line slope, and first Bézier curve control points; Clustering is performed based on the lane slope and key points of the initial target object in the target object clustering information to obtain clustering results. The clustering results are then filtered based on the lane starting point coordinates and the lane length to obtain predicted target objects. The clustering results include multiple lane lines composed of multiple key points. Based on the first Bézier curve control point and the predicted target object, determine the first Bézier curve corresponding to the predicted target object; Determine the second Bézier curve control points corresponding to the key points of the target object, and determine the second Bézier curve corresponding to the target object based on the second Bézier curve control points and the target object; Based on the predicted target object and the target object, a first loss function is determined, and based on the first Bézier curve and the second Bézier curve, a second loss function is determined. Based on the first loss function and the second loss function, a target object detection model is trained to obtain the model.
12. A computing device, comprising: Memory and processor; The memory is used to store computer-executable instructions, and the processor is used to execute the computer-executable instructions. When the computer-executable instructions are executed by the processor, they implement the steps of the lane line detection model training method according to any one of claims 1 to 9, or the steps of the lane line detection method according to claim 10, or the steps of the target object detection model training method according to claim 11.
13. A computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, implement the steps of the lane line detection model training method according to any one of claims 1 to 9, or the steps of the lane line detection method according to claim 10, or the steps of the target object detection model training method according to claim 11.