Data labeling method and device, computer device, storage medium and program product
By acquiring target labeling data from multiple frames of environmental images during autonomous driving and determining the mapping relationship of detected targets, the problem of time-consuming traditional data labeling processes is solved, achieving efficient and accurate data labeling.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- 苏州万集车联网技术有限公司
- Filing Date
- 2022-10-19
- Publication Date
- 2026-06-16
AI Technical Summary
In existing technologies, the data labeling process for multi-sensor data is time-consuming and inefficient, making it difficult to effectively improve the perception capabilities of autonomous driving models.
By acquiring target label data from multiple environmental images at the same acquisition time in the target scene, the target mapping relationship between each environmental image is determined, and the label data is corrected based on the mapping relationship and the target label data, thus achieving one-time correction of target labels in multiple environmental images.
It achieves the elimination of the need for manual frame-by-frame correction of markers, avoids human error, greatly reduces marker time, improves marker efficiency, and ensures marker accuracy.
Smart Images

Figure CN115797257B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of data tagging technology, and in particular to a data tagging method, apparatus, computer equipment, storage medium, and program product. Background Technology
[0002] With the implementation and application of autonomous driving technology, many problems have emerged, such as insufficient perception capabilities and the need to improve model recognition capabilities.
[0003] In traditional technologies, to improve the accuracy of detection results, developers often use multiple sensor data (such as point cloud data and image data) to determine the detection result. Before this, it is often necessary to manually label a large number of data samples with high precision in order to train a machine learning model with high recognition accuracy and strong generalization. Then, the model outputs the detection results of the corresponding sensor data, which are then fused to determine the final detection result.
[0004] Then, the amount of data from various sensors is very large, and the number of samples that developers need to label is also very large, resulting in a long labeling process and low efficiency. Summary of the Invention
[0005] Therefore, it is necessary to provide a data marking method, apparatus, computer equipment, computer-readable storage medium, and computer program product to address the aforementioned technical problems.
[0006] Firstly, this application provides a data tagging method, which includes:
[0007] Acquire target marker data from multiple frames of environmental images captured at the same time in the target scene;
[0008] Based on the target labeling data of each environmental image, the mapping relationship between the detected targets in each environmental image is obtained; among them, the detected targets with the mapping relationship between each environmental image represent the same target in the target scene;
[0009] Based on the mapping relationship and the target label data of each environmental image, the detection results of the target in the target scene are determined;
[0010] Based on the detection results and mapping relationships of each target in the target scene, the target labeling data in each environmental image is corrected.
[0011] In one embodiment, the multi-frame environment image includes a point cloud image and a two-dimensional image, and the target labeling data includes point cloud target labeling data in the point cloud image and image target labeling data in the two-dimensional image; the mapping relationship includes a first mapping relationship between point cloud detection targets in the point cloud image and image detection targets in the two-dimensional image.
[0012] Based on the target labeling data of each environmental image, the mapping relationship of the detected targets between the environmental images is obtained, including:
[0013] Obtain the first position of each point cloud detected target in the point cloud target labeling data, and the second position of each image detected target in the image target labeling data;
[0014] The first mapping relationship between the point cloud detection target and the image detection target is determined based on the first position and the second position.
[0015] In one embodiment, determining a first mapping relationship between a point cloud detection target and an image detection target based on a first position and a second position includes:
[0016] Transform the first and second positions to the same coordinate system;
[0017] Obtain the first position error between the converted first position and the second position;
[0018] The first mapping relationship between the point cloud detection target and the image detection target is determined based on the first position error.
[0019] In one embodiment, the detection result of the target in the target scene is determined based on the mapping relationship and the target label data of each environmental image, including:
[0020] Identify point cloud detection targets and image detection targets that have a first mapping relationship; wherein, the point cloud detection targets and image detection targets that have a first mapping relationship represent the same target in the target scene;
[0021] Obtain the first confidence score of point cloud target label data and the second confidence score of image target label data of the same data dimension in point cloud detection targets and image detection targets with the first mapping relationship;
[0022] The detection results of the target in the target scene are determined based on the first confidence level and the second confidence level.
[0023] In one embodiment, determining the detection result of the target in the target scene based on a first confidence level and a second confidence level includes:
[0024] The first weight of the point cloud target label data and the second weight of the image target label data are determined based on the data dimensions; wherein the sum of the first weight and the second weight is 1.
[0025] The detection results of the target in the target scene are determined based on the first weight, the first confidence level, the second weight, and the second confidence level.
[0026] In one embodiment, determining a first weight for point cloud target labeling data and a second weight for image target labeling data based on data dimensions includes:
[0027] When the data dimension is the type dimension, the first weight is less than the second weight;
[0028] When the data dimension is location, the first weight is greater than the second weight.
[0029] In one embodiment, determining the detection result of the target in the target scene based on a first weight, a first confidence level, a second weight, and a second confidence level includes:
[0030] Obtain the first product between the first weight and the first confidence level, and the second product between the second weight and the second confidence level;
[0031] Compare the first product with the second product;
[0032] If the first product is greater than the second product, the detection result of the corresponding target in the target scene will include point cloud target label data;
[0033] If the first product is less than or equal to the second product, the detection result of the corresponding target in the target scene includes the image target label data.
[0034] In one embodiment, when the multi-frame environmental image includes multiple frames of two-dimensional images with different coverage areas, the mapping relationship also includes a second mapping relationship between the image detection targets in each two-dimensional image; the above method further includes:
[0035] Obtain the target location from the image target marker data of each frame of a 2D image;
[0036] A second mapping relationship between image detection targets in multiple frames of two-dimensional images is determined based on the location of each target; wherein, the image detection targets with the second mapping relationship represent the same target in the target scene.
[0037] Secondly, this application also provides a data tagging device, which includes:
[0038] The target labeling module is used to acquire target labeling data from multiple frames of environmental images acquired at the same time in the target scene.
[0039] The mapping and association module is used to obtain the mapping relationship between detected targets in each environmental image based on the target label data of each environmental image; wherein, the detected targets with mapping relationship between each environmental image represent the same target in the target scene;
[0040] The result determination module is used to determine the detection results of targets in the target scene based on the mapping relationship and the target label data of each environmental image; the label correction module is used to correct the target label data in each environmental image based on the detection results and mapping relationship of each target in the target scene.
[0041] Thirdly, this application also provides a computer device, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to perform the following steps:
[0042] Acquire target marker data from multiple frames of environmental images captured at the same time in the target scene;
[0043] Based on the target labeling data of each environmental image, the mapping relationship between the detected targets in each environmental image is obtained; among them, the detected targets with the mapping relationship between each environmental image represent the same target in the target scene;
[0044] Based on the mapping relationship and the target label data of each environmental image, the detection results of the target in the target scene are determined;
[0045] Based on the detection results and mapping relationships of each target in the target scene, the target labeling data in each environmental image is corrected.
[0046] Fourthly, this application also provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, performs the following steps:
[0047] Acquire target marker data from multiple frames of environmental images captured at the same time in the target scene;
[0048] Based on the target labeling data of each environmental image, the mapping relationship between the detected targets in each environmental image is obtained; among them, the detected targets with the mapping relationship between each environmental image represent the same target in the target scene;
[0049] Based on the mapping relationship and the target label data of each environmental image, the detection results of the target in the target scene are determined;
[0050] Based on the detection results and mapping relationships of each target in the target scene, the target labeling data in each environmental image is corrected.
[0051] Fifthly, this application also provides a computer program product, including a computer program that, when executed by a processor, performs the following steps:
[0052] Acquire target marker data from multiple frames of environmental images captured at the same time in the target scene;
[0053] Based on the target labeling data of each environmental image, the mapping relationship between the detected targets in each environmental image is obtained; among them, the detected targets with the mapping relationship between each environmental image represent the same target in the target scene;
[0054] Based on the mapping relationship and the target label data of each environmental image, the detection results of the target in the target scene are determined;
[0055] Based on the detection results and mapping relationships of each target in the target scene, the target labeling data in each environmental image is corrected.
[0056] The aforementioned data labeling method, apparatus, computer equipment, storage medium, and computer program product acquire target labeling data from multiple frames of environmental images acquired at the same time in a target scene. Based on the target labeling data of each environmental image, a mapping relationship between detected targets is obtained. Then, based on the mapping relationship and the target labeling data of each environmental image, the detection result of the target in the target scene is determined. Finally, based on the detection result of each target in the target scene and the mapping relationship, the target labeling data in each environmental image is corrected. Detected targets with mapping relationships between environmental images represent the same target in the target scene. In this method, after obtaining the mapping relationship between detected targets, the target labeling data of detected targets with mapping relationships can be corrected synchronously, achieving one-time correction of target labeling data for multiple frames of environmental images. This eliminates the need for manual frame-by-frame label correction, avoids human error, significantly reduces labeling time, and ensures labeling accuracy while improving efficiency. Attached Figure Description
[0057] Figure 1 This is an internal structural diagram of a computer device in one embodiment;
[0058] Figure 2 This is a flowchart illustrating a data tagging method in one embodiment;
[0059] Figure 3 This is a flowchart illustrating the process of determining the first mapping relationship in one embodiment;
[0060] Figure 4 This is a flowchart illustrating the process of determining the first mapping relationship in another embodiment;
[0061] Figure 5 This is a flowchart illustrating the process of determining the detection result of a target in a target scene in one embodiment;
[0062] Figure 6 This is a flowchart illustrating the process of determining the detection result of a target in a target scene in another embodiment;
[0063] Figure 7 This is a flowchart illustrating the process of determining the detection result of a target in a target scene in another embodiment;
[0064] Figure 8 This is a flowchart illustrating the process of determining the second mapping relationship in one embodiment;
[0065] Figure 9 This is a schematic diagram of the structure of an image acquisition device in one embodiment;
[0066] Figure 10 This is a structural block diagram of a data marking device in one embodiment. Detailed Implementation
[0067] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0068] In one embodiment, a data tagging method is provided, in which the method is applied to Figure 1 The following description uses a computing device as an example. This computer device includes a processor, memory, communication interface, display screen, and input devices connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides the environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The communication interface is used for wired or wireless communication with external terminals; wireless communication can be achieved through Wi-Fi, mobile cellular networks, NFC (Near Field Communication), or other technologies. When the computer program is executed by the processor, it implements a data tagging method.
[0069] Those skilled in the art will understand that Figure 1 The structures shown are merely block diagrams of some structures related to the embodiments of this application and do not constitute a limitation on the computer devices on which the embodiments of this application are applied. Specific computer devices may include more or fewer components than those shown in the figures, or combine certain components, or have different component arrangements.
[0070] In one embodiment, such as Figure 2 As shown, a data tagging method is provided, which can be applied to... Figure 1 Taking a computer device as an example, the explanation includes the following steps:
[0071] S210. Acquire target marker data from multiple frames of environmental images acquired at the same time in the target scene.
[0072] The target labeling data consists of relevant information about the detected targets marked in the environmental image. For example, target labeling data may include target location (which can be represented by bounding box location), target type, confidence level of target location or target type, etc.
[0073] Optionally, the environmental image can be a point cloud image obtained from radar, a two-dimensional image obtained from a camera, or other forms of image, as long as it can be used for target detection. Optionally, the acquired multi-frame environmental images may include images of the same type, such as only point cloud images, or only two-dimensional images, or may include images of different types, such as both point cloud images and two-dimensional images.
[0074] Optionally, the computer device can receive multiple frames of images acquired by the sensor and synchronize them in time to obtain multiple frames of environmental data acquired at the same time. Alternatively, it can directly receive multiple frames of environmental images acquired at the same time after time synchronization. Then, target detection is performed on the multiple frames of environmental images to obtain target label data for each frame. The computer device can input the multiple frames of environmental images into a target detection model, and the model outputs target label data for each frame.
[0075] Optionally, when the multi-frame environmental images include point cloud images, the computer device inputs the point cloud images into a point cloud detection model to output target label data of the point cloud images; when the multi-frame environmental images include two-dimensional images, the computer device inputs the two-dimensional images into an image detection model to input target label data of the two-dimensional images.
[0076] S220. Based on the target label data of each environmental image, obtain the mapping relationship between the detected targets in each environmental image.
[0077] In this context, the detected targets that have a mapping relationship between the various environmental images represent the same target in the target scene. In other words, the mapping relationship between the detected targets in the various environmental images indicates the detected targets that represent the same target in the target scene. For example, if the mapping relationship includes a one-to-one mapping between detected target A1 in environmental image FigA and detected target B2 in environmental image FigB, then detected target A1 and detected target B2 represent the same target in the target scene.
[0078] Optionally, the computer device can perform feature matching between target label data of each environmental image to determine the detection targets that have successfully matched features between different environmental images, and then associate the detection targets that have successfully matched features between different environmental images to obtain a mapping relationship. For example, if the detection target A1 in environmental image FigA and the detection target B2 in environmental image FigB have successfully matched features, then the detection target A1 and the detection target B2 are associated to obtain a one-to-one mapping relationship between detection target A1 and detection target B2.
[0079] Optionally, the computer device extracts target label data of the same data dimension from each environmental image and compares them. It then correlates multiple targets with the same data dimension or whose errors meet a preset error range to obtain a mapping relationship.
[0080] S230. Based on the mapping relationship and the target label data of each environmental image, determine the detection result of the target in the target scene.
[0081] Optionally, the computer device acquires target label data from various environmental images, extracts target label data for each detected target with a mapping relationship, and integrates the target label data of each detected target with a mapping relationship to obtain the detection result of the corresponding target in the target scene. For example, as illustrated above, the mapping relationship includes a one-to-one mapping between detected target A1 in environmental image FigA and detected target B2 in environmental image FigB. Detected target A1 and detected target B2 represent the same target in the target scene. The computer device can integrate the target label data of detected target A1 and the target label data of detected target B2 to determine the detection result of the corresponding target in the target scene.
[0082] Optionally, the computer device can determine the detection result of the corresponding target in the target scene based on the detection confidence in the target label data of the detection targets with mapping relationships. Specifically, it can determine the target label data with higher confidence as the detection result of the corresponding target in the target scene. For example, continuing the above example, the target label data for detecting target A1 includes: target location P1, corresponding location confidence is 80%, target type is vehicle, corresponding type confidence is 70%; the target label data for detecting target B2 includes: target location P2, corresponding location confidence is 85%, target type is human, corresponding type confidence is 60%. Among them, the confidence of target location P1 (80%) is greater than the confidence of target location P2 (70%), and the confidence of target type vehicle (70%) is less than the confidence of target type human (85%). Therefore, the computer device can determine that the detection result of the target corresponding to the detection target A1 and the detection target B2 in the target environment is target location P1 and target type human.
[0083] S240. Based on the detection results and mapping relationships of each target in the target scene, correct the target label data in each environmental image.
[0084] Optionally, after determining the detection results of each target in the target scene, the computer device can traverse each detected target in each environmental image in a preset order or a random order, and correct the target label data of the detected targets in the remaining environmental images that have a mapping relationship with the detected target based on the detection results of each target.
[0085] Optionally, after determining the detection results of each target in the target scene, the computer device can receive the target to be corrected from any frame of the multi-frame environmental images selected by the user, receive the user's correction operation for the target to be corrected, and perform corresponding corrections on the target to be corrected. Simultaneously, while correcting the target to be corrected, the target label data of the detected targets that have a mapping relationship with the target to be corrected are also corrected. The target to be corrected can be any detected target in the multi-frame environmental images.
[0086] Optionally, after obtaining the above-mentioned corrected environmental image, the target label data of the corrected environmental image can be further manually checked and corrected to obtain an environmental image with strong confidence labels. Then, the environmental image with strong confidence labels can be used to train a target detection model with improved recognition accuracy and strong generalization.
[0087] In this embodiment, a computer device acquires target label data from multiple frames of environmental images captured at the same time in a target scene. Based on the target label data of each environmental image, a mapping relationship between detected targets is obtained. Then, based on the mapping relationship and the target label data of each environmental image, the detection result of the target in the target scene is determined. Finally, based on the detection results of each target in the target scene and the mapping relationship, the target label data in each environmental image is corrected. Detected targets with a mapping relationship between environmental images represent the same target in the target scene. In this method, after obtaining the mapping relationship between detected targets, the target label data of detected targets with a mapping relationship can be corrected synchronously, achieving one-time correction of target label data for multiple frames of environmental images. This eliminates the need for manual frame-by-frame label correction, avoids human error, greatly reduces labeling time, and ensures labeling accuracy while improving efficiency.
[0088] The aforementioned multi-frame images include point cloud images and two-dimensional images, and the target labeling data correspondingly includes point cloud target labeling data in the point cloud images and image target labeling data in the two-dimensional images. The aforementioned mapping relationship includes a first mapping relationship between point cloud detection targets in the point cloud images and image detection targets in the two-dimensional images. In one embodiment, such as... Figure 3 As shown, S210 above, obtaining the mapping relationship of detected targets between environmental images based on the target label data of each environmental image, includes:
[0089] S310. Obtain the first position of each point cloud detection target in the point cloud target labeling data, and the second position of each image detection target in the image target labeling data.
[0090] In this context, point cloud detection targets refer to the detected targets obtained by performing target detection based on point cloud images, while image detection targets refer to the detected targets obtained by performing target detection based on two-dimensional images. The first position is the target position of the point cloud detection target marked in the point cloud labeling data, and the second position is the target position of the image detection target marked in the image labeling data.
[0091] Optionally, the computer device extracts the target position of each point cloud detection target in the obtained point cloud target labeling data as the first position, and obtains the target position of each image point cloud detection target in the image target labeling data as the second position.
[0092] S320. Determine the first mapping relationship between the point cloud detection target and the image detection target based on the first position and the second position.
[0093] Optionally, after obtaining the first position of each point cloud detection target and the second position of each image detection target, the computer device can match each first position and each second position to determine the point cloud detection target and image detection target that match the target position, and then associate the point cloud detection target and image detection target that match the target position to obtain the first mapping relationship.
[0094] In practical applications, the first position is a three-dimensional spatial coordinate, and the second position is a two-dimensional planar coordinate. Therefore, as... Figure 4 As shown, the above-mentioned S320, determining the first mapping relationship between the point cloud detection target and the image detection target based on the first and second positions, includes:
[0095] S410. Convert the first and second positions to the same coordinate system.
[0096] Optionally, the computer device can determine the coordinate transformation relationship between the radar and the camera based on the calibration relationship between them, and transform the first position and the second position to the same coordinate system according to the coordinate transformation relationship. Specifically, the first position can be transformed to the coordinate system used by the second position, or the second position can be transformed to the coordinate system used by the first position.
[0097] Optionally, the computer equipment can also transform the first position to the world coordinate system according to the transformation relationship between the radar coordinate system and the world coordinate system, and at the same time transform the second position to the world coordinate system according to the transformation relationship between the camera coordinate system and the world coordinate system, so as to realize the coordinate system unification of the first position and the second position.
[0098] S420, Obtain the first position error between the converted first position and the second position.
[0099] The first position error is the distance between the first position and the second position.
[0100] Optionally, after obtaining the converted first and second positions, the computer device calculates the distance between each first position and each second position as the first position error.
[0101] S430. Determine the first mapping relationship between the point cloud detection target and the image detection target based on the first position error.
[0102] Optionally, after obtaining the first position error, the computer device determines whether the point cloud detection target and the image detection target that obtained the first position error represent the same target in the target scene. If yes, the point cloud detection target and the image detection target that obtained the first position error are correlated to obtain a first mapping relationship; otherwise, if no, no correlation is needed, and there is no mapping relationship between the point cloud detection target and the image detection target that obtained the first position error.
[0103] Optionally, the computer device can determine whether the point cloud detection target and the image detection target with the first position error represent the same target in the target scene based on the first position error and a preset distance threshold. Specifically, if the first position error is less than the preset distance threshold, the computer device can determine that the point cloud detection target and the image detection target with the first position error represent the same target in the target scene; conversely, if the first position error is greater than or equal to the preset distance threshold, the computer device can determine that the point cloud detection target and the image detection target with the first position error represent different targets in the target scene.
[0104] In this embodiment, the computer device acquires the first position of each point cloud detection target in the point cloud target labeling data and the second position of each image detection target in the image target labeling data. Then, based on the first and second positions, a first mapping relationship between the point cloud detection targets and the image detection targets is determined. Specifically, the first and second positions are transformed to the same coordinate system, and then the first position error between the transformed first and second positions is obtained. This method enables the determination of the first mapping relationship between the point cloud detection targets and the image detection targets, allowing for the simultaneous correction of both the labeled point cloud image and the 2D image based on this first mapping relationship, saving correction time and improving data labeling efficiency.
[0105] To improve the accuracy of target detection in a target scene, the detection result can be determined by combining point cloud target labeling data and image target labeling data. Based on this, such as... Figure 5 As shown, S230 above, based on the mapping relationship and the target label data of each environmental image, determines the detection result of the target in the target scene, including:
[0106] S510. Determine the point cloud detection target and the image detection target that have the first mapping relationship.
[0107] Among them, the point cloud detection target and the image detection target with the first mapping relationship represent the same target in the target scene.
[0108] Specifically, when determining the detection results of targets in a target scene, the computer device determines the point cloud detection targets and image detection targets in the point cloud image and the two-dimensional image that have the aforementioned first mapping relationship, based on the first mapping relationship. For example, the point cloud image includes point cloud detection targets F1, F2, and F3, and the two-dimensional image includes image detection targets f1, f2, and f3. The first mapping relationship includes a one-to-one mapping between point cloud detection target F1 and image detection target f1, point cloud detection target F2 and image detection target f2, and point cloud detection target F3 and image detection target f3. The computer device can then determine that point cloud detection target F1 and image detection target f1 have the first mapping relationship, point cloud detection target F2 and image detection target f2 have the first mapping relationship, and point cloud detection target F3 and image detection target f3 have the first mapping relationship.
[0109] S520. Obtain the first confidence level of the point cloud target label data and the second confidence level of the image target label data of the same data dimension in the point cloud detection target and the image detection target with the first mapping relationship.
[0110] In this context, data dimensions are used to represent data types. For example, it could be a type dimension used to detect the type of a target, or a location dimension used to represent the position of the target.
[0111] Specifically, after identifying the point cloud detection target and the image detection target with a first mapping relationship, the computer device further acquires the first confidence level of the point cloud target label data and the second confidence level of the image target label data for the same data dimension between the point cloud detection target and the image detection target. For example, when both the point cloud target label data and the image target label data for the point cloud detection target F1 and the image detection target f1 include data in the type dimension and the position dimension, the computer device acquires the first confidence level C of the target position in the point cloud target label data. location-1 And the second confidence level C of the target location in the image target labeling data. location-2 And / or, respectively obtain the first confidence level C of the target type in the point cloud target label data. type-1 And the second confidence level C of the target type in the image target labeling data. type-2 .
[0112] S530. Determine the detection result of the target in the target scene based on the first confidence level and the second confidence level.
[0113] Optionally, the computer device can directly compare the magnitudes of the first confidence level and the second confidence level to determine the detection result of the target in the target scene based on the comparison result. Specifically, if the first confidence level is greater than the second confidence level, the detection result of the target in the target scene is determined to include point cloud target label data of the corresponding type for the point cloud detection target; conversely, if the first confidence level is less than or equal to the second confidence level, the detection result of the target in the target scene is determined to include image target label data of the corresponding type for the image detection target.
[0114] Continuing with the example above, if C location-1 >C location-2 The detection results of the same target in the target scene represented by point cloud detection target F1 and image detection target f1 include the target location in the point cloud target label data; conversely, if C location-1 ≤C location-2 The detection results of the same target in the target scene represented by point cloud detection target F1 and image detection target f1 include the target location in the image target label data; similarly, if C type-1 >C type-2 The detection results of the same target in the target scene represented by point cloud detection target F1 and image detection target f1 include the target type in the point cloud target label data; conversely, if C type-1 ≤C typen-2 The detection results of the same target in the target scene represented by the point cloud detection target F1 and the image detection target f1 include the target type in the image target label data.
[0115] Optionally, target detection based on either point cloud images or 2D images may result in false negatives or false negatives. This means that there may be point cloud targets in the point cloud target labeling data that do not have a corresponding image target, or vice versa. In both cases, the computer device needs to determine whether a target exists in the target scene and the detection result of that target based on the point cloud target labeling data of the point cloud targets without a corresponding image target, or the image target labeling data of the image targets without a corresponding point cloud target.
[0116] Optionally, the computer device can determine whether a corresponding target exists in the target scene based on the confidence level of point cloud target label data for point cloud detection targets that do not have corresponding image detection targets, or the confidence level of image target label data for image detection targets that do not have corresponding point cloud detection targets. Specifically, the confidence level of the point cloud target label data / image target label data can be multiplied by an appropriate weight, and the resulting value can be compared with a preset threshold. If the obtained value is greater than the preset threshold, it is determined that a corresponding target exists in the target scene, and the corresponding point cloud target label data / image target label data is obtained as the detection result of that target; conversely, if the obtained value is less than or equal to the preset threshold, it is determined that a corresponding target does not exist in the target scene.
[0117] In this embodiment, the computer device can pre-determine point cloud detection targets and image detection targets with a first mapping relationship, and then obtain the first confidence level of point cloud target label data and the second confidence level of image target label data of the same data dimension in the point cloud detection targets and image detection targets with the first mapping relationship. Then, the detection result of the target in the target scene is determined based on the first confidence level and the second confidence level. Here, the point cloud detection targets and image detection targets with the first mapping relationship represent the same target in the target scene. Confidence level accurately reflects the credibility of the target label data. The method described above, which comprehensively determines the detection result of the target in the target scene based on the confidence levels of target label data obtained from different images, can effectively improve the credibility of the determined target detection result in the target scene and improve the accuracy of the detection result.
[0118] The first confidence score is obtained from the point cloud detection model, and the second confidence score is obtained from the image detection model. Different weights need to be configured for the first and second confidence scores to comprehensively determine the detection result of the target in the target scene. In one embodiment, such as... Figure 6 As shown, S530 above, determining the detection results of each target in the target scene based on the first confidence level and the second confidence level, includes:
[0119] S610. Determine the first weight of the point cloud target label data and the second weight of the image target label data based on the data dimension.
[0120] The sum of the first weight and the second weight is 1.
[0121] It should be noted that due to the inherent image features of point cloud images and 2D images, the reliability of target label data obtained from different data types also varies.
[0122] Optionally, the computer device determines a first weight for the point cloud target label data and a second weight for the image target label data based on the data dimension, and the sum of the first weight and the second weight is 1.
[0123] Generally, point cloud images are more reliable than 2D images for target location; however, 2D images are more reliable than point cloud images for target type. Therefore, when the data dimension is type, the first weight is less than the second weight; when the data dimension is location, the first weight is greater than the second weight.
[0124] S620. Determine the detection result of the target in the target scene based on the first weight, the first confidence level, the second weight, and the second confidence level.
[0125] Optionally, the computer device weights the first confidence level and the second confidence level according to the first weight and the second weight respectively, and compares the weighted results to determine the detection result of the target in the target scene. The detection result of the target in the target scene includes the target label data from the side with the larger weighted result.
[0126] In an alternative embodiment, such as Figure 7 As shown, the above-mentioned S620, determining the detection result of the target in the target scene based on the first weight, the first confidence level, the second weight, and the second confidence level, includes:
[0127] S710. Obtain the first product between the first weight and the first confidence level, and the second product between the second weight and the second confidence level.
[0128] Specifically, the computer device weights the first confidence level and the second confidence level according to the first weight and the second weight, respectively. That is, it calculates the product between the first weight and the first confidence level to obtain the first product, and calculates the product between the second weight and the second confidence level to obtain the second product. For example, continuing the above example, for point cloud detection target F1 and image detection target f1, for the data dimension of target location, the first weight 𝛾1 is greater than the second weight 𝛾2, and the first product F1 = 𝛾1 * C location-1 The second product F2 = 𝛾2 * C location-2 For the data dimension of target type, the first weight 𝜆1 is greater than the second weight 𝜆2, and the first product F1' = 𝜆1 * C type-1 The second product F2' = 𝜆2 * C type-2 .
[0129] S720. Compare the size of the first product and the second product.
[0130] Specifically, the computer device further compares the magnitudes of the obtained first product and second product, and then determines the detection result of the target in the target scene based on the comparison result.
[0131] S730. If the first product is greater than the second product, the detection result of the corresponding target in the target scene includes point cloud target label data.
[0132] S740. If the first product is less than or equal to the second product, the detection result of the corresponding target in the target scene includes the image target label data.
[0133] Specifically, if the first product is greater than the second product, the computer device can determine that the detection result of the corresponding target in the target scene includes point cloud target label data; conversely, if the first product is less than or equal to the second product, the computer device can determine that the detection result of the corresponding target in the target scene includes image target label data. Continuing the example above, if F1 > F2, the detection result of the same target in the target scene represented by point cloud detection target F1 and image detection target f1 includes the target position in the point cloud target label data; if F1 ≤ F2, the detection result of the same target in the target scene represented by point cloud detection target F1 and image detection target f1 includes the target position in the image target label data; similarly, if F1' > F2', the detection result of the same target in the target scene represented by point cloud detection target F1 and image detection target f1 includes the target type in the point cloud target label data; conversely, if F1' ≤ F2', the detection result of the same target in the target scene represented by point cloud detection target F1 and image detection target f1 includes the target type in the image target label data.
[0134] In this embodiment, the computer device determines a first weight for the point cloud target label data and a second weight for the image target label data based on the data dimension. The detection result of the target in the target scene is then determined based on the first weight, first confidence level, second weight, and second confidence level. The sum of the first weight and the second weight is 1. By adding weights as described above, the reliability of target label data from different images across different data dimensions can be considered, thus determining the target detection result in the target scene with higher reliability, thereby improving the accuracy of subsequent data correction and labeling.
[0135] In the case where a multi-frame environmental image includes multiple two-dimensional images with different coverage areas, the above mapping relationship also includes a second mapping relationship between the image detection targets in each two-dimensional image. Therefore, as Figure 8 As shown, the above method also includes:
[0136] S810: Obtain the target position in the image target marker data of each frame of two-dimensional image.
[0137] Optionally, the image acquisition device for acquiring multiple frames of environmental images can be a fusion kit integrating radar and camera; for details, please refer to [link / reference needed]. Figure 9Multiple frames of two-dimensional images can be obtained from multiple cameras (or webcams) in an image acquisition device.
[0138] Optionally, the computer device can receive point cloud images and two-dimensional images acquired by the image acquisition device, split them into frames, and synchronize and align them in time to obtain multi-frame environmental images including multiple frames of two-dimensional images acquired at the same time. Then, an image detection model is used to perform target detection on the multi-frame two-dimensional images to obtain image target label data of the image-detected target in each frame of the two-dimensional image, and extract the target location from it.
[0139] S820. Determine the second mapping relationship between image detection targets in multiple frames of two-dimensional images based on the target positions.
[0140] Among them, the image detection target with the second mapping relationship represents the same target in the target scene.
[0141] It should be noted that, due to the different orientations of the multiple cameras in the image acquisition device, the coverage areas of the multiple frames of 2D images acquired at the same acquisition time will also be different, but there may be overlapping areas. The image detection target that matches the image target label data in the overlapping area represents the same target in the target scene. Therefore, the above mapping relationship may also include a second mapping relationship between the image detection targets of multiple frames of 2D images.
[0142] Optionally, the computer device can acquire the target position in the image target label data of each frame of two-dimensional image, calculate the second position error between each pair, and determine the second mapping relationship between the image detection targets based on the second position error.
[0143] Optionally, after obtaining the second position error, the computer device determines whether the image detection targets with the second position error represent the same target in the target scene. If so, the image detection targets with the second position error are correlated to obtain a second mapping relationship; otherwise, if not, no correlation is needed, and there is no mapping relationship between the image detection targets with the second position error.
[0144] Optionally, the computer device can determine whether the image detection target with the second position error represents the same target in the target scene based on the second position error and a preset distance threshold. Specifically, if the second position error is less than the preset distance threshold, the computer device can determine that the image detection target with the second position error represents the same target in the target scene; conversely, if the second position error is greater than or equal to the preset distance threshold, the computer device can determine that the image detection target with the second position error does not represent the same target in the target scene.
[0145] Optionally, after obtaining the second mapping relationship, the computer device can simultaneously correct the point cloud target label data of the point cloud image and the image target label data of the two-dimensional image based on the detection results of each target in the target scene and by using the first mapping relationship and the second mapping relationship.
[0146] Optionally, for image target labeling data of image detection targets with a second mapping relationship in multiple frames of two-dimensional images, the computer device can directly determine the image target labeling data based on the confidence level of the image target labeling data of the multiple frames of two-dimensional images. For image detection targets with a second mapping relationship, the image target labeling data with a higher confidence level is obtained as the image target labeling data of the image detection target with the second mapping relationship.
[0147] In this embodiment, when the multi-frame environmental image includes multiple frames of two-dimensional images with different coverage areas, the above mapping relationship also includes a second mapping relationship between the image detection targets of each two-dimensional image. Specifically, the target position in the image target labeling data of each frame of two-dimensional image can be obtained, and then the second mapping relationship between the image detection targets of multiple frames of two-dimensional images can be determined based on the target position. The image detection targets with the second mapping relationship represent the same target in the target scene. This method can determine the second mapping relationship between image detection targets, so that the two-dimensional images can be corrected simultaneously based on this second mapping relationship. This saves correction labeling time, improves data labeling efficiency, reduces the probability of missed labels, and improves the comprehensiveness of data labeling.
[0148] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.
[0149] In one embodiment, such as Figure 10 As shown, a data tagging device is provided, including: a target tagging module 1001, a mapping association module 1002, a result determination module 1003, and a correction tagging module 1004, wherein:
[0150] The target labeling module 1001 is used to acquire target labeling data of multiple frames of environmental images at the same acquisition time in the target scene;
[0151] The mapping and association module 1002 is used to obtain the mapping relationship between detected targets in each environmental image based on the target label data of each environmental image; wherein, the detected targets with mapping relationship between each environmental image represent the same target in the target scene;
[0152] The result determination module 1003 is used to determine the detection result of the target in the target scene based on the mapping relationship and the target label data of each environmental image;
[0153] The correction labeling module 1004 is used to correct the target labeling data in each environmental image based on the detection results and mapping relationship of each target in the target scene.
[0154] In one embodiment, the multi-frame environment image includes a point cloud image and a two-dimensional image, and the target labeling data includes point cloud target labeling data in the point cloud image and image target labeling data in the two-dimensional image; the mapping relationship includes a first mapping relationship between point cloud detection targets in the point cloud image and image detection targets in the two-dimensional image.
[0155] The mapping and association module 1002 is specifically used for:
[0156] Obtain the first position of each point cloud detection target in the point cloud target labeling data and the second position of each image detection target in the image target labeling data; determine the first mapping relationship between the point cloud detection target and the image detection target based on the first position and the second position.
[0157] In one embodiment, the mapping association module 1002 is specifically used for:
[0158] Transform the first position and the second position to the same coordinate system; obtain the first position error between the transformed first position and the second position; determine the first mapping relationship between the point cloud detection target and the image detection target based on the first position error.
[0159] In one embodiment, the result determination module 1003 is specifically used for:
[0160] Identify point cloud detection targets and image detection targets with a first mapping relationship; wherein, the point cloud detection targets and image detection targets with the first mapping relationship represent the same target in the target scene; obtain the first confidence score of the point cloud target label data and the second confidence score of the image target label data of the same data dimension in the point cloud detection targets and image detection targets with the first mapping relationship; determine the detection result of the target in the target scene based on the first confidence score and the second confidence score.
[0161] In one embodiment, the result determination module 1003 is specifically used for:
[0162] The first weight of the point cloud target label data and the second weight of the image target label data are determined according to the data dimension; wherein the sum of the first weight and the second weight is 1; the detection result of the target in the target scene is determined according to the first weight, the first confidence level, the second weight and the second confidence level.
[0163] In one embodiment, the result determination module 1003 is specifically used for:
[0164] When the data dimension is a type dimension, the first weight is less than the second weight; when the data dimension is a position dimension, the first weight is greater than the second weight.
[0165] In one embodiment, the result determination module 1003 is specifically used for:
[0166] Obtain the first product between the first weight and the first confidence level, and the second product between the second weight and the second confidence level; compare the size of the first product and the second product; if the first product is greater than the second product, the detection result of the corresponding target in the target scene includes point cloud target label data; if the first product is less than or equal to the second product, the detection result of the corresponding target in the target scene includes image target label data.
[0167] In one embodiment, when the multi-frame environmental image includes multiple frames of two-dimensional images with different coverage areas, the mapping relationship also includes a second mapping relationship between the image detection targets of each two-dimensional image; the mapping association module 1002 is further configured to:
[0168] Obtain the target position in the image target label data of each frame of two-dimensional image; determine the second mapping relationship of image detection targets between multiple frames of two-dimensional images based on each target position; wherein, the image detection targets with the second mapping relationship represent the same target in the target scene.
[0169] Each module in the aforementioned data tagging device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the operations corresponding to each module.
[0170] In one embodiment, a computer device is provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to perform the following steps:
[0171] Acquire target label data from multiple frames of environmental images acquired at the same time in the target scene; among them, the detected targets with mapping relationships between the environmental images represent the same target in the target scene; based on the target label data of each environmental image, obtain the mapping relationship of the detected targets between the environmental images; based on the mapping relationship and the target label data of each environmental image, determine the detection result of the target in the target scene; based on the detection result of each target in the target scene and the mapping relationship, correct the target label data in each environmental image.
[0172] In one embodiment, the multi-frame environment image includes a point cloud image and a two-dimensional image; the target labeling data includes point cloud target labeling data in the point cloud image and image target labeling data in the two-dimensional image; the mapping relationship includes a first mapping relationship between point cloud detection targets in the point cloud image and image detection targets in the two-dimensional image; when the processor executes the computer program, it also implements the following steps:
[0173] Obtain the first position of each point cloud detection target in the point cloud target labeling data and the second position of each image detection target in the image target labeling data; determine the first mapping relationship between the point cloud detection target and the image detection target based on the first position and the second position.
[0174] In one embodiment, the processor further performs the following steps when executing the computer program:
[0175] Transform the first position and the second position to the same coordinate system; obtain the first position error between the transformed first position and the second position; determine the first mapping relationship between the point cloud detection target and the image detection target based on the first position error.
[0176] In one embodiment, the processor further performs the following steps when executing the computer program:
[0177] Identify point cloud detection targets and image detection targets with a first mapping relationship; wherein, the point cloud detection targets and image detection targets with the first mapping relationship represent the same target in the target scene; obtain the first confidence score of the point cloud target label data and the second confidence score of the image target label data of the same data dimension in the point cloud detection targets and image detection targets with the first mapping relationship; determine the detection result of the target in the target scene based on the first confidence score and the second confidence score.
[0178] In one embodiment, the processor further performs the following steps when executing the computer program:
[0179] The first weight of the point cloud target label data and the second weight of the image target label data are determined according to the data dimension; wherein the sum of the first weight and the second weight is 1; the detection result of the target in the target scene is determined according to the first weight, the first confidence level, the second weight and the second confidence level.
[0180] In one embodiment, the processor further performs the following steps when executing the computer program:
[0181] When the data dimension is a type dimension, the first weight is less than the second weight; when the data dimension is a position dimension, the first weight is greater than the second weight.
[0182] In one embodiment, the processor further performs the following steps when executing the computer program:
[0183] Obtain the first product between the first weight and the first confidence level, and the second product between the second weight and the second confidence level; compare the size of the first product and the second product; if the first product is greater than the second product, the detection result of the corresponding target in the target scene includes point cloud target label data; if the first product is less than or equal to the second product, the detection result of the corresponding target in the target scene includes image target label data.
[0184] In one embodiment, when the multi-frame environmental image includes multiple frames of two-dimensional images with different coverage areas, the mapping relationship also includes a second mapping relationship between the image detection targets of each two-dimensional image; the processor also performs the following steps when executing the computer program:
[0185] Obtain the target position in the image target label data of each frame of two-dimensional image; determine the second mapping relationship of image detection targets between multiple frames of two-dimensional images based on each target position; wherein, the image detection targets with the second mapping relationship represent the same target in the target scene.
[0186] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, the computer program performing the following steps when executed by a processor:
[0187] Acquire target label data from multiple frames of environmental images acquired at the same time in the target scene; obtain the mapping relationship between detected targets in each environmental image based on the target label data of each environmental image; wherein, detected targets with mapping relationships between environmental images represent the same target in the target scene; determine the detection results of targets in the target scene based on the mapping relationship and the target label data of each environmental image; and correct the target label data in each environmental image based on the detection results and mapping relationship of each target in the target scene.
[0188] In one embodiment, the multi-frame environment image includes a point cloud image and a two-dimensional image; the target labeling data includes point cloud target labeling data in the point cloud image and image target labeling data in the two-dimensional image; the mapping relationship includes a first mapping relationship between point cloud detection targets in the point cloud image and image detection targets in the two-dimensional image; when the computer program is executed by the processor, it further implements the following steps:
[0189] Obtain the first position of each point cloud detection target in the point cloud target labeling data and the second position of each image detection target in the image target labeling data; determine the first mapping relationship between the point cloud detection target and the image detection target based on the first position and the second position.
[0190] In one embodiment, when the computer program is executed by the processor, it further performs the following steps:
[0191] Transform the first position and the second position to the same coordinate system; obtain the first position error between the transformed first position and the second position; determine the first mapping relationship between the point cloud detection target and the image detection target based on the first position error.
[0192] In one embodiment, when the computer program is executed by the processor, it further performs the following steps:
[0193] Identify point cloud detection targets and image detection targets with a first mapping relationship; wherein, the point cloud detection targets and image detection targets with the first mapping relationship represent the same target in the target scene; obtain the first confidence score of the point cloud target label data and the second confidence score of the image target label data of the same data dimension in the point cloud detection targets and image detection targets with the first mapping relationship; determine the detection result of the target in the target scene based on the first confidence score and the second confidence score.
[0194] In one embodiment, when the computer program is executed by the processor, it further performs the following steps:
[0195] The first weight of the point cloud target label data and the second weight of the image target label data are determined according to the data dimension; wherein the sum of the first weight and the second weight is 1; the detection result of the target in the target scene is determined according to the first weight, the first confidence level, the second weight and the second confidence level.
[0196] In one embodiment, when the computer program is executed by the processor, it further performs the following steps:
[0197] When the data dimension is a type dimension, the first weight is less than the second weight; when the data dimension is a position dimension, the first weight is greater than the second weight.
[0198] In one embodiment, when the computer program is executed by the processor, it further performs the following steps:
[0199] Obtain the first product between the first weight and the first confidence level, and the second product between the second weight and the second confidence level; compare the size of the first product and the second product; if the first product is greater than the second product, the detection result of the corresponding target in the target scene includes point cloud target label data; if the first product is less than or equal to the second product, the detection result of the corresponding target in the target scene includes image target label data.
[0200] In one embodiment, when the multi-frame environmental image includes multiple frames of two-dimensional images with different coverage areas, the mapping relationship also includes a second mapping relationship between the image detection targets in each two-dimensional image; when the computer program is executed by the processor, it further implements the following steps:
[0201] Obtain the target position in the image target label data of each frame of two-dimensional image; determine the second mapping relationship of image detection targets between multiple frames of two-dimensional images based on each target position; wherein, the image detection targets with the second mapping relationship represent the same target in the target scene.
[0202] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, performs the following steps:
[0203] Acquire target label data from multiple frames of environmental images acquired at the same time in the target scene; obtain the mapping relationship between detected targets in each environmental image based on the target label data of each environmental image; wherein, detected targets with mapping relationships between environmental images represent the same target in the target scene; determine the detection results of targets in the target scene based on the mapping relationship and the target label data of each environmental image; and correct the target label data in each environmental image based on the detection results and mapping relationship of each target in the target scene.
[0204] In one embodiment, the multi-frame environment image includes a point cloud image and a two-dimensional image; the target labeling data includes point cloud target labeling data in the point cloud image and image target labeling data in the two-dimensional image; the mapping relationship includes a first mapping relationship between point cloud detection targets in the point cloud image and image detection targets in the two-dimensional image; when the computer program is executed by the processor, it further implements the following steps:
[0205] Obtain the first position of each point cloud detection target in the point cloud target labeling data and the second position of each image detection target in the image target labeling data; determine the first mapping relationship between the point cloud detection target and the image detection target based on the first position and the second position.
[0206] In one embodiment, when the computer program is executed by the processor, it further performs the following steps:
[0207] Transform the first position and the second position to the same coordinate system; obtain the first position error between the transformed first position and the second position; determine the first mapping relationship between the point cloud detection target and the image detection target based on the first position error.
[0208] In one embodiment, when the computer program is executed by the processor, it further performs the following steps:
[0209] Identify point cloud detection targets and image detection targets with a first mapping relationship; wherein, the point cloud detection targets and image detection targets with the first mapping relationship represent the same target in the target scene; obtain the first confidence score of the point cloud target label data and the second confidence score of the image target label data of the same data dimension in the point cloud detection targets and image detection targets with the first mapping relationship; determine the detection result of the target in the target scene based on the first confidence score and the second confidence score.
[0210] In one embodiment, when the computer program is executed by the processor, it further performs the following steps:
[0211] The first weight of the point cloud target label data and the second weight of the image target label data are determined according to the data dimension; wherein the sum of the first weight and the second weight is 1; the detection result of the target in the target scene is determined according to the first weight, the first confidence level, the second weight and the second confidence level.
[0212] In one embodiment, when the computer program is executed by the processor, it further performs the following steps:
[0213] When the data dimension is a type dimension, the first weight is less than the second weight; when the data dimension is a position dimension, the first weight is greater than the second weight.
[0214] In one embodiment, when the computer program is executed by the processor, it further performs the following steps:
[0215] Obtain the first product between the first weight and the first confidence level, and the second product between the second weight and the second confidence level; compare the size of the first product and the second product; if the first product is greater than the second product, the detection result of the corresponding target in the target scene includes point cloud target label data; if the first product is less than or equal to the second product, the detection result of the corresponding target in the target scene includes image target label data.
[0216] In one embodiment, when the multi-frame environmental image includes multiple frames of two-dimensional images with different coverage areas, the mapping relationship also includes a second mapping relationship between the image detection targets in each two-dimensional image; when the computer program is executed by the processor, it further implements the following steps:
[0217] Obtain the target position in the image target label data of each frame of two-dimensional image; determine the second mapping relationship of image detection targets between multiple frames of two-dimensional images based on each target position; wherein, the image detection targets with the second mapping relationship represent the same target in the target scene.
[0218] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments described above. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.
[0219] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0220] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.
Claims
1. A data tagging method, characterized in that, The method includes: Acquire target marker data from multiple frames of environmental images at the same acquisition time in a target scene; the multiple frames of environmental images include point cloud images and two-dimensional images, and the target marker data includes point cloud target marker data in the point cloud images and image target marker data in the two-dimensional images; Based on the target label data of each environmental image, a mapping relationship between the detected targets in each environmental image is obtained; wherein, the detected targets with the mapping relationship between each environmental image represent the same target in the target scene; the mapping relationship includes a first mapping relationship between the point cloud detected targets in the point cloud image and the image detected targets in the two-dimensional image; Identify point cloud detection targets and image detection targets that have the first mapping relationship; the point cloud detection targets and image detection targets that have the first mapping relationship represent the same target in the target scene; Obtain the first confidence level of point cloud target label data and the second confidence level of image target label data of the same data dimension in point cloud detection targets and image detection targets with the first mapping relationship; The first weight of the point cloud target label data and the second weight of the image target label data are determined according to the data dimension; the sum of the first weight and the second weight is 1; when the data dimension is a type dimension, the first weight is less than the second weight; when the data dimension is a position dimension, the first weight is greater than the second weight. The detection result of the target in the target scene is determined based on the first weight, the first confidence level, the second weight, and the second confidence level. Based on the detection results of each target in the target scene and the mapping relationship, the target label data in each of the environmental images is corrected; The step of obtaining the mapping relationship of detected targets between the environmental images based on the target label data of each environmental image includes: Obtain the first position of each point cloud detection target in the point cloud target labeling data, and the second position of each image detection target in the image target labeling data; A first mapping relationship between the point cloud detection target and the image detection target is determined based on the first position and the second position.
2. The method according to claim 1, characterized in that, Determining the first mapping relationship between the point cloud detection target and the image detection target based on the first position and the second position includes: Transform the first position and the second position to the same coordinate system; Obtain the first position error between the converted first position and the second position; A first mapping relationship between the point cloud detection target and the image detection target is determined based on the first position error.
3. The method according to claim 2, characterized in that, Determining the first mapping relationship between the point cloud detection target and the image detection target based on the first position error includes: Compare the first position error with a preset distance threshold; When the first position error is less than the preset distance threshold, the point cloud detection target and the image detection target with the first position error are correlated to obtain the first mapping relationship.
4. The method according to claim 1, characterized in that, Determining the detection result of the target in the target scene based on the first weight, the first confidence level, the second weight, and the second confidence level includes: Obtain the first product between the first weight and the first confidence level, and the second product between the second weight and the second confidence level; Compare the first product with the second product; If the first product is greater than the second product, the detection result of the corresponding target in the target scene includes the point cloud target label data; If the first product is less than or equal to the second product, the detection result of the corresponding target in the target scene includes the image target label data.
5. The method according to any one of claims 1-4, characterized in that, When the multi-frame environmental images include multiple frames of two-dimensional images with different coverage areas, the mapping relationship also includes a second mapping relationship between the image detection targets in each of the two-dimensional images; the method further includes: Obtain the target position in the image target marker data of each frame of the two-dimensional image; A second mapping relationship between the image detection targets in multiple frames of two-dimensional images is determined based on each of the target locations; wherein the image detection targets with the second mapping relationship represent the same target in the target scene.
6. The method according to claim 5, characterized in that, The step of determining the second mapping relationship between the image detection targets in multiple frames of two-dimensional images based on each of the target locations includes: Calculate the second positional error between each pair of the target positions; For each of the second position errors, compare the second position error with a preset distance threshold; If the second position error is less than the preset distance threshold, the image detection targets with the second position error are correlated to obtain the second mapping relationship.
7. A data tagging device, characterized in that, The device includes: The target labeling module is used to acquire target labeling data of multiple frames of environmental images at the same acquisition time in a target scene; the multiple frames of environmental images include point cloud images and two-dimensional images, and the target labeling data includes point cloud target labeling data in the point cloud images and image target labeling data in the two-dimensional images; The mapping and association module is used to obtain the mapping relationship between the detected targets in each of the environmental images based on the target label data of each of the environmental images; wherein, the detected targets with the mapping relationship between the environmental images represent the same target in the target scene; the mapping relationship includes a first mapping relationship between the point cloud detection targets in the point cloud image and the image detection targets in the two-dimensional image; The result determination module is used to determine point cloud detection targets and image detection targets having the first mapping relationship; the point cloud detection targets and image detection targets having the first mapping relationship represent the same target in the target scene; obtain the first confidence score of point cloud target label data and the second confidence score of image target label data of the same data dimension in the point cloud detection targets and image detection targets having the first mapping relationship; determine the first weight of the point cloud target label data and the second weight of the image target label data according to the data dimension; the sum of the first weight and the second weight is 1; when the data dimension is a type dimension, the first weight is less than the second weight; when the data dimension is a position dimension, the first weight is greater than the second weight; determine the detection result of the target in the target scene according to the first weight, the first confidence score, the second weight, and the second confidence score; The correction labeling module is used to correct the target labeling data in each of the environmental images based on the detection results of each target in the target scene and the mapping relationship; The mapping and association module is further configured to: obtain the first position of each point cloud detection target in the point cloud target labeling data and the second position of each image detection target in the image target labeling data; and determine the first mapping relationship between the point cloud detection target and the image detection target based on the first position and the second position.
8. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 6.
9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 6.
10. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 6.