Intelligent driving sensor data processing method and device

By filtering out invalid data and extracting synchronization frames from sensor data of autonomous vehicles, the problem of invalid data in sensor data processing is solved, enabling efficient and accurate data quality assessment, ensuring the quality of data used later, and improving data processing efficiency and utilization.

CN122266176APending Publication Date: 2026-06-23SHANGHAI ZAOFU INTELLIGENT TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHANGHAI ZAOFU INTELLIGENT TECHNOLOGY CO LTD
Filing Date
2026-03-20
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

The sensor data generated by autonomous vehicles during road testing suffers from several problems, including a high proportion of invalid data, incomplete statistics on multi-sensor synchronization frames, a lack of scientific methods for calculating the qualified frame ratio, unreasonable quality threshold settings, and low efficiency of manual quality inspection. These issues lead to a waste of computing resources and low data utilization.

Method used

By acquiring vehicle sensor configurations and business scenario requirements, the raw sensor data is filtered for invalidity and segmented. Synchronization frames are extracted based on the forward LiDAR timestamp using a differentiated time window, and qualification is determined according to sensor configuration to generate a quality assessment report, ensuring the completeness and synchronization of the data.

Benefits of technology

It achieves efficient filtering of invalid data, accurate statistical analysis of the completeness and synchronization of sensor data, quantitative evaluation of data quality, ensures the quality of data used later, and improves data processing efficiency and utilization.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122266176A_ABST
    Figure CN122266176A_ABST
Patent Text Reader

Abstract

The purpose of the present application is to provide an intelligent driving sensor data processing method and device, which effectively pre-selects the original sensor data collected by the autonomous vehicle, accurately counts the completeness and synchronization of the original sensor data, and evaluates the data quality. Not only does it filter out obvious invalid data and quantify the collaborative quality of each sensor data, but it also quantitatively evaluates the overall quality level of the original sensor data, ensuring that the original sensor data used in specific practical application scenarios in the later stage are high-quality data.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer technology, and in particular to a method and device for processing sensor data in intelligent driving. Background Technology

[0002] In existing technologies, autonomous vehicles generate massive amounts of sensor data during road testing, including: images from multiple cameras (e.g., 7 to 11 cameras), point clouds from multiple LiDAR systems (e.g., 4 to 8 LiDAR systems), GPS / IMU positioning data, and CAN bus vehicle status data. This data is typically stored in ROS2bag format. Current autonomous driving data quality control faces several prominent issues: First, a lack of systematic data pre-selection mechanisms. For example, the raw data collection contains a large number of invalid bag files with missing topics (missing camera, missing radar, etc.). Traditional methods directly parse all bag files, leading to low efficiency and an invalid data ratio of 20-30%, severely wasting computational resources. Second, imperfect methods for multi-sensor synchronization frame statistics. Autonomous vehicles are equipped with multiple sensors (e.g., 7 to 11 cameras, 4 to 8 LiDAR systems, positioning, etc.), each with different acquisition frequencies and inconsistent timestamps. This lack of a unified baseline timeline for multi-sensor data alignment makes it impossible to accurately count the completeness of sensor data in synchronization frames. Third, a lack of scientific methods for calculating the qualified frame ratio. Traditional methods only judge "pass" or "fail." The problems include: 1) lack of quantitative indicators, making it impossible to assess the overall quality level of the raw data packets; 2) lack of pass / fail criteria based on sensor integrity, making it impossible to trace the specific source of quality problems (e.g., which sensor data is missing); 3) lack of scientific basis for quality threshold setting, such as the lack of clear pass / fail thresholds in traditional methods (e.g., 80%, 90%), leading to unreasonable screening and an inability to set differentiated quality thresholds according to application needs, resulting in low data utilization and waste of high-quality data; 4) low efficiency and high cost of manual quality inspection, such as relying on manual sampling to judge data quality, which is slow (e.g., 1 hour / packet), costly (e.g., 500 yuan / packet), highly subjective, inconsistent standards, and difficult to cope with large-scale data processing needs (e.g., daily processing volume can reach hundreds of packets). Summary of the Invention

[0003] One objective of this application is to provide a method and device for processing sensor data in intelligent driving. By effectively pre-selecting, accurately statistically analyzing the completeness and synchronization of raw sensor data collected by autonomous vehicles, and evaluating data quality, this method not only filters out obviously invalid data and quantitatively evaluates the collaborative quality of each sensor's data, but also quantitatively assesses the overall quality level of the raw sensor data, ensuring that the raw sensor data used in specific practical application scenarios are all high-quality data.

[0004] According to one aspect of this application, a method for processing sensor data in intelligent driving is provided, wherein the method includes: Acquire raw sensor data collected by autonomous vehicles, and obtain business scenario requirements and sensor configuration of the vehicles; Based on the vehicle's sensor configuration or the requirements of the business scenario, the raw sensor data is filtered to obtain valid raw sensor data. The effective raw sensor data is segmented according to the business scenario requirements to obtain the raw sensor data corresponding to each segment time period. Using the forward lidar timestamp as the base time, sensors with different themes adopt differentiated time windows. All data frames of different themes in each time window of the original sensor data corresponding to the segmented time period are taken as the synchronization frames corresponding to the base time. The synchronization frame is a set of multiple data frames. The synchronization frames corresponding to each time point of the original sensor data corresponding to each segmented time period are determined. The time points are all forward lidar timestamps. Based on the sensor configuration of the vehicle, the passability of the synchronization frames corresponding to each time point is determined, and a quality assessment report of the original sensor data corresponding to each segmented time period is generated. The quality assessment report includes the passable frame ratio, which is the number of passable synchronization frames in the original sensor data corresponding to the segmented time period divided by the total number of frames.

[0005] Furthermore, in the above method, the step of filtering out invalid data from the original sensor data according to the sensor configuration of the vehicle to obtain valid original sensor data includes: If the vehicle's sensors are configured as 4-channel radar and 7-channel camera, the process of filtering the raw sensor data to obtain valid raw sensor data includes: Check whether the raw sensor data contains complete data corresponding to the 4-channel radar theme, 7-channel camera theme, and vehicle positioning theme. If so, the raw sensor data with complete themes is determined as valid raw sensor data. or, If the vehicle's sensors are configured with 8 radar channels and 11 cameras, the process of filtering out invalid data from the raw sensor data to obtain valid raw sensor data includes: Check whether the raw sensor data contains complete data corresponding to the 8 radar themes, 11 camera themes, and vehicle positioning themes. If so, the raw sensor data with complete themes is determined as valid raw sensor data.

[0006] Furthermore, in the above method, after checking whether the original sensor data contains complete data corresponding to 4 radar themes, 7 camera themes, and vehicle positioning themes, or after checking whether the original sensor data contains complete data corresponding to 8 radar themes, 11 camera themes, and vehicle positioning themes, the method further includes: If not, then record the missing information and mark the invalid data for the original sensor data where the subject is missing.

[0007] Furthermore, in the above method, the sensors for different themes employ differentiated time windows, including: a time window of 15ms before and after for radar-themed sensors, a time window of 50ms before and after for camera-themed sensors, a time window of 75ms before and 25ms after for vehicle positioning-themed sensors, and a time window of 50ms before and after for other themes. Wherein, the forward LiDAR timestamp is used as the base time, and sensors of different themes use differentiated time windows. All data frames of different themes within each time window of the original sensor data corresponding to the segmented time period are used as synchronization frames corresponding to the base time. The synchronization frame is a set of multiple data frames. The synchronization frames corresponding to each time point of the original sensor data for each segmented time period are determined, and the time points are all forward LiDAR timestamps, including: Obtain all forward LiDAR timestamps and use them as reference times. Perform the following operations at each reference time until the synchronization frames of the original sensor data corresponding to each of the said segmented time periods are obtained at each time point: The original radar sensor data frames 15ms before and after the reference time, the original camera sensor data frames 50ms before and after the reference time, the original vehicle positioning sensor data frames from 75ms before to 25ms after the reference time, and the original other sensor data frames 50ms before and after the reference time are all used as the synchronization frames corresponding to the reference time. Each synchronization frame corresponding to a reference time is recorded as a synchronization frame.

[0008] Furthermore, in the above method, the step of determining the passability of the synchronization frames corresponding to each time point according to the sensor configuration of the vehicle includes: If the vehicle's sensor configuration includes 4 radar channels and 7 cameras, the step of determining the passability of the synchronization frames corresponding to each time point based on the vehicle's sensor configuration includes: For each synchronization frame at each time point, perform the following operations until the pass / fail determination of synchronization frames at all time points is completed: Check whether the synchronization frames corresponding to the time points all contain complete data corresponding to the 4 radar themes, 7 camera themes and vehicle positioning themes, and whether there are at least 4 original vehicle positioning sensor data frames. If so, the synchronization frame corresponding to the time point is determined as a qualified frame; otherwise, the synchronization frame corresponding to the time point is determined as a non-qualified frame. or, If the vehicle's sensor configuration includes 8 radar channels and 11 cameras, the step of determining the passability of the synchronization frames corresponding to each time point based on the vehicle's sensor configuration includes: Check whether the synchronization frames corresponding to the time points all contain complete data corresponding to the 8 radar themes, 11 camera themes, and vehicle positioning themes, and whether there are at least 4 original vehicle positioning sensor data frames. If so, the synchronization frame corresponding to the time point is determined as a qualified frame; otherwise, the synchronization frame corresponding to the time point is determined as a non-qualified frame.

[0009] Furthermore, the above method further includes: Determine the pass rate threshold based on the requirements of the business scenario; Determine whether the qualified frame ratio is greater than or equal to the qualified rate threshold. If so, the original sensor data corresponding to the segment time period corresponding to the qualified frame ratio that is greater than or equal to the qualified rate threshold will be submitted as the original sensor data for submission. If not, the original sensor data corresponding to the segment time period corresponding to the qualified frame ratio that is less than the qualified rate threshold will be discarded.

[0010] Furthermore, the above method further includes: Identify all candidate marker frames and their positions in the raw sensor data of the submitted marker; For each candidate tag frame, perform the following operation until the final tag frame corresponding to the original sensor data of the submitted tag is obtained: Determine whether all current candidate marker frames meet the integrity requirements of the vehicle's sensor configuration and the image quality required for the first actual application scenario. If so, determine the current candidate marker frame as the final marker frame. If not, determine whether the next candidate marker frame meets the integrity requirements of the vehicle's sensor configuration. If yes, then the marker points of the current candidate marker frame are discarded, and the next candidate marker frame is determined as the final marker frame. If no, then it is determined whether the previous candidate marker frame of the current candidate marker frame meets the integrity requirements of the vehicle's sensor configuration. If so, the markers of the current candidate marker frame and the next candidate marker frame are discarded, and the previous candidate marker frame is determined as the final marker frame; otherwise, the markers of the current candidate marker frame, the next candidate marker frame, and the previous candidate marker frame are discarded.

[0011] Furthermore, the above method further includes: Obtain the requirements of the second actual application scenario, and determine the application qualification rate threshold, the sliding window size and the start frame number of the sliding window search based on the requirements of the second actual application scenario; Predict the initial marker of all synchronization frames in the original sensor data of the target being sent; Starting from the synchronization frame corresponding to the starting frame number, slide a window and count the percentage of qualified frames in the original sensor data of each sliding window size in the original sensor data of the submitted target. The optimal sliding window size is determined when the qualified frame rate is the highest. The synchronous frames in the original sensor data submitted for labeling that are outside the optimal sliding window are marked as unqualified, while the original marks of the synchronous frames within the optimal sliding window are maintained; it is then determined whether the qualified frame ratio of the original sensor data corresponding to the optimal sliding window is greater than the application qualification rate threshold. If so, the original sensor data corresponding to the optimal sliding window will be used in the actual application scenario; If not, record failure information indicating that the original sensor data corresponding to the optimal sliding window has failed to be used.

[0012] According to another aspect of this application, a non-volatile storage medium is also provided, on which computer-readable instructions are stored, which, when executed by a processor, cause the processor to implement the intelligent driving sensor data processing method described above.

[0013] According to another aspect of this application, an intelligent driving sensor data processing device is also provided, wherein the device includes: One or more processors; Computer-readable medium for storing one or more computer-readable instructions. When the one or more computer-readable instructions are executed by the one or more processors, the one or more processors implement the intelligent driving sensor data processing method described above.

[0014] Compared with existing technologies, this application first acquires raw sensor data collected by autonomous vehicles, and obtains business scenario requirements and the vehicle's sensor configuration; then, based on the vehicle's sensor configuration or the business scenario requirements, it performs invalid filtering on the raw sensor data to obtain valid raw sensor data, thus filtering out obviously invalid data in the early stages of data processing and improving overall processing efficiency; next, it segments the valid raw sensor data according to the business scenario requirements to obtain raw sensor data corresponding to each segment time period; and using the forward LiDAR timestamp as the base time, different time windows are used for sensors with different themes, and all data frames of different themes within each time window of the raw sensor data corresponding to the segment time period are used as synchronization frames corresponding to the base time, wherein the synchronization frames are multiple data frames. Based on the frame set, the synchronous frames corresponding to the original sensor data at each time point for each segmented time period are determined. The time points are the timestamps of all forward-facing LiDAR sensors, enabling accurate statistics on the completeness and synchronization of each sensor's data and quantitatively evaluating the collaborative quality of each sensor's data. Finally, based on the vehicle's sensor configuration, the pass / fail determination of the synchronous frames corresponding to each time point is performed, and a quality assessment report of the original sensor data corresponding to each segmented time period is generated. The quality assessment report includes a pass / fail frame ratio, which is the number of pass / fail synchronous frames in the original sensor data corresponding to the segmented time period divided by the total number of frames. The pass / fail frame ratio enables a quantitative assessment of the overall quality level of the original sensor data, ensuring that the original sensor data used in specific practical application scenarios is of high quality. Attached Figure Description

[0015] Other features, objects, and advantages of this application will become more apparent from the following detailed description of non-limiting embodiments with reference to the accompanying drawings: Figure 1 A flowchart illustrating a method for processing sensor data for intelligent driving according to one aspect of this application is shown. Figure 2 This diagram illustrates the overall system architecture of an intelligent driving sensor data processing method according to one aspect of this application. Detailed Implementation

[0016] The present application will now be described in further detail with reference to the accompanying drawings.

[0017] In a typical configuration of this application, the terminal, the device of the service network, and the trusted party all include one or more processors (CPU), input / output interfaces, network interfaces, and memory.

[0018] Memory may include non-persistent storage in computer-readable media, such as random access memory (RAM) and / or non-volatile memory, such as read-only memory (ROM) or flash RAM. Memory is an example of computer-readable media.

[0019] Computer-readable media includes both permanent and non-permanent, removable and non-removable media that can store information by any method or technology. Information can be computer-readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, CD-ROM, digital versatile optical disc (DVD) or other optical storage, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transferable medium that can be used to store information accessible by a computing device. As defined herein, computer-readable media does not include non-transitory computer-readable media, such as modulated data signals and carrier waves.

[0020] Existing autonomous driving data quality suffers from several shortcomings: a lack of rapid pre-selection mechanisms (e.g., traditional methods treat all data equally, failing to quickly filter out obviously missing data through topic integrity checks in the early stages); a lack of unified synchronization benchmarks (inconsistent timestamps across multiple sensors, lacking a clear benchmark timeline for extracting synchronization frames); a lack of quantifiable pass / fail indicators (only binary "pass" or "fail" judgments, unable to quantify the overall quality level of data packets); a lack of sensor integrity verification (failing to check the integrity of radar sensor lidar, camera sensor camera, and vehicle positioning sensor localization, leading to incomplete data entering downstream applications); and low efficiency of manual quality inspection (relying on manual sampling to judge data quality, which is slow, costly, and lacks standardized criteria). To address these technical problems, this application proposes an intelligent driving sensor data processing method, the flowchart of which is shown below. Figure 1As shown, this method is applied to the field of autonomous driving technology. The method includes steps S11, S12, S13, S14, and S15, specifically including the following steps: Step S11: Obtain the raw sensor data collected by the autonomous vehicle, and obtain the business scenario requirements and the sensor configuration of the vehicle; Here, the business scenario requirements may include, but are not limited to, the need to process and analyze raw sensor data for subsequent use in specific business scenarios, such as the need to use raw sensor data for mapping, the need to filter or label raw sensor data for value assessment of autonomous driving, the need to perform navigation analysis for autonomous vehicles, and the need to train the raw sensor data acquired by autonomous vehicles.

[0021] The vehicle's sensor configuration can be, but is not limited to, a configuration of 4 radar sensors, 7 camera sensors, and a vehicle localization sensor, or a configuration of 8 radar sensors, 11 camera sensors, and a vehicle localization sensor. For example, in a preferred embodiment of this application, the vehicle's sensor configuration can be obtained through the following code: Input: sensor_config (sensor configuration type) Output: required_sensors (list of required sensors) Configuration 1: 4 lidar units + 7 cameras required_cameras = [ 'camera_backward', # Rearview camera 'camera_forward_far', # Forward-facing telephoto camera 'camera_forward_wide', # Forward-facing wide-angle camera 'camera_pano_leftfront', # Left front panoramic camera 'camera_pano_leftrear', # Left rear panoramic camera 'camera_pano_rightfront', # Right front panoramic camera 'camera_pano_rightrear'# Right rear panoramic camera ] required_lidars = 4 required_localization_min = 4 # At least 4 localization records Configuration 2 (8 lidars) - Requires 11 cameras: required_cameras = [ 'camera_backward', 'camera_forward_far', 'camera_forward_wide', 'camera_pano_leftfront', 'camera_pano_leftrear', 'camera_pano_rightfront', 'camera_pano_rightrear', 'camera_surr_front', # Camera before blind spot detection 'camera_surr_left', # Fill in the gaps with the left camera 'camera_surr_rear', # Camera after blind spot detection 'camera_surr_right'# Fill in the gaps in right-side camera coverage ] required_lidars = 8 required_localization_min = 4 Step S12: Based on the sensor configuration of the vehicle or the requirements of the business scenario, the original sensor data is filtered to obtain valid original sensor data; Step S13: The valid raw sensor data is segmented according to the business scenario requirements to obtain the raw sensor data corresponding to each segment time period. Step S14: Using the forward lidar timestamp as the base time, different time windows are used for sensors of different themes. All data frames of different themes in each time window of the original sensor data corresponding to the segmented time period are taken as the synchronization frames corresponding to the base time. The synchronization frame is a set of multiple data frames. The synchronization frames corresponding to each time point of the original sensor data corresponding to each segmented time period are determined. The time points are all forward lidar timestamps. Step S15: According to the sensor configuration of the vehicle, the passability of the synchronization frames corresponding to each time point is determined, and a quality assessment report of the original sensor data corresponding to each segmented time period is generated. The quality assessment report includes the passable frame ratio, which is the number of passable synchronization frames in the original sensor data corresponding to the segmented time period / the total number of frames.

[0022] Through steps S11 to S15 above, the raw sensor data collected by autonomous vehicles is effectively pre-selected, the completeness and synchronization of the raw sensor data are accurately statistically analyzed, and the data quality is evaluated. This not only filters out obvious invalid data and quantitatively evaluates the collaborative quality of each sensor data, but also quantitatively evaluates the overall quality level of the raw sensor data, ensuring that the raw sensor data used in specific practical application scenarios in the later stages are all high-quality data.

[0023] Following the above embodiments of this application, step S12, which filters out invalid data from the original sensor data based on the vehicle's sensor configuration to obtain valid original sensor data, specifically includes: If the vehicle's sensors are configured as 4-channel radar and 7-channel camera, the process of filtering the raw sensor data to obtain valid raw sensor data includes: Check whether the raw sensor data contains complete data corresponding to the 4-channel radar theme, 7-channel camera theme, and vehicle positioning theme. If so, the raw sensor data with complete themes is determined as valid raw sensor data. or, If the vehicle's sensors are configured with 8 radar channels and 11 cameras, the process of filtering out invalid data from the raw sensor data to obtain valid raw sensor data includes: Check whether the raw sensor data contains complete data corresponding to the 8 radar themes, 11 camera themes, and vehicle positioning themes. If so, the raw sensor data with complete themes is determined as valid raw sensor data.

[0024] In the embodiments of this application, such as Figure 2 In the overall system architecture diagram shown, after acquiring the raw sensor data, in the early stages of processing the raw sensor data, it is necessary to quickly filter invalid bag files by checking the vehicle's necessary sensor configuration, thus preventing invalid data from entering subsequent processing stages (corresponding to...). Figure 2 Phase 1: Data Pre-selection. Before checking the necessary sensors of the vehicle, each sensor of the vehicle needs to be defined. In a preferred embodiment of this application, the necessary sensors of the vehicle can be defined by the following code: Input: None Output: required_topics (a list of required sensor topics) Pre-selection rules: # Define the required camera topics (h264 / h265 format is acceptable) required_camera_topics = [ 'sensor_camera_forward_far_orig', # Forward-facing long-range camera 'sensor_camera_forward_wide_orig', # Forward-facing wide-angle camera 'sensor_camera_pano_leftfront_orig', # Left front panoramic camera 'sensor_camera_pano_leftrear_orig', # Left rear panoramic camera 'sensor_camera_pano_rightfront_orig', # Right front panoramic camera 'sensor_camera_pano_rightrear_orig', # Right rear panoramic camera 'sensor_camera_backward_orig'# Rearview camera ] # Define the required lidar topics required_lidar_topics = [ 'sensor_lidar_front_orig_pcloud',# Front radar 'sensor_lidar_right_orig_pcloud', # Right radar 'sensor_lidar_left_orig_pcloud', # Left radar 'sensor_lidar_rear_orig_pcloud'# Rear radar ] # Define the required localization topics required_localization_topic = 'localization' After determining the vehicle's sensor configuration, the raw sensor data needs to be filtered for invalidity to obtain valid raw sensor data. In a preferred embodiment of this application, the topic integrity check of the raw sensor data can be implemented using the following code: Input: bag_path (path to the bag file) Output: is_valid (boolean), missing_topics (list of missing topics) algorithm: # Get information about the bag file bag_info = get_rosbag_info(bag_path)# Use the ros2 bag info command topics_in_bag = bag_info['Topics'].keys() # Check camera topics (either h264 or h265 is fine) missing_camera_topics = [] For each required_camera in required_camera_topics: found_h264 = (required_camera + '_h264') in topics_in_bag found_h265 = (required_camera + '_h265') in topics_in_bag If not (found_h264 or found_h265): missing_camera_topics.append(required_camera) / / Outputs information about the missing camera topic. # Check lidar topics missing_lidar_topics = [] For each required_lidar in required_lidar_topics: If required_lidar not in topics_in_bag: missing_lidar_topics.append(required_lidar) / / Outputs information related to the missing radar topics. # Check localization topic localization_exists = required_localization_topic in topics_in_bag # Comprehensive Judgment missing_topics = missing_camera_topics + missing_lidar_topics If localization_exists: missing_topics.append(required_localization_topic) If len(missing_topics) > 0: Returns (False, missing_topics) Else: Returns (True, []) The above code implements topic integrity checks when the vehicle's sensors are configured with 4 radar channels and 7 cameras. It checks whether the collected raw sensor data contains complete data corresponding to the 4 radar channels, 7 camera channels, and vehicle positioning topics. If the raw sensor data contains complete data corresponding to these topics, it indicates that the raw sensor data is topic-complete. This topic-complete raw sensor data is then identified as valid, enabling effective pre-selection of raw sensor data and improving overall processing efficiency.

[0025] In this embodiment, after checking whether the original sensor data contains complete data corresponding to 4 radar themes, 7 camera themes, and vehicle positioning themes, or after checking whether the original sensor data contains complete data corresponding to 8 radar themes, 11 camera themes, and vehicle positioning themes, the method further includes: If not, then record the missing information and mark the invalid data for the original sensor data where the subject is missing.

[0026] For example, in the implementation code for topic integrity checking when the vehicle's sensors are configured as 4 radars and 7 cameras, if the original sensor data of the vehicle does not contain data corresponding to any item in the complete 4 radar topic, 7 camera topic, and vehicle positioning topic, it indicates that the original sensor data is missing a topic. In order to facilitate the quality assurance of subsequent data processing, the missing information of the original sensor data with missing topics is recorded and marked as invalid data. That is, the original sensor data with missing topics is determined as invalid original sensor data, thereby achieving invalid filtering of the original sensor data and improving the overall processing efficiency.

[0027] In a preferred embodiment of this application, invalid filtering and valid pre-selection of raw sensor data can be implemented using the following code to obtain valid raw sensor data and invalid raw sensor data marked with invalidity and their missing information: Input: bag_path (path to the bag file), check_rosbag (whether to enable pre-selection) Output: should_process (whether to continue processing) algorithm: If check_rosbag is not used: # Disable pre-selection, process all bags Return True # Perform topic integrity check is_valid, missing_topics = check_necessary_topic_valid(bag_path) If not is_valid: logger.info(f"Bag {bag_path} is missing a necessary topic: {missing_topics}, skipit.") # Record missing information to the check_result file write_check_result(output_dir, {bag_path: missing_topics}) # Create NOT_VALID tag file create_file(output_dir + " / NOT_VALID") Return False Else: Return True Here, by quickly identifying bag files in the raw sensor data that are missing essential topics, not only are invalid data packets prevented from entering the subsequent parsing stage, saving 20% ​​to 30% of processing time, but detailed missing topic information of invalid raw sensor data is also provided, making it easier to trace the problem.

[0028] In this embodiment, in step S12, filtering obvious invalid data through fast pre-selection rules can be achieved through topic integrity checks, single-frame integrity checks, or extreme outlier detection. This achieves the purpose of filtering out invalid data and marking it as invalid, thereby obtaining valid raw sensor data. This achieves the goal of filtering out obvious invalid data in the early stage of data processing, thereby improving the overall processing efficiency of raw sensor data.

[0029] Following the above embodiments of this application, after effective pre-selection in step S12, it is necessary to coordinate the completeness and synchronization of effective raw sensor data (corresponding to...). Figure 2 Phase Two: Synchronization Frame Statistics. However, before coordinating completeness and synchronization, the valid raw sensor data needs to be segmented according to the business scenario requirements to obtain the raw sensor data corresponding to each segment time period. In a preferred embodiment of this application, the segment time period for data segmentation is preferably 15 seconds, and the valid raw sensor data can be clipped using the following code: Input: Complete bag data (e.g., the data to be segmented is valid raw sensor data with a duration of 25 seconds). Output: Multiple clip segments parameter: clip_duration = 15 seconds # Duration of a single clip segment skip_start = 1 second # Skip the beginning of the time. algorithm: total_duration = 25 seconds # Clip 1: From second 1 to second 16 (15 seconds) clip1_start = skip_start = 1 second clip1_end = clip1_start + clip_duration = 16 seconds # Clip 2: From 17 seconds to 25 seconds (8 seconds, less than 15 seconds) clip2_start = 17 seconds clip2_end = total_duration = 25 seconds clip2_duration = clip2_end - clip2_start = 8 seconds # Judgment Rules If clip2_duration <clip_duration: # Less than 15 seconds, discard clip2 valid_clips = [clip1] Else: valid_clips = [clip1, clip2] Returns valid_clips To achieve completeness and synchronization statistics of the raw sensor data from each sensor, in the embodiments of this application, the forward lidar timestamp is used as the base time to search for other sensor data within a specified time range, thereby extracting the synchronization frame (corresponding to...). Figure 2 Phase Two: Synchronization Frame Statistics. When extracting synchronization frames, further definitions and explanations of the synchronization frames in this embodiment are needed: Different time windows are used for sensors of different themes in step S14, including: a time window of 15ms before and after for radar sensors, a time window of 50ms before and after for camera sensors, a time window of 75ms before and 25ms after for vehicle positioning sensors, and a time window of 50ms before and after for other themes. In a preferred embodiment of this application, the definition of different time windows for sensors of different themes can be implemented using the following code: Key parameters: # Use the front radar (front_lidar) timestamp as the base time reference_sensor = 'lidar_front' # Time window range for different sensors time_windows = { 'localization': (-75ms, +25ms), # 75ms before and 25ms after localization 'lidar': (-15ms, +15ms), # 15ms before and after. 'camera': (-50ms, +50ms), # 50ms before and after. 'other_topics': (-50ms, +50ms)# Other topics have 50ms before and after. } In step S14, using the forward LiDAR timestamp as the base time, sensors with different themes employ differentiated time windows. All data frames of different themes within each time window of the original sensor data corresponding to the segmented time period are used as synchronization frames corresponding to the base time. The synchronization frame is a set of multiple data frames. The synchronization frames corresponding to each segmented time period are determined at each time point, where each time point is the timestamp of all forward LiDAR sensors, including: Obtain all forward LiDAR timestamps and use them as reference times. Perform the following operations at each reference time until the synchronization frames of the original sensor data corresponding to each of the said segmented time periods are obtained at each time point: The original radar sensor data frames 15ms before and after the reference time, the original camera sensor data frames 50ms before and after the reference time, the original vehicle positioning sensor data frames from 75ms before to 25ms after the reference time, and the original other sensor data frames 50ms before and after the reference time are all used as the synchronization frames corresponding to the reference time. Each synchronization frame corresponding to a reference time is recorded as a synchronization frame.

[0030] In a preferred embodiment of this application, the extraction of synchronization frames based on the forward lidar timestamp can be achieved using the following code: Input: Clip data (i.e., the raw sensor data corresponding to each slice time period) Output: sync_frames (a list of synchronization frames) (i.e., the synchronization frames corresponding to the raw sensor data at each time point for each time segment). Key parameters: # Use the front_lidar timestamp as the base time reference_sensor = 'lidar_front' # Time window range for different sensors time_windows = { 'localization': (-75ms, +25ms), # 75ms before localization to 25ms after localization 'lidar': (-15ms, +15ms), # 15ms before and after. 'camera': (-50ms, +50ms), # 50ms before and after. 'other_topics': (-50ms, +50ms)# 50ms before and after the sensor readings for other topics. } algorithm: sync_frames = [] # Get all front_lidar timestamps as base time For each t_ref in front_lidar_timestamps: sync_frame = { 'reference_timestamp': t_ref, 'frame_data': {} } #Firstly, find the sensor localization of the vehicle positioning topic (from the first 75ms to the last 25ms, find all data frames). loc_window_start = t_ref - 75ms loc_window_end = t_ref + 25ms localization_data = find_all_in_range( 'localization', loc_window_start, loc_window_end ) sync_frame['frame_data']['localization']= localization_data # Second, locate other lidars besides the forward-facing lidar (15ms forward and backward). For each lidar in ['lidar_rear', 'lidar_left', 'lidar_right']: lidar_data = find_closest_in_range( lidar, t_ref - 15ms, t_ref + 15ms ) sync_frame['frame_data'][lidar]= lidar_data #Thirdly, locate the camera (50ms before and after). For each camera in camera_list: camera_data = find_closest_in_range( camera t_ref - 50ms, t_ref + 50ms ) sync_frame['frame_data'][camera]= camera_data #Fourth, find sensors for other topics (50ms before and after). For each topic in other_topics: topic_data = find_closest_in_range( topic, t_ref - 50ms, t_ref + 50ms ) sync_frame['frame_data'][topic]= topic_data sync_frames.append(sync_frame) Return sync_frames Here, the above code uses the original radar sensor data frames 15ms before and after a reference time, the original camera sensor data frames 50ms before and after a reference time, the original vehicle positioning sensor data frames from 75ms before to 25ms after, and the original other sensor data frames 50ms before and after a reference time in the original sensor data corresponding to the segmented time period as synchronization frames corresponding to a selected reference time. In all embodiments of this application, a synchronization frame corresponding to a reference time is recorded as a synchronization frame, thus obtaining a synchronization frame corresponding to a reference time. In all embodiments of this application, a synchronization frame is for a reference time, and whether a synchronization frame is qualified or complete is for whether the synchronization frame corresponding to a reference time is qualified or complete. Of course, a synchronization frame corresponding to a reference time is a frame set containing multiple data frames. Since there are many forward LiDAR timestamps in the original sensor data corresponding to each segmented time period, by repeating the above steps to obtain a synchronization frame corresponding to a reference time, the synchronization frames corresponding to each forward LiDAR timestamp as a reference time can be obtained, thereby obtaining the synchronization frames corresponding to each time point of the original sensor data corresponding to each segmented time period.

[0031] Following the above embodiments of this application, step S15, based on the sensor configuration of the vehicle, performs a qualification determination on the synchronization frames corresponding to each time point, specifically including: If the vehicle's sensor configuration includes 4 radar channels and 7 cameras, the step of determining the passability of the synchronization frames corresponding to each time point based on the vehicle's sensor configuration includes: For each synchronization frame at each time point, perform the following operations until the pass / fail determination of synchronization frames at all time points is completed: Check whether the synchronization frames corresponding to the time points all contain complete data corresponding to the 4 radar themes, 7 camera themes and vehicle positioning themes, and whether there are at least 4 original vehicle positioning sensor data frames. If so, the synchronization frame corresponding to the time point is determined as a qualified frame; otherwise, the synchronization frame corresponding to the time point is determined as a non-qualified frame. or, If the vehicle's sensor configuration includes 8 radar channels and 11 cameras, the step of determining the passability of the synchronization frames corresponding to each time point based on the vehicle's sensor configuration includes: Check whether the synchronization frames corresponding to the time points all contain complete data corresponding to the 8 radar themes, 11 camera themes, and vehicle positioning themes, and whether there are at least 4 original vehicle positioning sensor data frames. If so, the synchronization frame corresponding to the time point is determined as a qualified frame; otherwise, the synchronization frame corresponding to the time point is determined as a non-qualified frame.

[0032] In a preferred embodiment of this application, when the vehicle's sensors are configured as 4 radars and 7 cameras, the qualification determination of the synchronization frame corresponding to a single time point in step S15 can be implemented by the following code: Input: sync_frame (a single synchronization frame, i.e., a synchronization frame corresponding to a point in time) Output: is_qualified (whether it is qualified) Acceptable frame standard: Complete 4-way lidar: lidar_front, lidar_rear, lidar_left, lidar_right 7-channel camera complete: - camera_backward - camera_forward_far - camera_forward_wide - camera_pano_leftfront - camera_pano_leftrear - camera_pano_rightfront - camera_pano_rightrear Localization requires at least 4 frames (since the sensor sampling rate for the vehicle localization subject is 50Hz, at least 4 frames are required within a 100ms time window from the first 75ms to the last 25ms of the reference time). Qualified frame determination algorithm: # Check 4-way lidar required_lidars = ['lidar_front', 'lidar_rear', 'lidar_left', 'lidar_right'] lidar_complete = all( sync_frame['frame_data'][lidar]is not None for lidar in required_lidars ) # Check 7-channel cameras required_cameras = [ 'camera_backward', 'camera_forward_far', 'camera_forward_wide', 'camera_pano_leftfront', 'camera_pano_leftrear', 'camera_pano_rightfront', 'camera_pano_rightrear' ] camera_complete = all( sync_frame['frame_data'][camera]is not None for camera in required_cameras ) # Check localization and the number of raw vehicle positioning sensor data frames localization_data = sync_frame['frame_data']['localization'] localization_sufficient = ( localization_data is not None AND len(localization_data) >= 4 ) # Qualification is_qualified = lidar_complete AND camera_complete AND localization_sufficient Returns is_qualified In this embodiment, the forward lidar timestamp is used as the reference time to avoid cyclic dependencies between multiple sensors. Different time windows are used for sensors of different themes to adapt to their respective sampling characteristics. Among them, the localization of the vehicle positioning theme uses a 100ms window (75ms before and 25ms after) to ensure at least 4 frames of original vehicle positioning sensor data frames, ensuring the integrity and completeness of the synchronization frames.

[0033] In step S14, an index is first constructed using all forward lidar timestamps as the reference time, and differentiated time windows are applied to sensors with different themes. Then, under the constructed time reference, the integrity and coverage statistics of synchronization frames are performed around the sensors with different themes using differentiated time windows, thereby extracting the synchronization frames corresponding to each time point.

[0034] Finally, in step S15, after determining the passability of synchronization frames corresponding to all forward-facing LiDAR timestamps as time bases for different vehicle sensor configurations, a quality assessment report of the raw sensor data corresponding to each segment time period is generated. This quality assessment report includes the passable frame ratio, which is the number of passable synchronization frames in the raw sensor data corresponding to the segment time period divided by the total number of frames. A synchronization frame corresponding to a time point is defined as one synchronization frame. By calculating the quality assessment report of the raw sensor data for each segment time period, the passable frame ratio in the quantitative data packet is calculated, thereby achieving the purpose of quantitatively assessing the overall quality level of the data packet (corresponding to...). Figure 2 Phase 3: Calculation of the qualified frame ratio.

[0035] In a preferred embodiment of this application, the calculation of the qualified frame ratio in the raw sensor data for each time period can be implemented using the following code, based on the vehicle's sensor configuration: Vehicle sensor configuration identification Input: sync_frame (single synchronization frame) Output: sensor_config (sensor configuration type) algorithm: # Count the number of lidars lidar_count = 0 For each file_path in sync_frame['file path']: If file_path contains 'lidar' AND file_path is not None: lidar_count += 1 # Determine configuration based on the number of lidars If lidar_count == 4: Returning to 'config_4lidar'# 4 Radar Configuration Else if lidar_count == 8: Returning to 'config_8lidar'# 8 radar configuration Else: Return 'unknown' Required sensor definitions (based on the vehicle's sensor configuration) Input: sensor_config (sensor configuration type) Output: required_sensors (list of required sensors) The vehicle's sensor configuration is Configuration 1 (4 lidars) - requires 7 cameras: required_cameras = [ 'camera_backward', # Rearview camera 'camera_forward_far', # Forward-facing telephoto camera 'camera_forward_wide', # Forward-facing wide-angle camera 'camera_pano_leftfront', # Left front panoramic camera 'camera_pano_leftrear', # Left rear panoramic camera 'camera_pano_rightfront', # Right front panoramic camera 'camera_pano_rightrear'# Right rear panoramic camera ] required_lidars = 4 required_localization_min = 4 # At least 4 localization records The vehicle's sensor configuration is Configuration 2 (8 lidars) - requiring 11 cameras: required_cameras = [ 'camera_backward', 'camera_forward_far', 'camera_forward_wide', 'camera_pano_leftfront', 'camera_pano_leftrear', 'camera_pano_rightfront', 'camera_pano_rightrear', 'camera_surr_front', # Camera before blind spot detection 'camera_surr_left', # Fill in the gaps with the left camera 'camera_surr_rear', # Camera after blind spot detection 'camera_surr_right'# Fill in the gaps in right-side camera coverage ] required_lidars = 8 required_localization_min = 4 Single-frame pass / fail determination (for synchronization frames corresponding to a single time point). Input: sync_frame (single synchronization frame), sensor_config (sensor configuration type) Output: is_qualified (whether it is qualified) algorithm: file_paths = sync_frame['file path'] # Statistical analysis of actual sensor data lidar_count = 0 camera_types = set() # Use set to record camera types localization_count = 0 For each file_path in file_paths: If file_path is None: continue # Statistics lidar If 'lidar' in file_path AND 'pcloud' in file_path: lidar_count += 1 # Statistics on camera types If 'camera' is in file_path: # Extract camera type (e.g., camera_forward_far) camera_type = extract_camera_type(file_path) camera_types.add(camera_type) # Statistics localization If 'localization' in file_path: If isinstance(file_path, list): localization_count = len(file_path) Else: localization_count = 1 # Get the list of required sensors required_cameras, required_lidars, required_loc_min = get_required_sensors(sensor_config) # Determine whether it is qualified lidar_ok = (lidar_count == required_lidars) cameras_ok = all(camera in camera_types for camera in required_cameras) localization_ok = (localization_count>= required_loc_min) is_qualified = lidar_ok AND cameras_ok AND localization_ok Returns is_qualified Qualified frame rate calculation Input: all_sync_frames (all synchronization frames in the raw sensor data corresponding to each slice time period) Output: qualification_report (quality assessment report, including pass rate, etc.) parameter: pass_ratio_threshold = 0.8 # Pass rate threshold 80% algorithm: total_frames = len(all_sync_frames) qualified_count = 0 # Frame-by-frame determination For each sync_frame in all_sync_frames: # Identify sensor configuration sensor_config = identify_sensor_config(sync_frame) # Determine if this frame is valid If check_frame_qualified(sync_frame, sensor_config): qualified_count += 1 # Calculate the qualified frame rate qualification_rate = qualified_count / total_frames # Determine if the overall result passes overall_pass = (qualification_rate>= pass_ratio_threshold) # Generate a quality assessment report, including the percentage of acceptable frames. qualification_report = { 'total_frames': total_frames, 'qualified_frames': qualified_count, 'qualification_rate': qualification_rate * 100, # Convert to percentage 'pass_threshold': pass_ratio_threshold * 100, 'overall_pass': overall_pass, 'recommendation': generate_recommendation(qualification_rate) } Returns the qualification report. Following the above embodiments of this application, one aspect of this application provides a method for processing sensor data in intelligent driving, which further includes: Determine the pass rate threshold based on the requirements of the business scenario; Determine whether the qualified frame ratio is greater than or equal to the qualified rate threshold. If so, the original sensor data corresponding to the segment time period corresponding to the qualified frame ratio that is greater than or equal to the qualified rate threshold will be submitted as the original sensor data for submission. If not, the original sensor data corresponding to the segment time period corresponding to the qualified frame ratio that is less than the qualified rate threshold will be discarded.

[0036] Following the preferred embodiment of this application where the pass rate threshold is 80% (pass_ratio_threshold = 0.8 #pass rate threshold 80%), the following code can be used to perform the label submission judgment on the raw sensor data: Input: qualification_report (a quality assessment report containing the percentage of qualified frames) Output: should_send_for_annotation (whether to send the annotation) Judgment rules: If qualification_report['qualification_rate']>= 80: # The frame rate meets the standard; create a clip and submit it for bidding. should_send_for_annotation = True logger.info(f"Qualification rate {qualification_report['qualification_rate']}% meets the standard, create a clip to submit bid") Else: # If the acceptable frame rate is not met, do not create a clip and discard it. should_send_for_annotation = False logger.info(f"If the qualified frame rate {qualification_report['qualification_rate']}% is less than 80%, the bid will not be submitted") Returns should_send_for_annotation Furthermore, the following code can be used to calculate the qualified frame ratio and determine the labeling for all synchronization frames in the raw sensor data corresponding to each segment time period: Input: clip_sync_frames (all synchronized frames within a clip) Output: should_send_for_annotation (whether to send the annotation) parameter: qualification_threshold = 0.8 # 80% pass rate threshold algorithm: total_frames = len(clip_sync_frames) qualified_count = 0 # Count the number of qualified frames For each sync_frame in clip_sync_frames: If is_frame_qualified(sync_frame): qualified_count += 1 # Calculate the qualified frame rate qualification_rate = qualified_count / total_frames #Send bid judgment should_send_for_annotation = (qualification_rate>=qualification_threshold) If not should_send_for_annotation: logger.info(f"Qualified frame rate {qualification_rate*100}% is less than 80%, bids will not be submitted") return{ 'qualification_rate': qualification_rate * 100, 'qualified_frames': qualified_count, 'total_frames': total_frames, 'should_send': should_send_for_annotation } In this preferred embodiment, an 80% pass rate threshold is used to determine whether raw sensor data should be submitted for bidding, ensuring the overall data quality of the submitted raw sensor data. Simultaneously, a binary screening decision is employed: either the data meets the standard and is accepted for bidding, or it fails to meet the standard and is discarded, further ensuring the high quality of the selected raw sensor data for bidding. Figure 2 If the criteria are met, data clipping is generated and submitted; otherwise, the data is discarded.

[0037] Following the above embodiments of this application, one aspect of this application provides a method for processing sensor data in intelligent driving, which further includes: Identify all candidate marker frames and their positions in the raw sensor data of the submitted marker; For each candidate tag frame, perform the following operation until the final tag frame corresponding to the original sensor data of the submitted tag is obtained: Determine whether all current candidate marker frames meet the integrity requirements of the vehicle's sensor configuration and the image quality required for the first actual application scenario. If so, determine the current candidate marker frame as the final marker frame. If not, determine whether the next candidate marker frame meets the integrity requirements of the vehicle's sensor configuration. If yes, then the marker points of the current candidate marker frame are discarded, and the next candidate marker frame is determined as the final marker frame. If no, then it is determined whether the previous candidate marker frame of the current candidate marker frame meets the integrity requirements of the vehicle's sensor configuration. If so, the markers of the current candidate marker frame and the next candidate marker frame are discarded, and the previous candidate marker frame is determined as the final marker frame; otherwise, the markers of the current candidate marker frame, the next candidate marker frame, and the previous candidate marker frame are discarded.

[0038] In a preferred embodiment of this application, the selection and adjustment of all marker frames in the original sensor data of the submitted labels can be achieved through the following code (corresponding to...). Figure 2 Phase Two: Synchronization Frame Statistics, to obtain the final marked frame: Input: clip_sync_frames (The input data is: 150 frames of original sensor data corresponding to one segment time period, assuming 10Hz acquisition for 15 seconds) Output: marked_frames (a list of final marked frames) parameter: marking_interval = 10 # Extract 1 frame from 10 frames (corresponding to the requirements of the first practical application scenario) marking_range_start = 20# Mark the starting frame number marking_range_end = 110# Mark the end frame number algorithm: candidate_mark_frames = [] First, identify candidate marker frames. For frame_n in range(0, len(clip_sync_frames)): # Determine if it is a candidate marker frame is_candidate = (frame_n % marking_interval == 0) AND (marking_range_start<= frame_n<= marking_range_end) If is_candidate: candidate_mark_frames.append(frame_n) # Next, adjust the frame position of the candidate marker frame (±1 frame tolerance). final_marked_frames = [] For each candidate_frame_n in candidate_mark_frames: # Check if the current candidate marker frame itself meets the integrity requirements. If check_frame_completeness(clip_sync_frames[candidate_frame_n]): final_marked_frames.append(candidate_frame_n) continue # The current candidate labeled frame does not meet the requirements, try the next candidate labeled frame (n+1). If candidate_frame_n + 1 <len(clip_sync_frames): If check_frame_completeness(clip_sync_frames[candidate_frame_n + 1]): final_marked_frames.append(candidate_frame_n + 1) continue # If the next candidate frame also fails to meet the requirements, try the previous candidate frame (n-1). If candidate_frame_n - 1 >= 0: If check_frame_completeness(clip_sync_frames[candidate_frame_n - 1]): final_marked_frames.append(candidate_frame_n - 1) continue # If neither the subsequent candidate tag frame nor the previous candidate tag frame satisfies the condition, the candidate tag frame is discarded. logger.warning(f"The marked frame {candidate_frame_n} and its preceding and following frames do not meet the integrity requirements, so the marking is abandoned") # Then, update the flag state of all synchronization frames. For frame_n in range(len(clip_sync_frames)): If frame_n in final_marked_frames: clip_sync_frames[frame_n]["whether to mark"] = True Else: clip_sync_frames[frame_n]["whether to mark"] = False Returns final_marked_frames / / Determines these as the final marked frames The function for checking whether the candidate marker frames meet the integrity requirements of the vehicle's sensor configuration can be the following: Function check_frame_completeness(sync_frame): required_topics = [ "localization", "sensor_camera_backward_orig_h264", "sensor_camera_forward_far_orig_h264", "sensor_camera_forward_wide_orig_h264", "sensor_camera_pano_leftfront_orig_h264", "sensor_camera_pano_leftrear_orig_h264", "sensor_camera_pano_rightfront_orig_h264", "sensor_camera_pano_rightrear_orig_h264", "sensor_lidar_front_orig_pcloud", "sensor_lidar_left_orig_pcloud", "sensor_lidar_rear_orig_pcloud", "sensor_lidar_right_orig_pcloud" ] For each topic in required_topics: If sync_frame['frame_data'][topic] is None or the data is empty: Return False Return True In this preferred embodiment, a ±1 frame fault tolerance mechanism is adopted. That is, if the current candidate label frame does not meet the requirements, the previous candidate label frame and the next candidate label frame are checked to improve the utilization rate of label frames. In the labeling range of 20 to 110 frames, one frame is extracted every 10 frames (frame_n % 10 == 0) for label frame screening and adjustment, so as to select the final label frame with labeling value, thereby reducing the workload of manual labeling.

[0039] In another preferred embodiment of this application for practical application scenarios requiring annotation, the following code can be used to select and re-annotate valid frames by extracting 1 frame every 10 frames from all synchronization frames in the raw sensor data of the selected segmented time period: Input: all_sync_frames (The input data consists of all sync frames from the compliant sensor data) Output: selected_frames (selected frames), sync_frame_info_clip_all.json algorithm: selected_frames = [] frame_counter = 0 For each sync_frame in all_sync_frames: frame_counter += 1 # Select 1 frame every 10 frames If frame_counter % 10 == 0: # Check frame quality If check_frame_quality(sync_frame): sync_frame["whether to mark"] = True sync_frame["marked content"] = detect_scene_type(sync_frame) selected_frames.append(sync_frame) Else: sync_frame["whether to mark"] = False Save as sync_frame_info_clip_all.json Returns selected_frames The function for frame quality checking is: Function check_frame_quality(sync_frame): # Check if the required sensor data exists If sync_frame['lidar_front'] is None: Return False If sync_frame['camera_forward_far'] is None: Return False If sync_frame['lidar_front'].point_count < minimum point count threshold: Return False # Check image quality (e.g., brightness). image = load_image(sync_frame['camera_forward_far']) If calculate_brightness(image) < brightness threshold: Return False Return True Following the above embodiments of this application, one aspect of this application provides a method for processing sensor data in intelligent driving, which further includes: Obtain the requirements of the second actual application scenario, and determine the application qualification rate threshold, the sliding window size and the start frame number of the sliding window search based on the requirements of the second actual application scenario; Predict the initial marker of all synchronization frames in the original sensor data of the target being sent; Starting from the synchronization frame corresponding to the starting frame number, slide a window and count the percentage of qualified frames in the original sensor data of each sliding window size in the original sensor data of the submitted target. The optimal sliding window size is determined when the qualified frame rate is the highest. The synchronous frames in the original sensor data submitted for labeling that are outside the optimal sliding window are marked as unqualified, while the original marks of the synchronous frames within the optimal sliding window are maintained; it is then determined whether the qualified frame ratio of the original sensor data corresponding to the optimal sliding window is greater than the application qualification rate threshold. If so, the original sensor data corresponding to the optimal sliding window will be used in the actual application scenario; If not, record failure information indicating that the original sensor data corresponding to the optimal sliding window has failed to be used.

[0040] In the embodiments of this application, the pass rate threshold and the application pass rate threshold can have different values ​​and adjustments depending on the business scenario, thereby adapting to the application scenario of compliant raw sensor data. In a preferred embodiment of the application scenario of high-precision map building in this application, the 50 consecutive frames with the highest quality can be found through sliding window search for map building, ensuring the quality of the map building data. In this preferred embodiment, the application pass rate threshold can preferably be 90%, the sliding window size of the sliding window search can preferably be 50 frames, the starting frame number can preferably be the 20th frame, and the sliding window step size can preferably be 1 (corresponding to...). Figure 2 The extension in the code: sliding window search for the best 50 frames, can be specifically implemented using the following code: First, the initial label pre-determination Input: all_sync_frames (all synchronized frames) Output: preliminary_marked_frames algorithm: For frame_n in range(1, len(all_sync_frames) + 1): sync_frame = all_sync_frames[frame_n - 1] # Starting from frame 20, check if the integrity requirements are met. If frame_n>= 20: If check_required_topics_complete(sync_frame): sync_frame["whether to mark"] = True Else: sync_frame["whether to mark"] = False Else: sync_frame["whether to mark"] = False Then, use a sliding window to search for the best 50 frames. Input: one_clip_json_dict (containing all frames and initial tags), window_size=50 Output: best_window_start, best_window_end, best_true_ratio parameter: window_size = 50 # Sliding window size start_frame = 20# (Start frame number) step = 1# Sliding window step size algorithm: best_window_start = None best_window_end = None best_true_ratio = 0.0 # Start sliding the window from frame 20 current_start = start_frame While current_start + window_size - 1 <= total frames: window_end = current_start + window_size - 1 # Count the number of frames marked as true in the current sliding window true_count = 0 total_count = 0 For frame_num in range(current_start, current_start + window_size): frame_key = f"Frame_{frame_num}" If frame_key exists: If frame_data.get("whether it is marked") == True: true_count += 1 total_count += 1 # Calculate the true qualified frame ratio If total_count>0: true_ratio = true_count / total_count logger.info(f"Sliding window: Frame_{current_start} to Frame_{window_end}, ") f"Number of qualified frames: {true_count} / {total_count}, Qualified frame ratio: {true_ratio:.2%}") # Updated for best sliding window If true_ratio>best_true_ratio: best_true_ratio = true_ratio best_window_start = current_start best_window_end = window_end logger.info(f"Found new best sliding window: Frame_{best_window_start} to Frame_{best_window_end}, ") f"Best true ratio: {best_true_ratio:.2%}") # Move to the next sliding window position current_start += step Return (best_window_start, best_window_end, best_true_ratio) Next, adjust the frame markers outside the optimal sliding window. Input: one_clip_json_dict, best_window_start, best_window_end Output: Adjusted one_clip_json_dict algorithm: # Change the flags of all frames other than the optimal sliding window to False, i.e., deem them unqualified. For each frame_key in one_clip_json_dict.keys(): frame_num = int(frame_key.split('_')[1]) # Check if the frame is outside the optimal sliding window If frame_num<best_window_start OR frame_num> best_window_end: # Frames outside the best sliding window should be marked as False. one_clip_json_dict[frame_key]["whether to mark"] = False # Frames within the optimal sliding window retain their original flag state (True or False). Returns one_clip_json_dict Finally, the quality judgment within the optimal sliding window. Input: best_true_ratio (the optimal acceptable frame rate for the sliding window) Output: should_generate_map_build_clip (whether to generate a map build clip) parameter: best_ratio_limit = 0.9# 90% is the application pass rate threshold. algorithm: If best_true_ratio>= best_ratio_limit: # Generate a JSON file for the image clip output_json_path = f"{bag_name}_clip_{clip_num}_map_build.json" save_json(one_clip_json_dict, output_json_path) logger.info(f"Optimal sliding window ratio {best_true_ratio:.2%} has been achieved, generating map clip") Return True Else: # Do not generate a map clip; log failure information. output_json_path = f"not_pass_for_map_build.json" failure_info = { "limit": f"{best_ratio_limit:.2%}", "data_ratio": f"{best_true_ratio:.2%}" } save_json(failure_info, output_json_path) logger.warning(f"Optimal sliding window ratio {best_true_ratio:.2%} < {best_ratio_limit:.2%}, no map clip generated") Return False The following example illustrates how to determine the optimal sliding window for various sliding window sizes: Assume a clip has 150 frames (Frame_1 to Frame_150): Sliding window 1: Frame_20~Frame_69 (50 frames), qualified frame rate 85% Sliding window 2: Frame_21~Frame_70 (50 frames), qualified frame rate 88% Sliding window 3: Frames 22~71 (50 frames), acceptable frame rate 92% ← Optimal sliding window Sliding window 4: Frame_23~Frame_72 (50 frames), qualified frame rate 83% ...... Sliding window 82: Frame_101~Frame_150 (50 frames), qualified frame rate 80%. Therefore, the selection result based on the qualified frame ratio is: - Frames 22 to 71 retain their original flags (mostly True). - Change the Frame_1~Frame_21 markers to False - Change the Frame_72~Frame_150 flags to False - Finally, a map clip containing the optimal 50 frame windows is generated using Frame_22~Frame_71. In the preferred embodiment of the preferred application scenario of automatically selecting the 50 consecutive frames with the highest quality in this application, by using the 90% application qualification rate threshold, not only is the high quality level of the mapping data ensured, avoiding manual selection and improving efficiency by 100 times, but the accuracy of mapping is also improved, and problems such as mapping failure caused by poor data quality are reduced.

[0041] In the embodiments of this application, the necessary camera, lidar, and localization are quickly checked through the topic integrity pre-selection mechanism. The system assesses the completeness of sensor data for various topics, filters out obviously missing BAT files, and improves overall processing efficiency. Using the forward-facing LiDAR timestamp as a baseline, it statistically analyzes the data integrity of sensors within their respective time windows, enabling multi-sensor synchronous frame statistics based on a single benchmark. This accurately measures the completeness and synchronization of multi-sensor data, quantitatively evaluating the collaborative quality of each sensor's data. It quantitatively calculates the percentage of qualified frames based on sensor completeness (LiDAR + Camera + Localization), generates detailed quality assessment reports, traces the specific sources of quality issues, and guides improvements in the data acquisition process. A binary decision-making process is implemented based on a pass rate threshold (e.g., 80%) (data meeting the threshold is considered original sensor data for submission, while data failing is discarded), establishing a clear quality threshold to ensure overall data quality. Strategies such as annotation (10 frames selected + error tolerance mechanism) and mapping (sliding window search for the best 50 frames) support frame selection strategies for different application scenarios, ensuring the needs of various application scenarios (annotation, training, mapping) and dynamically adjusting quality standards and screening strategies. The system achieves fully automated processing without manual intervention, significantly reducing quality inspection costs and time.

[0042] In the embodiments of this application, a four-layer quality assurance system of "data pre-selection - synchronization frame statistics - qualified frame ratio calculation - data filtering based on frame tag adjustment" is established to ensure that the data filtered out for later use is of high quality, thus ensuring the overall quality level of the filtered data.

[0043] According to another aspect of this application, a non-volatile storage medium is also provided, on which computer-readable instructions are stored, which, when executed by a processor, cause the processor to implement the intelligent driving sensor data processing method as described above.

[0044] According to another aspect of this application, an intelligent driving sensor data processing device is also provided, wherein the device includes: One or more processors; Computer-readable medium for storing one or more computer-readable instructions. When the one or more computer-readable instructions are executed by the one or more processors, the one or more processors implement the intelligent driving sensor data processing method described above.

[0045] For details of the various embodiments of the intelligent driving sensor data processing device, please refer to the corresponding parts of the above-described embodiments of the intelligent driving sensor data processing method, which will not be repeated here.

[0046] In summary, this application first acquires raw sensor data collected by an autonomous vehicle, and then obtains the business scenario requirements and the vehicle's sensor configuration. Next, based on the vehicle's sensor configuration or the business scenario requirements, it performs invalid filtering on the raw sensor data to obtain valid raw sensor data. This achieves the filtering out of obviously invalid data in the early stages of data processing, improving overall processing efficiency. Then, based on the business scenario requirements, the valid raw sensor data is segmented to obtain raw sensor data corresponding to each segment time period. Using the forward LiDAR timestamp as the base time, and employing differentiated time windows for sensors with different themes, all data frames of different themes within each time window of the raw sensor data corresponding to the segment time period are used as synchronization frames corresponding to the base time. Each synchronization frame consists of multiple data frames. The system generates a set of frames and determines the synchronization frames corresponding to the original sensor data at each time point for each segmented time period. The time points are the timestamps of all forward-facing LiDAR sensors. This enables accurate statistics on the completeness and synchronization of the sensor data and quantitatively evaluates the collaborative quality of each sensor data. Finally, based on the vehicle's sensor configuration, the system determines the passability of the synchronization frames corresponding to each time point and generates a quality assessment report for the original sensor data corresponding to each segmented time period. The quality assessment report includes a passable frame ratio, which is the number of passable synchronization frames in the original sensor data corresponding to the segmented time period divided by the total number of frames. The passable frame ratio enables a quantitative assessment of the overall quality level of the original sensor data, ensuring that the original sensor data used in specific practical application scenarios is of high quality.

[0047] It should be noted that this application can be implemented in software and / or a combination of software and hardware, for example, using an application-specific integrated circuit (ASIC), a general-purpose computer, or any other similar hardware device. In one embodiment, the software program of this application can be executed by a processor to implement the steps or functions described above. Similarly, the software program of this application (including related data structures) can be stored in a computer-readable recording medium, such as RAM memory, a magnetic or optical drive, a floppy disk, or similar devices. Furthermore, some steps or functions of this application can be implemented in hardware, for example, as circuitry that cooperates with a processor to perform the various steps or functions.

[0048] Furthermore, a portion of this application can be applied as a computer program product, such as computer program instructions, which, when executed by a computer, can invoke or provide the methods and / or technical solutions according to this application through the operation of the computer. The program instructions invoking the methods of this application may be stored in a fixed or removable recording medium, and / or transmitted via data streams in broadcast or other signal carrying media, and / or stored in the working memory of a computer device operating according to the program instructions. Here, one embodiment of this application includes an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein, when the computer program instructions are executed by the processor, the apparatus is triggered to operate the methods and / or technical solutions based on the foregoing embodiments of this application.

[0049] It will be apparent to those skilled in the art that this application is not limited to the details of the exemplary embodiments described above, and that this application can be implemented in other specific forms without departing from the spirit or essential characteristics of this application. Therefore, the embodiments should be considered exemplary and non-limiting in all respects, and the scope of this application is defined by the appended claims rather than the foregoing description. Thus, all variations falling within the meaning and scope of equivalents of the claims are intended to be embraced within this application. No reference numerals in the claims should be construed as limiting the scope of the claims. Furthermore, it is clear that the word "comprising" does not exclude other units or steps, and the singular does not exclude the plural. Multiple units or devices recited in the apparatus claims may also be implemented by a single unit or device in software or hardware. The terms "first," "second," etc., are used to indicate names and do not indicate any particular order.

Claims

1. A method for processing sensor data in intelligent driving, wherein, The method includes: Acquire raw sensor data collected by autonomous vehicles, and obtain business scenario requirements and sensor configuration of the vehicles; Based on the vehicle's sensor configuration or the requirements of the business scenario, the raw sensor data is filtered to obtain valid raw sensor data. The effective raw sensor data is segmented according to the business scenario requirements to obtain the raw sensor data corresponding to each segment time period. Using the forward lidar timestamp as the base time, sensors with different themes adopt differentiated time windows. All data frames of different themes in each time window of the original sensor data corresponding to the segmented time period are taken as the synchronization frames corresponding to the base time. The synchronization frame is a set of multiple data frames. The synchronization frames corresponding to each time point of the original sensor data corresponding to each segmented time period are determined. The time points are all forward lidar timestamps. Based on the sensor configuration of the vehicle, the passability of the synchronization frames corresponding to each time point is determined, and a quality assessment report of the original sensor data corresponding to each segmented time period is generated. The quality assessment report includes the passable frame ratio, which is the number of passable synchronization frames in the original sensor data corresponding to the segmented time period divided by the total number of frames.

2. The method according to claim 1, wherein, The step of filtering out invalid data from the raw sensor data based on the vehicle's sensor configuration to obtain valid raw sensor data includes: If the vehicle's sensors are configured as 4-channel radar and 7-channel camera, the process of filtering the raw sensor data to obtain valid raw sensor data includes: Check whether the raw sensor data contains complete data corresponding to the 4-channel radar theme, 7-channel camera theme, and vehicle positioning theme. If so, the raw sensor data with complete themes is determined as valid raw sensor data. or, If the vehicle's sensors are configured with 8 radar channels and 11 cameras, the process of filtering out invalid data from the raw sensor data to obtain valid raw sensor data includes: Check whether the raw sensor data contains complete data corresponding to the 8 radar themes, 11 camera themes, and vehicle positioning themes. If so, the raw sensor data with complete themes is determined as valid raw sensor data.

3. The method according to claim 2, wherein, After checking whether the raw sensor data contains complete data corresponding to 4 radar themes, 7 camera themes, and vehicle positioning themes, or after checking whether the raw sensor data contains complete data corresponding to 8 radar themes, 11 camera themes, and vehicle positioning themes, the method further includes: If not, then record the missing information and mark the invalid data for the original sensor data where the subject is missing.

4. The method according to claim 1, wherein, The sensors for different themes employ differentiated time windows, including: 15ms before and after for radar sensors, 50ms before and after for camera sensors, 75ms before and 25ms after for vehicle positioning sensors, and 50ms before and after for other themes. Wherein, the forward LiDAR timestamp is used as the base time, and sensors of different themes use differentiated time windows. All data frames of different themes within each time window of the original sensor data corresponding to the segmented time period are used as synchronization frames corresponding to the base time. The synchronization frame is a set of multiple data frames. The synchronization frames corresponding to each time point of the original sensor data for each segmented time period are determined, and the time points are all forward LiDAR timestamps, including: Obtain all forward LiDAR timestamps and use them as reference times. Perform the following operations at each reference time until the synchronization frames of the original sensor data corresponding to each of the said segmented time periods are obtained at each time point: The original radar sensor data frames 15ms before and after the reference time, the original camera sensor data frames 50ms before and after the reference time, the original vehicle positioning sensor data frames from 75ms before to 25ms after the reference time, and the original other sensor data frames 50ms before and after the reference time are all used as the synchronization frames corresponding to the reference time. Each synchronization frame corresponding to a reference time is recorded as a synchronization frame.

5. The method according to claim 3 or 4, wherein, The step of determining the passability of the synchronization frames corresponding to each time point based on the vehicle's sensor configuration includes: If the vehicle's sensor configuration includes 4 radar channels and 7 cameras, the step of determining the passability of the synchronization frames corresponding to each time point based on the vehicle's sensor configuration includes: For each synchronization frame at each time point, perform the following operations until the pass / fail determination of synchronization frames at all time points is completed: Check whether the synchronization frames corresponding to the time points all contain complete data corresponding to the 4 radar themes, 7 camera themes and vehicle positioning themes, and whether there are at least 4 original vehicle positioning sensor data frames. If so, the synchronization frame corresponding to the time point is determined as a qualified frame; otherwise, the synchronization frame corresponding to the time point is determined as a non-qualified frame. or, If the vehicle's sensor configuration includes 8 radar channels and 11 cameras, the step of determining the passability of the synchronization frames corresponding to each time point based on the vehicle's sensor configuration includes: Check whether the synchronization frames corresponding to the time points all contain complete data corresponding to the 8 radar themes, 11 camera themes, and vehicle positioning themes, and whether there are at least 4 original vehicle positioning sensor data frames. If so, the synchronization frame corresponding to the time point is determined as a qualified frame; otherwise, the synchronization frame corresponding to the time point is determined as a non-qualified frame.

6. The method according to claim 1, wherein, The method further includes: Determine the pass rate threshold based on the requirements of the business scenario; Determine whether the qualified frame ratio is greater than or equal to the qualified rate threshold. If so, the original sensor data corresponding to the segment time period corresponding to the qualified frame ratio that is greater than or equal to the qualified rate threshold will be submitted as the original sensor data for submission. If not, the original sensor data corresponding to the segment time period corresponding to the qualified frame ratio that is less than the qualified rate threshold will be discarded.

7. The method according to claim 6, wherein, The method further includes: Identify all candidate marker frames and their positions in the raw sensor data of the submitted marker; For each candidate tag frame, perform the following operation until the final tag frame corresponding to the original sensor data of the submitted tag is obtained: Determine whether all current candidate marker frames meet the integrity requirements of the vehicle's sensor configuration and the image quality required for the first actual application scenario. If so, determine the current candidate marker frame as the final marker frame. If not, determine whether the next candidate marker frame meets the integrity requirements of the vehicle's sensor configuration. If yes, then the marker points of the current candidate marker frame are discarded, and the next candidate marker frame is determined as the final marker frame. If no, then it is determined whether the previous candidate marker frame of the current candidate marker frame meets the integrity requirements of the vehicle's sensor configuration. If so, the markers of the current candidate marker frame and the next candidate marker frame are discarded, and the previous candidate marker frame is determined as the final marker frame; otherwise, the markers of the current candidate marker frame, the next candidate marker frame, and the previous candidate marker frame are discarded.

8. The method according to claim 6, wherein, The method further includes: Obtain the requirements of the second actual application scenario, and determine the application qualification rate threshold, the sliding window size and the start frame number of the sliding window search based on the requirements of the second actual application scenario; Predict the initial marker of all synchronization frames in the original sensor data of the target being sent; Starting from the synchronization frame corresponding to the starting frame number, slide a window and count the percentage of qualified frames in the original sensor data of each sliding window size in the original sensor data of the submitted target. The optimal sliding window size is determined when the qualified frame rate is the highest. The synchronous frames in the original sensor data submitted for labeling that are outside the optimal sliding window are marked as unqualified, while the original marks of the synchronous frames within the optimal sliding window are maintained; it is then determined whether the qualified frame ratio of the original sensor data corresponding to the optimal sliding window is greater than the application qualification rate threshold. If so, the original sensor data corresponding to the optimal sliding window will be used in the actual application scenario; If not, record failure information indicating that the original sensor data corresponding to the optimal sliding window has failed to be used.

9. A non-volatile storage medium having stored computer-readable instructions thereon, which, when executed by a processor, cause the processor to perform the method as described in any one of claims 1 to 8.

10. An intelligent driving sensor data processing device, wherein, The device includes: One or more processors; Computer-readable medium for storing one or more computer-readable instructions. When the one or more computer-readable instructions are executed by the one or more processors, the one or more processors perform the method as described in any one of claims 1 to 8.