Edge-Cloud Collaborative Driving Testing and Evaluation Methods and Systems for Intelligent Vehicles

By adopting an edge-cloud collaborative architecture and a multimodal time-series model, the system collaboration, multi-vehicle adaptation, and operational reliability issues of the intelligent connected vehicle driving test system were resolved, realizing intelligent and unified driving tests and evaluations, and improving the real-time performance and stability of the evaluation results.

CN122312349APending Publication Date: 2026-06-30UNIV OF SCI & TECH OF CHINA +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
UNIV OF SCI & TECH OF CHINA
Filing Date
2026-06-03
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing driving test systems suffer from insufficient system architecture coordination, poor compatibility with multiple vehicle models, and low operational reliability in intelligent connected vehicle scenarios. This makes it difficult to achieve unified planning, differentiated evaluation, and efficient management, resulting in inconsistent evaluation results and poor system stability.

Method used

An edge-cloud collaborative architecture is constructed. Through unified timing and time alignment of multimodal sensor data, the edge-side multimodal time series model identifies driving behavior, and quantitative scoring is performed by combining vehicle model parameters and membership functions. Model training and rule optimization are carried out on the cloud side to achieve cross-vehicle scoring consistency and fault-tolerant takeover.

Benefits of technology

It improves the real-time nature, stability, and fairness of driving tests, achieves fair scoring for multiple vehicle types and reliable system operation, and ensures the continuity of the testing process and the consistency of evaluation results.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122312349A_ABST
    Figure CN122312349A_ABST
Patent Text Reader

Abstract

This invention relates to the field of intelligent driving examination technology, and discloses an edge-cloud collaborative driving examination and evaluation method and system for intelligent vehicles. The method includes: collecting multimodal sensor data of the vehicle on the edge; identifying driving behavior categories through a multimodal time series model and outputting interpretable evaluation results; collecting historical data on the cloud to train and optimize the multimodal time series model, using an evolutionary algorithm to optimize scoring rule parameters to achieve cross-vehicle scoring consistency, and monitoring the edge status to execute remote fault-tolerant takeover in case of anomalies. This invention realizes an intelligent and scalable driving examination and evaluation system integration scheme for intelligent connected vehicles, which can adapt to the examination and evaluation needs of traditional manual driving and future intelligent driving vehicles under a unified examination scenario and evaluation framework, maintain the consistency of evaluation standards, and improve the real-time performance, stability, and fairness of the system.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of intelligent driving test technology, specifically to a method and system for edge-cloud collaborative driving test and evaluation for intelligent vehicles. Background Technology

[0002] With the rapid development of intelligent connected vehicles and intelligent driving technologies, driver qualification examinations and competency assessments are gradually evolving from the traditional model, which is mainly based on manual proctoring and single-point equipment judgment, towards automation, intelligence, and systematization. Especially in the context of future intelligent driving vehicles and human-machine collaborative driving, driver examinations not only need to judge human driving behavior, but also need to objectively evaluate the capabilities of different driving subjects and vehicles within a unified evaluation system.

[0003] Existing driving test systems typically collect vehicle operating status, student driving behavior, and test site environment information through onboard monitoring equipment. The test site server then identifies and judges typical driving behaviors such as crossing lines, abnormal speed, and deviation from the route. For behaviors involving complex traffic elements such as traffic lights and intersections, some systems still rely on roadside signal systems or manual assistance. While these systems automate the testing process to some extent, they are still designed around traditional testing scenarios and are ill-suited to the more complex and consistent testing and evaluation requirements of intelligent connected vehicles.

[0004] First, from a system architecture perspective, existing driving test systems mostly adopt a centralized or weakly collaborative architecture, lacking a unified system plan and collaborative mechanism between edge data acquisition devices, test site edge servers, and backend management platforms. In large-scale testing or multi-vehicle concurrent scenarios, problems such as unreasonable scheduling of computing resources, increased data transmission latency, and limited system scalability can easily arise, making it difficult to fully leverage the advantages of edge-cloud collaborative architecture in intelligent connected vehicle scenarios.

[0005] Secondly, in terms of driving behavior recognition and scoring, existing systems typically rely on fixed threshold rules or simple logical judgments, applying uniform standards to indicators such as start-up time, acceleration / deceleration response, and braking distance. This makes it difficult to fully consider the differences in dynamic characteristics, control response, and execution characteristics among different vehicle models. This evaluation method based on a single rule system is prone to inconsistencies in scoring standards in testing scenarios involving multiple vehicle models or future intelligent driving vehicles, affecting the fairness and comparability of the evaluation results.

[0006] Secondly, regarding system reliability, driving test systems inevitably encounter issues such as malfunctions in edge-side data acquisition devices, communication link interruptions, excessive load on edge-side computing nodes, or unstable recognition results during actual deployment. However, existing systems generally lack robust fault-tolerance mechanisms and task migration strategies. When local nodes malfunction, it is difficult to guarantee the continuity of the testing process and the consistency of evaluation results, which is particularly prominent in evaluation scenarios for intelligent connected vehicles.

[0007] Furthermore, in terms of managing examination rules, vehicle parameters, and scoring logic, existing systems largely rely on manual configuration or local maintenance, lacking centralized management, dynamic updates, and unified distribution mechanisms. This makes it difficult to achieve consistent management of rule versions, parameter configurations, and evaluation strategies across different test centers, vehicle models, and testing scenarios. This extensive management approach is no longer sufficient to meet the actual needs of collaborative testing and evaluation across multiple vehicle models and scenarios under intelligent connected vehicle conditions.

[0008] In summary, existing driving test systems still have significant shortcomings in terms of system architecture coordination, intelligent recognition capabilities, multi-vehicle adaptation, and operational reliability. There is an urgent need for an edge-cloud driving test and evaluation method and system for intelligent connected vehicles, which can achieve collaborative processing, differentiated evaluation, and centralized management under a unified architecture, thereby improving the real-time performance, stability, and fairness of the driving test and evaluation process, and providing support for the future testing and competency evaluation of intelligent driving vehicles. Summary of the Invention

[0009] To address the aforementioned technical problems, this invention provides a method and system for edge-cloud collaborative driving testing and evaluation for intelligent vehicles. By constructing a unified collaborative architecture and evaluation framework, real-time recognition, differentiated evaluation, and adaptive optimization of driving behavior are achieved, ensuring the continuity, stability, and reliable operation of the testing and evaluation process.

[0010] To solve the above-mentioned technical problems, the present invention adopts the following technical solution: In a first aspect, the present invention provides a method for edge-cloud collaborative driving testing and evaluation for intelligent vehicles, comprising: The device collects multimodal sensor data from the vehicle at the edge, performs unified timing and time alignment, constructs data frames, calculates the health status of each data frame, and then uploads it to the edge. The multimodal sensor data includes images, inertial data, positioning data, and vehicle controller area network data. The system receives and reassembles the multimodal sensor data, extracts visual features and vehicle dynamics features, and identifies driving behavior categories through a multimodal time-series model. Based on the dynamic parameter set of the current vehicle model, the side constructs a quantitative index of driving behavior category and performs fuzzification processing through membership function to obtain a comprehensive quantitative score of the current driving behavior category. Combined with the rule engine driven by the test state machine, the score is judged and an interpretable evaluation result is output. The cloud collects historical data to train and optimize the multimodal time series model, uses evolutionary algorithms to optimize scoring rule parameters to achieve cross-model scoring consistency, monitors the side-side status, and performs remote fault-tolerant takeover in case of anomalies.

[0011] In one embodiment, the edge device collects multimodal sensor data from the vehicle, performs unified timing and time alignment, constructs data frames, calculates the health status of each data frame, and uploads it to the edge device. Specifically, this includes: The edge receives a standard time signal from the side to correct the local clock deviation, and aligns the sensor data of different modes to a unified time reference through interpolation to construct a data frame containing timestamps; The health calculation for each data frame specifically includes: ; in, Indicates image sharpness, Indicates the stability of inertial data. This indicates the continuity of data on the vehicle controller area network. Indicates the validity of the location data; express The health of the data frame at any given time. This represents the timestamp of the kth sampling moment after unified time synchronization correction.

[0012] In one embodiment, the side receives and reassembles the multimodal sensor data, extracts visual features and vehicle dynamics features, and identifies driving behavior categories through a multimodal time-series model, specifically including: The system performs windowed recombination on multiple consecutive data frames, extracts visual features using a visual encoder, and enhances geometric consistency by combining optical flow estimation. The system reconstructs the vehicle's dynamic feature sequence using vehicle controller LAN data and inertial data, and then concatenates the enhanced visual features with the dynamic features before inputting them into a multimodal time series model to identify driving behavior categories.

[0013] In one embodiment, the side performs windowed recombination of multiple consecutive data frames, extracts visual features using a visual encoder, and combines optical flow estimation for geometric consistency enhancement, specifically including: Multimodal sensor data transmitted in the form of data frames from the side-side receiver is buffered and reassembled according to a preset inference period: Arrange them in chronological order. Each data frame constitutes a timing window. : ; in, express Data frames at any given time For window length, for Index of time; When a missing data frame is detected, the missing data frame is filled using nearest neighbor interpolation or model prediction, and the time series window is calculated. Health Reference Scale : ; express Data frames at time Health status; Health Reference Scale Used to characterize the overall reliability of images, inertial data, vehicle controller area network data, and positioning data within this time window; ; in, Representing timing windows respectively Average health of internal image data, inertial data, vehicle controller LAN data, and positioning data; right Visual features are extracted from each image in each data frame: ; For visual encoders, for The image in for Visual characteristics of a moment; Geometric consistency enhancement of visual features is achieved through optical flow estimation, resulting in enhanced visual features. : ; This indicates a characteristic correction operation guided by optical flow. Indicates feature splicing, For optical flow field; Constructing visual feature sequences , serving as the visual input for the multimodal temporal model.

[0014] In one embodiment, the process of reconstructing the vehicle's dynamic feature sequence using vehicle controller area network data and inertial data, concatenating the enhanced visual features with the dynamic features, and inputting the result into a multimodal time series model to identify driving behavior categories specifically includes: Assign fusion weights to sensor data of different modalities based on health reference values: ; in, , for The fusion weights of modal sensor data, Timing window Inside The average health of the sensor data for each modality. To prevent constants with a denominator of zero; These represent image data, inertial data, positioning data, and vehicle controller area network data, respectively. Vehicle controller network data in Data generated at any time Inertial measurement unit in Inertial data collected at all times With the positioning module at time Location data collected Combined into dynamic eigenvectors : ; Constructing dynamic characteristic sequences ; Side-side features are constructed based on modal health, enhanced visual features, and dynamic feature vectors. : ; in, This indicates a feature splicing or fusion operation; Constructing a multimodal temporal feature sequence based on multimodal features at continuous time points. : ; Multimodal temporal feature sequences Input to multimodal time series model Output the temporal embedding features of the current time series window. : ; The probability distribution of driving behavior categories is further obtained through the classification output layer. : ; in, and For classification parameters, This is the softmax function; it outputs the driving behavior category corresponding to the maximum probability distribution.

[0015] In one embodiment, the side constructs a behavioral quantification index for the driving behavior category based on the dynamic parameter set of the current vehicle model, and performs fuzzification processing using a membership function to obtain a comprehensive quantitative score for the current driving behavior category, specifically including: The vehicle's dynamic parameters include wheelbase, width, allowable longitudinal acceleration range, allowable steering angle range, and steering wheel response coefficient. The identified driving behavior categories are converted into behavioral quantification indicators. These indicators are then used as continuous inputs and fuzzified using one or more of the following methods: triangular membership function, trapezoidal membership function, or Gaussian membership function, to obtain the behavioral fuzzy vector. ,in Let i be the fuzzy membership degree. The number of fuzzy membership degrees for the behavior fuzzy vector; A comprehensive quantitative score for the current driving behavior category is calculated based on the behavioral fuzzy vector. : ; Indicates the first The scoring weights corresponding to each fuzzy membership degree.

[0016] In one embodiment, the scoring and evaluation, which is based on a rule engine driven by an exam state machine, and the output of interpretable evaluation results, specifically includes: An examination state machine is constructed to determine the current subject based on vehicle position, speed, and examination process control signals, and automatically activate the corresponding rule subset; the rule engine performs reasoning based on the activated rule subset and behavior fuzzy vector to obtain the judgment result of deduction, failure, or prompt. A time-series sliding window smoothing mechanism is used to smooth the comprehensive quantitative score. ; The length of the sliding window. For the current moment Forward The comprehensive quantitative score corresponding to each moment. For the current moment, The time index is within the sliding window, and , Scoring of driving behavior after time-series sliding window smoothing; The output score conclusion is determined based on the threshold: If The scoring conclusion is normal; if The scoring conclusion is a deduction; if The evaluation result was unsatisfactory. The interpretable evaluation results include driving behavior category, scoring conclusion, rule subset number, vehicle model version number, and behavior fuzzy vector.

[0017] In one embodiment, the cloud-side collection of historical data to train and optimize the multimodal time series model specifically includes: Using cross-entropy loss With modal regularization term The sum of these is the total loss. Training of a multimodal time series model; among which, ; This represents the probability distribution output by the multimodal time series model. for Corresponding real-world driving behavior category tags, Indicates the first An index of driving behavior categories; , The total number of modes, , For the first Feature representation of sensor data for each modality For regularization weights, It is a norm 2; By the Sensor data from each mode is extracted using a modal encoder: ; in, Indicates the first Sensor data for each modality, Indicates the first Feature encoders corresponding to each mode; During the training of the multimodal time series model, the dynamic parameters of the vehicle model are used as conditional inputs to achieve cross-vehicle adaptation. After training, the cloud side quantizes and prunes the multimodal time series model to generate an inference version adapted to the edge hardware and pushes it to the edge.

[0018] In one embodiment, the step of using an evolutionary algorithm to optimize scoring rule parameters to achieve cross-vehicle scoring consistency and monitoring side-side status, and executing remote fault-tolerant takeover in case of anomalies, specifically includes: Vehicle lane-crossing tolerance, speed deviation threshold, duration threshold, and membership function are encoded into optimizable vectors and optimized using a genetic algorithm or differential evolution algorithm to minimize the scoring differences between different tracks and vehicle types; the optimized rule parameters are then pushed to all edges. The cloud side uses heartbeat packets to detect whether the edge side is in a degraded state, including inference latency, model crash, memory overflow, data accumulation, and network disconnection. When the edge side is determined to be in a degraded state, it automatically receives the data proxy stream from the end side, uses the cloud side inference model to identify the driving behavior category and sends it back to the edge side until the edge side returns to normal.

[0019] In a second aspect, the present invention provides an edge-cloud collaborative driving examination and evaluation system for intelligent vehicles, used to execute the method of any embodiment of the first aspect, the system comprising: The edge device, deployed in the test vehicle, is used to collect multimodal sensor data of the vehicle, perform unified time synchronization and time alignment, and upload the health status of each modal sensor data to the edge device. The side server, deployed in the examination room, is used to receive and reassemble the multimodal sensor data, extract visual features and vehicle dynamic features, identify driving behavior categories through a multimodal time series model, construct behavioral quantification indicators for driving behavior categories based on the dynamic parameter set of the current vehicle model, perform fuzzification processing through membership functions, and combine the rule engine driven by the examination state machine to make scoring judgments and output interpretable evaluation results. The cloud-based platform, deployed in the monitoring center, is used to train and optimize the multimodal time series model, utilizes evolutionary algorithms to optimize scoring rule parameters to achieve cross-model scoring consistency, monitors the side-side status, and performs remote fault-tolerant takeover in case of anomalies.

[0020] Compared with the prior art, the beneficial technical effects of the present invention are: This invention constructs an edge-cloud collaborative architecture to achieve layered intelligent processing and improve real-time performance and scalability.

[0021] This invention introduces multi-source data fusion and side-by-side inference to improve the accuracy and robustness of driving behavior recognition.

[0022] This invention achieves fair scoring for multiple vehicle models by using a set of vehicle model parameters and an adjustable membership function.

[0023] This invention improves the overall reliability of the system by providing model training, rule optimization, and fault-tolerant takeover capabilities on the cloud side.

[0024] This invention enables unified, standardized, and intelligent management of the driving test process. Attached Figure Description

[0025] Figure 1This is a flowchart of the method of the present invention; Figure 2 This is a schematic diagram of the system modules of the present invention. Detailed Implementation

[0026] A preferred embodiment of the present invention will now be described in detail with reference to the accompanying drawings.

[0027] like Figure 1 As shown, the edge-cloud collaborative driving test and evaluation method for intelligent vehicles in this invention includes the following steps: S1: The edge side collects multimodal sensor data of the vehicle, performs unified timing and time alignment, constructs data frames, and uploads the data frames to the edge side after calculating their health status. The multimodal sensor data includes images, inertial data, positioning data, and vehicle controller area network data. S2, the side receives and reassembles the multimodal sensor data, extracts visual features and vehicle dynamics features, and identifies driving behavior categories through a multimodal time series model; S3: Based on the dynamic parameter set of the current vehicle model, the side constructs a quantitative index of driving behavior category and performs fuzzification processing through membership function to obtain a comprehensive quantitative score of the current driving behavior category. Combined with the rule engine driven by the test state machine, the score is judged and an interpretable evaluation result is output. S4, the cloud side collects historical data to train and optimize the multimodal time series model, uses evolutionary algorithms to optimize the scoring rule parameters to achieve cross-model scoring consistency, monitors the side-side status, and performs remote fault-tolerant takeover in case of anomalies.

[0028] The present invention will be described in detail below in several parts.

[0029] 1. End-side data acquisition and processing.

[0030] (1) Initialization and Status Detection of the End-Side Acquisition Module: After the vehicle is ignited or the testing equipment is powered on, the end-side acquisition system starts up, sequentially loading the drivers for the camera, inertial measurement unit (IMU), GPS module / real-time dynamic carrier phase differential (RTK) module, and controller area network (CAN) interface, and completing basic status detection. In this embodiment, the end-side embedded processor performs the following initialization process: Load the driver and initialize configuration parameters, including camera image resolution, exposure parameters, inertial measurement unit (IMU) sampling rate, GPS positioning data format, and vehicle controller LAN data mapping table.

[0031] Static environment benchmark verification: Acquire a short period of data at the end point in a stationary state, perform zero-bias estimation of the inertial measurement unit acceleration, check the camera brightness histogram, and verify whether the vehicle controller LAN data is continuously stable.

[0032] Hardware link connectivity detection: The endpoint sends a handshake signal to the edge. If the edge returns an acknowledgment, an uplink channel is established for data transmission. If multiple handshake attempts fail, the endpoint enters a local buffer mode to wait for network recovery.

[0033] Buffer and clock source initialization: A certain length of raw data frames are stored in a circular buffer, and a local high-stability clock is initialized to provide a basis for subsequent time synchronization.

[0034] The above initial steps ensure that each sensor on the end side is in a reliable and controllable working state before formal data acquisition.

[0035] (2) Unified timing and time alignment of multi-source sensors: Since the camera, inertial measurement unit (IMU), GPS module and vehicle controller LAN bus operate at different sampling frequencies and different hardware clocks, their original timestamps are not consistent. If they are directly uploaded to the side, it will cause modal misalignment during the inference stage. This invention maps the sensor data of all modes to the same time reference through a unified timing mechanism.

[0036] Time synchronization mechanism: The terminal side periodically receives a standard time signal from the edge side and compares the deviation between the local clock and the standard time. To uniformly correct the timestamps of all modalities. Let the original timestamp be... Corrected timestamp for: ; Sampling period difference handling: Sensor data from different modes cannot be naturally aligned. Therefore, this invention constructs a reference time sequence on a unified time base and aligns each mode to the same time point through interpolation.

[0037] Cross-modal synchronization frame construction: Aligned sensor data in A unified data frame is formed at all times. Each data frame records a unique time base number so that the edges can be processed in chronological order.

[0038] By using unified time synchronization and synchronous frame construction, the timing consistency of subsequent fusion stages is ensured.

[0039] (3) Lightweight preprocessing of images acquired by the camera, inertial data acquired by the inertial measurement unit, vehicle controller local area network data acquired by the controller local area network, and GPS positioning data acquired by the GPS module: After completing the timing, the present invention performs lightweight preprocessing on the sensor data of each mode to meet the input conditions of side-side inference.

[0040] (4) Health Calculation: The edge side calculates quality indicators for sensor data of each mode, which are used for edge-side modal weighting. ; in, Indicates image sharpness, Indicates the stability of inertial data. This indicates the continuity of data on the vehicle controller area network. This indicates the validity of GPS positioning data. Health score, used as a modal weight in the fusion phase, can improve the robustness of side-side inference.

[0041] Image clarity The calculation method is as follows: ; in, This is the sharpness evaluation value for the current image frame, which can be calculated from the image gradient energy or Laplacian variance. This is the image sharpness normalization constant.

[0042] Inertial data stability The calculation method is as follows: ; in, The variance of acceleration data within a preset time window, The variance of angular velocity data within a preset time window, This is the normalization constant for inertial data.

[0043] Vehicle Controller Area Network Data Continuity The calculation method is as follows: ; in, This represents the actual number of CAN messages received within a preset time window. This represents the theoretical number of CAN messages that should be received within a preset time window.

[0044] GPS positioning data validity The calculation method is as follows: ; in, For positioning error, This is the positioning error normalization constant.

[0045] All the above calculated indicators are mapped to intervals. The closer the value is to 1, the higher the quality of the corresponding modal data, and it is used for weight allocation, anomaly detection and fault tolerance decision-making in the subsequent multimodal fusion process.

[0046] (5) End-side upload strategy and cache management: To cope with network fluctuations, this invention adopts a circular buffer queue on the end side and supports an adaptive upload mechanism: when the network is abnormal, the upload frequency is slowed down and the cache is accumulated; when the network recovers, data frames for key time periods are automatically resent; after the end side confirms receipt, the old cache is removed. This mechanism ensures the continuity of data reception on the end side and avoids scoring errors caused by missing frames.

[0047] 2. Side recognition of driving behavior categories.

[0048] (1) Side data reception and windowed reconstruction: The side server receives data frames and a set of health scores from the endpoint. And it is buffered and reorganized according to a preset inference cycle. The sides are arranged in chronological order. Each data frame constitutes a timing window: ; in, The length of the window.

[0049] To avoid edge processing chain breaks caused by edge network fluctuations, this invention implements a window-level compensation strategy at the edge: if data is missing at a certain moment in the received data sequence, the missing frame is filled by nearest neighbor interpolation or model prediction, thereby ensuring the integrity of the input structure of the behavior recognition model.

[0050] In addition, the health reference values ​​at the edge calculation window level are as follows: ; The health reference value is used to guide the weight configuration in the modality fusion stage.

[0051] Health Reference Scale Used to characterize the overall reliability of images, inertial data, vehicle controller area network data, and positioning data within this time window; ; in, Representing timing windows respectively Average health of internal image data, inertial data, vehicle controller LAN data, and positioning data.

[0052] (2) Image feature extraction and geometric consistency enhancement: Side-to-side image sequence within the window Visual feature construction is performed to extract spatial structural features related to driving behavior. This invention employs a visual encoder based on a convolutional neural network (CNN) to extract the spatial feature representation of the image. Let the visual encoder be... Then the visual features of a single frame can be obtained: .

[0053] To improve the stability of edge features, this invention also introduces an image alignment mechanism, performing optical flow estimation and geometric consistency enhancement on neighboring frames to maintain spatial consistency in the same static environment over time. Let the optical flow field be... The enhanced features can then be represented as: ; in, This indicates a characteristic correction operation guided by optical flow. This indicates feature splicing.

[0054] Visual feature sequence , serving as the visual input for the multimodal temporal model.

[0055] (3) Dynamic state reconstruction of CAN and inertial data: The system utilizes vehicle controller local area network (Controller Area Network) data and inertial data to construct a dynamic state representation of the vehicle, aiding in the understanding of driving behavior. The Controller Area Network data is then used in... The signal vector formed at each moment Combined with the state provided by inertial data, a dynamic eigenvector is formed. Finally, a dynamic characteristic sequence is constructed. .

[0056] (4) Multimodal temporal fusion and behavior recognition model inference: Assign fusion weights to sensor data of different modalities based on health reference values: ; in, , for The fusion weights of modal sensor data, Timing window Inside The average health of the sensor data for each modality. To prevent constants with a denominator of zero; These represent image data, inertial data, positioning data, and vehicle controller area network data, respectively.

[0057] Vehicle controller network data in Data generated at any time Inertial measurement unit in Inertial data collected at all times With the positioning module at time Location data collected Combined into dynamic eigenvectors : ; Constructing dynamic characteristic sequences .

[0058] Side-side features are constructed based on modal health, enhanced visual features, and dynamic feature vectors. : ; in, This indicates a feature splicing or fusion operation.

[0059] Constructing a multimodal temporal feature sequence based on multimodal features at continuous time points. : ; Multimodal temporal feature sequences Input to multimodal time series model Output the temporal embedding features of the current time series window. : ; The probability distribution of driving behavior categories is further obtained through the classification output layer. : ; in, and For classification parameters, This is a softmax function; it outputs the driving behavior categories corresponding to the maximum probability distribution. The output driving behavior categories include lane departure, lane keeping, and abnormal speed.

[0060] 3. Side output interpretable evaluation results.

[0061] (1) Mechanism for constructing and loading vehicle dynamics parameter sets: Different vehicle models exhibit significant differences in dynamic characteristics, steering sensitivity, braking performance, and wheelbase length, meaning the same behavior may manifest differently in different models. For example, compact cars typically show smaller steering angle changes, while larger vehicles experience greater steering wheel angle changes under the same trajectory deviation conditions. Therefore, this invention introduces a vehicle model parameter set to provide differentiated interpretation of the side recognition results.

[0062] Suppose a vehicle model's parameter set includes: wheelbase, vehicle width, allowable longitudinal acceleration range, allowable steering angle range, and steering wheel response coefficient. When a vehicle goes online, the test center management platform sends the corresponding vehicle model version number to the edge, which then maps the version number to the scoring module, ensuring a consistent and fair scoring standard for different vehicle models.

[0063] For example, when the side detects the vehicle's lateral offset as At that time, the scoring module converts it into a vehicle model equivalent offset: ; This represents the equivalent offset after vehicle model normalization. The equivalent offset is used for subsequent fuzzing processing to ensure that the scoring standards for each vehicle model are consistent and to eliminate the impact of differences in vehicle body size on the evaluation results.

[0064] (2) Construction of membership functions and fuzzification of quantitative indicators of driving behavior: The results of driving behavior category recognition are transformed into scoreable continuous quantities, and the degree of behavior is fuzzified using membership functions. The purpose of fuzzification is to provide a smooth input space for the rule engine and avoid unstable deductions due to single-point fluctuations.

[0065] Construction of behavioral quantitative indicators: For example, for lane-crossing behavior, the distance between the vehicle's center and the lane edge can be used. Define the degree of violation These continuous quantities provide the input basis for membership functions. Indicates the vehicle's time The lateral center position coordinates, that is, the lateral position of the vehicle's geometric center in the vehicle coordinate system or global coordinate system. Indicates the vehicle's time The lateral coordinates corresponding to the edge of the lane. This represents the lateral distance between the vehicle's center and the lane edge, used to describe the vehicle's current relative position to the lane boundary.

[0066] Membership function construction: This invention can use trigonometric membership functions, trapezoidal membership functions, and Gaussian membership functions.

[0067] Fuzzy vector generation: in The time-matter behavior fuzzy vector is: Each of them A corresponding fuzzy membership degree is assigned, which is used to proceed to the next step of rule reasoning.

[0068] (3) Rule engine reasoning based on the examination state machine domain: Driving tests consist of multiple subjects, and the scoring logic differs for different behaviors in different subjects. Therefore, this invention constructs an examination state machine that automatically activates the corresponding subset of rules based on the subject.

[0069] 1. State machine construction: The state machine determines the current subject by means of vehicle position, speed, and examination process control signals.

[0070] 2. State-driven rule activation: Taking driving on a curve as an example, the tolerance for driving on a curve is smaller than that for driving on a straight line. Therefore, different rule sets are activated under different states. These rules are all located in the outer layer of the model and maintained by the cloud side.

[0071] The rule engine infers from the activated subset of rules and the amount of fuzzy behavior to obtain a result of deduction, disqualification, or warning.

[0072] (4) Scoring stability and judgment generation: Driving behavior may exhibit short-term anomalies. This invention employs a time-series sliding window smoothing mechanism to ensure that the scoring output is unaffected by transient noise. ; in This is the window length.

[0073] The output score conclusion is determined based on the threshold: If The scoring conclusion is normal; if The scoring conclusion is a deduction of points; The evaluation result is unsatisfactory. The interpretable evaluation results include driving behavior category, scoring conclusion, matching rule number, vehicle model version number, and membership vector, providing a basis for subsequent appeals and supervision.

[0074] 4. Cloud testing trains multimodal time series models and monitors edge states.

[0075] (1) Collection of historical data and construction of training samples on the cloud side: The cloud-based system periodically collects data from the edge servers at each test site throughout the entire testing process, including raw multimodal data uploaded from the device, driving behavior categories identified by the edge servers, scoring conclusions, and rule activation records. To ensure data availability and cross-site consistency, this invention constructs a data access and cleaning module on the cloud side to perform structural unification processing on all data from different vehicle models and test sites. Specifically, this includes: Data synchronization and archiving: Based on a unified timestamp, event records from the device, edge, and cloud are mapped to the same time series to form a structured log.

[0076] Abnormal data cleaning: Filter out missing frames, invalid GPS positioning data, noise abrupt changes, etc., and reconstruct samples through wavelet denoising, interpolation and other methods.

[0077] Model training sample construction: Select sample pairs containing behavioral features, vehicle parameters, and subject information from each sample to construct the training dataset.

[0078] Through the above steps, the cloud side obtains a large-scale sample library that can be used to train deep models and optimize rules.

[0079] (2) Training and optimization of cloud-based deep models: The cloud side trains and updates the previously used multimodal time series model to ensure its generalization performance under different environmental and vehicle vehicle conditions.

[0080] Training Objectives and Loss Function Design: The objective of cloud-based model training is to accurately predict driving behavior categories. Assume the model output is a probability distribution. The real label is Using cross-entropy loss : ; To improve the stability of the model under edge jitter and noise, this invention also introduces a modal regularization term. The total loss for: .

[0081] in, , The total number of modes, , For the first Feature representation corresponding to each mode For regularization weights, It is a norm 2; By the Sensor data from each mode is extracted using a modal encoder: ; in, Indicates the first Sensor data for each modality, Indicates the first Each modality corresponds to a feature encoder; when the sensor data is image data, the feature encoder includes one of a convolutional neural network or a lightweight image coding network; when the sensor data is inertial data, vehicle controller area network data, or positioning data, the feature encoder includes one of a multilayer perceptron, a long short-term memory network, a gated recurrent unit network, or a temporal Transformer network.

[0082] Cross-vehicle adaptation optimization: Vehicle dynamics parameters are incorporated into the model training as conditional inputs, enabling the model to adapt to the dynamic characteristics of different vehicle models. This mechanism allows the model to exhibit consistent recognition capabilities across different vehicle models.

[0083] Model Deployment: After training is completed, the cloud side quantizes and trims the model to generate an inference version adapted to the edge hardware, and pushes it to the edge servers of each examination room through the version management system.

[0084] (3) Rule parameter optimization and scoring consistency management: Rule parameters (thresholds, weights, membership functions, etc.) determine the fairness of the scoring. This invention uses an evolutionary algorithm on the cloud side to globally optimize the rule parameters, ensuring consistency across different test centers and vehicle models.

[0085] The rule parameters include: pressure tolerance, speed deviation threshold, duration threshold, and membership function. The cloud side encodes the entire set of rule parameters into an optimizable vector.

[0086] Evolutionary algorithm optimization: The rule parameters are optimized using a genetic algorithm or differential evolution algorithm: initialize the population, calculate the scoring deviation of each set of parameters on the validation dataset, select the best-performing set of rule parameters, perform crossover and mutation operations to generate a new population, and iterate until the deviation converges, that is, minimize the scoring difference between different sites and different vehicle models.

[0087] Rule version management: After optimization, the cloud side generates a new version of the rule parameters and pushes it to all examination room sides to ensure that the scoring logic is consistent.

[0088] (4) Cloud-side fault-tolerant takeover and global consistency guarantee: The cloud monitors the operational status of each edge node and performs remote takeover when an anomaly is detected, ensuring that the examination process is uninterrupted and the scoring criteria do not drift.

[0089] Edge health monitoring: The cloud side uses a heartbeat mechanism to detect the status of edge servers, including inference latency, model crash, memory overflow, data backlog, and network disconnection. If any indicator exceeds the threshold, the cloud side considers the edge to have entered a degraded state.

[0090] Remote takeover mechanism: When the edge fails to infer normally, the cloud automatically receives the proxy stream of data from the edge; uses the cloud-side inference model to instantly calculate the driving behavior category recognition results and evaluation results; and sends the results back to the edge and the test record database. During remote takeover, the edge's scoring output is replaced by the cloud until the edge returns to normal.

[0091] Consistency Verification: After the exam, the cloud side performs consistency verification on the outputs of the end, edge, and cloud sides, including consistency in behavior recognition, scoring rules, vehicle model adaptation, and time synchronization. If any inconsistencies are detected, the cloud side triggers an alert and saves evidence for administrators to review.

[0092] Example: 1. End side (test vehicle): In a typical driving test scenario, the test vehicle is equipped with the following hardware modules: Cameras: Front and side cameras are used to capture lane lines, road signs, and driving environment.

[0093] Inertial Measurement Unit: Used to acquire vehicle attitude changes, acceleration, and angular velocity.

[0094] GPS module / RTK module: Used to obtain the vehicle's position and heading in the test site coordinate system.

[0095] Vehicle Controller Area Network (CAN) interface: used to acquire vehicle Controller Area Network data such as vehicle speed, acceleration, steering angle, braking, and gear position.

[0096] End-side processing unit: Used to perform unified timing and time alignment, and to calculate the health status of data from each modal sensor before uploading it to the edge. The end-side hardware module is fixedly installed inside the testing vehicle.

[0097] 2. Side (Exam Room Server): Side nodes are typically deployed in the test center monitoring room or equipment room, equipped with a GPU / CPU hybrid inference environment, used to identify driving behavior categories and output interpretable evaluation results. Side nodes include: Data receiving module: used to receive and reassemble the multimodal sensor data.

[0098] Visual encoding and dynamics modeling module: used to construct visual feature sequences and dynamic feature sequences, and to align and fuse them.

[0099] Multimodal temporal model: used to identify driving behavior categories such as lane departure, deviation, abnormal speed, and improper operation.

[0100] Scoring and Rule Engine Module: Constructs behavioral quantification indicators for driving behavior categories and performs fuzzification through membership functions. Combined with a rule engine driven by the examination state machine, it performs scoring and outputs interpretable evaluation results.

[0101] Anomaly monitoring and feedback module: used to return the evaluation results to the front-end system and synchronize the parameters and rules in the scoring process to the cloud side.

[0102] Side-side computation typically requires behavior recognition and scoring to be completed within 100–200ms to meet the real-time requirements of the examination system.

[0103] 3. Cloud side: The cloud-based platform is deployed in the monitoring center and is responsible for overall management and parameter optimization. Specifically, this includes: Model Training Center: Used to train multimodal temporal models and visual encoders based on historical data, and generate lightweight versions that can be deployed on the side.

[0104] Rule Parameter Optimization Center: Optimizes rule parameters based on evolutionary algorithms.

[0105] System health monitoring platform: used to monitor the status of each edge node in real time and perform fault-tolerant takeover when an edge node is abnormal.

[0106] Scoring Consistency and Behavioral Audit Platform: Performs statistical analysis on the scoring conclusions of each examination room to identify possible rule drift or scoring bias.

[0107] The system modules of the present invention are as follows Figure 2 As shown.

[0108] Through cloud-based management, this invention achieves scoring consistency and rule standardization across vehicle models and test sites.

[0109] The following describes the collaborative process of the terminal side, edge side, and cloud side in actual operation, to demonstrate the complete implementation path of the system of the present invention.

[0110] At the start of the exam: After the vehicle is powered on, the terminal side completes hardware module initialization, clock synchronization, and self-test; the side side obtains the current exam subject information and vehicle dynamic parameters, and pushes the corresponding exam vehicle version number to the terminal side.

[0111] During the driving phase: Each modality on the edge generates data frames according to a unified timing and sends them to the edge in real time; the edge performs visual and dynamic fusion reasoning to identify driving behavior categories; the scoring engine performs rule reasoning based on the current subject state machine and provides a score output.

[0112] Anomaly Phase: If the edge detects inference delays, model crashes, data backlogs, etc., it will proactively report to the cloud side; the cloud side will immediately initiate a takeover process and calculate the scoring conclusion through a proxy; the examination process will not be interrupted and the scoring standards will remain consistent.

[0113] At the end of the exam: the edge summarizes the driving behavior category sequence, scoring conclusion sequence, rule activation records, etc., and uploads them to the cloud; the cloud executes data archiving and updates the global training sample library to provide materials for subsequent training; the entire collaborative process ensures that the system maintains real-time performance, robustness and consistency throughout the entire link.

[0114] In summary, this invention achieves the following technical effects through multiple technical means, including edge-side data acquisition and unified timing, edge-side behavior recognition and temporal fusion, vehicle model-based scoring and fuzzy rule reasoning, and cloud-side model training and rule optimization: Significantly improved scoring consistency: By managing vehicle model version number parameter sets, membership functions, and unified rules, consistent scoring standards are achieved across different vehicle models and testing environments.

[0115] It combines real-time performance with robustness: lightweight edge processing, high-performance edge inference, and fault-tolerant cloud takeover enable the system to maintain stable response even in complex examination environments.

[0116] Explainable and auditable: By recording behavior sequences, membership vectors, rule numbers, etc., the scoring process is made traceable and verifiable.

[0117] Scalable and Evolvable: The cloud-based model training and rule optimization mechanism enables the system to continuously learn and update rapidly, adapting to future changes in examination systems and vehicle models.

[0118] Unified global management across sites and vehicles: Unified rules on the cloud side and distributed reasoning on the edge side ensure consistent scoring standards and improve the standardization of driving tests.

[0119] This invention can be widely applied to driving test systems, intelligent connected vehicle evaluation platforms, and intelligent driving testing and verification scenarios. It has good engineering applicability, scalability, and significant practical application value and promotional significance.

[0120] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the invention. The terms “comprising,” “including,” etc., as used herein indicate the presence of the stated features, steps, operations, and / or components, but do not exclude the presence or addition of one or more other features, steps, operations, or components.

[0121] It should be understood that although the steps in the flowcharts of the accompanying drawings are shown sequentially as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some of the steps in the flowcharts of the accompanying drawings may include multiple steps or stages, which are not necessarily completed at the same time, but may be executed at different times, and the execution order of these steps or stages is not necessarily sequential, but may be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.

[0122] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0123] It will be apparent to those skilled in the art that the present invention is not limited to the details of the exemplary embodiments described above, and that the invention can be implemented in other specific forms without departing from its spirit or essential characteristics. Therefore, the embodiments should be considered in all respects as exemplary and non-limiting, and the scope of the invention is defined by the appended claims rather than the foregoing description. Thus, all variations falling within the meaning and scope of equivalents of the claims are intended to be included within the present invention, and no reference numerals in the claims should be construed as limiting the scope of the claims.

[0124] Furthermore, it should be understood that although this specification describes embodiments, not every embodiment contains only one independent technical solution. This narrative style is merely for clarity. Those skilled in the art should consider the specification as a whole, and the technical solutions in each embodiment can also be appropriately combined to form other embodiments that can be understood by those skilled in the art.

Claims

1. A method for edge-cloud collaborative driving testing and evaluation for intelligent vehicles, characterized in that, include: The device collects multimodal sensor data from the vehicle at the edge, performs unified timing and time alignment, constructs data frames, calculates the health status of each data frame, and then uploads it to the edge. The multimodal sensor data includes images, inertial data, positioning data, and vehicle controller area network data. The system receives and reassembles the multimodal sensor data, extracts visual features and vehicle dynamics features, and identifies driving behavior categories through a multimodal time-series model. Based on the dynamic parameter set of the current vehicle model, the side constructs a quantitative index of driving behavior category and performs fuzzification processing through membership function to obtain a comprehensive quantitative score of the current driving behavior category. Combined with the rule engine driven by the test state machine, the score is judged and an interpretable evaluation result is output. The cloud collects historical data to train and optimize the multimodal time series model, uses evolutionary algorithms to optimize scoring rule parameters to achieve cross-model scoring consistency, monitors the side-side status, and performs remote fault-tolerant takeover in case of anomalies.

2. The edge-cloud collaborative driving test and evaluation method for intelligent vehicles according to claim 1, characterized in that, The edge device collects multimodal sensor data from the vehicle, performs unified timing and time alignment, constructs data frames, calculates the health status of each data frame, and uploads it to the edge device. Specifically, this includes: The edge receives a standard time signal from the side to correct the local clock deviation, and aligns the sensor data of different modes to a unified time reference through interpolation to construct a data frame containing timestamps; The health calculation for each data frame specifically includes: ; in, Indicates image sharpness, Indicates the stability of inertial data. This indicates the continuity of data on the vehicle controller area network. Indicates the validity of the location data; express The health of the data frame at any given time. This represents the timestamp of the kth sampling moment after unified time synchronization correction.

3. The edge-cloud collaborative driving test and evaluation method for intelligent vehicles according to claim 1, characterized in that, The side receives and reassembles the multimodal sensor data, extracts visual features and vehicle dynamics features, and identifies driving behavior categories through a multimodal time-series model, specifically including: The system performs windowed recombination on multiple consecutive data frames, extracts visual features using a visual encoder, and enhances geometric consistency by combining optical flow estimation. The system reconstructs the vehicle's dynamic feature sequence using vehicle controller LAN data and inertial data, and then concatenates the enhanced visual features with the dynamic features before inputting them into a multimodal time series model to identify driving behavior categories.

4. The edge-cloud collaborative driving test and evaluation method for intelligent vehicles according to claim 3, characterized in that, The side-by-side performs windowed recombination of multiple consecutive data frames, extracts visual features using a visual encoder, and enhances geometric consistency by combining optical flow estimation, specifically including: Multimodal sensor data transmitted in the form of data frames from the side-side receiver is buffered and reassembled according to a preset inference period: Arrange them in chronological order. Each data frame constitutes a timing window. : ; in, express Data frames at any given time For window length, for Index of time; When a missing data frame is detected, the missing data frame is filled using nearest neighbor interpolation or model prediction, and the time series window is calculated. Health Reference Scale : ; express Data frames at time Health status; Health Reference Scale Used to characterize the overall reliability of images, inertial data, vehicle controller area network data, and positioning data within this time window; ; in, Representing timing windows respectively Average health of internal image data, inertial data, vehicle controller LAN data, and positioning data; right Visual features are extracted from each image in each data frame: ; For visual encoders, for The image in for Visual characteristics of a moment; Geometric consistency enhancement of visual features is achieved through optical flow estimation, resulting in enhanced visual features. : ; This indicates a characteristic correction operation guided by optical flow. Indicates feature splicing, For optical flow field; Constructing visual feature sequences , serving as the visual input for the multimodal temporal model.

5. The edge-cloud collaborative driving test and evaluation method for intelligent vehicles according to claim 4, characterized in that, The process of reconstructing the vehicle's dynamic feature sequence using vehicle controller area network data and inertial data, then concatenating the enhanced visual features with the dynamic features, and inputting the result into a multimodal time series model to identify driving behavior categories, specifically includes: Assign fusion weights to sensor data of different modalities based on health reference values: ; in, , for The fusion weights of modal sensor data, Timing window Inside The average health of the sensor data for each modality. To prevent constants with a denominator of zero; These represent image data, inertial data, positioning data, and vehicle controller area network data, respectively. Vehicle controller network data in Data generated at any time Inertial measurement unit in Inertial data collected at all times With the positioning module at time Collection of location data Combined into dynamic eigenvectors : ; Constructing dynamic characteristic sequences ; Side-side features are constructed based on modal health, enhanced visual features, and dynamic feature vectors. : ; in, This indicates a feature splicing or fusion operation; Constructing a multimodal temporal feature sequence based on multimodal features at continuous time points. : ; Multimodal temporal feature sequences Input to multimodal time series model Output the temporal embedding features of the current time series window. : ; The probability distribution of driving behavior categories is further obtained through the classification output layer. : ; in, and For classification parameters, This is the softmax function; it outputs the driving behavior category corresponding to the maximum probability distribution.

6. The edge-cloud collaborative driving test and evaluation method for intelligent vehicles according to claim 1, characterized in that, The side constructs a quantitative index for driving behavior categories based on the dynamic parameter set of the current vehicle model, and performs fuzzification processing using a membership function to obtain a comprehensive quantitative score for the current driving behavior category, specifically including: The vehicle's dynamic parameters include wheelbase, width, allowable longitudinal acceleration range, allowable steering angle range, and steering wheel response coefficient. The identified driving behavior categories are converted into behavioral quantification indicators. These indicators are then used as continuous inputs and fuzzified using one or more of the following methods: triangular membership function, trapezoidal membership function, or Gaussian membership function, to obtain the behavioral fuzzy vector. ,in Let i be the fuzzy membership degree. The number of fuzzy membership degrees for the behavior fuzzy vector; A comprehensive quantitative score for the current driving behavior category is calculated based on the behavioral fuzzy vector. : ; Indicates the first The scoring weights corresponding to the fuzzy membership degrees.

7. The edge-cloud collaborative driving test and evaluation method for intelligent vehicles according to claim 1, characterized in that, The scoring and evaluation are performed using a rule engine driven by the exam state machine, and interpretable evaluation results are output, specifically including: An examination state machine is constructed to determine the current subject based on vehicle position, speed, and examination process control signals, and automatically activate the corresponding rule subset; the rule engine performs reasoning based on the activated rule subset and behavior fuzzy vector to obtain the judgment result of deduction, failure, or prompt. A time-series sliding window smoothing mechanism is used to smooth the comprehensive quantitative score. ; The length of the sliding window. For the current moment Forward The comprehensive quantitative score corresponding to each moment. For the current moment, The time index is within the sliding window, and , Scoring of driving behavior after time-series sliding window smoothing; The output score conclusion is determined based on the threshold: If The scoring conclusion is normal; if The scoring conclusion is a deduction; if The evaluation result was unsatisfactory. The interpretable evaluation results include driving behavior category, scoring conclusion, rule subset number, vehicle model version number, and behavior fuzzy vector.

8. The edge-cloud collaborative driving test and evaluation method for intelligent vehicles according to claim 1, characterized in that, The collection of historical data on the cloud side for training and optimizing the multimodal time series model specifically includes: Using cross-entropy loss With modal regularization term The sum of these is the total loss. Training of a multimodal time series model; among which, ; This represents the probability distribution output by the multimodal time series model. for Corresponding real-world driving behavior category tags, Indicates the first An index of driving behavior categories; , The total number of modes, , For the first Feature representation of sensor data for each modality For regularization weights, It is a norm 2; By the Sensor data from each mode is extracted using a modal encoder: ; in, Indicates the first Sensor data for each modality, Indicates the first Feature encoders corresponding to each mode; During the training of the multimodal temporal model, the dynamic parameters of the vehicle model are used as conditional inputs to achieve cross-vehicle adaptation. After training, the cloud side quantizes and prunes the multimodal temporal model to generate an inference version adapted to the edge hardware and pushes it to the edge.

9. The edge-cloud collaborative driving test and evaluation method for intelligent vehicles according to claim 1, characterized in that, The method of using evolutionary algorithms to optimize scoring rule parameters to achieve cross-vehicle scoring consistency and monitoring side-side status, and executing remote fault-tolerant takeover in case of anomalies, specifically includes: Vehicle lane-crossing tolerance, speed deviation threshold, duration threshold, and membership function are encoded into optimizable vectors and optimized using a genetic algorithm or differential evolution algorithm to minimize the scoring differences between different tracks and vehicle types; the optimized rule parameters are then pushed to all edges. The cloud side uses heartbeat packets to detect whether the edge side is in a degraded state, including inference latency, model crash, memory overflow, data accumulation, and network disconnection. When the edge side is determined to be in a degraded state, it automatically receives the data proxy stream from the end side, uses the cloud side inference model to identify the driving behavior category and sends it back to the edge side until the edge side returns to normal.

10. A collaborative driving test and evaluation system for intelligent vehicles, characterized in that, The system for performing the method as described in any one of claims 1 to 9 includes: The edge device, deployed in the test vehicle, is used to collect multimodal sensor data of the vehicle, perform unified time synchronization and time alignment, and upload the health status of each modal sensor data to the edge device. The side server, deployed in the examination room, is used to receive and reassemble the multimodal sensor data, extract visual features and vehicle dynamic features, identify driving behavior categories through a multimodal time series model, construct behavioral quantification indicators for driving behavior categories based on the dynamic parameter set of the current vehicle model, perform fuzzification processing through membership functions, and combine the rule engine driven by the examination state machine to make scoring judgments and output interpretable evaluation results. The cloud-based platform, deployed in the monitoring center, is used to train and optimize the multimodal time series model, utilizes evolutionary algorithms to optimize scoring rule parameters to achieve cross-model scoring consistency, monitors the side-side status, and performs remote fault-tolerant takeover in case of anomalies.