Model training method, driving right allocation method, device, and storage medium

By acquiring flight sample data to predict driving actions and evaluate drivers, and by using convolutional neural networks and long short-term memory networks to train a driving rights allocation model, the safety hazards caused by drivers manually allocating driving rights are solved, and the accuracy and safety of autonomous driving rights allocation are achieved.

CN120348304BActive Publication Date: 2026-06-19CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY
Filing Date
2025-04-18
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

In existing technologies, the allocation of driving rights between the intelligent driving system and the driver of flying cars relies on manual allocation by the driver, which is easily affected by the driver's mental state and skill level, posing safety hazards.

Method used

By acquiring flight sample data, we can predict driving actions and evaluate pilots. We can then train a driving rights allocation model using the model to be trained, automatically allocate driving rights, and combine convolutional neural networks and long short-term memory networks for feature extraction and prediction to optimize the driving rights allocation strategy.

Benefits of technology

It enables automatic allocation of driving rights based on the driver's current status and flight environment, improving the flight safety and accuracy of driving rights allocation for flying cars.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN120348304B_ABST
    Figure CN120348304B_ABST
Patent Text Reader

Abstract

This application relates to the field of autonomous driving technology, and discloses a model training method, a driving rights allocation method, a device, and a storage medium. The model training method includes: acquiring flight sample data; predicting driving actions based on the flight sample data to obtain action prediction sample data; evaluating the driver based on driver state sample data, driver action sample data, and action prediction sample data to obtain driver evaluation sample data; inputting the flight sample data and driver evaluation sample data into a preset model to be trained to obtain driving rights allocation sample data; and training the model to be trained based on the driving rights allocation sample data to obtain a driving rights allocation model. Embodiments of this application can automatically allocate driving rights to improve the flight safety of flying cars.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of autonomous driving technology, and in particular to a model training method, a driving rights allocation method, a device, and a storage medium. Background Technology

[0002] Flying cars, as an emerging mode of transportation, have received increasing attention in recent years.

[0003] For flying cars, the current stage of development of intelligent driving functions involves the process of sharing driving rights between humans and machines. Because of the sharing of driving rights, the actions of the intelligent driving system and the actions of the driver will inevitably have a coupling and constraint relationship, and there will be a struggle for driving rights, which is game-like.

[0004] In related technologies, the allocation of driving rights between intelligent driving systems and drivers mainly relies on manual allocation by the driver. During the process of manually allocating driving rights, the driver's mental state and driving skills can easily affect the driver, thus posing safety hazards. Summary of the Invention

[0005] The purpose of this application is to provide a model training method, a driving rights allocation method, a device, and a storage medium, which solves the technical problem that requires the driver to manually allocate driving rights, thus posing a safety hazard, and realizes automatic allocation of driving rights to improve the flight safety of flying cars.

[0006] This application provides a model training method, including:

[0007] Acquire flight sample data; the flight sample data includes flight environment sample data, pilot status sample data, pilot action sample data, and vehicle condition sample data;

[0008] Based on the flight sample data, driving actions are predicted to obtain action prediction sample data;

[0009] The driver is evaluated based on the driver state sample data, the driver action sample data, and the action prediction sample data to obtain driver evaluation sample data.

[0010] The flight sample data and the pilot evaluation sample data are input into a preset training model to obtain piloting rights allocation sample data.

[0011] The driving rights allocation model is obtained by training the model to be trained based on the driving rights allocation sample data.

[0012] In some embodiments, the step of predicting driving actions based on the flight sample data to obtain action prediction sample data includes:

[0013] The flight sample data is input into a pre-trained driving action prediction model to obtain the action prediction sample data. The driving action prediction model is trained based on the action prediction loss information of the action prediction sample data. The action prediction loss information is obtained by fitting a first strategy loss information, a first value loss information, and a first entropy loss information. The first strategy loss information represents the quality of the strategy that generates the action prediction sample data. The first value loss information represents the value deviation between the action prediction sample data and the real action data. The first entropy loss information represents the uncertainty of the action prediction sample data.

[0014] In some embodiments, evaluating the driver based on the driver state sample data, the driver action sample data, and the action prediction sample data to obtain driver evaluation sample data includes:

[0015] Based on the driver state sample data, generate first evaluation sample data characterizing the driving state;

[0016] Based on the action prediction sample data and the driver action sample data, a second evaluation sample data characterizing driving skills is generated;

[0017] The driver is evaluated based on the first evaluation sample data and the second evaluation sample data to obtain the driver evaluation sample data.

[0018] In some embodiments, inputting the flight sample data and the pilot evaluation sample data into a preset training model to obtain pilot authority allocation sample data includes:

[0019] Based on the flight sample data and the driver evaluation sample data, the driving scenario at future moments is predicted, and multiple future scenario feature sample data are obtained.

[0020] Calculate the scene reward data corresponding to each of the future scene feature sample data;

[0021] Output the corresponding driving rights allocation sample data based on the scenario reward data.

[0022] In some embodiments, the step of predicting future driving scenarios based on the flight sample data and the driver evaluation sample data to obtain multiple future scenario feature sample data includes:

[0023] Feature extraction of the current scene feature sample data is obtained by performing continuous convolution on the flight sample data and the driver evaluation sample data.

[0024] Based on the transformation probability of the current scene feature sample data at future times, the driving scene at future times is predicted, and multiple future scene feature sample data are obtained.

[0025] In some embodiments, the scene reward data is obtained by fitting a first reward sample data, a second reward sample data, a third reward sample data, a fourth reward sample data, and a fifth reward sample data. The first reward sample data represents the flight efficiency of the driving scene corresponding to the future scene feature sample data, the second reward sample data represents the safety of the driving scene corresponding to the future scene feature sample data, the third reward sample data represents the stability of the driving scene corresponding to the future scene feature sample data, the fourth reward sample data represents the human-machine collaboration success rate of the driving scene corresponding to the future scene feature sample data, and the fifth reward sample data represents the quality of the strategy that generated the future scene feature sample data.

[0026] In some embodiments, training the model to be trained based on the driving rights allocation sample data to obtain the driving rights allocation model includes:

[0027] Determine the model loss information corresponding to the driving right allocation sample data; the model loss information is obtained by fitting the second strategy loss information, the second value loss information and the second entropy loss information, the second strategy loss information characterizes the quality of the strategy that generates the driving right allocation sample data, the second value loss information characterizes the value deviation between the driving right allocation sample data and the real driving right allocation data, and the second entropy loss information characterizes the uncertainty of the driving right allocation sample data.

[0028] The network parameters in the model to be trained are iteratively adjusted based on the action prediction loss information until the training termination condition is met, thus obtaining the driving rights allocation model.

[0029] This application also provides a method for allocating driving rights, including:

[0030] Acquire flight data; the flight data includes flight environment data, pilot status data, pilot action data, and vehicle condition data;

[0031] Based on the flight data, driving actions are predicted to obtain action prediction data;

[0032] The driver is evaluated based on the driver status data, the driver action data, and the action prediction data to obtain driver evaluation data;

[0033] The flight data and the pilot evaluation data are input into the pilot control allocation model to obtain pilot control allocation data; the pilot control allocation model is trained by the model training method described above.

[0034] The driving rights allocation action is performed based on the driving rights allocation data.

[0035] This application also provides an electronic device, which includes a memory and a processor. The memory stores a computer program, and the processor executes the computer program to implement the above-described method.

[0036] This application also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described method.

[0037] The beneficial effects of this application are as follows: Driving actions are predicted based on flight sample data to obtain action prediction sample data. Then, the driver is evaluated based on driver state sample data, driver action sample data, and action prediction sample data to obtain driver evaluation sample data, which serves as the basis for evaluating the driver's driving ability. The flight sample data and driver evaluation sample data are input into a pre-set training model. The driving rights allocation sample data obtained from the training model is used to train the model, resulting in a driving rights allocation model after training. Since the driver evaluation sample data is obtained by evaluating the driver based on driver state sample data, driver action sample data, and action prediction sample data, inputting the flight sample data and driver evaluation sample data into the training model and using the driving rights allocation sample data to train the model allows the training model to learn the dependence of different driver states and driver actions on the driving rights allocation results during training. This enables the trained driving rights allocation model to accurately identify the driver's current driving ability and determine the driving rights allocation data. When automatically allocating driving rights based on this driving rights allocation data, the flight safety of the flying car can be improved. Attached Figure Description

[0038] Figure 1 This diagram illustrates the application environment of the model training method provided in the embodiments of this application.

[0039] Figure 2 This is a flowchart of the model training method provided in the embodiments of this application.

[0040] Figure 3 This is a flowchart of the driving rights allocation method provided in the embodiments of this application.

[0041] Figure 4 This is an example of the hardware structure of the electronic device provided in the embodiments of this application. Detailed Implementation

[0042] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0043] It should be noted that although functional modules are divided in the device's operation and a logical order is shown in the flowchart, in some cases, the steps shown may be performed in a different order than the module division in the device or the order in the flowchart. The terms "first," "second," etc., in the specification, claims, and drawings are used to distinguish similar objects and are not used to describe a specific order or sequence.

[0044] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of this application only and is not intended to limit this application.

[0045] The model training method and driving rights allocation method provided in this application can be executed by a computer device, which can be a terminal device or a server. The terminal device includes, but is not limited to, mobile phones, computers, intelligent voice interaction devices, smart home appliances, vehicle terminals, and aircraft. The server can be a standalone physical server, a server cluster consisting of multiple physical servers, a distributed system, or a cloud server.

[0046] Furthermore, the information, data, and signals involved in the embodiments of this application are all authorized by the relevant parties or fully authorized by all parties, and the collection, use, and processing of the relevant data comply with the relevant laws, regulations, and standards of the relevant countries and regions.

[0047] To facilitate understanding of the model training method provided in this application embodiment, the following example uses a server as the execution subject of the model training method to illustrate its application scenarios.

[0048] Figure 1 This diagram illustrates the application environment of the model training method provided in the embodiments of this application. (See also...) Figure 1This model training method is applied to a model training system. The model training system includes a terminal 110 and a server 120. The terminal 110 and server 120 are connected via a network. The terminal 110 can be a desktop terminal or a mobile terminal; the mobile terminal can be at least one of a mobile phone, tablet, or laptop. The server 120 can be a standalone server or a server cluster consisting of multiple servers. The terminal 110 sends flight sample data, including flight environment sample data, driver state sample data, driver action sample data, and vehicle condition sample data, to the server 120. The server 120 acquires the flight sample data, predicts driving actions based on the flight sample data to obtain action prediction sample data, evaluates the driver based on the driver state sample data, driver action sample data, and action prediction sample data to obtain driver evaluation sample data, inputs the flight sample data and driver evaluation sample data into a preset model to be trained to obtain driving rights allocation sample data, and trains the model to be trained based on the driving rights allocation sample data to obtain a driving rights allocation model.

[0049] It should be understood that Figure 1 The application scenarios shown are merely examples. In practical applications, the model training method provided in this application embodiment can also be applied to other scenarios. For example, the above model training method can be directly applied to terminal 110. Terminal 110 is used to acquire flight sample data including flight environment sample data, driver state sample data, driver action sample data, and vehicle condition sample data. It performs driving action prediction based on the flight sample data to obtain action prediction sample data. It evaluates the driver based on the driver state sample data, driver action sample data, and action prediction sample data to obtain driver evaluation sample data. It inputs the flight sample data and driver evaluation sample data into a preset model to be trained to obtain driving rights allocation sample data. It trains the model to be trained based on the driving rights allocation sample data to obtain a driving rights allocation model.

[0050] Figure 2 This is a flowchart of a model training method provided in an embodiment of this application. (See attached document.) Figure 2 In one embodiment, the method includes, but is not limited to, steps S201 to S205.

[0051] Step S201: Obtain flight sample data.

[0052] Flight sample data can be obtained by retrieving historical flight data from the flying car.

[0053] The flight sample data includes flight environment sample data, pilot status sample data, pilot action sample data, and vehicle condition sample data. Flight environment sample data may include environmental sample data such as obstacle location, dynamic state, weather conditions, no-fly zone location, and air traffic density. Pilot status sample data may include physiological sample data such as the pilot's saccade rate and heart rate. Pilot action sample data may include action sample data such as the pilot's input speed, altitude, and direction. Vehicle condition sample data may include sample data such as the flying car's flight speed, flight altitude, pitch angle, heading angle, roll angle, and angular velocity.

[0054] Step S202: Predict driving actions based on flight sample data to obtain action prediction sample data.

[0055] Among them, the action prediction sample data refers to the driver action sample data obtained by predicting the driving actions performed by the driver in a later time based on the flight sample data in the previous time.

[0056] In one embodiment, predicting driving actions based on flight sample data can be achieved by using a pre-trained driving action prediction model to predict the driving actions performed by the driver in a later time based on flight sample data from a previous time step, thereby obtaining the corresponding action prediction sample data.

[0057] Step S203: Evaluate the driver based on driver state sample data, driver action sample data, and action prediction sample data to obtain driver evaluation sample data.

[0058] Among them, driver evaluation sample data refers to evaluation sample data that predicts the driver's driving ability at the prediction time based on driver state sample data, driver action sample data, and corresponding action prediction sample data.

[0059] In one embodiment, evaluating a driver based on driver state sample data, driver action sample data, and action prediction sample data can be achieved by using a preset driver evaluation model to predict the driver's driving ability at the current moment based on driver state sample data, driver action sample data, and action prediction sample data at the same time, and then evaluating the driver's driving ability at the current moment to obtain driver evaluation sample data.

[0060] Step S204: Input the flight sample data and pilot evaluation sample data into the preset training model to obtain pilot authority allocation sample data.

[0061] Among them, the driving rights allocation sample data refers to sample data used to indicate the allocation ratio of driving rights to drivers and autonomous driving systems in flying cars.

[0062] In one embodiment, after inputting flight sample data and driver evaluation sample data into a preset training model, the training model determines the driver's current driving ability and the risk of the current flight environment based on the flight sample data and driver evaluation sample data, and predicts whether the driver's current driving ability can cope with the risk of the current flight environment. Based on the prediction result, corresponding driving right allocation sample data is obtained. If the driver can cope, driving right allocation sample data is obtained to allocate driving right to the driver. If the driver cannot cope, driving right allocation sample data is obtained to allocate driving right to the autonomous driving system. The driving right allocation sample data may include driving response schemes predicted based on flight sample data.

[0063] Step S205: Train the model to be trained based on the driving rights allocation sample data to obtain the driving rights allocation model.

[0064] In one embodiment, flight sample data and pilot evaluation sample data are input into a preset training model. The network parameters of the training model are adjusted each time pilot control allocation sample data is obtained, ultimately resulting in a pilot control allocation model. Specifically, multiple sets of flight sample data are traversed. Action prediction sample data is generated based on the flight sample data, and pilot evaluation sample data is generated based on the pilot state sample data, pilot action sample data, and action prediction sample data. Pilot control allocation sample data is then generated using the training model based on the flight sample data and pilot evaluation sample data. This iterative training continues until all model loss information meets the training termination criteria, resulting in the obtained pilot control allocation model.

[0065] In summary, the model training method provided in this application involves predicting driving actions based on flight sample data to obtain action prediction sample data. Then, the driver is evaluated based on driver state sample data, driver action sample data, and action prediction sample data to obtain driver evaluation sample data, which serves as the basis for evaluating the driver's driving ability. The flight sample data and driver evaluation sample data are input into a preset training model. The driving rights allocation sample data obtained from the training model is used to train the model, resulting in a driving rights allocation model after training. Since the driver evaluation sample data is obtained by evaluating the driver based on driver state sample data, driver action sample data, and action prediction sample data, inputting the flight sample data and driver evaluation sample data into the training model and using the driving rights allocation sample data allows the training model to learn the dependence of different driver states and driver actions on the driving rights allocation results during training. This enables the trained driving rights allocation model to accurately identify the driver's current driving ability and determine driving rights allocation data. When automatically allocating driving rights based on this driving rights allocation data, the flight safety of the flying car can be improved.

[0066] In one embodiment, step S202 includes: inputting flight sample data into a pre-trained driving action prediction model to obtain action prediction sample data.

[0067] The driving action prediction model is trained based on the action prediction loss information of the action prediction sample data. The action prediction loss information is obtained by fitting the first strategy loss information, the first value loss information, and the first entropy loss information. The first strategy loss information represents the quality of the strategy that generates the action prediction sample data, the first value loss information represents the value deviation between the action prediction sample data and the real action data, and the first entropy loss information represents the uncertainty of the action prediction sample data.

[0068] Specifically, the driving action prediction model is a model with convolutional neural networks and long short-term memory networks. It extracts local spatial features from the action prediction sample data by performing continuous convolution on the action prediction sample data. Then, the local spatial features and temporal features are input together into the long short-term memory network to capture the temporal dependency between the local spatial features and the temporal features, thereby predicting the driver's operation actions and obtaining action prediction sample data.

[0069] The driving action prediction model is trained based on the action prediction loss information from the action prediction sample data. The formula for calculating the action prediction loss information is as follows:

[0070] ,

[0071] in, For the first strategy, information loss. For information that results in the first loss of value, Information is lost due to first entropy. and This is a hyperparameter.

[0072] The formula for calculating the loss information of the first strategy is:

[0073] ,

[0074] The formula for calculating the first value loss information is:

[0075] ,

[0076]

[0077] ,

[0078] ,

[0079] = ,

[0080] ,

[0081] ,

[0082] ,

[0083] The formula for calculating the first entropy loss information is:

[0084] ,

[0085] ,

[0086] in: / The ratio between the old and new strategies. Represents a state. Representative action, For the dominant function, The cropping threshold, The estimated value is derived from the action prediction. It is the response reward value. Is the strategy in the state? Take driving action The value, It is the driver action prediction reward function. Reward information for the accuracy of driver action prediction. Information on safety awards for flying cars. Information to reward driver comfort. , and These are the weighting coefficients. For collision risk reward information, To ensure stable reward information, The number of sampling points within the time window T. For action deviation, The average motion deviation within the time window. The minimum distance between the flying car and surrounding obstacles. For a safe distance, For the attitude angle deviation of the flying car, To stabilize the threshold, , These are the weighting coefficients. To manipulate acceleration, This is the threshold for smoothing operations.

[0087] In one embodiment, step S203 includes: generating first evaluation sample data representing driving state based on driver state sample data; generating second evaluation sample data representing driving skills based on action prediction sample data and driver action sample data; and evaluating the driver based on the first evaluation sample data and the second evaluation sample data to obtain driver evaluation sample data.

[0088] The formula for calculating the first evaluation sample data is:

[0089] ,

[0090] ,

[0091] ,

[0092] ,

[0093] in, This is the first evaluation sample data. Rate attention For the tense scoring, For fatigue rating, , , As the scoring weight, This represents the deviation between the measured blink frequency and the normal blink frequency. This represents the deviation between the measured fixation time and the normal fixation time. This represents the deviation between the measured gaze deviation and the normal gaze deviation value. This refers to the deviation between the measured heart rate and the normal heart rate value. This represents the deviation between the measured heart rate variability and the normal heart rate variability. This represents the maximum permissible deviation between the measured blink frequency and the normal blink frequency. This represents the maximum permissible deviation between the measured fixation time and the normal fixation time. This represents the maximum permissible deviation between the measured gaze deviation and the normal gaze deviation value. This represents the maximum permissible deviation between the measured heart rate and the normal measured heart rate value. This represents the maximum permissible deviation between the measured value of heart rate variability and the normal value of heart rate variability.

[0094] The formula for calculating the second evaluation sample data is:

[0095] ,

[0096] ,

[0097] ,

[0098] ,

[0099] ,

[0100] ,

[0101] ,

[0102] ,

[0103] ,

[0104] ,

[0105] ,

[0106] in, This is the second evaluation sample data. For driver skill scoring data, R represents the driving environment complexity index, T represents the airspace condition parameter, T represents the traffic density parameter, and W represents the meteorological condition parameter. , , , , and As the scoring weight, For accuracy scoring, For stability rating, For sensitivity rating, Let be the predicted action value at time t. This represents the actual operational value at time t. The number of sampling points within the time window T. For the maximum permissible deviation, The average deviation of actions within the time window. For action deviation, For fluctuations in actions within a time window, To set the maximum allowable fluctuation, This represents the penalty coefficient for volatility in the score. The standardization rate of change of deviation. The rate of change of deviation To design the maximum permissible rate of change of deviation, This is the sliding step size of the sliding time window.

[0107] The formula for calculating driver evaluation sample data is as follows:

[0108] ,

[0109] in, For driver evaluation sample data, and For scoring weights.

[0110] In one embodiment, step S204 includes: predicting the driving scenario at future moments based on flight sample data and driver evaluation sample data to obtain multiple future scenario feature sample data; calculating the scenario reward data corresponding to each future scenario feature sample data; and outputting the corresponding driving right allocation sample data based on the scenario reward data.

[0111] Specifically, the model to be trained determines the driver's current driving ability and the risk of the current flight environment based on flight sample data and driver evaluation sample data, in order to predict the driving operations the driver will take in response to the current flight environment and the driving scenarios at future moments. Then, based on the predicted driving operations of the driver and the driving scenarios at future moments, it generates multiple future scenario feature sample data representing possible future driving scenarios. Then, it calculates the scenario reward data corresponding to each future scenario feature sample data. Based on the calculated scenario reward data, it predicts whether the driver has the risk to cope with the most likely future scenario, and obtains the corresponding driving right allocation sample data based on the prediction results. If the driver can cope, the driving right allocation sample data is obtained, which assigns driving right to the driver; if the driver cannot cope, the driving right allocation sample data is obtained, which assigns driving right to the autonomous driving system. The driving right allocation sample data may include driving response plans predicted based on flight sample data.

[0112] In one embodiment, the driving scenario at a future moment is predicted based on flight sample data and driver evaluation sample data to obtain multiple future scenario feature sample data, including: extracting features from the flight sample data and driver evaluation sample data through continuous convolution to obtain current scenario feature sample data; and predicting the driving scenario at a future moment based on the transformation probability of the current scenario feature sample data at a future moment to obtain multiple future scenario feature sample data.

[0113] Specifically, the model to be trained is a model with a convolutional neural network and a probabilistic prediction network. By performing continuous convolution on flight sample data and driver evaluation sample data, local parameter features of both flight sample data and driver evaluation sample data are extracted to obtain current scene feature sample data. Then, the current scene feature sample data and temporal features are input into the probabilistic prediction network to capture the temporal dependency between the current scene feature sample data and temporal features. In this way, the transformation probability of future moments is predicted to predict the driving scene at future moments, resulting in multiple future scene feature sample data.

[0114] In one embodiment, the scene reward data is obtained by fitting the first reward sample data, the second reward sample data, the third reward sample data, the fourth reward sample data, and the fifth reward sample data.

[0115] Among them, the first reward sample data represents the flight efficiency of the driving scenario corresponding to the future scenario feature sample data, the second reward sample data represents the safety of the driving scenario corresponding to the future scenario feature sample data, the third reward sample data represents the smoothness of the driving scenario corresponding to the future scenario feature sample data, the fourth reward sample data represents the success rate of human-machine collaboration in the driving scenario corresponding to the future scenario feature sample data, and the fifth reward sample data represents the quality of the strategy for generating the future scenario feature sample data.

[0116] The formula for calculating scene reward data is:

[0117] ,

[0118] in, For scene reward data, , , , and These are the weighting coefficients. This is the first reward sample data. For the second reward sample data, This is the third reward sample data. This is the fourth reward sample data. This is the fifth reward sample data.

[0119] The formula for calculating the first reward sample data is:

[0120] ,

[0121] ,

[0122] ,

[0123] in, and These are the weighting coefficients. To estimate the shortest flight time based on maps and traffic conditions, The total flight time during the flight of the flying car. This represents the average energy consumption of the flying car under standard operating conditions. This refers to the energy consumption rate during the flight of a flying car.

[0124] The formula for calculating the second reward sample data is:

[0125] ,

[0126] ,

[0127] ,

[0128] in, and These are the weighting coefficients. The minimum distance between the flying car and surrounding obstacles. For a safe distance, For the attitude angle deviation of the flying car, The stable threshold.

[0129] The formula for calculating the third reward sample data is:

[0130] ,

[0131] in, These are the weighting coefficients. To manipulate acceleration, This is the threshold for smoothing operations.

[0132] The formula for calculating the fourth reward sample data is:

[0133] ,

[0134] in, These are the weighting coefficients. The success rate of human-machine collaboration.

[0135] The formula for calculating the fifth reward sample data is:

[0136] ,

[0137] ,

[0138] ,

[0139] in, and These are the weighting coefficients. To compare the performance metrics under the current strategy with the performance metrics of the previous iteration, To monitor the success rate of the system's emergency response to sudden situations.

[0140] In one embodiment, step S205 includes: determining the model loss information corresponding to the driving rights allocation sample data; iteratively adjusting the network parameters in the model to be trained based on the action prediction loss information until the training termination condition is met, thereby obtaining the driving rights allocation model. The model loss information is obtained by fitting second policy loss information, second value loss information, and second entropy loss information. The second policy loss information characterizes the quality of the strategy used to generate the driving rights allocation sample data; the second value loss information characterizes the value deviation between the driving rights allocation sample data and the actual driving rights allocation data; and the second entropy loss information characterizes the uncertainty of the driving rights allocation sample data.

[0141] Specifically, the network parameters in the model to be trained are iteratively adjusted based on the model loss information. This can be done by presetting a loss threshold range and a reset threshold as training termination conditions. When the model loss information is within the loss threshold range and the number of training resets reaches the reset threshold, the training ends, and the model to be trained in the last iteration is the driving rights allocation model. When the number of training resets has not reached the reset threshold, the network parameters of the model to be trained are adjusted according to the degree of deviation of the model loss information from the loss threshold range. This allows the model loss information to gradually approach the loss threshold range during the iteration process and eventually fall within the loss threshold range. Then, new flight sample data is acquired, and driving action prediction is performed based on the flight sample data to obtain action prediction sample data. The driver is then evaluated based on the driver state sample data, driver action sample data, and action prediction sample data to obtain driver evaluation sample data, which serves as the basis for evaluating the driver's driving ability. The flight sample data and driver evaluation sample data are input into the preset model to be trained to obtain new driving rights allocation sample data. This process continues until the model loss information falls within the loss threshold range. The above steps are repeated until the number of training resets reaches the reset threshold, at which point the training ends, and the driving rights allocation model is obtained. This model allows the driving rights allocation model to learn the dependence of different driver states and driver actions on the driving rights allocation results.

[0142] Figure 3 This is a flowchart illustrating a method for allocating driving rights according to an embodiment of this application. (See attached document.) Figure 3 In one embodiment, the method includes, but is not limited to, steps S301 to S305.

[0143] Step S301: Acquire flight data.

[0144] The flight data includes flight environment data, pilot status data, pilot action data, and vehicle condition data. Flight environment data may include environmental data such as obstacle location, dynamic status, weather conditions, no-fly zone location, and air traffic density. Pilot status data may include physiological data such as the pilot's saccade rate and heart rate. Pilot action data may include action data such as the pilot's input speed, altitude, and direction. Vehicle condition data may include data such as the flying car's flight speed, flight altitude, pitch angle, heading angle, roll angle, and angular velocity.

[0145] Step S302: Predict driving actions based on flight data to obtain action prediction data.

[0146] Among them, motion prediction data refers to the pilot motion data obtained by predicting the pilot's driving actions in a later time based on the flight data in the previous time.

[0147] In one embodiment, predicting driving actions based on flight data can be achieved by using a pre-trained driving action prediction model to predict the driving actions performed by the driver in a later time based on flight data from a previous time step, thereby obtaining the corresponding action prediction data.

[0148] Step S303: Evaluate the driver based on driver status data, driver action data, and action prediction data to obtain driver evaluation data.

[0149] Among them, driver evaluation data refers to evaluation data that predicts the driver's driving ability at the predicted time based on driver state data, driver action data, and corresponding action prediction data.

[0150] In one embodiment, evaluating a driver based on driver status data, driver action data, and action prediction data can be achieved by using a preset driver evaluation model to predict the driver's driving ability at the current moment based on driver status data, driver action data, and action prediction data at the same time, and then evaluating the driver's driving ability at the current moment to obtain driver evaluation data.

[0151] Step S304: Input the flight data and pilot evaluation data into the pilot control allocation model to obtain pilot control allocation data.

[0152] The driving rights allocation model was trained using the model training method described above.

[0153] Among them, driving rights allocation data refers to data used to instruct the allocation of driving rights to a driver or to the autopilot system of a flying car.

[0154] In one embodiment, after inputting flight data and driver evaluation data into a pre-trained driving rights allocation model, the driving rights allocation model determines the driver's current driving ability and the risk of the current flight environment based on the flight data and driver evaluation data, and predicts whether the driver's current driving ability can cope with the risk of the current flight environment. Based on the prediction result, corresponding driving rights allocation data is obtained. If the driver can cope, driving rights allocation data is obtained to allocate driving rights to the driver; if the driver cannot cope, driving rights allocation data is obtained to allocate driving rights to the autopilot system. The driving rights allocation data may include driving response plans predicted based on flight data.

[0155] Step S305: Perform the driving rights allocation action based on the driving rights allocation data.

[0156] The driving rights allocation method provided in this application acquires flight data in real time, predicts driving actions based on the flight data to obtain action prediction data, evaluates the driver based on driver status data, driver action data, and action prediction data to obtain driver evaluation data as the basis for evaluating the driver's driving ability, inputs flight data and driver sample data into a pre-trained driving rights allocation model to obtain driving rights allocation data, and finally performs driving rights allocation actions based on the driving rights allocation data. This method can accurately identify the driver's current driving ability and determine the driving rights allocation data. When automatically allocating driving rights based on this driving rights allocation data, it can improve the flight safety of flying cars.

[0157] Figure 4 This is a block diagram illustrating an electronic device according to an exemplary embodiment.

[0158] The following reference Figure 4 To describe an electronic device 400 according to such an embodiment of the present disclosure. Figure 4 The electronic device 400 shown is merely an example and should not impose any limitation on the functionality and scope of use of the embodiments disclosed herein.

[0159] like Figure 4As shown, the electronic device 400 is presented in the form of a general-purpose computing device. The components of the electronic device 400 may include, but are not limited to: at least one processing unit 410, at least one storage unit 420, a bus 430 connecting different system components (including storage unit 420 and processing unit 410), a display unit 440, etc.

[0160] The storage unit stores program code, which can be executed by the processing unit 410, causing the processing unit 410 to perform the steps described in the above-described method section of this specification according to various exemplary embodiments of this disclosure.

[0161] Storage unit 420 may include a readable medium in the form of a volatile storage unit, such as random access memory (RAM) 4201 and / or cache memory 4202, and may further include a read-only memory (ROM) 4203.

[0162] Storage unit 420 may also include a program / utility 4204 having a set (at least one) program module 4205, such program module 4205 including but not limited to: an action system, one or more application programs, other program modules and program data, each or some combination of these examples may include an implementation of a network environment.

[0163] Bus 430 can represent one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local bus using any of the various bus structures.

[0164] Electronic device 400 can also communicate with one or more external devices 400' (e.g., keyboard, pointing device, Bluetooth device, etc.), and with one or more devices that enable a user to interact with electronic device 400, and / or with any device that enables electronic device 400 to communicate with one or more other computing devices (e.g., router, modem, etc.). This communication can be performed via input / output (I / O) interface 450. Furthermore, electronic device 400 can also communicate with one or more networks (e.g., local area network (LAN), wide area network (WAN), and / or public networks, such as the Internet) via network adapter 460. Network adapter 460 can communicate with other modules of electronic device 400 via bus 430. It should be understood that, although not shown in the figures, other hardware and / or software modules can be used in conjunction with electronic device 400, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems.

[0165] This application also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described method.

[0166] The model training method, driving authority allocation method, device, and storage medium provided in this application embodiment predict driving actions based on flight sample data to obtain action prediction sample data. Then, the driver is evaluated based on driver state sample data, driver action sample data, and action prediction sample data to obtain driver evaluation sample data, which serves as the basis for evaluating the driver's driving ability. The flight sample data and driver evaluation sample data are input into a preset training model. The driving authority allocation sample data obtained from the training model is used to train the training model, resulting in a driving authority allocation model after training. Since the driver evaluation sample data is obtained by evaluating the driver based on driver state sample data, driver action sample data, and action prediction sample data, inputting the flight sample data and driver evaluation sample data into the training model and using the driving authority allocation sample data to train the model allows the training model to learn the dependence of different driver states and driver actions on the driving authority allocation results during training. This enables the trained driving authority allocation model to accurately identify the driver's current driving ability and determine the driving authority allocation data. When automatically allocating driving authority based on this driving authority allocation data, the flight safety of the flying car can be improved.

[0167] From the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein can be implemented by software or by combining software with necessary hardware. Therefore, the technical solutions according to the embodiments of this disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (such as a CD-ROM, USB flash drive, external hard drive, etc.) or on a network, including several instructions to cause a computing device (such as a personal computer, server, or network device, etc.) to execute the methods described above according to the embodiments of this disclosure.

[0168] The program product may employ any combination of one or more readable media. A readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (a non-exhaustive list) of readable storage media include: electrical connections having one or more wires, portable disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof.

[0169] Computer-readable storage media may include data signals propagated in baseband or as part of a carrier wave, carrying readable program code. Such propagated data signals may take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A readable storage medium may also be any readable medium other than a readable storage medium that can transmit, propagate, or transfer a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the readable storage medium may be transmitted using any suitable medium, including but not limited to wireless, wired, optical fiber, RF, etc., or any suitable combination thereof.

[0170] Those skilled in the art will understand that the above modules can be distributed in the device as described in the embodiments, or they can be modified accordingly and placed in one or more devices that are unique to this embodiment. The modules in the above embodiments can be combined into one module, or they can be further divided into multiple sub-modules.

[0171] Exemplary embodiments of this disclosure have been specifically shown and described above. It should be understood that this disclosure is not limited to the detailed structures, arrangements, or implementation methods described herein; rather, this disclosure covers various modifications and equivalent arrangements contained within the spirit and scope of the appended claims.

Claims

1. A model training method, characterized in that, include: Acquire flight sample data; the flight sample data includes flight environment sample data, pilot status sample data, pilot action sample data, and vehicle condition sample data; Based on the flight sample data, driving actions are predicted to obtain action prediction sample data; The driver is evaluated based on the driver state sample data, the driver action sample data, and the action prediction sample data to obtain driver evaluation sample data; The flight sample data and the pilot evaluation sample data are input into a preset training model to obtain piloting rights allocation sample data. The driving rights allocation model is obtained by training the model to be trained based on the driving rights allocation sample data. The process of evaluating the driver based on the driver state sample data, the driver action sample data, and the action prediction sample data to obtain driver evaluation sample data includes: Based on the driver state sample data, generate first evaluation sample data characterizing the driving state; Based on the action prediction sample data and the driver action sample data, a second evaluation sample data characterizing driving skills is generated; The driver is evaluated based on the first evaluation sample data and the second evaluation sample data to obtain the driver evaluation sample data. The step of inputting the flight sample data and the pilot evaluation sample data into a preset training model to obtain pilot authority allocation sample data includes: Based on the flight sample data and the driver evaluation sample data, the driving scenario at future moments is predicted, and multiple future scenario feature sample data are obtained. Calculate the scene reward data corresponding to each of the future scene feature sample data; Output the corresponding driving rights allocation sample data based on the scenario reward data; The step of training the model to be trained based on the driving rights allocation sample data to obtain the driving rights allocation model includes: Determine the model loss information corresponding to the driving right allocation sample data; the model loss information is obtained by fitting the second strategy loss information, the second value loss information and the second entropy loss information, the second strategy loss information characterizes the quality of the strategy that generates the driving right allocation sample data, the second value loss information characterizes the value deviation between the driving right allocation sample data and the real driving right allocation data, and the second entropy loss information characterizes the uncertainty of the driving right allocation sample data. The network parameters in the model to be trained are iteratively adjusted based on the model loss information until the training termination condition is met, thus obtaining the driving rights allocation model.

2. The model training method of claim 1, wherein, The step of predicting driving actions based on the flight sample data to obtain action prediction sample data includes: The flight sample data is input into a pre-trained driving action prediction model to obtain the action prediction sample data. The driving action prediction model is trained based on the action prediction loss information of the action prediction sample data. The action prediction loss information is obtained by fitting a first strategy loss information, a first value loss information, and a first entropy loss information. The first strategy loss information represents the quality of the strategy that generates the action prediction sample data. The first value loss information represents the value deviation between the action prediction sample data and the real action data. The first entropy loss information represents the uncertainty of the action prediction sample data.

3. The model training method according to claim 1, characterized in that, The method of predicting future driving scenarios based on the flight sample data and the driver evaluation sample data yields multiple future scenario feature sample data, including: Feature extraction of the current scene feature sample data is obtained by performing continuous convolution on the flight sample data and the driver evaluation sample data; Based on the transformation probability of the current scene feature sample data at future times, the driving scene at future times is predicted, and multiple future scene feature sample data are obtained.

4. The model training method of claim 1, wherein, The scenario reward data is obtained by fitting first reward sample data, second reward sample data, third reward sample data, fourth reward sample data, and fifth reward sample data. The first reward sample data represents the flight efficiency of the driving scenario corresponding to the future scenario feature sample data, the second reward sample data represents the safety of the driving scenario corresponding to the future scenario feature sample data, the third reward sample data represents the stability of the driving scenario corresponding to the future scenario feature sample data, the fourth reward sample data represents the human-machine collaboration success rate of the driving scenario corresponding to the future scenario feature sample data, and the fifth reward sample data represents the quality of the strategy that generated the future scenario feature sample data.

5. A method of allocating driving right, characterized by, include: Acquire flight data; the flight data includes flight environment data, pilot status data, pilot action data, and vehicle condition data; Based on the flight data, driving actions are predicted to obtain action prediction data; The driver is evaluated based on the driver status data, the driver action data, and the action prediction data to obtain driver evaluation data; The flight data and the pilot evaluation data are input into the pilot assignment model to obtain pilot assignment data; The driving rights allocation model is trained by the model training method described in any one of claims 1 to 4; The driving rights allocation action is performed based on the driving rights allocation data.

6. An electronic device, comprising: The electronic device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to implement the method according to any one of claims 1 to 5.

7. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by a processor, it implements the method of any one of claims 1 to 5.