A seat adjustment and autonomous driving decision coordination optimization system and method

By coordinating and optimizing the seat perception module, autonomous driving decision-making module, and seat control module, and combining deep learning and reinforcement learning algorithms, the problem of coordination between seat adjustment and driving decision-making in autonomous driving systems has been solved, thereby improving passenger comfort and safety.

CN117863982BActive Publication Date: 2026-06-23JILIN UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
JILIN UNIVERSITY
Filing Date
2024-02-04
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing autonomous driving systems lack coordinated optimization between seat adjustment and autonomous driving decisions, which fails to effectively alleviate passenger motion sickness symptoms and does not fully consider individual passenger differences and comfort needs.

Method used

Develop a seat adjustment and autonomous driving decision-making collaborative optimization system. By combining a seat perception module, an autonomous driving decision-making module, a seat control module, and a collaborative optimization algorithm module, the system can adjust seat settings and driving strategies in real time and optimize passenger comfort and safety using pressure sensors, deep learning, and reinforcement learning algorithms.

Benefits of technology

Significantly reduces motion sickness symptoms, improves passenger comfort and safety, provides a personalized riding experience, and ensures optimal comfort and safety under various driving conditions.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117863982B_ABST
    Figure CN117863982B_ABST
Patent Text Reader

Abstract

The application discloses a seat adjustment and automatic driving decision cooperation optimization system and method, which comprises a seat sensing module, an automatic driving decision module electrically connected with the seat sensing module, a seat control module electrically connected with the automatic driving decision module, and a cooperation optimization algorithm module bidirectionally electrically connected with the seat sensing module, the automatic driving module and the seat control module. The seat setting and the automatic driving strategy can be dynamically adjusted according to the motion sickness of passengers, and the driving comfort and safety are improved. The application also designs and develops a seat adjustment and automatic driving decision cooperation optimization method, which can automatically adjust the shape of the seat according to the state of passengers and the driving condition of the automobile, and improve the riding comfort.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to a seat adjustment and autonomous driving decision-making collaborative optimization system and method, belonging to the field of autonomous vehicle technology. Background Technology

[0002] With the rapid development of autonomous vehicle technology, passenger comfort has become a critical issue. Passengers in autonomous vehicles are prone to motion sickness symptoms, including nausea, dizziness, and headaches, which seriously affects the acceptability and practical application of autonomous driving technology. Current autonomous driving systems do not adequately consider passenger motion sickness, resulting in a poor riding experience and potentially threatening passenger health and safety.

[0003] Motion sickness symptoms primarily stem from the mismatch between the movement of autonomous vehicles and the seating arrangement of passengers. Existing autonomous driving systems typically prioritize vehicle safety and planning decisions, often neglecting the physiological state of passengers. Furthermore, seating arrangements are often fixed or based solely on simple passenger weight adjustment, without adequately considering individual passenger differences and comfort needs.

[0004] There is some existing research on autonomous driving and passenger comfort, including seat airbag technology, intelligent seat adjustment systems, and biofeedback monitoring technology. However, these achievements are often isolated and do not achieve synergistic optimization between seat adjustment and autonomous driving decisions. Therefore, current solutions are still insufficient to effectively alleviate motion sickness symptoms in passengers.

[0005] Patent CN116552345A discloses a car seat and its intelligent control method. This car seat features a backrest, seat cushion, pressure sensors, curved protective arms, and possibly a unit airbag. Designed for adults and children of different ages and sizes, the seat can automatically adjust and deform according to the occupant's needs to provide better safety and comfort. This technology aims to provide a convenient travel experience for both adults and children.

[0006] Patent CN110696695A discloses a pneumatic control device for a car seat and a similar device. This device includes an airbag inside the seat, an excitation switch that senses the deformation of the airbag, a control circuit electrically connected to the excitation switch, and an air source module electrically connected to the control circuit. The excitation switch senses the deformation force of the airbag and controls the state of the control circuit, thereby controlling the air source module to start or stop inflating or deflating the airbag. This system can sense the state of the airbag in real time and make adjustments to achieve the adjustment of the comfort and support of the car seat.

[0007] Patent CN113741201A discloses an intelligent control method for actively mitigating motion sickness at the chassis control layer, specifically targeting motion sickness in vehicles. This method involves collecting data from onboard sensors and occupant head sensors to establish a vestibular-vehicle translational dynamics model, reducing the motion sickness index and ensuring the generated path is as smooth as possible. This technology, through intelligent driving vehicle control optimization algorithms, actively mitigates motion sickness at the chassis control layer, improving the riding experience and comfort, and reducing the likelihood of occupants experiencing motion sickness.

[0008] Patent CN113581181A proposes a method for intelligent vehicle behavior decision-making at intersections, emphasizing comfort considerations. This method uses a hierarchical reinforcement learning decision-making model, including an upper-level path strategy and a lower-level action strategy, to assist intelligent vehicles in making decisions at intersections. By observing environmental conditions, including the position and speed information of vehicles and obstacles, the method generates appropriate turning radii and longitudinal accelerations. Through a reward function and reinforcement learning, the method optimizes decisions to provide a more comfortable driving experience. This helps ensure smoother vehicle behavior at intersections and meets the comfort requirements of both the driver and passengers.

[0009] Patent CN116118655B proposes and authorizes a cabin data optimization method and related apparatus aimed at alleviating motion sickness that may occur when riding in autonomous vehicles. The method includes receiving cooperative instructions, identifying the user corresponding to the seat and the user's seating situation, and then, based on vehicle condition indicators, controlling the seat's associated devices to perform cooperative operations, such as adjusting the operation mode of the seat fan, simulating the interior sound of a traditional gasoline-powered vehicle, or controlling the seat vibration mode, to adapt to special vehicle conditions, thereby reducing the passenger's motion sickness. This technology helps improve the comfort and riding experience when riding in autonomous vehicles.

[0010] Patent CN115214672A discloses a human-like decision-making, planning, and control method for autonomous driving, taking driving comfort into account. This method identifies surrounding vehicle driving styles and establishes a decision cost function, using complete information non-cooperative game theory to build a lane-changing decision model that considers driving style and interaction behavior. Simultaneously, it constructs a driving risk field model and utilizes model predictive control algorithms to plan collision-free lane-changing paths in real time, thereby improving driving comfort. This technology can make reasonable human-like lane-changing decisions in complex traffic scenarios, improving passenger comfort and demonstrating broad application potential.

[0011] The main problem with existing technologies is the lack of coordination between seat adjustment and autonomous driving decisions. Seat adjustment technology often lacks awareness of the vehicle's status and autonomous driving decisions, while autonomous driving decisions also lack sufficient information to adapt to the passenger's physiological state. Furthermore, existing technologies typically neglect the individual needs of passengers and cannot provide highly customized solutions.

[0012] The challenge in solving these problems lies in comprehensively considering the complex relationship between seat adjustment, autonomous driving decisions, and passenger physiological states. This invention develops a collaborative optimization system that combines seat perception technology, autonomous driving decision-making algorithms, and machine learning methods to achieve a high degree of collaborative optimization between seat adjustment and autonomous driving strategies. This system can dynamically adjust seat settings and autonomous driving strategies based on passenger motion sickness symptoms to provide optimal comfort and safety. Therefore, the method of this invention will provide a more pleasant riding experience for passengers in autonomous vehicles, overcoming the limitations of existing technologies. Summary of the Invention

[0013] This invention designs and develops a seat adjustment and autonomous driving decision-making collaborative optimization system, which can dynamically adjust seat settings and autonomous driving strategies according to the passenger's motion sickness symptoms, thereby improving driving comfort and safety.

[0014] The present invention also designed and developed a method for co-optimization of seat adjustment and autonomous driving decision-making, which can automatically adjust the shape of the seat according to the passenger's state and the driving situation of the car to improve the riding comfort.

[0015] The technical solution provided by this invention is as follows:

[0016] A system and method for collaborative optimization of seat adjustment and autonomous driving decision-making, comprising:

[0017] Seat sensing module;

[0018] An autonomous driving decision-making module, which is electrically connected to the seat perception module;

[0019] A seat control module, which is electrically connected to the autonomous driving decision module;

[0020] The collaborative optimization algorithm module is bidirectionally electrically connected to the seat perception module, the autonomous driving module, and the seat control module.

[0021] Preferably, the seat sensing module includes multiple pressure sensors, which are distributed on the passenger seats of the vehicle.

[0022] Preferably, the seat control module is electrically connected to the airbag on the seat to adjust the airbag inflation volume and response time.

[0023] A method for co-optimizing seat adjustment and autonomous driving decision-making, characterized in that it uses the aforementioned co-optimization system for seat adjustment and autonomous driving decision-making, comprising:

[0024] Step 1: The seat sensing module acquires passenger pressure distribution data through pressure sensors and converts it into passenger status information to determine the passenger's current status.

[0025] Step 2: Based on the current passenger status and vehicle status information, the autonomous driving decision module optimizes the vehicle's lateral and longitudinal movements.

[0026] Step 3: The seat control module adjusts the airbag inflation volume and response time;

[0027] Step 4: Perform collaborative optimization, establish a reinforcement learning model and environment, and obtain the optimal reinforcement learning decision model by adjusting hyperparameters;

[0028] The model input consists of the vehicle and seat states, and the output consists of the vehicle and seat decisions.

[0029] Preferably,

[0030] The passenger status information includes: seat surface pressure distribution and airbag inflation status;

[0031] The vehicle status information includes: vehicle speed, position, heading angle, relative speed and position of other vehicles.

[0032] Preferably, it also includes: constructing an autonomous driving control model:

[0033] Lateral dynamics model: LSF = ∑(WSA) i ·P i ·A i );

[0034] In the formula, LSF is the transverse stability factor, and WSA is the transverse stability factor. i The weight of each airbag, P i The inflation level of each airbag, A i The area of ​​each airbag;

[0035] The longitudinal dynamic model is as follows:

[0036] m·a=θ·F max -F air -F roll ;

[0037]

[0038] F roll =C r mg;

[0039] In the formula, θ is the accelerator pedal opening, a is the vehicle acceleration, and F is the acceleration of the vehicle. max F is the maximum driving force that the engine can provide when the throttle is fully open. air For air resistance, F roll C represents rolling resistance. dWhere C is the air resistance coefficient, A is the vehicle's frontal area, ρ is the air density, and v is the vehicle speed; r is the rolling resistance coefficient, m is the vehicle mass, and g is the acceleration due to gravity;

[0040] The vehicle's turning radius is:

[0041]

[0042] In the formula, TR is the turning radius of the vehicle, which is the minimum turning radius that the vehicle can achieve under specific speed and steering conditions, and V is the speed of the vehicle when turning.

[0043] Preferably, the force range of the airbag is: ForceRange=[Fmin,Fmax];

[0044] Give the range of values

[0045] The functional relationship between the force on the airbag and the inflation volume is:

[0046] F = k·IA b ;

[0047] In the formula, F is the force on the airbag, IA is the inflation volume, and k and b are constants determined based on experimental data;

[0048] Airbag inflation adjustment formula:

[0049] IA(t)=IA desired ·f(t);

[0050] In the formula, IA(t) represents the airbag inflation volume during time t. desired Let f(t) be the required inflation volume, and f(t) be the inflation time response function.

[0051] The formula for evaluating the seat's support is:

[0052]

[0053] Wherein, WRAP represents the seat's envelopment assessment, P i A is the pressure value at the i-th contact point. i C is the area of ​​the i-th contact point. i It is a correction factor that takes into account the passenger's body shape and is used to adjust the feeling of enclosure for passengers of different body shapes. TSA represents the total area of ​​the seat.

[0054] Preferably, step four includes: establishing an optimization objective function:

[0055] CC = α·LSF + β·CEA;

[0056] Here, CC represents passenger discomfort, α represents the potential risk of the vehicle, and β are weighting coefficients that can be adjusted according to the importance of the optimization problem.

[0057] Based on the SAC model, the network parameters are backpropagated according to the rewards obtained from interaction with the environment, constructing four Q networks and one policy network.

[0058] The reward method is multi-objective reinforcement learning:

[0059] r t =α1·r1+α2·r2+α3·r3+α4·r4;

[0060] In the formula, α i The weights of different reward items are i = 1, 2, 3, 4, r1 = -CC, which represents the reward based on passenger comfort, r2 represents the speed reward, r3 represents the decision time reward, and r4 represents the collision reward. These reward items together constitute the optimization objective of vehicle decision-making.

[0061] The Q-value network loss function of the SAC model satisfies:

[0062]

[0063] In the formula, y t Let be the target value for each tuple at time t, and N be the number of sample groups in the experience replay pool. For the current state s t Next, execute action a t The Q-value, j is the Q-value category, j = 1, 2;

[0064] The policy network loss function of the SAC model satisfies:

[0065]

[0066]

[0067] In the formula, L π (θ) is the loss function of the policy network, where θ is the parameter of the neural network, and λ is the loss function of the policy network. π It is the learning rate of the policy network. The derivatives of the parameters of the neural network;

[0068] The entropy regularization coefficient of the policy network satisfies:

[0069]

[0070] In the formula, L α (α) is the loss function that depends on the entropy regularization coefficient. Let π(·∣s) be the expectation, D be the experience replay pool, and π(·∣s) be the expectation.t Given a state s at time step t t The probability distribution of the output action of the time policy π, where α is the coefficient of the entropy regularization term, and H0 is the target entropy.

[0071] The input actions are classified and flattened. The output has two pairs of parameters. Based on the mean and variance, these two probability distributions are constructed and sampled to obtain the model's actions in these two dimensions. The probability distribution formulas are as follows:

[0072]

[0073] In the formula, a t Represents the state s at time t. t The following actions were taken, π θ (a t |s t ) represents the output of the policy network, μ t σ represents the mean of the probability distribution of action sampling. t This represents the variance of the probability distribution of action sampling.

[0074] The beneficial effects of this invention are as follows: This invention effectively improves passenger comfort and safety through a highly coordinated seat adjustment and autonomous driving decision-making system, particularly excelling in reducing motion sickness symptoms. The system-integrated data processing and reinforcement learning algorithms optimize the automatic seat adjustment, ensuring optimal passenger comfort under various driving conditions, while significantly improving driving safety through precise vehicle control. This comprehensive optimization solution provides an innovative approach to improving the passenger experience in autonomous vehicles. Attached Figure Description

[0075] Figure 1 This is a schematic diagram of the seat adjustment and autonomous driving decision-making collaborative optimization system described in this invention.

[0076] Figure 2 This is a flowchart illustrating the operation of the seat sensing module described in this invention.

[0077] Figure 3 This is a flowchart of the autonomous driving decision-making module described in this invention.

[0078] Figure 4 This is a flowchart of the seat control module's workflow.

[0079] Figure 5 This is an example flowchart of a collaborative optimization algorithm. Detailed Implementation

[0080] The present invention will now be described in further detail with reference to the accompanying drawings, so that those skilled in the art can implement it based on the description.

[0081] like Figure 1-5 As shown, the present invention provides a seat adjustment and autonomous driving decision-making collaborative optimization system, comprising: a seat perception module, an autonomous driving decision-making module, a seat control module, and a collaborative optimization algorithm module. The autonomous driving decision-making module is electrically connected to the seat perception module; the seat control module is electrically connected to the autonomous driving decision-making module; and the collaborative optimization algorithm module is bidirectionally electrically connected to the seat perception module, the autonomous driving module, and the seat control module.

[0082] Seat sensing module such as Figure 2 As shown, the system is responsible for collecting images of the passenger's posture, force data, and autonomous driving decision-making information. This data is used to monitor the passenger's position, posture, and the vehicle's driving status in real time. The seat perception module processes this data by constructing a CNN-LSTM to obtain detailed information about the passenger's state.

[0083] Autonomous driving decision-making module such as Figure 3 As shown, this module is responsible for generating the car's driving strategy, including lateral movement decisions. This module combines data from the seat perception module to optimize driving decisions to meet passenger comfort needs. This means that during autonomous driving, the car considers passenger position, posture, and seat shape adjustments to provide the best riding experience.

[0084] Seat control module such as Figure 4 As shown, this module is responsible for actually controlling the shape and position of the seat. It uses a collaborative optimization algorithm to dynamically adjust the seat to adapt to the passenger's posture and provide maximum comfort. The seat control module can also reduce discomfort during acceleration and braking.

[0085] Collaborative optimization algorithms such as Figure 5 As shown, this algorithm integrates seat perception, autonomous driving decision-making, and seat control to collaboratively optimize passenger comfort. It considers the interplay between seat shape adjustment and autonomous driving decisions to provide the best riding experience. The collaborative optimization algorithm automatically adjusts the seat shape based on the passenger's state and the vehicle's driving conditions to offer optimal comfort.

[0086] These modules work together to optimize passenger seating comfort while maintaining vehicle safety and stability. This allows autonomous vehicles to provide an excellent driving experience while ensuring passenger comfort and safety.

[0087] This invention also provides a method for co-optimizing seat adjustment and autonomous driving decision-making, using the seat adjustment and autonomous driving decision-making co-optimization system provided by this invention, including:

[0088] Step 1: The seat sensing module acquires passenger pressure distribution data through pressure sensors and converts it into passenger status information to determine the passenger's current status.

[0089] Step 2: Based on the current passenger status and vehicle status information, the autonomous driving decision module optimizes the vehicle's lateral and longitudinal movements.

[0090] Step 3: The seat control module adjusts the airbag inflation volume and response time;

[0091] Step 4: Perform collaborative optimization, establish a reinforcement learning model and environment, and obtain the optimal reinforcement learning decision model by adjusting hyperparameters;

[0092] The model input consists of the vehicle and seat states, and the output consists of the vehicle and seat decisions.

[0093] In this invention, preferably, a camera collects continuous video frames for detecting and identifying objects in the surrounding environment. A LiDAR (Light Detection and Ranging) system generates point cloud data, providing precise three-dimensional positions and shapes of objects around the vehicle. A GPS positioning system provides the vehicle's geographic location information. Accelerometers and gyroscopes collect the vehicle's motion and attitude data, including acceleration and rotational angular velocity.

[0094] In the data processing section, the YOLO model is used to process camera image data for object detection and classification. Dedicated point cloud processing algorithms are used to process LiDAR data to obtain 3D object information. Data from GPS, accelerometers, and gyroscopes are comprehensively utilized to analyze the vehicle's motion state and navigation path. Preprocessing of the sensor data, including filtering, normalization, and time synchronization, ensures data quality and consistency. Key features are extracted from the preprocessed data, including: 2D object position from camera data, 3D object position and shape from LiDAR data, and simultaneous analysis of GPS and IMU data to determine the vehicle's speed and orientation. Data fusion algorithms (such as extended Kalman filters) are used to integrate data from different sensors. This fusion not only combines the measurement results of each sensor but also considers their uncertainties and accuracy to obtain a more accurate and reliable representation of the vehicle and environmental state. The fused data is used to support system decisions, such as seat adjustment and autonomous driving operations. Through this comprehensive approach, the system can more accurately understand and respond to complex road conditions.

[0095] The Soft Actor-Critic (SAC) reinforcement learning algorithm is used for vehicle control and seat adjustment decisions. SAC is an efficient reinforcement learning method suitable for problems in continuous action spaces, capable of optimizing behavioral strategies while maintaining exploratory nature. The decision-making process includes controlling the vehicle's steering angle and throttle opening, as well as adjusting the seat angle and firmness. The reward function is designed to consider both passenger comfort and vehicle driving safety.

[0096] The system is trained in the CARLA simulation environment to ensure good performance before actual deployment. After training, it can be deployed on experimental vehicles for further testing and optimization.

[0097] The seat perception module uses pressure sensors to detect the pressure distribution of passengers in their seats. This data can be used to determine the severity of motion sickness symptoms. Through a built-in network of pressure sensors, it acquires real-time data on passenger pressure distribution in the seats. The seat perception module then transmits the collected data to a deep learning model for processing. This data is transformed into detailed information about the passenger's state, including body position, posture, center of gravity distribution, and any abnormal body pressure distribution. Based on the analysis results of the deep learning model, the seat perception module determines the passenger's current state. This helps the system identify whether the passenger is experiencing discomfort or nausea, as well as the passenger's corresponding positional and posture information, thereby prompting the autonomous driving decision-making module to take appropriate action.

[0098] The autonomous driving decision-making module uses seat perception data and vehicle status information to collaboratively optimize the vehicle's lateral and longitudinal movements, thereby improving passenger comfort. The data provided by the seat perception module includes passenger status information, seat shape adjustment, and potential motion sickness symptoms. Additionally, vehicle status information is acquired, including vehicle speed, road conditions, and traffic conditions. The autonomous driving decision-making module processes the input data in real time and generates a vehicle driving strategy. This includes decisions on lateral movements (such as turning or changing lanes) and longitudinal movements (such as acceleration or braking). The module uses a collaborative optimization algorithm to balance the vehicle's lateral movements and passenger seat shape adjustments. This ensures that passengers maintain an optimal seating posture when the vehicle is moving laterally, reducing the occurrence of motion sickness symptoms.

[0099] The autonomous driving decision-making module uses sensor data and vehicle state information, such as speed (v) and lateral acceleration (a). lat The system uses the turning rate (r) and turning radius (r) to make decisions. This data can be used to build an autonomous driving control model, as shown below:

[0100] Lateral Dynamics (LSF): The seat sensing module provides lateral support forces to maintain passenger position, which can be represented as:

[0101] LSF=∑(WSA i ·P i ·A i );

[0102] Among them, WSA i P represents the weight of each airbag. i Indicates the inflation level of each airbag, A i This indicates the area of ​​each airbag.

[0103] Vehicle turning radius:

[0104]

[0105] Seat Control (SC): The seat control module is designed to adjust the airbag inflation volume and timing response to ensure that the airbag provides support within an acceptable range for the passenger to meet comfort requirements. This can be expressed using the following formula:

[0106] The functional relationship between the force on the airbag and the inflation volume is:

[0107] F = k·IA b ;

[0108] In the formula, F is the force on the airbag, IA is the inflation volume, and k and b are constants determined based on experimental data;

[0109] Airbag inflation adjustment formula:

[0110] IA(t)=IA desired ·f(t);

[0111] In the formula, IA(t) represents the airbag inflation volume during time t. desired Let f(t) be the required inflation volume, and f(t) be the inflation time response function.

[0112] The formula for evaluating the seat's support is:

[0113]

[0114] Wherein, WRAP represents the seat's envelopment assessment, P i A is the pressure value at the i-th contact point. i C is the area of ​​the i-th contact point. i It is a correction factor that takes into account the passenger's body shape and is used to adjust the feeling of enclosure for passengers of different body shapes. TSA represents the total area of ​​the seat.

[0115] CEA (Containment Experience Assessment): CEA assesses the containment of a seat, i.e., how comfortably the seat encloses the passenger. This can be expressed using the following formula:

[0116]

[0117] Here, CEA represents the seat's envelopment assessment, and TSA represents the seat's total area.

[0118] Collaborative optimization: The collaborative optimization algorithm uses sensor data and autonomous driving decision data to collaboratively optimize seat adjustment and autonomous driving decisions through the following steps:

[0119] Establish an optimization objective function: The objective function comprehensively considers both passenger motion sickness symptoms (based on data from the seat perception module) and the vehicle's motion state (based on data from the autonomous driving decision module). An example of the objective function might be shown below:

[0120] CC = α·LSF + β·CEA;

[0121] Here, CC represents passenger discomfort and potential vehicle risk. α and β are weighting coefficients that can be adjusted according to the importance of the optimization problem.

[0122] These formulas enable a seat adjustment and autonomous driving decision-making co-optimization system and method to work collaboratively during autonomous driving to provide seat support that meets passenger comfort needs and ensure that the seat-passenger fit is maintained throughout the ride. This approach will improve passenger comfort and safety.

[0123] The decision-making part uses the SAC (soft actor critic) model, a reinforcement learning model. Its input is a combination of vehicle and environmental information with seat and human information to form the model's input state. The output is a combination of vehicle and seat actions to form the model's output action. The network parameters are backpropagated based on the rewards obtained from interacting with the environment. Five neural networks are constructed: four Q-networks and one policy network. Each Q-network evaluates the value of a given state-action pair, helping the algorithm determine which actions are advantageous. Two Q-networks are used to reduce overestimation, while the other two are their target networks, used to stabilize the learning process. The policy network generates actions that consider not only maximizing expected rewards but also exploring unknown states.

[0124] The input actions to the model require further processing, including classification and flattening, to fix the positions of the input parameters. The output consists of two pairs of parameters: the mean and variance of the probability distribution for vehicle actions and the probability distribution and variance of the airbag actions. These two probability distributions are constructed based on the mean and variance, and then sampled to obtain the model's actions in these two dimensions. The probability distribution formulas are as follows:

[0125]

[0126] In the formula, a tRepresents the state s at time t. t The following actions were taken, π θ (a t |s t ) represents the output of the policy network, μ t σ represents the mean of the probability distribution of action sampling. t This represents the variance of the probability distribution of action sampling.

[0127] Based on the objective function of human discomfort, a total reward function can be obtained. This reward method is called multi-objective reinforcement learning.

[0128] r t =α1·r1+α2·r2+α3·r3+α4·r4;

[0129] In the formula, α i The weights of different reward items are i = 1, 2, 3, 4, r1 = -CC, which represents the reward based on passenger comfort, r2 represents the speed reward, r3 represents the decision time reward, and r4 represents the collision reward. These reward items together constitute the optimization objective of vehicle decision-making.

[0130] In the backpropagation part, the Q-value network loss function of the SAC network satisfies:

[0131]

[0132] In the formula, y t Let be the target value for each tuple at time t, and N be the number of sample groups in the experience replay pool. For the current state s t Next, execute action a t The Q-value, j is the Q-value category, j = 1, 2;

[0133] The update of the Q-value network satisfies:

[0134]

[0135] In the formula, L Q (φ1) represents the first Q-value network. The loss function, L Q (φ2) represents the second Q-value network Q. φ2 The loss function, λ Q φ is the learning rate of the Q-value network, and φ is the parameter of the neural network;

[0136] The policy network loss function of the SAC network satisfies:

[0137]

[0138]

[0139] In the formula, L π (θ) is the loss function of the policy network, where θ is the parameter of the neural network, and λ is the loss function of the policy network. π It is the learning rate of the policy network. The derivatives of the parameters of the neural network;

[0140] The entropy regularization coefficient of the policy network satisfies:

[0141]

[0142] In the formula, L α (α) is the loss function that depends on the entropy regularization coefficient. Let π(·∣s) be the expectation, D be the experience replay pool, and π(·∣s) be the expectation. t Given a state s at time step t t The probability distribution of the output action of the time policy π, where α is the coefficient of the entropy regularization term, and H0 is the target entropy.

[0143] Training this model using SAC typically results in convergence after 5000 rounds. During training, the model's hyperparameters need to be continuously adjusted. The table below shows these hyperparameters and their default values:

[0144] Table 1

[0145]

[0146]

[0147] The target entropy is determined based on the output dimension, which is 2 in this case, so the parameter is -2. The others are generally empirical values.

[0148] The optimal value of the objective function is obtained through numerical optimization. The optimal solution will reflect the best seat adjustment and autonomous driving decision-making strategies to minimize passenger motion sickness while ensuring safety.

[0149] Although embodiments of the present invention have been disclosed above, they are not limited to the applications listed in the specification and embodiments. They can be applied to various fields suitable for the present invention. For those skilled in the art, other modifications can be easily made. Therefore, without departing from the general concept defined by the claims and their equivalents, the present invention is not limited to the specific details and illustrations shown and described herein.

Claims

1. A method for collaborative optimization of seat adjustment and autonomous driving decision-making, characterized in that, Cooperative optimization is performed using a seat adjustment and autonomous driving decision-making collaborative optimization system, which includes: Seat sensing module; An autonomous driving decision-making module, which is electrically connected to the seat perception module; A seat control module, which is electrically connected to the autonomous driving decision module; The collaborative optimization algorithm module is simultaneously electrically connected bidirectionally to the autonomous driving module and the seat control module, and electrically connected to the seat perception module. The method for coordinating seat adjustment and autonomous driving decision-making optimization includes: Step 1: The seat sensing module acquires passenger pressure distribution data through pressure sensors and converts it into passenger status information to determine the passenger's current status. Step 2: Based on the current passenger status and vehicle status information, the autonomous driving decision module optimizes the vehicle's lateral and longitudinal movements. Step 3: The seat control module adjusts the airbag inflation volume and response time; Step 4: Perform collaborative optimization, establish a reinforcement learning model and environment, and obtain the optimal reinforcement learning decision model by adjusting hyperparameters; The model input consists of the vehicle and seat states, and the output consists of the vehicle and seat decisions. Constructing an autonomous driving control model: Lateral dynamics model: ; In the formula, As a horizontal stability factor, The weight of each airbag, The inflation level of each airbag, The area of ​​each airbag; The longitudinal dynamic model is as follows: ; ; ; In the formula, This refers to the opening of the accelerator pedal. To accelerate the vehicle, This is the maximum driving force that the engine can provide when the throttle is fully open. For air resistance, For rolling resistance; The air drag coefficient, The vehicle's frontal area. air density, For vehicle speed; The rolling resistance coefficient, For vehicle quality, It is the acceleration due to gravity; The vehicle's turning radius is: ; In the formula, The turning radius of a vehicle is the minimum turning radius that a vehicle can achieve under specific speed and steering conditions. This refers to the vehicle's speed when turning.

2. The method for collaborative optimization of seat adjustment and autonomous driving decision-making according to claim 1, characterized in that, The seat sensing module includes multiple pressure sensors, which are distributed on the passenger seats of the car.

3. The method for collaborative optimization of seat adjustment and autonomous driving decision-making according to claim 2, characterized in that, The seat control module is electrically connected to the airbag on the seat and is used to adjust the airbag inflation volume and response time.

4. The method for co-optimizing seat adjustment and autonomous driving decision-making according to claim 3, characterized in that, The passenger status information includes: seat surface pressure distribution and airbag inflation status; The vehicle status information includes: vehicle speed, position, heading angle, relative speed and position of other vehicles.

5. The method for collaborative optimization of seat adjustment and autonomous driving decision-making according to claim 4, characterized in that, The force-bearing range of the airbag is: ; Give the range of values The functional relationship between the force on the airbag and the inflation volume is: ; In the formula, For the airbag to bear the force, This refers to the inflation volume. and These are constants determined based on experimental data; Airbag inflation adjustment formula: ; In the formula, In time The air volume of the internal airbag To the required inflation volume, This is the inflation time response function; The formula for evaluating the seat's support is: ; in, This indicates an assessment of the seat's envelopment properties. It is the first Pressure value at each contact point It is the first The area of ​​each contact point It is a correction factor that takes into account passenger body shape, used to adjust the feeling of being enveloped for passengers of different body types. This indicates the total area of ​​the seats.

6. The method for collaborative optimization of seat adjustment and autonomous driving decision-making according to claim 5, characterized in that, Step four includes: establishing the optimization objective function. ; in, It indicates passenger discomfort and potential risks to the vehicle. and These are weighting coefficients, which can be adjusted according to the importance of the optimization problem. Based on the SAC model, the network parameters are backpropagated according to the rewards obtained from interaction with the environment, constructing four Q networks and one policy network. The reward method is multi-objective reinforcement learning: ; In the formula, It is the weight of different reward items. , This indicates that rewards are based on passenger comfort. Indicates speed reward, Indicates a reward for decision-making time. These represent collision rewards, and together they constitute the optimization objective of vehicle decision-making. The Q-value network loss function of the SAC model satisfies: ; In the formula, for The target value of each tuple at time t. The number of sampling groups in the experience replay pool. In the current state Next action Q value, Classify Q-values ; The policy network loss function of the SAC model satisfies: ; ; In the formula, It is the loss function of the policy network. These are the parameters of the neural network. It is the learning rate of the policy network. The derivatives of the parameters of the neural network; The entropy regularization coefficient of the policy network satisfies: ; In the formula, The loss function depends on the entropy regularization coefficient. As expected, For experience replay pool, In time step Next given state Time Strategy The probability distribution of the output action. Here, H0 is the coefficient of the entropy regularization term, and H0 is the target entropy. The input actions are classified and flattened. The output has two pairs of parameters. Based on the mean and variance, these two probability distributions are constructed and sampled to obtain the model's actions in these two dimensions. The probability distribution formulas are as follows: ; In the formula, Indicates in state of time The following actions were taken. This represents the output of the policy network. This represents the mean of the probability distribution of action sampling. This represents the variance of the probability distribution of action sampling.