Intelligent cockpit interaction method and system based on multi-modal artificial intelligence

By collecting and analyzing driver and vehicle data in real time through multimodal artificial intelligence, driver intentions are identified and personalized assistance strategies are generated. This solves the problem that traditional cockpit systems cannot fully judge the driver's state, and achieves accurate risk prediction and personalized assisted driving, thereby improving driving safety and comfort.

CN122232656APending Publication Date: 2026-06-19SHANGHAI QIANQIAN INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHANGHAI QIANQIAN INFORMATION TECH CO LTD
Filing Date
2026-03-19
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Traditional cockpit systems rely on a single perception method, which cannot fully assess the driver's state and lacks proactive prediction and prevention mechanisms, resulting in a high risk of traffic accidents. Furthermore, the system's decisions do not meet the driver's needs and lack personalized driver assistance.

Method used

Based on multimodal artificial intelligence, the system collects driver biometrics and vehicle status data in real time, combines them with historical interaction logs, identifies driver intentions and generates personalized assistance strategies, integrates vehicle environment and passenger information, and proactively guides driver operations.

Benefits of technology

It achieves accurate risk prediction, reduces traffic accidents, improves driving safety and comfort, meets driver needs through personalized strategies, and reduces the impact of misoperation.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122232656A_ABST
    Figure CN122232656A_ABST
Patent Text Reader

Abstract

This invention relates to the field of driving interaction technology, specifically to an intelligent cockpit interaction method and system based on multimodal artificial intelligence. The method includes: real-time acquisition of multimodal biometric data of the driver and vehicle operating status data within the cockpit; determining whether the driver's current biometrics meet a preset characteristic change pattern based on the driver's multimodal biometric data and vehicle operating status data, combined with acquired historical cockpit interaction log data; and determining the driver's state characteristic parameters at a given time point when the driver's current biometrics meet the preset characteristic change pattern. This invention, by acquiring multimodal biometric data of the driver and vehicle operating status data, can determine in real-time whether the driver is fatigued, distracted, or experiencing emotional fluctuations. Combined with future driving path and environmental information, it calculates the probability of driving risks occurring at each driving location, enabling the system to detect potential dangers in advance.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of driving interaction technology, specifically to an intelligent cockpit interaction method and system based on multimodal artificial intelligence. Background Technology

[0002] Currently, most traditional methods rely on a single sensing method, such as detecting the driver's state solely through hardware devices like in-vehicle cameras or radar. This approach has a weak ability to perceive the driver's multimodal biometrics and cannot comprehensively determine whether the driver is fatigued, distracted, or experiencing emotional fluctuations. This makes it difficult to achieve real-time and accurate risk prediction. Moreover, traditional methods tend to issue warnings or alerts when the driver exhibits abnormal behavior, lacking proactive prediction and prevention mechanisms. Even if potential risks are detected by external sensors, it is difficult to predict the driver's behavior and the vehicle environment in advance, resulting in excessively long response times for preventative measures and an inability to effectively reduce the occurrence of traffic accidents.

[0003] Furthermore, traditional methods often have limited integration approaches when sensing the vehicle's surroundings and the driver's state. They frequently rely solely on in-vehicle cameras and sensors to assess the environment, failing to effectively integrate the driver's biometric data, vehicle dynamics data, and passenger voice information. This can lead to system decisions that do not align with the driver's actual needs, potentially resulting in misoperations or unnecessary interference. Moreover, many traditional intelligent cockpit systems only provide passive safety reminders or warnings. For example, the system might issue a warning when the driver fails to notice an obstacle or traffic sign. However, these methods cannot proactively provide driver assistance, such as adjusting navigation routes or adapting to the environment, thus failing to achieve truly personalized and proactive driver assistance. Summary of the Invention

[0004] To achieve the above objectives, the present invention provides the following technical solution: an intelligent cockpit interaction method based on multimodal artificial intelligence, comprising: Real-time acquisition of multimodal biometric data of the driver and vehicle operating status data in the cockpit; based on the multimodal biometric data of the driver and the vehicle operating status data, combined with the acquired cockpit historical interaction log data, to determine whether the driver's current biometrics meet the preset feature change pattern. When it is determined that the driver's current biometrics meet the preset characteristic change pattern, the driver's state characteristic parameters at the time point are determined, and based on the driver's state characteristic parameters and the vehicle's operating status data, it is determined whether the driver meets the preset interaction trigger conditions. When it is determined that the driver meets the preset interaction triggering conditions, the driver's voice interaction command and visual interaction gesture are obtained, and the driver's intention recognition result is identified based on the driver's voice interaction command and visual interaction gesture. When the driver's intention recognition result is identified, environmental perception data around the vehicle, navigation path information, and passenger voice information in the cabin are acquired, and environmental fusion information of the cockpit is generated based on the environmental perception data around the vehicle, navigation path information, and passenger voice information in the cabin. Based on the driver's intent recognition results and the cockpit environment fusion information, it is determined whether the driver's operating intent matches the current driving environment; When it is determined that the driver's operating intention does not match the current driving environment, a target interaction strategy is generated based on the driver's state characteristic parameters, intention recognition results, and environmental fusion information; the target interaction strategy is sent to the execution device in the cockpit, wherein the target interaction strategy is used to provide assisted driving guidance to the driver.

[0005] Preferably, before generating the target interaction strategy based on the driver's state characteristic parameters, intent recognition results, and environmental fusion information, the method further includes: The system obtains the target driver's future driving path and the environmental information set corresponding to the future driving path, and calculates the probability of the target driver encountering driving risks at each driving location based on the environmental information set corresponding to the future driving path. For each driving location, when it is determined that the probability of the target driver encountering driving risks at the driving location is greater than or equal to the first preset risk level, the driving location is determined as a driving scenario, and the environmental fusion information of the cockpit is updated according to the environmental perception data under the driving scenario. Sending the target interaction strategy to the execution device in the cockpit includes: Based on the updated cockpit environment fusion information and the urgency of the target interaction strategy, the signal output requirements corresponding to the target interaction strategy are determined, and based on the signal output requirements, the target execution device corresponding to the cockpit is determined, and the target interaction strategy is transmitted to the target execution device.

[0006] Preferably, based on the driver's multimodal biometric data and vehicle operating status data, combined with the acquired cockpit historical interaction log data, it is determined whether the driver's current biometrics meet a preset characteristic change pattern, including: Obtain the driver's personal user profile and historical fatigue driving characteristics information; The driver's multimodal biometric data is compared with the baseline biometric data in the personal user profile, and the driver's multimodal biometric data is compared with the historical fatigue driving feature information; Based on the biometric comparison results and fatigue characteristic comparison results, it is determined whether the driver's current biometric characteristics meet the preset characteristic change pattern.

[0007] Preferably, when it is determined that the driver's current biometrics meet the preset characteristic change pattern, the driver's state characteristic parameters at the specified time point are determined, and based on the driver's state characteristic parameters and the vehicle's operating status data, it is determined whether the driver meets the preset interaction trigger conditions, including: Based on the driver's multimodal biometric data, the driver's attention distraction level and emotional fluctuation parameters are determined. Based on the driver's distraction level value, determine whether the driver's distraction level value is greater than or equal to a preset attention threshold; When it is determined that the driver's distraction level is greater than or equal to the preset attention threshold, the driver's emotional fluctuation parameters and the vehicle speed are combined to determine whether they are within a preset danger correlation range. When it is determined that the driver's emotional fluctuation parameters and the vehicle's speed are within the preset danger-related range, it is determined that the driver meets the preset interaction triggering condition.

[0008] Preferably, when it is determined that the driver meets the preset interaction triggering conditions, the driver's voice interaction commands and visual interaction gestures are acquired, and the driver's intent recognition result is identified based on the driver's voice interaction commands and visual interaction gestures, including: Obtain a driver's personal user profile, wherein the personal user profile includes one or more of the driver's commonly used spoken vocabulary and specific gesture meanings; The driver's voice interaction commands are matched with commonly used spoken vocabulary information in the personal user profile, and the driver's visual interaction gestures are matched with specific gesture meaning information in the personal user profile. Based on the voice matching results and gesture matching results, the driver's intention recognition result is determined, wherein the intention recognition result includes one or more of the following: navigation adjustment intention, vehicle control intention, and entertainment adjustment intention.

[0009] Preferably, when the driver's intention recognition result is identified, environmental perception data of the vehicle's surroundings, navigation path information, and passenger voice information are acquired. Based on the environmental perception data of the vehicle's surroundings, navigation path information, and passenger voice information, environmental fusion information of the cockpit is generated, including: The passenger's voice information in the cabin is analyzed to obtain the passenger's interference level and the passenger's intended command; Based on the navigation path information, extract the curvature, slope, and lane line information of the current road; The environmental perception data around the vehicle, the passenger's interference level, the passenger's intention commands, and the current road information are fused to generate the cockpit's environmental fusion information.

[0010] Preferably, the future driving path of the target driver and the environmental information set corresponding to the future driving path are obtained, and the probability of the target driver encountering driving risks at each driving location is calculated based on the environmental information set corresponding to the future driving path, including: Obtain environmental information for each of the multiple driving locations along the future driving path; The environmental information of the driving location is input into a pre-built road condition complexity analysis model for analysis to obtain the road condition complexity of the driving location; Obtain vehicle operation information and vehicle control information corresponding to the future driving path of the target driver, and estimate the traffic hazard effect corresponding to each driving position based on the traffic flow information corresponding to the vehicle operation information. Based on the vehicle control information of the future driving path, the control delay coefficient required for the vehicle to be driven by the target driver to perform hazard avoidance operations at each driving position is estimated. For each driving location, the probability of the target driver encountering driving risks at that driving location is calculated based on the complexity of the road conditions at that driving location, the traffic hazard effect corresponding to that driving location, and the control delay coefficient corresponding to that driving location.

[0011] Preferably, for each driving location, when it is determined that the probability of the target driver encountering driving risk at that driving location is greater than or equal to a first preset risk level, the driving location is defined as a driving scenario, and the environmental fusion information of the cockpit is updated based on the environmental perception data under the driving scenario, including: For all target driving locations where the probability of a driving risk is less than the first preset risk level, obtain the historical driving violation information for each target driving location; For each target driving location, based on the historical driving violations of the target driving location and the likelihood of the target driver encountering driving risks at the target driving location, it is determined whether the target driving location meets the preset risk conditions; When it is determined that the target driving location meets the risk conditions, the target driving location is identified as a driving scenario; Acquire high-precision environmental perception data in the driving scenario, and use the high-precision environmental perception data to update the original cockpit environment fusion information.

[0012] Preferably, for each target driving location, based on the historical driving violations at the target driving location and the likelihood of the target driver posing a driving risk at the target driving location, it is determined whether the target driving location meets a preset risk condition, including: For each target driving location, the probability of the target driver's historical violations at that target driving location is assessed based on the historical driving violations at that target driving location; Determine whether the historical violation probability of the target driver at the target driving location is greater than or equal to a preset violation probability and whether the probability of the target driver experiencing driving risk at the target driving location is greater than or equal to a second preset risk level; When it is determined that the historical violation probability of the target driver at the target driving location is greater than or equal to the preset violation probability and the probability of the target driver incurring driving risks at the target driving location is greater than or equal to the second preset risk level, the target driving location is determined to meet the preset risk conditions.

[0013] A multimodal artificial intelligence-based intelligent cockpit interaction system, applicable to the aforementioned multimodal artificial intelligence-based intelligent cockpit interaction method, including: The data acquisition unit is configured to collect multimodal biometric data of the driver and vehicle operating status data in the cockpit in real time; based on the multimodal biometric data of the driver and the vehicle operating status data, combined with the acquired cockpit historical interaction log data, it determines whether the driver's current biometrics meet the preset feature change pattern. The triggering judgment unit is configured to determine the driver's state characteristic parameters at the time point when it is determined that the driver's current biometrics meet the preset characteristic change pattern, and to determine whether the driver meets the preset interaction triggering conditions based on the driver's state characteristic parameters and the vehicle's operating status data. The intent recognition unit is configured to, when it is determined that the driver meets the preset interaction triggering conditions, acquire the driver's voice interaction commands and visual interaction gestures, and recognize the driver's intent recognition result based on the driver's voice interaction commands and visual interaction gestures. The environment fusion unit is configured to, when the driver's intention recognition result is identified, acquire environmental perception data around the vehicle, navigation path information and passenger voice information in the cabin, and generate environmental fusion information of the cockpit based on the environmental perception data around the vehicle, navigation path information and passenger voice information in the cabin. An environment matching unit is configured to determine whether the driver's operating intention matches the current driving environment based on the driver's intention recognition result and the environment fusion information of the cockpit. The driving interaction unit is configured to generate a target interaction strategy based on the driver's state characteristic parameters, intent recognition results, and environmental fusion information when it is determined that the driver's operating intention does not match the current driving environment; and send the target interaction strategy to the execution device in the cockpit, wherein the target interaction strategy is used to provide assisted driving guidance to the driver.

[0014] Compared with the prior art, the beneficial effects of the present invention are: This invention collects multimodal biometric data of drivers and vehicle operating status data, enabling real-time judgment of whether drivers are fatigued, distracted, or emotionally unstable. Combined with future driving routes and environmental information, it calculates the probability of driving risks occurring at each driving location, allowing the system to detect potential dangers in advance and reduce the probability of traffic accidents. By utilizing individual user profiles of drivers and historical fatigue driving characteristics, it compares current driving behavior with historical behavior to achieve more accurate driving risk prediction. This invention integrates the vehicle's surrounding environment, navigation path, and in-cabin passenger voice information to form a global cockpit environment perception, making interactive decisions more in line with actual scenarios. By generating personalized assisted driving strategies based on the driver's state, intent recognition results, and environmental information, it achieves proactive assistance rather than passive prompts. Furthermore, when the driver's operational intent does not match the current driving environment, the system automatically generates a target interaction strategy to guide the driver to operate safely. By analyzing in-cabin passenger voice interference and driver state, it reduces the impact of accidental triggering or inappropriate commands on driving, improving driver focus and comfort. Attached Figure Description

[0015] Figure 1 This is a schematic flowchart of the overall method in one embodiment of the present invention; Figure 2 This is a schematic diagram of the overall system architecture in one embodiment of the present invention.

[0016] In the diagram: 1. Data acquisition unit; 2. Trigger judgment unit; 3. Intent recognition unit; 4. Environment fusion unit; 5. Environment matching unit; 6. Driving interaction unit. Detailed Implementation

[0017] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0018] Example 1, please refer to Figure 1 This invention provides a technical solution: an intelligent cockpit interaction method based on multimodal artificial intelligence, comprising: S1. Real-time collection of multimodal biometric data of the driver and vehicle operating status data in the cockpit; based on the multimodal biometric data of the driver and the vehicle operating status data, combined with the acquired historical interaction log data of the cockpit, determine whether the driver's current biometrics meet the preset feature change pattern. S2. When it is determined that the driver's current biometrics meet the preset characteristic change pattern, determine the driver's state characteristic parameters at the time point, and determine whether the driver meets the preset interaction trigger conditions based on the driver's state characteristic parameters and the vehicle's operating status data. S3. When it is determined that the driver meets the preset interaction trigger conditions, the driver's voice interaction command and visual interaction gesture are obtained, and the driver's intention recognition result is identified based on the driver's voice interaction command and visual interaction gesture. S4. When the driver's intention recognition result is identified, the environmental perception data around the vehicle, navigation path information and passenger voice information in the cabin are obtained, and the environmental fusion information of the cockpit is generated based on the environmental perception data around the vehicle, navigation path information and passenger voice information in the cabin. S5. Based on the driver's intention recognition results and the cockpit environment fusion information, determine whether the driver's operating intention matches the current driving environment; S6. When it is determined that the driver's operating intention does not match the current driving environment, a target interaction strategy is generated based on the driver's state characteristic parameters, intention recognition results, and environmental fusion information; the target interaction strategy is sent to the execution device in the cockpit, wherein the target interaction strategy is used to guide the driver in assisted driving.

[0019] It should be noted that two types of information are collected: multimodal biometric data of the driver, such as heart rate, pupil size, facial expression, voice tone, hand movements, etc.; and vehicle operating status data, such as vehicle speed, steering wheel angle, accelerator and brake status, distance to surrounding obstacles, etc. For example, when Xiao Wang was driving, the cockpit camera detected that his pupils were dilated, his heart rate was slightly faster, and his hands were slightly shaking on the steering wheel; the vehicle was currently driving on urban roads at a speed of 50 km / h. The currently collected data is compared with previous historical interaction logs and preset physiological state patterns to see if the driver is in a specific state; for example, it is determined that Xiao Wang's heart rate and pupil changes conform to the characteristic patterns of "early stage of fatigued driving". When the system detects that the driver's behavior is consistent with a pattern, it will further determine whether to trigger intelligent interaction, such as assisting driving through voice or gestures; for example: Xiao Wang's eyelids start to droop and his hands occasionally leave the steering wheel; the system determines that the trigger condition of "needing to remind the driver to pay attention" has been met. Simultaneously, the system reads the driver's voice commands and gestures to identify the desired operation. For example, if Xiao Wang raises his hand to press the air conditioning button and says "turn up the temperature," the system analyzes the gestures and voice to determine that Xiao Wang wants to turn up the air conditioning temperature. The system integrates surrounding vehicle perception information, navigation routes, and in-vehicle passenger voice commands to form a global environmental information; for example, the system detects a pedestrian crossing the road ahead, the navigation indicates a traffic light 500 meters ahead, and the passenger says "hurry up to the mall"; the system merges this information into the current environmental state. The system determines whether the driver's intention is safe, reasonable, and appropriate for the current driving environment. For example, Xiao Wang wants to accelerate before reaching a traffic light, but there are pedestrians crossing the road ahead. The system determines that this intention is not appropriate for the environment. When the operation is mismatched, the system will generate an assistance strategy based on the driver's state, intention, and environment, and send it to the cockpit execution device; for example, the system will activate a voice reminder: "There is a pedestrian ahead, please slow down;" and at the same time automatically and slightly control the throttle to make the vehicle decelerate smoothly and ensure safety.

[0020] In an optional embodiment, before generating the target interaction strategy based on the driver's state characteristic parameters, intent recognition results, and environmental fusion information, the method further includes: Obtain the target driver's future driving path and the environmental information set corresponding to the future driving path, and calculate the probability of the target driver encountering driving risks at each driving location based on the environmental information set corresponding to the future driving path. For each driving location, when it is determined that the probability of the target driver encountering driving risks at the driving location is greater than or equal to the first preset risk level, the driving location is determined as a driving scenario, and the environmental fusion information of the cockpit is updated based on the environmental perception data under the driving scenario. Sending the target interaction strategy to the execution equipment in the cockpit, including: Based on the updated cockpit environment fusion information and the urgency of the target interaction strategy, the signal output requirements corresponding to the target interaction strategy are determined. Based on the signal output requirements, the target execution device corresponding to the cockpit is determined, and the target interaction strategy is transmitted to the target execution device.

[0021] In an optional embodiment, based on the driver's multimodal biometric data and the vehicle's operating status data, combined with the acquired cockpit historical interaction log data, it is determined whether the driver's current biometrics meet a preset characteristic change pattern, including: Obtain the driver's personal user profile and historical fatigue driving characteristics information; The driver's multimodal biometric data is compared with the baseline biometric data in the personal user profile, and the driver's multimodal biometric data is compared with historical fatigue driving characteristics. Based on the results of biometric and fatigue comparisons, it is determined whether the driver's current biometrics meet the preset pattern of change.

[0022] In an optional embodiment, when it is determined that the driver's current biometrics meet a preset characteristic change pattern, the driver's state characteristic parameters at a given time point are determined, and based on the driver's state characteristic parameters and the vehicle's operating status data, it is determined whether the driver meets a preset interaction trigger condition, including: Based on the driver's multimodal biometric data, the driver's attention distraction level and emotional fluctuation parameters are determined; Based on the driver's distraction level value, determine whether the driver's distraction level value is greater than or equal to a preset attention threshold; When it is determined that the driver's distraction level is greater than or equal to the preset attention threshold, the driver's emotional fluctuation parameters and the vehicle speed are combined to determine whether they are within the preset danger correlation range. When it is determined that the driver's emotional fluctuation parameters and the vehicle's speed are within a preset danger correlation range, it is determined that the driver meets the preset interaction trigger conditions.

[0023] It should be noted that the system assesses a driver's inattention and emotional fluctuations using biometric data (such as heart rate, pupil size, facial expressions, etc.). These two parameters are key indicators for determining whether a driver is in a dangerous driving state. For example, suppose Xiao Wang's heart rate is high, his pupils are dilated, and he appears somewhat anxious. The system uses this data to calculate his level of inattention (e.g., "high") and emotional fluctuation parameters (e.g., "high anxiety"). The system compares the driver's level of distraction with a preset attention threshold. If the driver's level of distraction exceeds the threshold, it indicates that he may not be paying attention while driving and there is a certain safety hazard. For example, when Xiao Wang is driving, his level of distraction is assessed as "high" - that is, he may be distracted (such as playing on his phone or talking to the people in the car). The system will check whether this level of distraction exceeds the safety threshold. If the system detects a high level of driver inattention, it will then combine this with the vehicle's speed to determine whether the driver's emotional fluctuations have reached a "dangerous" level. In other words, if the driver is emotionally unstable and the vehicle is traveling at a high speed, the risk is higher. For example, if Xiao Wang's emotional fluctuation parameters are high (e.g., he is feeling anxious or angry), and the vehicle's speed is also high (e.g., 80 km / h), the system will determine that in this situation, the driver may be more likely to make impulsive or irrational reactions, increasing the risk of an accident. If the system detects that the driver's inattention and emotional fluctuations are in a dangerous state, and the vehicle speed is also high, the system will determine that the driver has met the conditions for triggering interaction. At this time, the intelligent cockpit will take action, such as reminding the driver to concentrate through voice prompts, vibration feedback, or even automatically intervening when necessary (such as slowing down). For example, if the system detects that Xiao Wang is not only distracted but also anxious, and the vehicle is traveling at a high speed, it will determine that Xiao Wang is in a dangerous driving state. The system may proactively remind: "Please concentrate, the vehicle speed is too high." Or, if the situation is particularly urgent, the system may automatically slow down to ensure safety.

[0024] In an optional embodiment, when it is determined that the driver meets the preset interaction triggering conditions, the driver's voice interaction commands and visual interaction gestures are acquired, and the driver's intent recognition result is identified based on the driver's voice interaction commands and visual interaction gestures, including: Obtain a driver's personal user profile, which includes one or more of the driver's commonly used spoken vocabulary and the meaning of specific gestures; The driver's voice interaction commands are matched with commonly used spoken vocabulary information in the personal user profile, and the driver's visual interaction gestures are matched with specific gesture meaning information in the personal user profile. Based on the voice matching results and gesture matching results, the driver's intention recognition results are determined. The intention recognition results include one or more of the following: navigation adjustment intention, vehicle control intention, and entertainment adjustment intention.

[0025] It should be noted that the system determines whether to enter the voice or gesture recognition stage based on whether the driver meets specific interaction trigger conditions (such as inattention or excessive emotional fluctuations). Once the condition is met, the system will begin to capture the driver's voice commands and visual gestures. For example, suppose Xiao Wang is inattentive. Based on the previous judgment, the system will begin to recognize his next command. Xiao Wang might say "Navigate home" or make a gesture, such as pointing his finger in front of the car to indicate that he needs to adjust the navigation. Each driver has a unique personal user profile, which includes the driver's commonly used vocabulary and the meaning of specific gestures. The system builds this profile based on the driver's historical interaction records in order to make more accurate judgments in subsequent interactions. For example, in the past, Xiao Wang may have been used to using "navigate home" to trigger the navigation function, or he always used a specific gesture (such as palm up to indicate that the car's interior temperature is being turned up). This voice and gesture information will constitute his personal user profile. When a driver gives a command, the system matches the voice command with the driver's frequently used spoken words in their personal user profile. In this way, the system can identify the driver's specific needs. For example, if Xiao Wang says, "Navigate to home," the system will match this command with the "navigation" command in Xiao Wang's personal user profile and know that Xiao Wang wants to adjust the navigation route to his home. The system captures the driver's visual gestures (e.g., gesture recognition cameras can see the driver's finger movements, facial expressions, etc.) and matches them with the specific meanings of gestures in the individual user profile; this helps the system understand the specific intentions behind the gestures; for example, if Xiao Wang makes a palm-up gesture, the system will check if this matches his previous gesture habits (e.g., Xiao Wang always uses this gesture to indicate "raising the temperature"); if the match is successful, the system will adjust the temperature inside the car. The system combines the matching results of voice commands and gesture commands to determine the driver's intent. These intents may include: navigation adjustment intents (e.g., the driver wants to reset the route or update the navigation destination); vehicle control intents (e.g., the driver wants to adjust the vehicle speed or change the air conditioning temperature); and entertainment adjustment intents (e.g., adjusting the volume or playing a specific song). For example, if Xiao Wang says "Navigate home" and makes a palm-up gesture, the system will determine from the voice matching result that he wants to adjust the navigation route, while the gesture matching result indicates that he may want to increase the interior temperature. Combining these two pieces of information, the system can perform both operations, adjusting the navigation and increasing the temperature.

[0026] In an optional embodiment, when the driver's intention recognition result is identified, environmental perception data of the vehicle's surroundings, navigation path information, and passenger voice information are acquired. Based on the environmental perception data of the vehicle's surroundings, navigation path information, and passenger voice information, environmental fusion information of the cockpit is generated, including: Analyze passenger voice information in the cabin to obtain the passenger's interference level and the passenger's intended command; Based on the navigation route information, extract the curvature, slope, and lane line information of the current road; By fusing environmental perception data around the vehicle, passenger interference levels, passenger intent commands, and current road information, a fusion of environmental information for the cockpit is generated.

[0027] In one optional embodiment, the future driving path of the target driver and the set of environmental information corresponding to the future driving path are obtained, and the probability of the target driver encountering driving risks at each driving location is calculated based on the set of environmental information corresponding to the future driving path, including: Obtain environmental information for each of the multiple driving locations along the future driving route; The environmental information of the driving location is input into a pre-built road condition complexity analysis model for analysis to obtain the road condition complexity of the driving location; Obtain vehicle operation information and vehicle control information corresponding to the target driver's future driving path, and predict the traffic hazard effect corresponding to each driving position based on the traffic flow information corresponding to the vehicle operation information. Based on vehicle control information of the future driving path, the control delay coefficient required for the target driver to perform hazard avoidance operations at each driving position is estimated. For each driving location, the likelihood of the target driver encountering driving risks at that location is calculated based on the complexity of the road conditions, the traffic hazard effect corresponding to the driving location, and the control delay coefficient corresponding to the driving location.

[0028] In an optional embodiment, for each driving location, when it is determined that the probability of the target driver encountering driving risk at the driving location is greater than or equal to a first preset risk level, the driving location is defined as a driving scenario, and the environmental fusion information of the cockpit is updated based on the environmental perception data under the driving scenario, including: For all target driving locations where the probability of a driving risk is less than the first preset risk level, obtain the historical driving violation information for each target driving location; For each target driving location, based on the historical driving violations of the target driving location and the likelihood of the target driver encountering driving risks at the target driving location, it is determined whether the target driving location meets the preset risk conditions; When it is determined that the target driving location meets the risk conditions, the target driving location is identified as the driving scenario; Acquire high-precision environmental perception data in driving scenarios, and use the high-precision environmental perception data to update the original cockpit environment fusion information.

[0029] In an optional embodiment, for each target driving location, based on the historical driving violations at the target driving location and the likelihood of the target driver posing a driving risk at the target driving location, it is determined whether the target driving location meets preset risk conditions, including: For each target driving location, assess the probability of the target driver's historical violations at the target driving location based on the historical driving violations at that location; Determine whether the historical violation probability of the target driver at the target driving location is greater than or equal to the preset violation probability and whether the probability of the target driver experiencing driving risk at the target driving location is greater than or equal to the second preset risk level; When it is determined that the historical violation probability of the target driver at the target driving location is greater than or equal to the preset violation probability and the probability of the target driver incurring driving risks at the target driving location is greater than or equal to the second preset risk level, the target driving location is determined to meet the preset risk conditions.

[0030] Example 2, please refer to Figure 2 This invention provides a technical solution: an intelligent cockpit interaction system based on multimodal artificial intelligence, applicable to the aforementioned intelligent cockpit interaction method based on multimodal artificial intelligence, comprising: Data acquisition unit 1 is configured to collect multimodal biometric data of the driver and vehicle operating status data in the cockpit in real time; based on the multimodal biometric data of the driver and the vehicle operating status data, combined with the acquired cockpit historical interaction log data, it determines whether the driver's current biometrics meet the preset feature change pattern. Trigger judgment unit 2 is configured to determine the driver's state characteristic parameters at a given time point when it is determined that the driver's current biometrics meet the preset characteristic change pattern, and to determine whether the driver meets the preset interaction trigger conditions based on the driver's state characteristic parameters and the vehicle's operating status data. The intent recognition unit 3 is configured to acquire the driver's voice interaction command and visual interaction gesture when it is determined that the driver meets the preset interaction trigger conditions, and to recognize the driver's intent recognition result based on the driver's voice interaction command and visual interaction gesture. The environment fusion unit 4 is configured to acquire environmental perception data around the vehicle, navigation path information, and passenger voice information in the cabin when the driver's intention recognition result is identified, and generate the cockpit environment fusion information based on the environmental perception data around the vehicle, navigation path information, and passenger voice information in the cabin. The environment matching unit 5 is configured to determine whether the driver's operating intention matches the current driving environment based on the driver's intention recognition result and the cockpit environment fusion information; The driving interaction unit 6 is configured to generate a target interaction strategy based on the driver's state characteristic parameters, the intention recognition result, and the environment fusion information when it is determined that the driver's operation intention does not match the current driving environment; and send the target interaction strategy to the execution device in the cockpit. The target interaction strategy is used to provide assisted driving guidance to the driver.

[0031] The embodiments of the present invention have been described in detail above with reference to the accompanying drawings. However, the present invention is not limited thereto. Various changes can be made within the scope of knowledge possessed by those skilled in the art without departing from the spirit of the present invention.

Claims

1. A smart cockpit interaction method based on multimodal artificial intelligence, characterized in that, include: Real-time acquisition of multimodal biometric data of the driver in the cockpit and vehicle operating status data; Based on the driver's multimodal biometric data and the vehicle's operating status data, combined with the acquired cockpit historical interaction log data, it is determined whether the driver's current biometrics meet the preset feature change pattern. When it is determined that the driver's current biometrics meet the preset characteristic change pattern, the driver's state characteristic parameters at the time point are determined, and based on the driver's state characteristic parameters and the vehicle's operating status data, it is determined whether the driver meets the preset interaction trigger conditions. When it is determined that the driver meets the preset interaction triggering conditions, the driver's voice interaction command and visual interaction gesture are obtained, and the driver's intention recognition result is identified based on the driver's voice interaction command and visual interaction gesture. When the driver's intention recognition result is identified, environmental perception data around the vehicle, navigation path information, and passenger voice information in the cabin are acquired, and environmental fusion information of the cockpit is generated based on the environmental perception data around the vehicle, navigation path information, and passenger voice information in the cabin. Based on the driver's intent recognition results and the cockpit environment fusion information, it is determined whether the driver's operating intent matches the current driving environment; When it is determined that the driver's operating intention does not match the current driving environment, a target interaction strategy is generated based on the driver's state feature parameters, intention recognition results, and environmental fusion information. The target interaction strategy is sent to the execution device in the cockpit, wherein the target interaction strategy is used to provide assisted driving guidance to the driver.

2. The intelligent cockpit interaction method based on multimodal artificial intelligence according to claim 1, characterized in that, Before generating the target interaction strategy based on the driver's state characteristic parameters, intent recognition results, and environmental fusion information, the method further includes: The system obtains the target driver's future driving path and the environmental information set corresponding to the future driving path, and calculates the probability of the target driver encountering driving risks at each driving location based on the environmental information set corresponding to the future driving path. For each driving location, when it is determined that the probability of the target driver encountering driving risks at the driving location is greater than or equal to the first preset risk level, the driving location is determined as a driving scenario, and the environmental fusion information of the cockpit is updated according to the environmental perception data under the driving scenario. Sending the target interaction strategy to the execution device in the cockpit includes: Based on the updated cockpit environment fusion information and the urgency of the target interaction strategy, the signal output requirements corresponding to the target interaction strategy are determined, and based on the signal output requirements, the target execution device corresponding to the cockpit is determined, and the target interaction strategy is transmitted to the target execution device.

3. The intelligent cockpit interaction method based on multimodal artificial intelligence according to claim 2, characterized in that, Based on the driver's multimodal biometric data and vehicle operating status data, combined with the acquired cockpit historical interaction log data, it is determined whether the driver's current biometrics meet a preset characteristic change pattern, including: Obtain the driver's personal user profile and historical fatigue driving characteristics information; The driver's multimodal biometric data is compared with the baseline biometric data in the personal user profile, and the driver's multimodal biometric data is compared with the historical fatigue driving feature information; Based on the biometric comparison results and fatigue characteristic comparison results, it is determined whether the driver's current biometric characteristics meet the preset characteristic change pattern.

4. The intelligent cockpit interaction method based on multimodal artificial intelligence according to claim 3, characterized in that, When it is determined that the driver's current biometrics meet the preset characteristic change pattern, the driver's state characteristic parameters at the specified time point are determined, and based on the driver's state characteristic parameters and the vehicle's operating status data, it is determined whether the driver meets the preset interaction trigger conditions, including: Based on the driver's multimodal biometric data, the driver's attention distraction level and emotional fluctuation parameters are determined. Based on the driver's distraction level value, determine whether the driver's distraction level value is greater than or equal to a preset attention threshold; When it is determined that the driver's distraction level is greater than or equal to the preset attention threshold, the driver's emotional fluctuation parameters and the vehicle speed are combined to determine whether they are within a preset danger correlation range. When it is determined that the driver's emotional fluctuation parameters and the vehicle's speed are within the preset danger-related range, it is determined that the driver meets the preset interaction triggering condition.

5. The intelligent cockpit interaction method based on multimodal artificial intelligence according to claim 4, characterized in that, When it is determined that the driver meets the preset interaction triggering conditions, the driver's voice interaction commands and visual interaction gestures are acquired, and the driver's intent recognition result is obtained based on the driver's voice interaction commands and visual interaction gestures, including: Obtain a driver's personal user profile, wherein the personal user profile includes one or more of the driver's commonly used spoken vocabulary and specific gesture meanings; The driver's voice interaction commands are matched with commonly used spoken vocabulary information in the personal user profile, and the driver's visual interaction gestures are matched with specific gesture meaning information in the personal user profile. Based on the voice matching results and gesture matching results, the driver's intention recognition result is determined, wherein the intention recognition result includes one or more of the following: navigation adjustment intention, vehicle control intention, and entertainment adjustment intention.

6. The intelligent cockpit interaction method based on multimodal artificial intelligence according to claim 5, characterized in that, When the driver's intent recognition result is obtained, environmental perception data of the vehicle's surroundings, navigation path information, and passenger voice information are acquired. Based on these data, environmental fusion information of the cockpit is generated, including: The passenger's voice information in the cabin is analyzed to obtain the passenger's interference level and the passenger's intended command; Based on the navigation path information, extract the curvature, slope, and lane line information of the current road; The environmental perception data around the vehicle, the passenger's interference level, the passenger's intention commands, and the current road information are fused to generate the cockpit's environmental fusion information.

7. The intelligent cockpit interaction method based on multimodal artificial intelligence according to claim 6, characterized in that, Obtain the target driver's future driving path and the corresponding environmental information set, and calculate the probability of the target driver encountering driving risks at each driving location based on the environmental information set corresponding to the future driving path, including: Obtain environmental information for each of the multiple driving locations along the future driving path; The environmental information of the driving location is input into a pre-built road condition complexity analysis model for analysis to obtain the road condition complexity of the driving location; Obtain vehicle operation information and vehicle control information corresponding to the future driving path of the target driver, and estimate the traffic hazard effect corresponding to each driving position based on the traffic flow information corresponding to the vehicle operation information. Based on the vehicle control information of the future driving path, the control delay coefficient required for the vehicle to be driven by the target driver to perform hazard avoidance operations at each driving position is estimated. For each driving location, the probability of the target driver encountering driving risks at that driving location is calculated based on the complexity of the road conditions at that driving location, the traffic hazard effect corresponding to that driving location, and the control delay coefficient corresponding to that driving location.

8. The intelligent cockpit interaction method based on multimodal artificial intelligence according to claim 7, characterized in that, For each driving location, when it is determined that the probability of the target driver encountering driving risk at that driving location is greater than or equal to a first preset risk level, the driving location is defined as a driving scenario, and the environmental fusion information of the cockpit is updated based on the environmental perception data under the driving scenario, including: For all target driving locations where the probability of a driving risk is less than the first preset risk level, obtain the historical driving violation information for each target driving location; For each target driving location, based on the historical driving violations of the target driving location and the likelihood of the target driver encountering driving risks at the target driving location, it is determined whether the target driving location meets the preset risk conditions; When it is determined that the target driving location meets the risk conditions, the target driving location is identified as a driving scenario; Acquire high-precision environmental perception data in the driving scenario, and use the high-precision environmental perception data to update the original cockpit environment fusion information.

9. The intelligent cockpit interaction method based on multimodal artificial intelligence according to claim 8, characterized in that, For each target driving location, based on the historical driving violations at the target driving location and the likelihood of the target driver posing a driving risk at the target driving location, it is determined whether the target driving location meets preset risk conditions, including: For each target driving location, the probability of the target driver's historical violations at that target driving location is assessed based on the historical driving violations at that target driving location; Determine whether the historical violation probability of the target driver at the target driving location is greater than or equal to a preset violation probability and whether the probability of the target driver experiencing driving risk at the target driving location is greater than or equal to a second preset risk level; When it is determined that the historical violation probability of the target driver at the target driving location is greater than or equal to the preset violation probability and the probability of the target driver incurring driving risks at the target driving location is greater than or equal to the second preset risk level, the target driving location is determined to meet the preset risk conditions.

10. A smart cockpit interaction system based on multimodal artificial intelligence, applicable to the smart cockpit interaction method based on multimodal artificial intelligence as described in any one of claims 1-9, characterized in that, include: The data acquisition unit is configured to collect multimodal biometric data of the driver in the cockpit and vehicle operating status data in real time. Based on the driver's multimodal biometric data and the vehicle's operating status data, combined with the acquired cockpit historical interaction log data, it is determined whether the driver's current biometrics meet the preset feature change pattern. The triggering judgment unit is configured to determine the driver's state characteristic parameters at the time point when it is determined that the driver's current biometrics meet the preset characteristic change pattern, and to determine whether the driver meets the preset interaction triggering conditions based on the driver's state characteristic parameters and the vehicle's operating status data. The intent recognition unit is configured to, when it is determined that the driver meets the preset interaction triggering conditions, acquire the driver's voice interaction commands and visual interaction gestures, and recognize the driver's intent recognition result based on the driver's voice interaction commands and visual interaction gestures. The environment fusion unit is configured to, when the driver's intention recognition result is identified, acquire environmental perception data around the vehicle, navigation path information and passenger voice information in the cabin, and generate environmental fusion information of the cockpit based on the environmental perception data around the vehicle, navigation path information and passenger voice information in the cabin. An environment matching unit is configured to determine whether the driver's operating intention matches the current driving environment based on the driver's intention recognition result and the environment fusion information of the cockpit. The driving interaction unit is configured to generate a target interaction strategy based on the driver's state feature parameters, the intent recognition result, and the environment fusion information when it is determined that the driver's operation intention does not match the current driving environment. The target interaction strategy is sent to the execution device in the cockpit, wherein the target interaction strategy is used to provide assisted driving guidance to the driver.