A motorcycle off-line voice interaction system and method thereof
By monitoring the environmental signal-to-noise ratio and vehicle status in real time, and dynamically switching the command library and feedback parameters, the problem of low voice recognition rate and misoperation of motorcycles in high wind noise environment is solved, realizing reliable voice interaction and safety control across the entire speed range.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- ZHEJIANG QIANJIANG MOTORCYCLE
- Filing Date
- 2025-12-31
- Publication Date
- 2026-06-19
AI Technical Summary
Motorcycles have low voice recognition rates in high wind noise and high speed environments. Existing technology cannot distinguish between high-risk and low-risk commands, resulting in a high risk of misoperation. Furthermore, feedback information is difficult to penetrate noise and be perceived by the driver.
By monitoring the environmental signal-to-noise ratio in real time, dynamically switching to a highly robust simplified instruction library and noise-resistant model, and combining a state permission mapping table with multi-dimensional data such as vehicle speed and tilt angle, a secondary confirmation logic and adaptive feedback mechanism are introduced to ensure that the voice interaction system accurately captures user intent and prevents misoperation in harsh environments.
It achieves full-speed-range voice interaction capabilities, reduces the false recognition rate, prevents misoperation, ensures that feedback information can penetrate noise and be perceived by the driver, and improves the reliability and safety of the system.
Smart Images

Figure CN122245306A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of voice interaction technology, and in particular to an offline voice interaction system and method for motorcycles. Background Technology
[0002] With the rapid development of intelligent motorcycles and vehicle networking technologies, traditional mechanical buttons and dashboard touch controls are increasingly unable to meet the growing information processing needs of riders. Taking your hands off the handlebars or shifting your gaze to operate navigation, music, or answer phone calls while riding greatly distracts the driver and increases the risk of traffic accidents. Therefore, voice recognition-based human-machine interaction technology, with its "hands-free, eyes-on-road" characteristics, has become an important direction for upgrading motorcycle smart cockpits. However, unlike the enclosed cockpit of a car, motorcycles operate in an open acoustic environment. Wind noise, tire noise, and engine vibration noise generated during vehicle operation increase exponentially with speed, posing a significant challenge to the accuracy of voice recognition and the safety of command execution.
[0003] Currently, to address the issues of misidentification and misoperation in high wind noise environments, existing technologies typically employ a "one-size-fits-all" speed threshold defense strategy. For example, Chinese patent authorization announcement number CN214985822U discloses an instrument panel for a smart electric bicycle or motorcycle with voice interaction functionality. However, this patent explicitly states that "interaction can only be achieved via microphone when the speed is below 40 km / h and wind noise is minimal," and argues that "if the speed is too high, wind noise and ambient noise are too great, the recognition rate is insufficient, and misoperation is likely to occur." While this technical solution mitigates the risk of misidentification to some extent, it essentially adopts an overly cautious and "one-size-fits-all" avoidance approach. This prevents users from using the function in high-speed cruising scenarios where voice assistance is most needed (such as adjusting navigation routes or answering emergency calls on highways), severely limiting the practicality of the voice interaction system. Furthermore, existing technologies lack refined safety control over command categories, often failing to distinguish between low-risk commands like "switch music" and high-risk commands like "open the seat bucket" under different vehicle postures (such as cornering, acceleration, and deceleration), thus failing to maximize the usability of the function while ensuring riding safety. Summary of the Invention
[0004] The purpose of this invention is to solve the technical problem of low recognition rate or even unusability of existing motorcycle voice systems in high wind noise and high speed environments due to the decrease in signal-to-noise ratio. By monitoring the environmental signal-to-noise ratio in real time and dynamically switching to a highly robust simplified instruction library and noise-resistant model, this invention can ensure that the vehicle can accurately capture and respond to the user's core control requests even when the vehicle is traveling at high speed or in harsh acoustic environments. This breaks through the limitation of existing technologies that can only interact at low speeds and achieves full-speed-range voice interaction capabilities.
[0005] The purpose of this invention is to address the driving safety hazards caused by the lack of deep correlation between existing voice control logic and the dynamic driving status of the vehicle. By constructing a state permission mapping table that includes multi-dimensional data such as vehicle speed, tilt angle, and gear position, this invention establishes an interlocking decision-making mechanism between command intent and vehicle physical state. This mechanism can intelligently intercept high-risk physical opening commands (such as opening the seat bucket) at high speeds and trigger secondary confirmation logic for mode switching commands, thereby effectively preventing vehicle loss of control or mechanical accidents caused by voice misrecognition or user misoperation.
[0006] The purpose of this invention is to solve the technical problem that environmental noise during motorcycle riding makes it difficult for drivers to clearly receive system feedback information. By establishing an adaptive correlation between the environmental signal-to-noise ratio and speech synthesis parameters, this invention can dynamically adjust the gain value and speech rate of the feedback speech according to the real-time wind noise level. Combined with a closed-loop verification mechanism, it ensures that the feedback information after the system executes the command can be accurately perceived by the driver, thereby improving the closed-loop reliability of human-computer interaction and user experience.
[0007] This invention proposes an offline voice interaction system for motorcycles. The system includes: an acoustic front-end perception module connected to an offline semantic parsing module and an adaptive feedback execution module, outputting the real-time environmental signal-to-noise ratio; the offline semantic parsing module receiving the environmental signal-to-noise ratio and loading the corresponding recognition instruction library, parsing the audio to output the operation intention; a safety interlock decision module connected to the offline semantic parsing module and the vehicle status monitoring module, matching the operation intention with real-time driving status data in a status permission mapping table; and an adaptive feedback execution module connected to the safety interlock decision module and the acoustic front-end perception module, executing interlock control commands and adjusting the acoustic parameters of the feedback voice according to the environmental signal-to-noise ratio. By constructing a system architecture integrating acoustic perception, semantic parsing, safety interlocking, and adaptive feedback, reliable interaction with deep fusion of voice recognition and vehicle status is achieved in complex riding environments.
[0008] Preferably, the acoustic front-end perception module is configured with an environmental noise threshold. When the real-time environmental signal-to-noise ratio is lower than the environmental noise threshold, the acoustic front-end perception module sends a first control signal to the offline semantic parsing module to trigger it to switch to a preset high-noise robust simplified instruction library, and simultaneously sends a second control signal to the adaptive feedback execution module to trigger it to increase the gain value of speech synthesis and reduce the speech rate. Dynamically switching the instruction library and adjusting the speech synthesis parameters according to the real-time environmental signal-to-noise ratio significantly improves the recognition success rate and feedback audibility of the system in high wind noise environments.
[0009] Preferably, the safety interlock decision module pre-stores a status permission mapping table, which defines the corresponding logic between operation intent categories and vehicle speed ranges and tilt angle thresholds. When the operation intent is a physical opening command and the vehicle speed is greater than zero, the safety interlock decision module generates an interception signal. When the operation intent is a driving mode switching command and the vehicle speed is in the high-speed range, the safety interlock decision module generates a secondary confirmation request signal. By using the status permission mapping table to match and intercept high-risk commands in real time, mechanical safety hazards caused by accidental voice triggering during driving are effectively prevented.
[0010] Preferably, after the adaptive feedback execution module sends the interlock control command via the CAN bus, it triggers the vehicle status monitoring module to collect the feedback status of the actuator within a preset time window. The adaptive feedback execution module compares the feedback status with the interlock control command; if they do not match, it generates a fault alarm audio. This post-command status feedback comparison mechanism achieves closed-loop verification, ensuring accurate system execution and timely alarm in case of faults.
[0011] Preferably, the acoustic front-end sensing module includes a digital signal processing chip and its peripheral filtering circuit, and the offline semantic parsing module and the safety interlock decision module are integrated into the microcontroller unit. The digital signal processing chip transmits the noise-reduced audio data to the microcontroller unit via an I2S bus, and transmits the environmental signal-to-noise ratio to the microcontroller unit in real time via a UART serial port or SPI bus. This hardware architecture, which integrates the digital signal processing chip and the microcontroller unit, achieves a high degree of integration between efficient audio data processing and system control.
[0012] Preferably, the CAN controller pins of the microcontroller are connected to the transceiver ports of the CAN bus transceiver, and the high and low level terminals of the CAN bus transceiver are connected to the vehicle control network to form the vehicle status monitoring module. By connecting to the vehicle network via the CAN bus transceiver, real-time and stable acquisition of vehicle driving status data is achieved, providing a reliable data source for safety decisions.
[0013] Preferably, the DAC output or PWM output of the microcontroller is connected to the input of the power amplifier circuit to form the adaptive feedback execution module; the microcontroller drives the power amplifier circuit to change the output voltage amplitude by adjusting the duty cycle of the output signal or the value of the gain register. By dynamically adjusting the power amplifier circuit using the DAC / PWM output, hardware-level adaptive control of the feedback voice volume and sound quality is achieved.
[0014] This invention proposes an offline voice interaction method for motorcycles, applied to the aforementioned offline voice interaction system. The method includes: simultaneously acquiring audio signals from a microphone array and vehicle driving status data from the CAN bus; calculating the environmental signal-to-noise ratio (SNR) based on the audio signals; when the SNR meets a preset high-noise judgment condition, loading a simplified instruction library to parse the audio signals and obtain the operation intent; otherwise, loading a full-function instruction library for parsing; matching the operation intent with the vehicle driving status data in a status permission mapping table to determine control permissions; generating an execution command when the control permission is allowed, and generating an interception command when the control permission is prohibited; controlling vehicle actions in response to the execution command, and adjusting the gain and speech rate parameters of the voice feedback based on the environmental SNR. Through this full-process method of simultaneously acquiring acoustic and vehicle data, dynamically loading the instruction library, and implementing permission matching and adaptive feedback, safe and reliable voice interaction across the entire speed range is achieved.
[0015] Preferably, the method controls the speech synthesis module to issue an inquiry audio and open a short-term listening window; within the short-term listening window, it receives and parses the user's response audio; if the response audio matches a positive confirmation instruction, it generates the execution instruction and exits the interaction loop; if the response audio matches a negative or invalid instruction, it terminates the process. This secondary confirmation mechanism, through voice inquiry and short-term listening before sensitive operations, further reduces the risk of accidental operation and enhances system security.
[0016] Preferably, the method reads the feedback message from the corresponding actuator via the CAN bus; compares the physical state contained in the feedback message with the target state of the executed instruction; and generates and plays a fault prompt audio when the comparison result is inconsistent. By reading the actuator feedback via the CAN bus and comparing it with the instruction target, the system achieves self-checking and fault voice prompts, improving the reliability and maintainability of the interactive system.
[0017] The present invention has the following beneficial effects: 1. Unlike existing technologies that simply enable or disable voice functionality based on vehicle speed thresholds, this invention establishes a dynamic mapping mechanism between environmental signal-to-noise ratio and command library loading strategy. This mechanism provides full-featured natural language interaction in low-speed, quiet environments, while automatically and seamlessly switching to a highly robust simplified command library and noise-resistant model in high-speed, high-wind-noise environments. This mechanism effectively balances the conflict between "recognition accuracy" and "functional coverage" without increasing hardware costs, ensuring that drivers can still control the vehicle via core commands even in harsh acoustic environments, avoiding interruptions in human-machine interaction caused by sudden changes in environmental noise.
[0018] 2. This invention departs from the traditional voice interaction system's singular logic that focuses solely on the semantics of the command. It innovatively introduces a state permission mapping table that incorporates multi-source data such as vehicle speed, tilt angle, and gear position. By interlocking voice intent with the vehicle's real-time physical state, the system can accurately identify and block high-risk operations such as "opening the seat bucket at high speed" and "distracted driving while leaning into a corner," and introduces a secondary confirmation mechanism for critically risky commands. This not only fundamentally eliminates mechanical accidents caused by misidentification or misoperation but also solves the technical challenge of balancing "convenience" and "safety" in high-dynamic motorcycle driving scenarios using in-vehicle voice technology.
[0019] 3. Addressing the unique scenario of motorcycles where riders wear helmets and wind noise masking is a significant concern, this invention employs an environmental adaptive feedback mechanism to dynamically adjust speech synthesis parameters (volume, speech rate) according to ambient noise intensity, ensuring that feedback information effectively penetrates background noise and is perceived by the driver. Simultaneously, coupled with a closed-loop verification logic of "command-execution-feedback," the system can detect the physical actions of the actuators (such as changes in light current) in real time and provide timely feedback when execution fails, overcoming the shortcomings of open-loop control systems that fail to recognize command loss in environments with strong electromagnetic interference.
[0020] 4. This invention employs a fully offline edge computing architecture, integrating semantic parsing and logical judgment into a local microcontroller. This eliminates reliance on network signal coverage and completely resolves the pain points of large cloud recognition latency and unstable response in weak network environments such as mountainous areas and tunnels, ensuring millisecond-level real-time response for vehicle control. Simultaneously, this invention completes acoustic feature extraction and matching locally without uploading raw audio data, naturally complying with personal information protection regulations at the system architecture level and mitigating the risk of user privacy leaks. Attached Figure Description
[0021] Figure 1 This is a system module architecture diagram of the present invention.
[0022] Figure 2 This is a circuit diagram of the present invention.
[0023] Figure 3 This is a schematic diagram of the method flow of the present invention. Detailed Implementation
[0024] Example 1 according to Figure 1As shown, this invention provides a motorcycle offline voice interaction system based on multi-source state fusion. Its hardware architecture is primarily built upon an embedded in-vehicle computing platform, with core logic implemented collaboratively by multiple functional modules integrated into the microcontroller unit. From an overall system architecture perspective, the system does not exist in isolation but is deeply embedded within the motorcycle's electronic and electrical architecture, interacting in real-time with the vehicle control network via a CAN bus. The system mainly includes an acoustic front-end perception module, an offline semantic parsing module, a vehicle status monitoring module, a safety interlock decision module, and an adaptive feedback execution module. The acoustic front-end perception module, acting as the system's sensing antennae, employs a multi-microphone array positioned on the motorcycle's dashboard or helmet. It not only collects the driver's voice commands but, more importantly, monitors changes in the external sound field environment in real time. The offline semantic parsing module and the safety interlock decision module constitute the system's "dual-core brain," responsible for understanding intent in the language dimension and making safety decisions in the physical dimension, respectively. These two modules are tightly coupled through data flow, ensuring that the execution of any voice command undergoes dual verification of "clear audibility" and "feasibility."
[0025] First, the acoustic front-end perception module and its linkage mechanism with downstream modules are described in detail. This module integrates a digital signal processing chip and dedicated filtering circuitry, where the raw analog audio signal acquired by the microphone array is converted into a digital signal. Unlike traditional solutions that only perform noise reduction, the acoustic front-end perception module in this embodiment has environmental feature extraction capabilities. It calculates the environmental signal-to-noise ratio (SNR) in the audio stream in real time. This parameter is not only used for enhancing the audio signal itself but also serves as a global control signal, synchronously sent to the offline semantic parsing module and adaptive feedback execution module in the backend. Specifically, when the vehicle is stationary or traveling at low speed, and the SNR remains at a high level, the acoustic front-end perception module outputs a first state signal, keeping the system in standard operating mode. However, when the vehicle travels at high speed, causing a sharp increase in wind noise and the SNR to drop below a preset noise reduction threshold, the module immediately triggers the system to enter a "high-noise robust mode." This low-level triggering mechanism based on the acoustic environment provides a physical basis for solving the problem of "unclear hearing" on motorcycles in high-speed scenarios.
[0026] Next is the dynamic response logic of the offline semantic parsing module. This module receives noise-reduced audio data and environmental signal-to-noise ratio control signals from the acoustic front-end perception module. Internally, it pre-stores at least two different levels of recognition instruction libraries: one is a standard instruction library covering all functions of navigation, entertainment, communication, and vehicle control; the other is a simplified instruction library that has been trimmed, retaining only core high-frequency instructions and trained with a specific acoustic model. When a control signal representing a high-noise environment is received, the offline semantic parsing module automatically unloads the standard instruction library and seamlessly switches to the simplified instruction library. The advantage of this strategy is that by significantly reducing the search space, it can significantly reduce the false recognition rate in extremely low signal-to-noise ratio environments, ensuring that even in windy cycling scenarios, the system can still accurately capture key phrases such as "home" and "answer the call," thus achieving full-speed-range interactive usability and overcoming the deficiency of existing technologies that can only be forced to shut down voice functions at high speeds.
[0027] The core safety logic of the system is jointly implemented by the vehicle status monitoring module and the safety interlock decision module. The vehicle status monitoring module is physically connected to the motorcycle's body control network via a CAN bus transceiver, polling and parsing in real time all-dimensional driving parameters, including vehicle speed, gear position, tilt sensor data, and ABS status. These parameters are transmitted to the safety interlock decision module in real time. This module stores a preset status permission mapping table, which defines the execution permissions of each voice operation intention under different vehicle physical states. When the offline semantic parsing module outputs an operation intention (such as "open the seat compartment"), the safety interlock decision module does not immediately allow it, but immediately retrieves the current vehicle speed and gear data. If the vehicle is detected to be in motion (i.e., the speed is greater than zero), according to the logic of the mapping table, the instruction will be judged as a "high-risk physical opening operation," and the decision module will then generate an interception signal to prevent the instruction from being issued, thus preventing the seat compartment from popping open during driving due to misoperation and causing an accident.
[0028] Furthermore, a "secondary confirmation" mechanism for specific scenarios is also a crucial function of the safety interlock decision module. When a user issues commands that may alter the vehicle's dynamic characteristics, such as "switch sport mode" or "adjust power output," if the vehicle status monitoring module indicates that the current speed is in the high-speed range or the tilt sensor shows that the vehicle is in a cornering posture, the safety interlock decision module will determine the command's authorization as "requires confirmation." In this case, the system will not immediately execute a mode switch but will instead trigger the adaptive feedback execution module to initiate a query. This design fully considers the high risk inherent in motorcycle driving, preventing driver errors due to sudden power changes when the driver is highly focused or the vehicle's posture is unstable, thus achieving a perfect balance between intelligence and safety.
[0029] Finally, the adaptive feedback execution module forms the closed-loop terminal for human-machine interaction. This module is responsible for two aspects: firstly, converting legally approved commands into standard CAN control messages and sending them to the vehicle's actuators (such as the lighting controller and electronic fuel injection system); secondly, it provides feedback to the driver via TTS (Text-to-Speech) technology. To address the issue of "inaudible" feedback during high-speed riding, this module also receives environmental signal-to-noise ratio signals from the acoustic front-end perception module. As ambient noise increases, the module automatically increases the output gain of the speech synthesis and simultaneously reduces the speech rate, even automatically simplifying long sentences into short phrases to ensure the information can effectively penetrate wind noise and be received by the driver. Simultaneously, this module executes closed-loop verification logic. After sending control commands, it verifies the success of the action by reading the actuator feedback messages returned by the vehicle status monitoring module. If the status remains unchanged after the command is sent, a fault warning is promptly broadcast, thus ensuring the reliability of the entire control chain.
[0030] Example 2 according to Figure 2 As shown, this invention focuses on the hardware circuit implementation structure of the aforementioned motorcycle offline voice interaction system. To meet the stringent requirements of real-time response and high reliability in the vehicle environment, the core control circuit of this system adopts a dual-core distributed architecture, consisting of a dedicated digital signal processing chip (DSP) for acoustic signal processing and a main control microcontroller unit (MCU) responsible for logic decision-making and vehicle communication. This dual-core design physically decouples "perception" and "decision-making." The DSP and its peripheral circuits correspond to the aforementioned acoustic front-end perception module, while the main control microcontroller unit integrates the functions of the offline semantic parsing module and the safety interlock decision-making module. The two are physically connected via a high-speed digital audio interface (I2S bus) and a universal asynchronous receiver / transmitter interface (UART serial port). Specifically, the I2S bus is dedicated to transmitting high-fidelity digital audio streams after noise reduction processing, ensuring that the semantic parsing module can obtain clear acoustic features; while the UART serial port or SPI bus serves as a control link, specifically used to transmit environmental signal-to-noise ratio values calculated by the DSP, wake-up interrupt signals, and mode switching instructions. This discrete bus topology ensures that the transmission of large amounts of audio data will not block the real-time arrival of critical control signals, providing underlying hardware support for the system to complete the "high noise mode" switching within milliseconds.
[0031] In the acoustic front-end sensing circuit, to overcome wind noise and engine electromagnetic interference during high-speed motorcycle operation, the circuit design employs a multi-channel analog microphone input stage and an independent power supply filtering network. The analog input pins of the digital signal processing chip are connected to microphone arrays deployed in specific locations on the helmet or dashboard. Each microphone input is equipped with a low-pass filter and impedance matching circuit consisting of capacitors and resistors to filter out high-frequency radio frequency interference. Crucially, this section's power supply circuit uses a low-dropout linear regulator (LDO) for independent power supply, physically isolating the fluctuating noise generated by engine ignition in the vehicle's power supply, ensuring the high purity of the analog audio signal input to the digital signal processing chip. Furthermore, the digital signal processing chip is connected to the audio output circuit via a specific reference signal pin, forming a hardware echo cancellation (AEC) reference link. This allows the chip to acquire the sound emitted by the speaker in real time at the hardware level and cancel it out from the mixed signal acquired by the microphones. This ensures that the system can accurately listen to the user's interruption commands during playback feedback, achieving the circuit foundation for full-duplex interaction.
[0032] In terms of vehicle status monitoring and communication circuits, this embodiment achieves deep physical integration with the vehicle's electronic and electrical architecture through the onboard controller local area network (CAN bus) interface circuit. The main control microcontroller integrates a CAN controller, whose transmit pin (TX) and receive pin (RX) are connected to an independent CAN bus transceiver chip. The outputs of this transceiver chip, namely the CAN high-level terminal (CAN_H) and CAN low-level terminal (CAN_L), are connected to the motorcycle's main wiring harness via twisted-pair cables. To prevent high-voltage pulses generated by the motorcycle's ignition coil from damaging the core circuitry, transient voltage suppression diodes and common-mode inductors are connected in parallel at the CAN bus interface. Through this physical interface, the main control microcontroller can directly read engine speed, vehicle speed data converted from wheel speed sensor pulses, and digital messages from the tilt sensor at extremely high frequencies. This hard-connection method completely eliminates the traditional consumer-grade product's reliance on Bluetooth to forward vehicle data, eliminating data transmission delays and ensuring that the "safety interlock decision module" can obtain absolutely real-time vehicle dynamics status, providing zero-latency hardware protection for "intercepting dangerous commands."
[0033] Finally, the design of the adaptive feedback execution circuit directly reflects the system's hardware-level response capability to environmental noise. This part of the circuit mainly consists of a Class D power amplifier chip and its peripheral gain control network. The audio output pin (DAC) of the main control microcontroller unit is connected to the signal input terminal of the power amplifier, while the pulse width modulation output pin (PWM) or general purpose input / output pin (GPIO) of the main control microcontroller unit is connected to the gain control terminal or mute control terminal of the power amplifier. Unlike traditional fixed-gain circuits, the circuit in this embodiment allows the main control microcontroller unit to dynamically adjust the duty cycle of the PWM pin according to the aforementioned obtained environmental signal-to-noise ratio value, thereby changing the bias voltage or gain factor inside the power amplifier. This means that when the ambient wind noise increases, the system does not simply increase the amplitude of the digital audio signal (which may lead to clipping distortion), but physically increases the amplification factor of the hardware circuit, thereby driving the high-power speaker connected to the output terminal to produce a feedback sound with a higher sound pressure level. Meanwhile, the output of the power amplifier also leads out a current detection feedback signal to the analog-to-digital converter (ADC) interface of the main control microcontroller unit, which is used to monitor the working status of the speaker at the circuit level. Once an open circuit or short circuit is detected, the circuit can immediately trigger the self-protection logic and report the hardware fault, completing a complete closed loop from instruction issuance to physical execution and then to circuit status readback.
[0034] Example 3 according to Figure 3 As shown, this invention details the specific execution flow of a motorcycle offline voice interaction method based on multi-source state fusion. This method relies on the hardware system described in the aforementioned embodiments. Its core logic lies in changing the traditional linear single path of "recognition first, execution later" in voice interaction, and constructing a composite processing link of "parallel monitoring of environmental perception and vehicle status, and dynamic matching of recognition strategies and security permissions." After the process starts, the system is not in a static waiting state, but immediately enters a parallel dual-thread data acquisition stage. The first thread focuses on real-time capture of the acoustic environment, continuously acquiring external audio signals through a microphone array, which is used not only for subsequent voice recognition but also for real-time environmental noise assessment. Simultaneously, the second thread synchronously reads the vehicle's real-time driving status data through the CAN bus interface in a high-frequency polling manner, including but not limited to vehicle speed pulse signals, gear status, tilt sensor values, and battery power information. This parallel acquisition mechanism ensures that the system has already grasped the current physical environment and vehicle dynamics background when it receives a user's voice command at any time, providing a zero-latency data foundation for subsequent "fusion decision-making."
[0035] While completing data acquisition, the system immediately executes an environmental adaptive preprocessing step, a key step in addressing the "unclear hearing" problem in high wind noise environments on motorcycles. The system's internal processor calculates the environmental signal-to-noise ratio (SNR) of the current audio signal in real time and compares this value with preset high-noise criteria. If the calculated SNR is higher than the preset threshold, indicating a relatively quiet environment at rest or low speed, the system loads a full-featured instruction library and a standard acoustic model, supporting complex natural language navigation queries or multi-turn dialogues. However, if vehicle acceleration causes a surge in wind noise, causing the SNR to drop below the high-noise criteria, the system immediately triggers a dimensionality reduction defense strategy, forcibly unloading the standard instruction library and loading a preset high-noise robust simplified instruction library. This simplified instruction library contains only high-frequency phrases for core driving control (such as "home" and "answer"), and significantly reduces the complexity of the search space by eliminating easily confused words. Through this dynamic switching mechanism, this method ensures that even under extreme wind noise interference, although some non-essential entertainment interaction functions are sacrificed, the recognition channel of the core control commands can still be maintained robustly, achieving full-speed domain interaction continuity.
[0036] Once the system successfully parses the user's operational intent in the aforementioned adaptive mode, the process enters the most crucial safety interlock decision step. At this point, the system no longer generates instructions solely based on semantic content; instead, it performs multi-dimensional matching by placing the parsed operational intent against real-time vehicle driving status data synchronously acquired by the second thread into a pre-defined state permission mapping table. The core of this step lies in establishing a logical interlock between semantic intent and physical state. For example, when a user issues an operational intent involving physical mechanisms (such as "open the seat cover" or "open the fuel tank cap"), the system checks the current vehicle speed data. If the speed is not zero, the system will directly determine the control permission for the operation as "prohibited" based on the mapping table's logic, generate an interception command to prevent the actuator from acting, and simultaneously provide a "No operation while driving" warning. Conversely, if the user issues a necessary driving command such as lights or horn, even at high speeds, the mapping table will determine the permission as "permitted," thus generating an execution command. This step effectively separates "understanding" from "whether or not to do it," preventing dangerous driving behavior from the algorithmic level.
[0037] For certain commands in a gray area between safety and danger, this method further introduces an interactive loop step to handle permission categories that require secondary confirmation. When the matching result shows that the user's intent involves changing the vehicle's dynamic characteristics (such as "switching to Sport mode" or "disabling traction control"), and the vehicle is currently in a complex condition of high-speed cruising or steep tilt, the system will neither execute nor reject the command directly, but will instead classify it as a "sensitive operation." At this point, the process enters a human-machine negotiation loop: the system first controls the voice synthesis module to play a question audio (such as "The current speed is high, are you sure you want to switch?"), and then opens a short listening window lasting several seconds. During this window, the system focuses on capturing the user's response audio. If a clear affirmative confirmation command is parsed, the system will finally generate the execution command and exit the loop; if the user remains silent or issues a negative command during the window, the system will automatically cancel the operation request. This mechanism gives the driver the final decision-making power in special conditions, while preventing safety hazards caused by accidental touches by increasing the interaction cost.
[0038] The final stage of the process is the execution and feedback step, which also demonstrates the invention's adaptability to the environment and the rigor of its control results. While the execution command is sent to the vehicle's actuators via the CAN bus, the system again calls upon the real-time environmental signal-to-noise ratio data acquired by the first thread to adjust the acoustic parameters of the voice feedback. In noisy scenarios, the system automatically increases the gain of the speech synthesis and lengthens the pronunciation duration (reducing the speech rate) to ensure that the feedback information can penetrate wind noise. More importantly, this step also includes a closed-loop verification sub-stage: after issuing the command, the system reads the status message of the corresponding actuator via the CAN bus (e.g., reading the current status of the turn signal or the mode feedback from the instrument panel). The system compares this physical feedback status with the initial target status of the execution command; only when they match will it announce "execution successful"; if they do not match (e.g., a light not turning on due to a wiring fault), the system will immediately announce a fault warning. This closed-loop design ensures that the driver can not only control the vehicle via voice but also know for sure whether the vehicle has actually executed the command, greatly improving the credibility and safety of the interactive system.
[0039] Example 4 This embodiment, combining the aforementioned system architecture, circuit principles, and method flow, further illustrates the specific integration and application of the present invention in a real motorcycle environment. In practical engineering implementation, the offline voice interaction system described in this invention is typically packaged and integrated inside the motorcycle's TFT smart instrument panel, or installed as a separate smart vehicle terminal box (T-BOX) inside the front fairing. To achieve optimal sound pickup, the hardware of the acoustic front-end sensing module employs a linear array composed of four high signal-to-noise ratio MEMS microphones, directly mounted on the driver-facing side of the upper edge of the instrument panel, or wirelessly extended to the inside of the driver's helmet. The system obtains 12V vehicle power through the motorcycle's standard OBD interface or body wiring harness connector, and connects to the vehicle's controller area network bus in parallel via CAN_H and CAN_L signal lines, thus becoming a standard node in the vehicle's electronic and electrical architecture.
[0040] When the driver turns the key or presses the keyless start button, the motorcycle performs a power-on self-test, and the system initializes accordingly. At this time, the main control microcontroller activates the digital signal processing chip via the I2S interface and begins continuously collecting background noise. In scenarios where the vehicle is idling or driving at low speeds in urban areas (e.g., below 30 km / h), the environmental signal-to-noise ratio calculated by the digital signal processing chip is typically at a high level (e.g., greater than 15 dB). The system then automatically loads a full-featured natural language understanding model containing tens of thousands of data points. The driver can issue complex compound commands, such as "navigate to the nearest gas station and play rock music," as if in a conversation. The system accurately parses the "navigation" and "music" intentions within this long sentence, plans a route using a local offline map engine, and simultaneously plays locally stored media files. The entire process is smooth and natural, consistent with the experience of existing high-end car cabins.
[0041] As vehicles enter expressways or highways and speeds climb above 80 km / h, wind and tire noise increase exponentially, and the environmental signal-to-noise ratio drops sharply (e.g., below 5 dB). The digital signal processing chip not only executes beamforming algorithms to suppress lateral wind noise, but more importantly, it sends a "high noise warning" interrupt signal to the main control microcontroller unit in real time. Upon receiving the signal, the main control microcontroller unit immediately executes a "degradation and survival" strategy, switching the ASR engine from full-function mode to a "minimalist riding mode" containing only a few dozen core words. At this time, the system may not recognize complex sentences, but its recognition weight for high-frequency short commands such as "answer the phone," "increase the volume," and "go home" is greatly enhanced. This strategic "function degradation" is extremely practical in actual riding because it ensures that even in the worst conditions, the most fundamental functions related to the riding experience remain robust, completely solving the engineering pain point of traditional vehicle infotainment systems becoming completely unresponsive at high speeds.
[0042] Regarding driving safety management, this embodiment strictly implements interlock logic based on physical states. Taking the motorcycle-specific "electronically open seat bucket" function as an example, in actual circuits, this function is usually implemented by the Body Control Module (BCM) driving a relay to pull a steel cable. When the driver accidentally presses the voice button while driving, or when the system misinterprets the "open seat bucket" command due to environmental noise, the system's safety interlock decision module will read the real-time wheel speed data on the CAN bus at millisecond speeds. As long as the vehicle speed pulse is detected to be non-zero, the system will resolutely block the issuance of the command and the TTS will announce "Do not open while driving". Similarly, for commands such as "Switch to Sport Mode", which may cause a sudden change in engine torque output, if the system detects that the vehicle tilt sensor value exceeds 15 degrees (in a cornering state), the system will determine that changing the power characteristics at this time may cause the tires to slip and crash, so it will temporarily suspend the command or require the driver to straighten the vehicle and confirm again. This logic, which is deeply integrated with vehicle dynamics, cannot be achieved by general mobile phone voice assistants or aftermarket Bluetooth headsets.
[0043] Finally, in the closed-loop feedback stage of command execution, this system solves the "feedback confirmation" problem through a combination of hardware and software. When the system executes the "turn on left turn signal" command, the main control microcontroller not only sends a CAN message to the vehicle network but also continuously monitors feedback frames on the bus regarding the "left turn signal status," and even detects the current load of the flasher circuit via the ADC pin. Only after confirming that the turn signal is physically flashing will the system issue a voice announcement. Furthermore, the volume of this announcement is dynamic—under high-speed wind noise, the gain of the power amplifier circuit is automatically increased, and combined with a "high-penetration" tone library specifically tuned for riding, it ensures that the rider can clearly hear the "left light is on" confirmation tone even through their helmet. This series of logical closed loops designed based on real-world operating conditions constitutes a solid foundation for the industrial applicability of this invention.
Claims
1. A motorcycle offline voice interaction system, characterized in that, The system includes: The acoustic front-end perception module is signal-connected to the offline semantic parsing module and the adaptive feedback execution module, and outputs the real-time environmental signal-to-noise ratio. The offline semantic parsing module receives the environmental signal-to-noise ratio and loads the corresponding recognition instruction library to parse the audio output operation intent; The safety interlock decision module is connected to the offline semantic parsing module and the vehicle status monitoring module respectively, and matches the operation intention with the real-time driving status data in the status permission mapping table. The adaptive feedback execution module connects the safety interlock decision module and the acoustic front-end perception module, executes interlock control commands, and adjusts the acoustic parameters of the feedback speech according to the environmental signal-to-noise ratio.
2. The motorcycle offline voice interaction system according to claim 1, characterized in that, The acoustic front-end perception module is configured with an environmental noise threshold. When the real-time environmental signal-to-noise ratio is lower than the environmental noise threshold, the acoustic front-end perception module sends a first control signal to the offline semantic parsing module to trigger it to switch to a preset high-noise robust simplified instruction library, and simultaneously sends a second control signal to the adaptive feedback execution module to trigger it to increase the gain value of speech synthesis and reduce the speech rate.
3. A motorcycle offline voice interaction system according to claim 1 or 2, characterized in that, The safety interlock decision module has a pre-stored state permission mapping table, which defines the corresponding logic between operation intent categories and vehicle speed ranges and tilt angle thresholds. When the operation intent is a physical opening command and the vehicle speed is greater than zero, the safety interlock decision module generates an interception signal. When the operation intent is a driving mode switching command and the vehicle speed is in the high-speed range, the safety interlock decision module generates a secondary confirmation request signal.
4. The motorcycle offline voice interaction system according to claim 1, characterized in that, After the adaptive feedback execution module sends the interlock control command via the CAN bus, it triggers the vehicle status monitoring module to collect the feedback status of the actuator within a preset time window. The adaptive feedback execution module compares the feedback status with the interlock control command, and if they are inconsistent, it generates a fault broadcast audio.
5. The motorcycle offline voice interaction system according to claim 1, characterized in that, The acoustic front-end sensing module includes a digital signal processing chip and its peripheral filtering circuit. The offline semantic parsing module and the safety interlock decision module are integrated in the microcontroller unit. The digital signal processing chip transmits the noise-reduced audio data to the microcontroller unit through the I2S bus, and transmits the environmental signal-to-noise ratio to the microcontroller unit in real time through the UART serial port or SPI bus.
6. A motorcycle offline voice interaction system according to claim 1 or 5, characterized in that, The CAN controller pin of the microcontroller unit is connected to the transceiver port of the CAN bus transceiver, and the high and low level terminals of the CAN bus transceiver are connected to the vehicle body control network to form the vehicle status monitoring module.
7. A motorcycle offline voice interaction system according to claim 1 or 5, characterized in that, The DAC output or PWM output of the microcontroller is connected to the input of the power amplifier circuit to form the adaptive feedback execution module; the microcontroller drives the power amplifier circuit to change the output voltage amplitude by adjusting the duty cycle of the output signal or the value of the gain register.
8. A method for offline voice interaction on a motorcycle, the method being applied to an offline voice interaction system for a motorcycle as described in claims 1 to 7, characterized in that, The method includes: Simultaneously acquire audio signals from the microphone array and vehicle driving status data from the CAN bus; The ambient signal-to-noise ratio is calculated based on the audio signal. When the ambient signal-to-noise ratio meets the preset high noise judgment condition, a simplified instruction library is loaded to parse the audio signal and obtain the operation intention. Otherwise, a full-function instruction library is loaded for parsing. By matching the operational intent with the vehicle driving status data in the status permission mapping table, the control permission is determined, and an execution command is generated when the control permission is allowed, and an interception command is generated when the control permission is prohibited. The vehicle is controlled to move in response to the execution command, and the gain and speech rate parameters of the voice feedback are adjusted according to the environmental signal-to-noise ratio.
9. A method for offline voice interaction on a motorcycle according to claim 8, characterized in that, The method controls the speech synthesis module to issue an inquiry audio and open a short-term listening window; Receive and parse the user's audio response within the short listening window; If the response audio matches a positive confirmation instruction, the execution instruction is generated and the interaction loop is exited; if the response audio matches a negative or invalid instruction, the process is terminated.
10. A method for offline voice interaction on a motorcycle according to claim 8, characterized in that, The method reads the feedback message of the corresponding actuator through the CAN bus; The physical state contained in the feedback message is compared with the target state of the execution instruction; When the comparison results are inconsistent, a fault prompt audio is generated and played.