Method for generating in-vehicle ambient music, system, and electronic device
By acquiring images of the vehicle exterior and the driver's status to generate personalized ambient music, the system solves the problem that existing in-vehicle audio systems cannot automatically adjust the music atmosphere, thus achieving a personalized and comfortable in-vehicle listening experience.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- AMPERE SAS
- Filing Date
- 2025-12-22
- Publication Date
- 2026-07-02
AI Technical Summary
Existing in-vehicle audio systems cannot automatically adjust the music atmosphere according to the driver's mood or changes in the environment, and the music recommendation system cannot accurately meet the user's needs, resulting in an inefficient user feedback mechanism.
By acquiring real-time images of the outside of the vehicle, parameters of the in-vehicle environment, and the driver's status, a generative AI algorithm is used to generate personalized ambient music, which is then matched with the ambient lighting configuration. The music experience is adjusted in conjunction with a user feedback optimization mechanism.
It enables the generation of personalized ambient music based on driving scenarios and driver status, enhancing the personalization and comfort of the in-car listening experience and strengthening the driver's immersive driving experience.
Smart Images

Figure CN2025144227_02072026_PF_FP_ABST
Abstract
Description
Methods, systems and electronic devices for generating in-vehicle ambient music Technical Field
[0001] This invention relates to the field of intelligent cockpit control technology, specifically to a method, system, and electronic device for generating in-vehicle ambient music. Background Technology
[0002] With the continuous development of technology, automotive intelligence has become an important trend in the industry. As an important component of intelligent vehicles, the intelligent cockpit integrates a variety of advanced electronic devices and functions, among which in-car ambient music plays a crucial role in enhancing the driving and riding experience.
[0003] Current in-vehicle audio systems typically only offer basic playback and volume control functions, lacking the ability to automatically adjust the music atmosphere based on the driver's mood or changes in the environment. Meanwhile, current music recommendation systems fail to fully meet diverse user needs, offering only broad music recommendations that are difficult to precisely match the user's requirements in specific driving scenarios. Furthermore, user feedback mechanisms are inefficient, and the system cannot promptly and accurately adjust recommendations based on user feedback.
[0004] Generative algorithms hold immense potential in personalized content creation, enabling them to more accurately uncover user information needs and overcome the limitations of existing recommendation systems. By generating personalized music, an immersive in-car atmosphere can be created, and users can also provide real-time feedback via voice and touch commands, allowing them to express their needs more precisely. Therefore, there is an urgent need to combine generative algorithms with personalized and intelligent ambient music in smart cockpits to provide drivers with an immersive driving experience. Summary of the Invention
[0005] This invention proposes a method, system, and electronic device for generating in-vehicle ambient music, aiming to overcome one or more of the above-mentioned technical problems and / or other technical problems in the prior art.
[0006] According to one aspect of this application, a method for generating in-vehicle ambient music is proposed, comprising the following steps:
[0007] Acquire real-time images of the outside of the vehicle, parameters of the interior environment, and the driver's status;
[0008] The driving scenario and external environment parameters are determined based on the acquired real-time images of the vehicle exterior.
[0009] Feature vectors are generated based at least on the driving scenario, the real-time images outside the vehicle, the parameters of the in-vehicle environment, and the driver's state.
[0010] Core keywords are determined based on the generated feature vectors;
[0011] Ambient music is generated based on the core keywords using a generative AI algorithm;
[0012] Match the generated ambient music playlist with the ambient lighting configuration.
[0013] Another aspect of this application proposes an in-vehicle ambient music generation system, which includes:
[0014] The detection module is used to acquire real-time images of the outside of the vehicle, parameters of the interior environment, and the driver's status.
[0015] The feature vector generation module is used to generate feature vectors based at least on the driving scenario and the real-time images outside the vehicle, the parameters of the in-vehicle environment, and the driver's state.
[0016] The core keyword determination module is used to determine core keywords based on the generated feature vectors.
[0017] An ambient music generation module is used to generate ambient music based on the core keywords using a generative AI algorithm.
[0018] The ambient light control module is used to match the ambient music playlist with the ambient light configuration and drive the ambient light accordingly.
[0019] This application also proposes an electronic device including multiple sensors, a storage unit, a processing unit, and an ambient light controller, wherein the sensors, storage unit, and ambient light controller are respectively signal-connected to the processing unit, and the processing unit is configured to execute any of the above-described methods for generating in-vehicle ambient music.
[0020] Finally, this application proposes a vehicle having any of the aforementioned in-vehicle ambient music generation system and / or the aforementioned electronic devices.
[0021] In the proposed solution, real-time data on the in-vehicle and out-of-vehicle environments, along with the driver's physiological or psychological state data (such as real-time external images, temperature, heart rate, steering wheel grip strength, etc.), are collected. This data, combined with the driving scenario and driver preferences, is used to determine core keywords based on predetermined rules. A generative algorithm is then employed to generate personalized ambient music. Furthermore, the generated ambient music is synchronized with the in-vehicle ambient lighting, automatically adjusting its effects according to the music's rhythm, mood, and the scene. Simultaneously, a user feedback optimization mechanism is used to provide users with the best listening experience, significantly enhancing the personalization and comfort of the in-vehicle listening experience. Attached Figure Description
[0022] The above and other features and advantages of the present invention will become more apparent from a detailed description of exemplary embodiments thereof with reference to the accompanying drawings.
[0023] Figure 1 is a flowchart of a preferred embodiment of a method for generating in-vehicle ambient music according to the present invention;
[0024] Figure 2 is a flowchart of the method for generating in-vehicle ambient music according to the present invention from the perspective of data processing;
[0025] Figure 3 is a schematic diagram of the feedback optimization process in the method for generating in-vehicle ambient music according to the present invention;
[0026] Figure 4 is an architectural diagram of the electronic device according to the present invention. Detailed Implementation
[0027] Exemplary embodiments will now be described more fully with reference to the accompanying drawings. However, these exemplary embodiments can be implemented in many forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided to make the content of the invention comprehensive and complete, and to fully convey the concept of the exemplary embodiments to those skilled in the art. In the drawings, the dimensions of some elements may be exaggerated or modified for clarity. The same reference numerals in the drawings denote the same or similar structures, and therefore their detailed description will be omitted.
[0028] Furthermore, the described features, structures, or characteristics can be combined in any suitable manner in one or more embodiments. Numerous specific details are provided in the following description to give a full understanding of embodiments of the invention. However, those skilled in the art will recognize that the invention can be practiced without one or more of the specific details described, or other methods, elements, etc. In other instances, well-known structures, methods, or operations are not shown or described in detail to avoid obscuring aspects of the invention.
[0029] Figure 1 shows a flowchart of a preferred embodiment of the method for generating in-vehicle ambient music according to the present invention. In this embodiment, the method for generating in-vehicle ambient music mainly includes the following steps:
[0030] S1: Acquire real-time images of the outside of the vehicle, parameters of the in-vehicle environment, and the driver's status;
[0031] S2: Determine driving scenario and external environment parameters based on the acquired real-time images outside the vehicle;
[0032] S3: Generate a feature vector based at least on the driving scenario and the real-time images outside the vehicle, the parameters of the in-vehicle environment, and the driver's state;
[0033] S4: Determine the core keywords based on the generated feature vectors;
[0034] S5: Generative AI algorithm is used to generate ambient music based on the core keywords;
[0035] S6: Match the generated ambient music list with the ambient light configuration.
[0036] In step S1, the detection module acquires real-time images of the outside of the vehicle, in-vehicle environmental parameters, and driver status. In-vehicle environmental parameters include, for example, in-vehicle temperature and in-vehicle noise levels. Driver status refers to the driver's physiological or psychological state, including driver heart rate, steering wheel grip strength, respiratory rate, and ground-glass response (GSR). The detection module is a multimodal detection device, which includes, for example, multiple sensors, such as a camera for detecting real-time images of the outside environment, a temperature sensor for detecting in-vehicle temperature, a heart rate sensor for detecting driver heart rate, and a force sensor for detecting driver steering wheel grip strength. Other parameters inside and outside the vehicle, such as radar information, steering wheel angle, vehicle speed, braking frequency, respiratory rate, GSR, in-vehicle noise level, in-vehicle conversations, and real-time in-vehicle images, can also be collected or considered together. Therefore, the detection device also includes corresponding sensors or is connected to corresponding sensor signals.
[0037] In the next step S2, the data input to the detection module is processed. For example, driving scene recognition is used to determine the driving scene from real-time images outside the vehicle, and a convolutional neural network (CNN) is used to extract external environmental parameters from the real-time images, including at least weather features, road structure features, and traffic sign features. A convolutional neural network (CNN) is a deep learning model that excels in image recognition, video analysis, and natural language processing. CNNs use convolutional layers to extract local features from image data, then use pooling layers to reduce the spatial dimensionality of the features, and finally use fully connected layers for classification or regression tasks. Here, scene recognition and feature extraction can be performed solely based on real-time images outside the vehicle, or radar information can be considered additionally. Considering radar information allows for reliable acquisition of external vehicle information even in situations with poor visibility and limited camera coverage.
[0038] In step S3, feature vectors are generated based at least on the driving scene, real-time external images, in-vehicle environmental parameters, and driver state. For example, firstly, scene-specific weighting rules are determined based on the driving scene; then, feature vectors are generated based on these rules, according to the real-time external images, in-vehicle environmental parameters, and driver state. The driving scene is determined by recognizing real-time external images. Here, driving scene recognition is also based on deep learning image analysis technology, mainly used to automatically identify scenes in images, including object categories, positions, and poses. The core principle of the scene recognition algorithm is to extract and classify features from images using a deep learning model. Convolutional Neural Networks (CNNs) can also be used in the scene recognition algorithm, especially object detection methods based on Region Proposal Networks (RPNs), thus achieving a good balance between detection accuracy and speed.
[0039] Based on the scene recognition results, driving scenarios can generally be categorized into the following preset scenarios: night driving, highway driving, driving in severe weather, traffic congestion, urban driving, and mountain driving. For each driving scenario, a scene priority score is assigned to each input parameter. For undefined scenarios, the system will automatically classify them based on the current data input and match them to the closest preset scenario. In other words, based on the driving scene recognition results, the current driving scenario is matched to the closest preset scenario, and the scene priority score for each parameter is determined based on the closest preset scenario. Scene priority scores can be derived from experiments, empirical data, or computer simulations. Table 1 shows an optimal scheme for the priority scores of each parameter under various scenarios.
[0040] Table 1 Scene Priority Scoring
[0041] Scenario-specific priority scores are fixed, ensuring that for specific situations (high-speed driving, inclement weather, night driving), specific data types (heart rate, real-time images) consistently have high base priority scores. This guarantees that the driver's physiological, psychological, and environmental feedback are given priority consideration in these driving scenarios or situations. Here, priority scoring is a relative evaluation standard, reflecting the differences in importance of different data types or parameters through higher scores, that is, reflecting the weight of individual data within the entire dataset. For night driving: Low visibility and dim lighting environments increase driver fatigue, therefore image data has a higher weight, increasing the proportion of upbeat music to improve driver alertness. Heart rate data is used to assess fatigue and tension levels, and to fine-tune volume and style. For high-speed driving: Steering wheel grip strength and heart rate stability are crucial, especially since tension is easily aroused at high speeds. A smoother, lower-pitched music is generated to help the driver remain calm and focused. Therefore, these two data points have higher weights. For inclement weather: Visibility has a more significant impact, image data is given higher weight, and more soothing music is generated to reduce driver anxiety caused by low visibility and poor weather conditions. Grip force data can be used to adjust the rhythm and volume of the music. In traffic congestion: Drivers are prone to anxiety or fatigue, especially while waiting at a stop. Therefore, changes in grip force better reflect driver mood swings, generating more relaxing music to alleviate driver anxiety. In urban driving: Real-time images are given higher weight for recognizing traffic signals and road signs. The music rhythm adjusts with traffic flow, and temperature and heart rate are used to fine-tune volume and style to suit driving comfort. For mountain driving: Real-time images and heart rate are given high weight, and music generation needs to adapt to complex road conditions and driver stress levels, including more focused / relaxed music styles.
[0042] During the feature vector generation process, additional considerations can be made regarding the changing trends of real-time external images, in-vehicle environmental parameters, and driver status, as well as / or data validity. In other words, weights are adjusted based on these factors. The following are suggested scoring criteria for parameter trends and / or data validity.
[0043] Table 2 Scoring Criteria Based on Data Type and Trend
[0044] In Table 2 above, the threshold values are derived from experiments or experience. By scoring the changing trends of each parameter and the validity of the data, the weights can be dynamically adjusted based on real-time changes in the data under specific circumstances. For example, detecting key points in the data (extreme values, ranges, fluctuation frequencies) ensures that sudden environmental changes or physiological reactions are responded to in a timely manner. If key data is detected to exceed a specific threshold, the priority score of that data will be automatically increased according to preset rules.
[0045] In a favorable implementation, the weights of each parameter can be assigned using the following formula: w i =F i ·(α i ·P i +β i ·T i +γ i ·D i );
[0046] in,
[0047] w i The final weight of input parameter i;
[0048] i: The number of the input parameter, i = 1, 2, 3, 4 represent real-time image, in-vehicle temperature, heart rate, and steering wheel grip strength, respectively;
[0049] F i : Adaptive adjustment factor, where Fi takes the following values:
[0050] α i The scenario priority factor ensures that even if data scores low in the preset criteria (Table 2) due to its data type, its impact on the system can still be increased, thereby enhancing the system's responsiveness. For example, in actual driving, special situations or more subtle scenario changes may occur, requiring real-time adjustments to the data's importance. For instance, in a typical nighttime driving scenario, if a sudden downpour occurs, α can be increased based on this special circumstance. i The value of increases the importance of real-time image data;
[0051] β iThis is a real-time trend factor for parameter data, which adjusts the degree of influence of data change trends. If a data point changes rapidly (e.g., a sharp drop in temperature), the system will increase the value of this factor, indicating that the dynamic change in the current data is noteworthy. For example, if the heart rate rises from 70 beats / min to 120 beats / min within 30 seconds, β3 can be set to 0.9, indicating significant emotional fluctuations. For situations where short-term fluctuations are meaningless, such as a rapid increase in heart rate followed by a quick return to normal (e.g., a driver sneezing or being suddenly startled), these short-term fluctuations do not have a long-term impact on the system's decision-making. In this case, β3... i The value should be reduced;
[0052] γ i α: This parameter represents the importance factor of the data, reflecting the impact of data quality. Data quality can be affected by various factors. For example, the steering wheel grip force sensor may experience temporary malfunctions or interference, leading to a decrease in data quality. Based on these real-time changes in data quality, the importance factor of the steering wheel grip force data can be reduced to avoid unreliable data misleading the system. Here, α... i β i γ i Decimals whose values are all in the range [0, 1].
[0053] P i : Represents the scene priority score of the input parameter data (Table 1); T i : Input parameter data change trend score (Table 2); D i The comprehensive score of the input parameter data corresponds to the comprehensive score, temperature range, heart rate range, and grip strength in the scoring criteria in Table 2, reflecting the actual state of the data; among which, P i T is a decimal with a value range of [0, 2]. i and D i The value of a is a decimal in the range [0, 1]. i P i This indicates the fundamental importance of each data point in a specific scenario, used to distinguish which data is more suitable for the current situation. For example, in nighttime driving, real-time images and heart rate are more important than temperature. Using nighttime driving, highway driving, severe weather, traffic congestion, city driving, and mountain driving as examples, randomly assigned weight values are used for illustration; the actual scores are based on in-depth research and experiments.
[0054] The above formula is explained in detail below:
[0055] Real-time image brightness change: If the rate of change in image brightness is less than a 11 or higher than b 11 If the visibility is poor / too strong, then the factor F1 = n1 is considered; if the rate of change in brightness is within a... 11to b 11 If the brightness value is within a moderate range and shows no significant change, then factor F1 = n2; if the brightness value is within a moderate range and shows no significant change, then factor F1 = n3. For example, if the image brightness change rate is below -30% or above 30%, then the visibility is considered poor / excessively strong, and factor F1 = 2. The actual value needs to be based on long-term driving data analysis. Here, a 11 and b 11 This refers to the rate of change; the range of brightness change rate is -100% to 100%, with the brightness baseline set based on a minimum brightness of 0% and a maximum brightness of 100%; "constant" here means no significant fluctuation. The maximum brightness change rate during normal image display is considered 100%, and the moderate range for brightness change rate is set at -30% to 30%, where a brightness change rate between -5% and 5% can be considered as having no significant change. 11 to b 11 This excludes areas with no significant change. Brightness values are set from 0% to 100%, with a moderate brightness range of 30% to 80%.
[0056] Temperature fluctuation: When the temperature change is greater than a 22 When the temperature change is greater than b℃ / min, it is determined to exceed the threshold, and the factor F2 = n1; 22 When the temperature change is measured in ℃ / min, it is considered a large fluctuation, and the factor F2 = n2. For example, when the temperature change is greater than 10℃ / min, it is considered to exceed the threshold; when the temperature change is greater than 5℃ / min, it is considered a large fluctuation.
[0057] Heart rate fluctuations: If the heart rate change is greater than b 31 If the fluctuation is significant, then the factor F3 = n1; if the fluctuation amplitude is within a... 31 to b 31 Between these, the factor F3 = n2.
[0058] Steering wheel grip force change: If the change in steering wheel grip force is greater than b 42 If the unit (quantified as 0-100) is used, it is judged as having large fluctuations, and factor F4 = n1; if the change is less than b 42 Unit greater than a 42 Unit, factor F4 = n2.
[0059] In a preferred embodiment, the feature vector is defined as follows: Feature = [W1, W2, W3, W4]
[0060] W i These represent feature values for real-time image, temperature, heart rate, and steering wheel grip force, respectively. Each feature value is calculated from the final weight of that data. Specifically, the feature value W... i The final weights w of the input parameter i are respectively i The feature vector is obtained by concatenating the corresponding data. However, it is clear that the feature vector can contain feature values with more parameters.
[0061] In step S4, the core keywords are determined based on the generated feature vectors. For example, a recurrent neural network (RNN) can be used to calculate the probability of each category (core keyword) based on data from each time step (30 seconds) in the sequence, ultimately outputting the probability value of each category (core keyword) at the output layer, and selecting the category with the highest probability as the output result. A recurrent neural network (RNN) is a neural network model specifically designed for processing sequential data. Unlike traditional feedforward neural networks, RNNs have connections between nodes in each layer, allowing them to utilize previous information in the current output. This design makes RNNs excellent at processing serialized information such as natural language text or time series data. The core concept of an RNN is the hidden state, which captures the temporal dynamics of the sequence. At each time step, the RNN updates its hidden state, an update that depends not only on the current input but also on the hidden state from the previous time step. This mechanism allows the RNN to maintain an internal "memory," through which it can consider historical information of the sequence when making predictions or classifications.
[0062] For example, a core keyword mapping table can be used to match corresponding music tags based on core keywords. Music tags, for instance, describe the style, rhythm, tonality, and intensity of music. Style determines the emotional tone of music, rhythm determines the tempo, tonality determines the timbre, and intensity determines the volume and dynamic range. The core keyword mapping table is shown in Table 3, and more representative core keywords can be adjusted based on this table.
[0063] Table 3 Core Keyword Mapping Table
[0064] In step S5, a generative AI algorithm is used to generate ambient music based on the core keywords, more specifically, based on the music tags corresponding to the core keywords. Generative AI models work by using neural networks to analyze and identify patterns and structures in the training data. Using this understanding, they generate new content that both mimics human-like creation and expands upon the patterns in the training data. The functionality of these neural networks varies depending on the specific technology or architecture used. This includes, but is not limited to, Transformers, Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and diffusion models. The potential of generative AI algorithms in music composition is enormous, and they have already been able to create diverse, personalized, and emotionally rich music.
[0065] Finally, in step S6, the generated ambient music list is matched with the ambient lighting configuration. After inputting the core keywords, the generative algorithm automatically generates ambient music that matches the environment under different conditions and continuously adjusts the music type. The ambient control module, combined with the digital sound effects processing unit and ambient lighting, uses an inference model to match the generated music (text or the music itself), adaptively recognizing the music type, volume, frequency changes, brightness, and color. Based on known rules and data, Bayesian inference is used to infer the optimal lighting configuration from the input music data according to the probability distribution. This configuration is then linked to the working parameters (light color, brightness, frequency, and light effect) of the ambient lighting control system. Specifically, the music type matches the ambient lighting hue, the music rhythm is linked to the ambient lighting frequency changes, and the sound effects are linked to the light effects. Bayesian inference is a mathematical method for probabilistic judgment and decision-making under uncertain conditions. The core principle of Bayesian inference is: to derive a "prior probability" based on known information, and then, with each new piece of evidence, to use Bayes' formula and new clues to correct the prior probability, obtaining an updated "posterior probability". This correction process repeats continuously, introducing new information and gradually bringing the probability estimate closer to the true probability. Through iterative iteration, Bayesian inference allows us to dynamically adjust our judgment of the likelihood of an event while constantly absorbing new information. Ultimately, it not only provides the most probable conclusion but also quantifies the degree of confidence in that choice.
[0066] Figure 2 shows a flowchart of an advantageous embodiment of the method according to the invention from a data flow perspective. As can be seen from Figure 2, the multimodal data input includes real-time images of the vehicle exterior, in-vehicle temperature, driver's heart rate, and steering wheel grip force. Driving scene recognition can be performed based on the real-time images of the vehicle exterior, thereby determining the driving scene in which the vehicle is located. On the other hand, the real-time images of the vehicle exterior, in-vehicle temperature, driver's heart rate, and steering wheel grip force data are first preprocessed. Preprocessing includes dimensionality reduction of the real-time images and consistency checks for filtering and denoising all data, ensuring that the data are synchronized in terms of acquisition time and that the fluctuation range is within a reasonable range, effectively reflecting real-time mood and environmental changes. To improve data quality, for temperature, wavelet analysis is used on historical temperature data to extract short-term fluctuations (minute-level) and long-term trends (hour-level) of temperature changes. The Savitzky-Golay filtering method is used to maintain local polynomial fitting in the temperature data, preserving signal features (peak value and width) while removing noise. This method is better at preserving data trends than moving average filtering during rapid temperature changes (such as transitioning from a low-temperature to a high-temperature environment). Outliers (sudden spikes or drops in heart rate readings) in the heart rate data collection were identified and filtered out to remove potential noise interference and ensure a stable heart rate trend. For grip strength data, only the maximum and minimum values of significant changes were retained, while the median was removed to reduce data redundancy and improve the accuracy of determining tension or relaxation states. Scene-specific priority scoring rules were determined based on the vehicle's driving scenario (as shown in Table 1). Furthermore, scores were awarded based on the changing trends of each parameter and the validity of the data, with weights dynamically adjusted in real-time according to changes in the data within specific contexts.
[0067] When the driver state indicated in the driving scenario is inconsistent with the acquired driver state, the conflict mechanism can be used to dynamically and incrementally adjust the factor Δw according to the following formula. i To adjust the weights: w i =w i +Δw i ;
[0068] Among them, Δw is determined based on driving scenarios and real-time data changes. i The value of Δw. That is, when the detected driver state conflicts with the driver state indicated in the driving scenario (e.g., sunny weather and high heart rate occurring simultaneously), the factor Δw is dynamically and incrementally adjusted. i To adjust the weights.
[0069] From a security perspective, we construct a model to adjust Δw under different conflict scenarios. i The following is the dynamic incremental adjustment factor Δw. i An example of the adjustment rules.
[0070] Table 4 Δwi Adjusting rules
[0071] After determining or correcting the weight values of each input parameter, a feature vector is built based on these weight values. Core keywords are then determined based on these feature vectors. After the core keywords match music tags, a generative AI algorithm is used to generate ambient music based on the music tags matched by the core keywords.
[0072] Figure 3 is a schematic diagram of the feedback optimization process in the method for generating in-vehicle ambient music according to the present invention. As shown in Figure 3, after the core keywords are input into the generative AI module, ambient music and ambient lighting configurations are generated. The display interaction module provides system control functions to the user through voice commands or a touch screen. The user feedback module obtains user feedback by detecting the user's manual operations, multimodal recognition, and / or asking the user questions, and optimizes the generation of ambient music based on the user feedback.
[0073] The system records user behavior data and emotional responses for a specific period after the generated in-car ambient music begins playing. Based on user feedback, it updates historical data to generate a new training dataset and optimizes the in-car ambient music generation strategy using a self-learning algorithm. For example, users can input specific music preference commands via voice or touch. The system responds instantly to these commands, adjusting the style and type of music to ensure users always have control over the system and meet their personalized needs. This interaction mechanism allows users to communicate with the system more intuitively and achieve more efficient operation. Simultaneously, the user feedback module generates an emotional score by recording user behavior data and emotional responses. The system collects physiological data (heart rate) and interaction data (user adjustments to volume and music type), quantifying them as emotional feedback. After each system adjustment, an emotional score is generated based on user responses and stored in the core database of the central processing module. This score is then analyzed in conjunction with historical data, and the generation strategy is optimized through a self-learning algorithm, adjusting the music selection strategy in real time to adapt to users' personalized needs. The specific feedback mechanism involves the system recording user behavior data and emotional responses in each scenario in real time. Emotional responses can be quantified through physiological data and interactive feedback. Each time the system generates music and ambient lighting adjustments, it produces an emotional score based on the user's reaction. When the music's rhythm adjustment helps lower the user's heart rate, the system records a high emotional score, indicating that the current adjustment meets the user's needs. All user feedback emotional scores are stored and analyzed in conjunction with historical data. After learning from the feedback mechanism each time, the generation algorithm will adjust itself based on the previous feedback results in the next adjustment. When the user's reaction is unsatisfactory, the system will automatically correct the model and adjust the music selection strategy to adapt to the personalized needs of different users.
[0074] The electronic device 400 according to this embodiment of the present application will now be described with reference to FIG4. The electronic device 400 shown in FIG4 is merely an example and should not impose any limitation on the functionality and scope of use of the embodiments of the present application.
[0075] As shown in Figure 4, the electronic device 400 is manifested as a computing device. Components of the electronic device 400 may include, but are not limited to: at least one processing unit 410, at least one storage unit 420, a bus 430 connecting different system components (including the storage unit 420 and the processing unit 410), an interactive display unit 440, etc. The storage unit stores program code that can be executed by the processing unit 410, causing the processing unit 410 to perform the steps described in the method for generating in-vehicle ambient music according to various exemplary embodiments of this application. For example, the processing unit 410 may perform the steps shown in Figure 1.
[0076] The storage unit 420 may include a readable storage medium in the form of a volatile storage unit, such as a random access memory unit (RAM) 4201 and / or a cache storage unit 4202, and may further include a read-only memory unit (ROM) 4203.
[0077] The storage unit 420 may also include a program / utility 4204 having a set (at least one) program module 4205, such program module 4205 including but not limited to: an operating system, one or more application programs, other program modules and program data, each or some combination of these examples may include an implementation of a network environment.
[0078] Bus 430 can represent one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local bus using any of the various bus structures.
[0079] Electronic device 400 can also communicate with one or more sensors 500 (e.g., cameras, temperature sensors, force sensors, heart rate sensors, etc.), and with one or more devices that enable user interaction with electronic device 400, and / or with any device that enables electronic device 400 to communicate with one or more other computing devices (e.g., routers, modems, etc.). This communication can be performed via input / output (I / O) interface 450. Furthermore, electronic device 400 can also communicate with the vehicle's ambient lighting via ambient lighting controller 460 to control the ambient lighting in coordination with ambient music. It should be understood that, although not shown in the figures, other hardware and / or software modules can be used in conjunction with electronic device 400, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems.
[0080] In the electronic device of this application, different units use a bus network for protocol conversion, data exchange, fault diagnosis, and other tasks. The sensors inside and outside the vehicle transmit data to the storage unit, which also stores user preference data and cached information. The data is input to the processing unit through the storage unit, which is used to perform data processing and generative algorithm operations. The ambient light controller is used to control the adjustment of the audio system and the linkage setting of the ambient light. Finally, the system interacts with the user through the interactive device to perform system control and information feedback optimization algorithms.
[0081] Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein can be implemented by software or by combining software with necessary hardware. Therefore, the technical solutions according to the embodiments of this application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (such as a CD-ROM, USB flash drive, external hard drive, etc.) or on a network, including several instructions to cause a computing device (such as a personal computer, server, or network device, etc.) to execute the method for generating in-vehicle ambient music according to this application.
[0082] Another aspect of this application relates to an in-vehicle ambient music generation system, which includes:
[0083] The detection module is used to acquire real-time images of the outside of the vehicle, parameters of the interior environment, and the driver's status.
[0084] The feature vector generation module is used to generate feature vectors based at least on the driving scenario and the real-time images outside the vehicle, the parameters of the in-vehicle environment, and the driver's state.
[0085] The core keyword determination module is used to determine core keywords based on the generated feature vectors.
[0086] An ambient music generation module is used to generate ambient music based on the core keywords using a generative AI algorithm.
[0087] The ambient lighting control module is used to match the ambient music playlist with the ambient lighting configuration and drive the ambient lighting accordingly.
[0088] In addition, this application also proposes a vehicle having any of the aforementioned in-vehicle ambient music generation system and / or the aforementioned electronic devices.
[0089] Overall, the solution described in this application can, on the one hand, collect real-time scene data inside and outside the vehicle, as well as the driver's physiological or psychological state data (such as real-time images of the outside environment, temperature, heart rate, steering wheel grip, etc.), combine this data with driving scenarios and driver preferences to determine core keywords based on predetermined rules, and then use generative algorithms to generate personalized ambient music. On the other hand, the generated ambient music is linked with the in-vehicle ambient lighting, automatically adjusting the lighting effects according to the rhythm, mood, and scene of the music. Simultaneously, a user feedback optimization mechanism provides users with the best listening experience, significantly improving the personalization and comfort of the in-vehicle listening experience.
[0090] Other embodiments of the invention will readily occur to those skilled in the art upon consideration of the specification and practice of the disclosure herein. This invention is intended to cover any variations, uses, or adaptations of the invention that follow the general principles of the invention and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of the invention are indicated by the appended claims.
Claims
1. A method for generating ambient music in a vehicle, comprising the following steps: Acquire real-time images of the outside of the vehicle, parameters of the interior environment, and the driver's status; The driving scenario and external environment parameters are determined based on the acquired real-time images of the vehicle exterior. Feature vectors are generated based at least on the driving scenario, the real-time images outside the vehicle, the parameters of the in-vehicle environment, and the driver's state. Core keywords are determined based on the generated feature vectors; Ambient music is generated based on the core keywords using a generative AI algorithm; Match the generated list of ambient music with the ambient lighting configuration.
2. The method for generating in-vehicle ambient music according to claim 1, characterized in that, The in-vehicle environmental parameters include at least the in-vehicle temperature, and the driver's status includes at least heart rate and steering wheel grip strength.
3. The method for generating in-vehicle ambient music according to claim 1 or 2, characterized in that, The system uses a convolutional neural network (CNN) to extract external environmental parameters from the real-time image, including at least weather features, road structure features, and traffic sign features, and identifies the current driving scenario information.
4. The method for generating in-vehicle ambient music according to any one of claims 1 to 3, characterized in that, At least based on the driving scenario and the real-time external images, in-vehicle environmental parameters, and driver state, the generated feature vector includes: Determine scenario-specific weighting rules based on driving scenarios; Based on the scenario-specific weighting rules, a feature vector is generated according to the real-time external image, in-vehicle environmental parameters, and driver status.
5. The method for generating in-vehicle ambient music according to claim 4, characterized in that, The weights are adjusted based on the real-time images outside the vehicle, the parameters inside the vehicle, the changing trends of the driver's state, and / or the validity of the data.
6. The method for generating in-vehicle ambient music according to claim 4, characterized in that, The weights of each input parameter are assigned using the following formula: w i =F i ·(α i ·P i +β i ·T i +γ i ·D i ); in, w i The final weight of input parameter i; i: The number of the input parameter, i = 1, 2, 3, 4 represent real-time image, in-vehicle temperature, heart rate, and steering wheel grip strength, respectively; F i : Adaptive adjustment factor, where Fi takes the following values: α i Scene priority factor, β i : Real-time trend factor of parameter data, γ i : Parameter data validity factor, α i β i γ i Decimals whose values are all in the range [0, 1]. P i : Represents the scenario priority score of the input parameter data; T i : Input parameter data change trend score; D i : A comprehensive score of the input parameter data; where P i T is a decimal number with a value range of [0, 2]. i and D i The value range is a decimal in the range [0,1].
7. The method for generating in-vehicle ambient music according to claim 5, characterized in that, Before determining the weights, the real-time images outside the vehicle, the parameters of the in-vehicle environment, and the driver's status are preprocessed. The preprocessing includes data cleaning, filtering, and / or consistency checks.
8. The method for generating in-vehicle ambient music according to claim 5, characterized in that, For temperature data, wavelet analysis is used to extract short-term fluctuations at the minute level and long-term trends at the hour level, while the Savitzky-Golay filtering method is used to remove noise.
9. The method for generating in-vehicle ambient music according to claim 6, characterized in that, Based on the results of driving scenario recognition, driving scenarios are divided into the following preset scenarios: night driving, highway driving, driving in severe weather, traffic congestion, urban driving, and mountain driving. For each driving scenario, a scenario priority score is assigned to each input parameter.
10. The method for generating in-vehicle ambient music according to claim 9, characterized in that, Based on the results of driving scene recognition, the current driving scene is matched with the closest preset scene, and the scene priority score of each parameter is determined based on the closest preset scene.
11. The method for generating in-vehicle ambient music according to claim 5, characterized in that, When the driver state indicated by the driving scenario is inconsistent with the acquired driver state, the dynamic incremental adjustment factor Δw is used. i To adjust the weights, Among them, Δw is determined based on driving scenarios and real-time data changes. i The value of .
12. The method for generating in-vehicle ambient music according to any one of claims 1 to 11, characterized in that, Determining core keywords based on the generated feature vectors also includes determining the style, rhythm, tonality, and intensity of the music corresponding to the core keywords based on the keyword mapping table.
13. The method for generating in-vehicle ambient music according to any one of claims 1 to 12, characterized in that, Ambient lighting configuration is generated and matched with ambient music using a Bayesian inference algorithm.
14. The method for generating in-vehicle ambient music according to claim 13, characterized in that, The ambient light configuration includes determining the operating parameters of the ambient light corresponding to the type, volume, and frequency changes of the generated ambient music. The operating parameters include at least one of light color, brightness, frequency, and light effect.
15. The method for generating in-vehicle ambient music according to any one of claims 1 to 14, characterized in that, Obtain the user's input music preference command, and adjust the style and type of the generated ambient music in response to the music preference command.
16. The method for generating in-vehicle ambient music according to claim 15, characterized in that, User behavior data and emotional responses are recorded for a specific period of time after the generated in-car ambient music starts playing. Based on user feedback, historical data is updated to generate a new training dataset, and the strategy for generating in-car ambient music is optimized through a self-learning algorithm.
17. An in-vehicle ambient music generation system, the in-vehicle ambient music generation system comprising: The detection module is used to acquire real-time images of the outside of the vehicle, parameters of the interior environment, and the driver's status. The feature vector generation module is used to generate feature vectors based at least on the driving scenario, the real-time images outside the vehicle, the parameters of the in-vehicle environment, and the driver's state. The core keyword determination module is used to determine core keywords based on the generated feature vectors. An ambient music generation module is used to generate ambient music based on the core keywords using a generative AI algorithm. The ambient lighting control module matches the ambient music list with the ambient lighting configuration and drives the ambient lighting accordingly.
18. The in-vehicle ambient music generation system according to claim 17, characterized in that, The in-vehicle ambient music generation system also includes a user feedback module, which obtains user feedback by detecting user manual operations, multimodal recognition, and / or asking the user questions, and optimizes the generation of ambient music based on the user feedback.
19. An electronic device comprising a plurality of sensors, a storage unit, a processing unit, and an ambient light controller, wherein, The sensor, storage unit, and ambient light controller are respectively connected to the processing unit via signals, characterized in that the processing unit is configured to execute the method for generating in-vehicle ambient music according to any one of claims 1 to 16.
20. The electronic device according to claim 19, characterized in that, The electronic device also includes a user interaction device, which is signal-connected to the processing unit.
21. A vehicle having an in-vehicle ambient music generation system according to any one of claims 16 to 18 and / or an electronic device according to claim 19 or 20.