Methods and systems for determining a position and orientation of a device using acoustic beacons

By using a microphone to receive sound signals on wearable audio devices and combining an inertial measurement unit (IMU) with a gradient descent algorithm, IMU drift is corrected, solving the problem of IMU position and orientation estimation errors in wearable devices and achieving more accurate device positioning and virtual audio source positioning.

CN116601514BActive Publication Date: 2026-06-26BOSE CORP

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BOSE CORP
Filing Date
2021-09-15
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Inertial measurement units (IMUs) in wearable audio devices accumulate position and orientation estimation errors due to noise and offset, affecting the accuracy of virtual audio sources.

Method used

By using the microphone on the wearable audio device to receive sound signals from the environment, the distance between the device and the audio source is determined by the time-of-flight information, and the position and orientation of the device are corrected by combining the inertial measurement unit and the gradient descent algorithm.

Benefits of technology

Effective IMU drift correction improves the accuracy of the position and orientation of wearable audio devices in known environments, ensuring accurate positioning of virtual audio sources.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116601514B_ABST
    Figure CN116601514B_ABST
Patent Text Reader

Abstract

The present disclosure provides systems and methods for determining a position and orientation of a wearable audio device, e.g., methods and systems for determining a position, orientation, and / or height of a wearable audio device using acoustic beacons. In some examples, the determined position, orientation, and / or height can be utilized to correct for drift experienced by an inertial measurement unit (IMU). In other examples, the drift can cause an externalized or virtualized audio source produced within a known environment to move or drift relative to a known position of a physical audio source within the environment. Thus, the systems and methods described herein can be utilized to correct for a position drift of a virtualized audio source relative to the wearable audio device by first determining an absolute position and orientation of the virtualized audio source within the environment itself.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] Cross-references to related applications

[0002] This application claims priority to U.S. Patent Application Serial No. 17 / 022,784, filed on September 16, 2020, entitled “Methods and Systems for Determining Position and Orientation of a Device Using Acoustic Beacons”, the entire disclosure of which is incorporated herein by reference. Background Technology

[0003] This disclosure relates to aspects and embodiments of audio systems, and more specifically, to audio systems comprising one or more wearable devices and one or more audio sources. Some wearable devices (such as headphones or smart glasses) utilize a set of sensors called an inertial measurement unit (IMU) to derive the relative position and / or orientation of the device relative to a fixed point in space. Small measurement errors within the IMU (e.g., due to noise and / or offset) accumulate and worsen over time, resulting in increasingly larger errors in the device's perceived orientation and perceived position relative to its actual position in space. Summary of the Invention

[0004] This disclosure relates to systems and methods for determining the position and orientation of a wearable audio device, such as methods and systems for determining the position and orientation of a wearable audio device using acoustic beacons. In some examples, the determined position and orientation can be used to correct for drift experienced by an inertial measurement unit (IMU). In other examples, this drift can cause an externalized or virtualized audio source generated within a known environment to move or drift relative to a known position of a physical audio source within that environment. Therefore, the systems and methods described herein can be used to correct for positional drift of a virtual audio source relative to the wearable audio device by first determining the virtual audio source's own absolute position and orientation within the environment.

[0005] Acoustic augmented reality experiences typically require some form of localization data about the surrounding environment, such as the location of acoustically reflective surfaces within the environment. Many wearable audio devices use triaxial accelerometers and triaxial gyroscopes to sense the angular velocity and / or linear acceleration of the user wearing the device. In theory, these sensors can be used to track the device's position and orientation in space. However, due to sensor noise and other biases, integral drift and estimates of position and orientation quickly become inaccurate. Therefore, this invention relates to using one or more microphones on the device and knowledge of signals transmitted to physical speakers within the environment to determine how far the device is from each source speaker. In other words, the system can include multiple audio source devices at known locations. Once these source devices generate sound, any drift experienced by the IMU can be corrected. Because sound travels through the air at a known speed, the system utilizes time-of-flight information obtained by comparing a reference signal with the actual signal obtained by the microphones on the wearable audio device to determine the distance between each microphone and each source device. With sufficient sources, wearable audio devices can triangulate their own position relative to the audio source device and obtain a complete description of the device's position and orientation, including, for example, its position in the Cartesian x, y, and z planes, as well as yaw, pitch, and roll.

[0006] In one example, a wearable audio device is provided, comprising: a first microphone and a second microphone configured to receive a first signal representing a first sound reproduced in an environment by a first audio source; and a processor configured to derive, at least in part, the orientation of the wearable audio device relative to the first audio source in the environment based on the first signal received at the first and second microphones.

[0007] In one aspect, the wearable device further includes an inertial measurement unit, wherein the processor is further configured to determine the perceived orientation of the wearable audio device based at least in part on the inertial measurement unit.

[0008] In one aspect, the processor is configured to generate a first virtual audio source within an environment, and wherein the processor is configured to prevent or correct a drift of the virtual position of the first virtual audio source relative to the first audio source, the drift being formed by the perceived orientation of the wearable audio device relative to the first audio source.

[0009] In one aspect, the processor is configured to: determine a first distance between a first audio source and a first microphone and a second distance between the first audio source and a second microphone based on time-of-flight information, and derive the orientation of the wearable audio device relative to the first audio source based at least in part on the time-of-flight information.

[0010] In one aspect, the first microphone and the second microphone are configured to acquire a second signal representing a second sound reproduced in the environment by a second audio source, and wherein the processor is further configured to derive the position of the wearable audio device relative to the first audio source and the second audio source.

[0011] In one aspect, the first microphone and the second microphone are configured to acquire a third signal representing a third sound reproduced in the environment by a third audio source, and wherein the processor is further configured to derive the height of the wearable audio device relative to the first audio source, the second audio source, and / or the third audio source, at least in part based on the first signal, the second signal, and / or the third signal.

[0012] In one aspect, the processor is configured to use a gradient descent algorithm or a gradient ascent algorithm to determine the orientation and position of the wearable audio device, the gradient descent algorithm or gradient ascent algorithm utilizing time-of-flight information from a first signal received at a first microphone and a second microphone, a second signal generated by a second audio source and / or a third signal generated by a third audio source.

[0013] In another example, a wearable audio device is provided, comprising: a first microphone configured to receive a first signal representing a first sound reproduced in an environment by a first audio source and a second signal representing a second sound reproduced in an environment by a second audio source; and a processor configured to derive the position of the wearable audio device relative to the first and second audio sources, at least in part based on the first and second signals.

[0014] In one aspect, the wearable audio device further includes an inertial measurement unit, wherein the processor is further configured to determine the perceived orientation of the wearable audio device based at least in part on the inertial measurement unit.

[0015] In one aspect, the processor is configured to generate a first virtual audio source within an environment, and wherein the processor is configured to prevent or correct a drift of the virtual position of the first virtual audio source relative to a first audio source and a second audio source, the drift being formed by the perceptual orientation of the wearable audio device relative to the first audio source and the second audio source.

[0016] In one aspect, the processor is configured to: determine a first distance between a first audio source and a first microphone and a second distance between a second audio source and the first microphone based on time-of-flight information, and derive the position of a wearable audio device relative to the first audio source and the second audio source based at least in part on the time-of-flight information.

[0017] In one aspect, the wearable audio device includes a second microphone configured to acquire a first signal and a second signal within an environment, and a processor further configured to derive the orientation of the wearable audio device relative to a first audio source and a second audio source, at least in part based on the first signal and the second signal.

[0018] In one aspect, the first microphone and the second microphone are configured to acquire a third signal representing a third sound reproduced in the environment by a third audio source, and wherein the processor is further configured to derive the height of the wearable audio device relative to the first audio source, the second audio source, and / or the third audio source, at least in part based on the first signal, the second signal, and / or the third signal.

[0019] In one aspect, the processor is configured to use a gradient descent algorithm or a gradient ascent algorithm to determine the orientation and location of a wearable audio device based at least in part on a first signal, a second signal, and / or a third signal, the gradient descent algorithm or gradient ascent algorithm utilizing time-of-flight information from the first signal received at a first microphone and a second microphone, the second signal generated by a second audio source, and / or the third signal generated by a third audio source.

[0020] In another example, a method for determining the orientation of a wearable audio device is provided, the method comprising: obtaining a first signal representing a first sound reproduced in an environment by a first audio source via a first microphone and a second microphone of the wearable audio device; and deriving the orientation of the wearable audio device relative to the first audio source in the environment via a processor based on the first signal received at the first microphone and the second microphone.

[0021] In one aspect, the wearable audio device includes an inertial measurement unit, and the method further includes: determining the perceived orientation of the wearable audio device via a processor, at least in part, based on the inertial measurement unit.

[0022] In one aspect, the method further includes: generating a first virtual audio source within an environment via a processor; and preventing or correcting a drift of the virtual position of the first virtual audio source relative to the first audio source via the processor, the drift being formed by the perceived orientation of the wearable audio device relative to the first audio source.

[0023] In one aspect, the method further includes: determining, via a processor, a first distance between a first audio source and a first microphone and a second distance between the first audio source and a second microphone based on time-of-flight information; and deriving, at least in part, the orientation of a wearable audio device relative to the first audio source based on the time-of-flight information.

[0024] In one aspect, the method further includes: obtaining a second signal representing a second sound reproduced in the environment by a second audio source via a first microphone and a second microphone; and deriving the position of a wearable audio device relative to the first audio source and the second audio source via a processor, based at least in part on the first signal and / or the second signal.

[0025] In one aspect, the method further includes: obtaining a third signal representing a third sound reproduced in the environment by a third audio source via a first microphone and a second microphone; and deriving the height of the wearable audio device relative to the first audio source, the second audio source, and / or the third audio source via a processor, based at least in part on the first signal, the second signal, and / or the third signal.

[0026] These and other aspects of the various implementation schemes will become apparent from the implementation schemes described below, and will be clarified with reference to the implementation schemes described below. Attached Figure Description

[0027] In the accompanying drawings, similar reference numerals generally refer to the same parts in all different views. Furthermore, the drawings are not necessarily drawn to scale, and the focus is usually on illustrating the principles of various embodiments.

[0028] Figure 1 This is a schematic diagram of the system according to this disclosure.

[0029] Figure 2A This is a schematic diagram of the components of a wearable audio device according to this disclosure.

[0030] Figure 2B This is a schematic diagram of the components of an audio source device according to this disclosure.

[0031] Figure 3 This is a top-view plan view of the system according to this disclosure.

[0032] Figure 4 This is a top-view plan view of the system according to this disclosure.

[0033] Figure 5 This is a top-view plan view of the system according to this disclosure.

[0034] Figure 6 This is a top-view plan view of the system according to this disclosure.

[0035] Figure 7 This is a top-view plan view of the system according to this disclosure.

[0036] Figure 8 The steps of the method according to this disclosure are shown.

[0037] Figure 9The steps of the method according to this disclosure are shown. Detailed Implementation

[0038] This disclosure relates to systems and methods for determining the position and orientation of a wearable audio device, such as methods and systems for determining the position and orientation of a wearable audio device using acoustic beacons. In some examples, the determined position and orientation can be used to correct for drift experienced by an inertial measurement unit (IMU). In other examples, this drift can cause an externalized or virtualized audio source generated within a known environment to move or drift relative to a known position of a physical audio source within that environment. Therefore, the systems and methods described herein can be used to correct for positional drift of a virtual audio source relative to the wearable audio device by first determining the virtual audio source's own absolute position and orientation within the environment.

[0039] As used herein, the term "wearable audio device" is intended to refer, in addition to its common meaning or meaning known to those skilled in the art, to a device adapted around, on, in, or near the ear (including open-ear audio devices worn on a user's head or shoulders) and a device that radiates sound energy into or toward the ear. Wearable audio devices are sometimes referred to as headphones, earphones, earpieces, over-ear headphones, earbuds, or sports headphones, and may be wired or wireless. Wearable audio devices include acoustic drivers for converting audio signals into sound energy. The acoustic drivers may be housed within earcups. While some of the following figures and descriptions may illustrate a single wearable audio device with a pair of acoustic drivers, it should be understood that a wearable audio device may be a single, independent unit with only one acoustic driver. Each acoustic driver of a wearable audio device may be mechanically connected to the other acoustic driver, for example, via a headband and / or via leads that conduct audio signals to the pair of acoustic drivers. Wearable audio devices may include components for wirelessly receiving audio signals. Wearable audio devices may include components of an active noise cancellation (ANR) system. Wearable audio devices can also include other features, such as microphones, allowing them to be used as headphones. Although Figure 1 One example of the shape factor for audio glasses is shown, but in other examples, the headphones can be in-ear, on-ear, over-ear, or near-ear headphones. In some examples, the wearable audio device can be an open-ear device, which includes acoustic drivers to radiate sound energy toward the ear while keeping the ear open to its outside world and surrounding environment.

[0040] As used herein, the term "head-related transfer function" or the acronym "HRTF" is intended, in addition to its common meaning known to those skilled in the art, to broadly reflect any method of calculating, determining, or estimating binaural sound perceived by the human ear, enabling a listener to estimate the spatial origin of the sound. For example, an HRTF may be a mathematical formula or set of mathematical formulas that can be applied or convolved with an audio signal, allowing a user listening to a modified audio signal to perceive sound as originating from a specific point in space. As mentioned herein, these HRTFs can be generated specifically for each user, for example, taking into account the user's unique physiology (e.g., the size and shape of the head, ears, nasal cavity, mouth, etc.). Alternatively, it should be understood that a generalized HRTF can be generated for all users, or multiple generalized HRTFs can be generated for a subset of users (e.g., based on certain physiological characteristics, such as age, sex, head size, ear size, or other parameters, that at least broadly indicate a unique head-related transfer function for that user). In one example, some aspects of the HRTF can be determined precisely, while others are estimated coarsely (e.g., interaural delay is determined precisely, but the magnitude response is estimated coarsely).

[0041] According to Figures 1 to 7 Read the following description. Figure 1 This is a schematic diagram of a system 100 used in environment E according to this disclosure. System 100 includes at least one wearable audio device (e.g., wearable audio device 102) and multiple audio sources 104A to 104C (collectively referred to as "audio source 104" or "multiple audio sources 104" or "audio source device 104"). Wearable audio device 102 is intended to be a device capable of acquiring (via a microphone discussed below) sound within environment E, such as sound 108A to 108C (collectively referred to as "sound 108" or "multiple sounds 108") (in Figures 3 to 6 (as shown in the diagram), and converts those sounds into multiple signals, for example, signals 150A to 150F (in... Figure 2A and Figure 2B (As shown in the image). Additionally, the wearable audio device 102 may include one or more speakers, such as a first speaker 122 and a second speaker 124 (as shown in the image). Figures 4 to 7 (As shown in the diagram and discussed below), to provide audio playback to the user or wearer of the wearable audio device 102. In one example, such as Figure 1As shown, wearable audio device 102 is an eyeglass-shaped, open-ear audio device capable of reproducing sound energy outside and close to the user's ear. It should be understood that in other examples, wearable audio device 102 may be selected from ear-hook or in-ear headphones, earphones, handsets, headphones, earbuds, or sports headphones. In some examples, system 100 includes at least one peripheral device PD, which may be selected from any electronic device capable of generating and / or transmitting audio signals (e.g., reference signal 106 discussed below) to a separate device (e.g., wearable audio device 102 and / or audio source 104). In one example, as... Figure 1 As shown, the peripheral device PD is intended to function as a smartphone or tablet computer. However, it should be understood that the peripheral device PD may be selected from a smartphone, tablet computer, laptop computer or personal computer, a housing configured to engage with and / or charge the wearable audio device 102, or any other portable and / or mobile computing device.

[0042] Each of the multiple audio source devices 104 is designed to be a device capable of receiving an audio signal (e.g., reference signal 106) associated with audio, video, or other stored media or media streams to be reproduced as audible sound by the audio source device 104. As will be discussed below, each audio source 104 may include multiple sounds, such as multiple sounds 108A to 108C, capable of receiving the reference signal 106 and generating multiple sounds within the environment E (in...). Figures 3 to 7 One or more acoustic drivers, transducers, or amplifiers for at least one sound (shown in the diagram), such as source speaker 144 (discussed below). In at least some examples, audio source 104 is intended to be a speaker, such as a wired or wireless speaker; however, it should be understood that each audio source 104 may be selected from: portable speakers, smartphones, tablet computers, personal computers, smart TVs, far-field audio devices, vehicle speakers, or any other device capable of generating detectable sound within environment E in response to reference signal 106. In another example, at least one audio source 104 may take the form of a public address (PA) system or other speaker systems in public places such as arenas, stadiums, or concert venues. It should be understood that each audio source 104 may receive reference signal 106 and utilize reference signal 106 to generate a corresponding sound, such as multiple sounds 108A to 108C, at each audio source 104 within environment E. Although only three audio sources 104A to 104C are shown and described herein, it should be understood that more than three audio sources 104 may be used, such as four, five, six, eight, ten, etc.

[0043] like Figure 2AAs shown, the wearable audio device 102 may further include a first circuit 110, which includes a first processor 112 and a first memory 114. The first processor and the first memory are capable of executing and storing a first set of non-transitory computer-readable instructions 116, respectively, to perform the functions of the wearable audio device 102 as described herein. The first circuit 110 may also include a first communication module 118, which is configured to transmit and / or receive wireless data, such as data related to a reference signal 106 from a peripheral device PD (e.g., data related to a reference signal 106). Figure 1 (As shown). It should also be understood that the wearable audio device 102 can also be configured to transmit wireless data (e.g., reference signal 106) to each audio source device 104. For this purpose, the first communication module 118 may include at least one radio or antenna, such as a first radio 120 capable of transmitting and receiving wireless data. In some examples, in addition to at least one radio (e.g., the first radio 120), the first communication module 118 may also include some form of automatic gain control (AGC), a modulator and / or demodulator, and potentially a discrete processor for bit processing, electrically connected to the first processor 112 and the first memory 114 to assist in transmitting and / or receiving wireless data. As will be discussed below, the first circuitry 110 of the wearable audio device 102 may also include a first speaker 122 and a second speaker 124 (e.g., a loudspeaker or acoustic driver or transducer), which are electrically connected to the first processor 112 and the first memory 114 and configured to electromechanically convert electrical signals (e.g., reference signal 106) into audible sound energy (also referred to herein as audio playback) within the environment E. In some examples, the reference signal 106 and the audible sound energy are associated with data transmitted and received between the wearable audio device 102, multiple audio source devices 104, and / or peripheral devices PD. In one example, such as Figures 4 to 7 As shown, the first speaker 122 is intended to be positioned close to the user's right ear, while the second speaker 124 is intended to be positioned close to the user's left ear.

[0044] Additionally, as discussed below, the first circuitry 110 of the wearable audio device 102 may further include at least one microphone. In some examples, the wearable audio device 102 has only one microphone, namely, the first microphone 126. In other examples, the wearable audio device 102 includes multiple microphones, namely, at least the first microphone 126 and the second microphone 128. It should be understood that although the following examples describe a wearable audio device 102 having one or two microphones (i.e., the first microphone 126 and / or the second microphone 128), in some examples, more than two microphones (e.g., three, four, six, eight, etc.) may be used. As discussed below, each microphone is capable of receiving sound within the environment E and converting, generating, or acquiring a signal associated with the corresponding sound (e.g., sounds 108A to 108C) generated by the speakers (discussed below) of each audio source device 104. Furthermore, as discussed below and regarding... Figures 4 to 7 In the examples shown, two microphones may be used, such as a first microphone 126 and a second microphone 128. In these examples, the first microphone 126 may be engaged, fixed to, or mounted on or inside the right side of the wearable audio device 102 near the user's right ear, and the second microphone may be engaged, fixed to, or mounted on or inside the left side of the wearable audio device 102 near the user's left ear.

[0045] Additionally, the first circuitry 110 of the first wearable audio device 102 may further include an inertial measurement unit 130 (in... Figure 2A (Illustrated schematically). The inertial measurement unit (IMU) 130 is intended to include one or more sensors configured to obtain the perceived orientation PO of the wearable audio device 102 relative to one or more audio sources 104 and / or the perceived position PP of the wearable audio device 102 relative to one or more audio sources 104 (all in... Figure 3(As shown in the diagram and discussed below). In some examples, the sensors of IMU 130 may be selected from one or more of the following: a gyroscope (e.g., a three-axis gyroscope), an accelerometer (e.g., a three-axis accelerometer), a magnetometer, a camera, a proximity sensor, a light detection and ranging sensor (LIDAR), an ultrasonic distance sensor, or any other sensor capable of obtaining relative distance, orientation, position, or height information of the wearable audio device 102 relative to one or more audio source devices 104 and / or other objects with known positions in or near the environment E. In some examples, IMU 130 may also include one or more sensors for deriving or obtaining information related to absolute position, such as a Global Positioning System (GPS) sensor capable of obtaining at least the position information of the wearable audio device 102. As discussed below, IMU 130 may employ one or more algorithms to process the data obtained from the aforementioned sensors to determine the relative perceived position and orientation of the wearable audio device 102 relative to one or more audio source devices 104. It should be understood that determining the relative perceived position of the wearable audio device 102 relative to one or more audio source devices 104 may include determining the relative perceived height of the wearable audio device 102 relative to the actual height of the one or more audio source devices 104.

[0046] like Figure 2BAs shown, each audio source device may also include its own circuitry, namely, source circuitry 132. Each source circuitry 132 of each audio source device 104 includes a source processor 134 and a source memory 136, which are capable of executing and storing a set of non-transitory computer-readable instructions (i.e., source instructions 138) to perform the functions of each corresponding audio source device 104 as described herein. Each source circuitry 132 may also include a source communication module 140 configured to send and / or receive wireless data, such as data associated with reference signal 106, to and / or from, for example, a peripheral device PD or a wearable audio device 102. For this purpose, each source communication module 140 may include at least one radio or antenna, such as a source radio 142 capable of transmitting and receiving wireless data. In some examples, in addition to at least one radio (e.g., source radio 142), each source communication module 140 may also include some form of automatic gain control (AGC), a modulator and / or demodulator, and potentially a discrete processor for bit processing, electrically connected to each respective source processor 134 and source memory 136 to assist in transmitting and / or receiving wireless data. As will be discussed below, each audio source device 104 and therefore each source circuit 132 may also include at least one source speaker 144 (e.g., a loudspeaker or acoustic driver or transducer), electrically connected to its respective source processor 134 and source memory 136 and configured to electromechanically convert electrical signals (e.g., reference signal 106) into audible acoustic energy within the environment E (also referred to herein as audio playback). In some examples, the reference signal 106 and the audible acoustic energy are associated with data transmitted and received between the wearable audio device 102, the plurality of audio source devices 104, and / or peripheral devices PD.

[0047] In some examples, system 100 can be configured to generate, produce, or otherwise reproduce one or more virtual audio sources within environment E. For example, wearable audio device 102 and / or peripheral device PD can be configured to modify reference signal 106 into one or more modified audio signals that have been filtered or modified using at least one head correlation transfer function (HRTF). In one example of system 100, the system can utilize this virtualization or externalization of augmented reality audio systems and programs by: modeling environment E (e.g., using a locator or other environmental data source), generating virtual sound sources (e.g., virtual sound source 146) at various locations within environment E, and processing the virtual sound source 146 (in...) Figures 3 to 6The sound waves and their corresponding paths to the location of the wearable audio device 102 and / or the location of the ear of the user wearing the wearable audio device 102 (as shown in the diagram) are modeled or simulated to simulate the user's perception of sound as if the virtual sound source 146 were a real or tangible sound source (e.g., a physical speaker located at the location of the virtual sound source 146 within the environment E). For each modeled or simulated sound path (e.g., a direct sound path or a reflected sound path), computational processing is used to apply or convolve at least one pair of HRTFs (one associated with the left ear and one associated with the right ear) to the reference signal 106 to generate a modified audio signal. Once the HRTFs have been applied and the modified audio signal has been generated, the modified audio signal can be provided to the speakers of the wearable audio device 102 (i.e., the first speaker 122 and the second speaker 124) to generate audio playback that deceives the user's brain into believing that they are perceiving sound from an actual externalized sound source located at the location of the virtual sound source 146 within the environment E. In some examples, the quality of the simulated realism of these modified audio signals can be enhanced by simulating at least first-order and / or second-order sound reflections from a virtual audio source 146 within the environment E, and by attenuating or delaying the simulated signal to estimate the time of flight of the sound signal through the air as if the sound signal originated at the location of the virtual sound source 146 within the environment E. It should be understood that the wearable audio device 102 and / or the peripheral device PD can process, apply, or convolve the HRTF to simulate one or more virtual sound sources. However, since the form factor and therefore the space for additional processing components are typically limited in wearable audio devices (e.g., wearable audio device 102), it should also be understood that the application or convolution of the HRTF with the reference signal 106 in question may be implemented via the circuitry of the peripheral device PD, and the modified audio signal can then be sent or streamed to the wearable audio device 102 for playback as an audio playback APB. Additional information relating to the generation and simulation of virtual sound sources can be found in U.S. Patent Application Publication No. 2020 / 0037097, the entire contents of which are incorporated herein by reference.

[0048] During operation, such as Figure 3As shown, a system that uses an inertial measurement unit (e.g., IMU 130) to obtain position and orientation information of one or more devices (e.g., wearable audio device 102) may experience drift. The term "drift," in addition to its common meaning known to those skilled in the art, is intended to refer to a quantifiable difference between the perceived orientation PO and / or perceived position PP of the wearable audio device 102 relative to one or more audio source devices 104 as perceived by IMU 130 and the actual orientation OL, position P1, and height H of the wearable audio device 102 within the environment E. Drift can occur for a variety of reasons; for example, drift can be caused by the accumulation of small measurement errors (e.g., due to noise and / or offset) that aggravate over time, resulting in increasingly larger errors in the perceived orientation PO and / or perceived position PP of the wearable audio device 102 relative to the actual orientation OL and position P1 of the wearable audio device 102 within the environment E. Figure 3 An exemplary illustration of this drift is shown in the diagram. The user's actual orientation O1 and position P1 are... Figure 3 As shown in the diagram. Additionally, due to, for example, the accumulation of errors, the perceived orientation PO and perceived position PP of the wearable audio device 102 are shown as dashed outlines positioned 15 degrees behind the actual user's location. This results in a drift in the position of the virtual audio source 146 (also...). Figure 3 (Shown in dashed lines). It should be understood that the 15-degree rotational and positional shifts shown and disclosed herein are extreme examples, and other smaller drift shifts, such as rotational drifts of 1, 2, or 5 degrees, or larger drift shifts, such as 20, 30 degrees, and any values ​​in between, are contemplated herein. As discussed herein, this disclosure uses signals obtained from one or more microphones of the wearable audio device 102 to correct or adjust drifts, these signals representing sound generated by one or more audio source devices 104 within the environment E. In other words, this system and method allow the use of acoustic energy within the environment E as a beacon so that the wearable audio device 102 can derive its actual orientation O1, position P1 relative to one or more audio source devices 104.

[0049] One method of utilizing sounds generated by multiple audio sources 104 (i.e., sounds 108A to 108C) is to utilize the time-of-flight information 148 of each sound. Figure 2A(as shown in the diagram) and calculate or derive the orientation Ol, position P1, and / or height H of the wearable audio device 102 relative to one or more audio source devices 104. Regarding obtaining location data, this technique can also be referred to as multi-point positioning, for example, utilizing the arrival time of energy waves having a known speed through a particular medium (e.g., sound or light through air). In these examples, the wearable audio device 102 and the audio source devices 104 can know, store, or utilize data obtained from the reference signal 106. For example, the audio source devices 104A to 104C can utilize the reference signal 106 to respectively reproduce audible sound energy within the environment E (i.e., sounds 108A to 108C). Since each audio source device 104 may be located at a different position within the environment E relative to the wearable audio device 104, one or more microphones of the wearable audio device 102 (e.g., first microphone 126 and / or second microphone 128) will receive the sound waves associated with sounds 108A to 108C at different times and generate signals 150A to 150F associated with each sound, respectively. By calculating the difference between the time of acquisition or reception of each sound signal and baseline time information from reference signal 106, wearable audio device 102 can determine, calculate, or derive the time-of-flight information 148 of each signal from its corresponding source to wearable audio device 102. Once the time-of-flight information 148 of each signal 150A to 150F is known, it can be multiplied by the known speed of sound through the air to obtain the distance between each audio source 104 and wearable audio device 102. Once the distance between each audio source device 104 and wearable audio device 102 is known, the angle between wearable audio device 102 and one or more audio sources 104 can be derived, and the position, orientation, and / or height of wearable audio device 102 relative to the audio sources 104 can be derived. Once these actual values ​​and positions are known, drift relative to those audio sources can be corrected and / or prevented.

[0050] The IMU 130 can use algorithms to determine the perceived orientation PO and / or perceived position PP. In one example, the algorithm used by the IMU 130 may be combined with or utilize gradient descent or gradient ascent algorithms to calculate the distance between each audio source 104 and the wearable audio source 102, and finally use the distance information to calculate the actual orientation Ol, position P1, and / or height H of the wearable audio device 102 relative to one or more audio source devices 104.

[0051] The following is an exemplary implementation based on the principles of this disclosure. Assume that N speakers exist in environment E, for example, source speakers 144, each speaker generating, producing, or reproducing sound 108. The signal reaching or obtained by the k-th microphone is:

[0052]

[0053] In the above formula, x i (t) represents the signal played from the i-th source speaker 144, g ki (t) represents the transfer function from the i-th source speaker 144 to the k-th microphone, and τ ki This corresponds to the time delay, i.e., the time-of-flight information 148. In one example, the system utilizes a gradient ascent algorithm, which adjusts the perceived orientation (PO) and / or perceived position (PP) estimates so that the signals obtained from one or more microphones are consistent with... Maximize the correlation between them, where

[0054]

[0055] In other words, the algorithm obtains flight time information 148 or time shift. These time shifts make the estimated orientation or location components time-aligned with the components of the signal obtained from one or more microphones of the wearable audio device 102.

[0056] In the following examples, the use of gradient ascent or gradient descent algorithms allows wearable audio device 102 to determine the distance between the corresponding audio source device 104 and wearable audio device 102, as well as the angle between the audio source device and wearable audio device 104. Therefore, the position of the audio source 104 relative to wearable audio device 102 is known or can be derived and / or calculated. Using the known positions of enough audio sources 104, wearable audio device 102 can triangulate its actual position P1. Using additional audio sources, wearable audio device 102 can also derive its height or position relative to audio source 104. Furthermore, in exemplary embodiments utilizing more than one microphone (e.g., first microphone 122 and second microphone 124), wearable audio device 102 can also derive, calculate, or otherwise obtain its orientation relative to one or more audio source devices 104.

[0057] In one exemplary operation, such as Figure 4As shown, system 100 includes a wearable audio device 102 having two microphones (i.e., a first microphone 126 and a second microphone 128) and a single audio source device 104A. A peripheral device PD can transmit wireless data (e.g., data associated with a reference signal 106) to the source device 104A, enabling the audio source 104A to generate, produce, or otherwise reproduce audible sound energy (e.g., sound 108A) within the environment E. The wearable audio device 102 can receive the sound energy of sound 108A at the first microphone 126 and the second microphone 128, and obtain associated signals 150A to 150B representing sound 108A from the audio source 104A. Figure 2A (As shown in the diagram). Since the wearable audio device 102 can also receive a reference signal 106 from, for example, a peripheral device PD, the wearable audio device 102 can compare the first signal 150A with the reference signal 106 and compare the second signal 150B, and obtain time-of-flight information 148 for the first signal 150A and the second signal 150B. Using the time-of-flight information 148 for each signal 150A to 150B, and using the known constant speed of sound propagation through the air, the wearable audio device 102 can determine, calculate, or otherwise derive (e.g., using a gradient ascent or descent algorithm) the distance between the first microphone 126 and the audio source 104A (i.e., the first distance D1) and the distance between the second microphone 128 and the audio source 104A (i.e., the second distance D2). Once these distances are known, the angles generated by the distance lines D1 and D2 can be calculated, and the actual orientation O1 of the wearable audio device 102 relative to the audio source 104A can be determined. Additionally, as shown, if system 100 generates or otherwise reproduces one or more virtual audio sources 146, the algorithm employed by IMU 130 can utilize the known orientation O1 of wearable audio device 102 relative to audio source 104A to correct any drift in the perceived orientation PO of wearable audio device 102 that occurs during operation of system 100, in order to maintain the accurate orientation and / or position of virtual audio source 146. Furthermore, it should be understood that one or more additional parameters can be assumed to obtain or derive more detailed information related to the position or orientation of the wearable audio device, i.e., one or more assumed parameters 152 (in... Figure 2A (As shown in the diagram). For example, it can be assumed that the user is on the surface of the earth and that they are 1.8 meters (approximately 5 feet 9 inches) tall. Additionally, it can be assumed that the physical speakers of, for example, audio sources 104A and 104B are positioned 1 meter (approximately 3 feet 2 inches) above the ground. This allows the IMU 130 to determine the distance or angle in question more accurately, as changes in elevation between the source and the wearable audio device (which contribute to the distance between the devices) can be taken into account.

[0058] Furthermore, in the example discussed above, where sound is received from a single audio source 104 at two microphones (e.g., the first microphone 126 and the second microphone 128) and only orientation information is needed, there is no need to compare the time-of-flight information 148 with the reference signal 106. Instead, the IMU algorithm can simply use the time-of-flight information 148 to determine the actual orientation Ol of the wearable audio device 102.

[0059] In another exemplary operation, such as Figure 5 As shown, system 100 includes a wearable audio device 102 having a single microphone (i.e., first microphone 126) and two audio source devices 104A and 104B. Although shown on the right side of the wearable audio device near the user's right ear, it should be understood that in this example, the single microphone (i.e., the first microphone) can be positioned anywhere on the wearable audio device. Peripheral device PD can transmit wireless data (e.g., data associated with reference signal 106) to source devices 104A and 104B, such that audio sources 104A and 104B can respectively generate, produce, or otherwise reproduce audible sound energy (e.g., sounds 108A and 108B) within environment E. Wearable audio device 102 can receive the sound energy of sounds 108A and 108B at the first microphone 126 and obtain associated signals 150A to 150B representing sounds 108A and 108B from audio source 104A (in Figure 2A(As shown in the diagram). Since the wearable audio device 102 can also receive a reference signal 106 from, for example, a peripheral device PD, the wearable audio device 102 can compare a first signal 150A with the reference signal 106 and a second signal 150B with the reference signal 106, and obtain time-of-flight information 148 for the first signal 150A and the second signal 150B. Using the time-of-flight information 148 for each signal 150A to 150B, and using the known constant speed of sound propagation through the air, the wearable audio device 102 can determine, calculate, or otherwise derive (e.g., using a gradient ascent or descent algorithm) the distance between the first microphone 126 and the audio source 104A (i.e., the first distance D1) and the distance between the first microphone 126 and the audio source 104B (i.e., the second distance D2). Once these distances are known, the angles generated by the distance lines D1 and D2 can be calculated with a first degree of accuracy (discussed below), and the actual position P1 of the wearable audio device 102 relative to the audio sources 104A and 104B can be estimated and / or determined. In one example, the first level of accuracy is low accuracy. For instance, equal time-of-flight information 148 from both audio sources 104A and 104B can indicate that the wearable audio device 102 is located at any point along the plane formed exactly between the two sources, but it will not indicate where the wearable audio device 102 is located within that plane.

[0060] Additionally, as shown, if system 100 generates or otherwise reproduces one or more virtual audio sources 146, the algorithm employed by IMU 130 can utilize the known position P1 of wearable audio device 102 relative to audio sources 104A and 104B to correct any drift in the perceived position PP of wearable audio device 102 that occurs during operation of system 100, in order to maintain the accurate position of the virtual audio sources 146. As will be discussed below, it should be understood that one or more additional parameters can be assumed to obtain or derive more detailed information related to the position or orientation of the wearable audio device, namely, one or more assumed parameters 152. For example, it can be assumed that the user is on the surface of the earth and that they are 1.8 meters (approximately 5 feet 9 inches) tall. Additionally, it can be assumed, for example, that the physical speakers of audio sources 104A and 104B are positioned 1 meter (approximately 3 feet 2 inches) above the ground. This allows IMU 130 to determine the distance or angle in question more accurately, as changes in elevation between the source and the wearable audio device (which contribute to the distance between the devices) can be taken into account.

[0061] In another exemplary operation, such as Figure 6As shown, system 100 includes a wearable audio device 102 having two microphones (i.e., a first microphone 126 and a second microphone 128) and two audio source devices 104A and 104B. A peripheral device PD can transmit wireless data (e.g., data associated with a reference signal 106) to the source devices 104A and 104B, enabling the audio sources 104A and 104B to generate, produce, or otherwise reproduce audible sound energy (e.g., sounds 108A and 108B) within the environment E. The wearable audio device 102 can receive the sound energy of sounds 108A and 108B at the first microphone 126 and the second microphone 128, and obtain at each microphone associated signals 150A to 150D representing the sound 108A from audio source 104A and the sound 108B from audio source 104B. Figure 2A (As shown in the diagram). Since the wearable audio device 102 can also receive a reference signal 106 from, for example, a peripheral device PD, the wearable audio device 102 can compare the first signal 150A, the second signal 150B, the third signal 150C, and the fourth signal 150D with the reference signal 106 and obtain time-of-flight information 148 for signals 150A to 150D. Using the time-of-flight information 148 for each signal 150A to 150D, and using the known constant speed of sound traveling through the air, the wearable audio device 102 can determine, calculate, or otherwise (e.g., using gradient ascent or descent algorithms) derive the distance between the first microphone 126 and the audio source 104A (i.e., the first distance D1), the distance between the second microphone 128 and the audio source 104A (i.e., the second distance D2), the distance between the first microphone 126 and the audio source 104B (i.e., the third distance D3), and the distance between the second microphone 128 and the audio source 104B (i.e., the fourth distance D4). Once these distances are known, the angles generated by distance lines D1 and D2, and D3 and D4, can be calculated, and the actual position P1 can be determined with a first degree of accuracy. In one example, the first degree of accuracy is low; for example, equal time-of-flight information 148 from both audio sources 104A and 104B can indicate that the wearable audio device 102 is located at any point along the plane formed exactly between the two sources, but will not indicate where the wearable audio device 102 is located within that plane. Additionally, since the wearable audio device 102 includes two microphones, for example, where the first microphone 126 and the second microphone 128 are positioned on opposite sides of the wearable audio device 102, the actual orientation O1 of the wearable audio device 102 relative to audio sources 104A and 104B can also be determined.

[0062] As illustrated, if system 100 generates or otherwise reproduces one or more virtual audio sources 146, the algorithm employed by IMU 130 can utilize the known orientation O1 and actual position P1 of wearable audio device 102 relative to audio sources 104A and 104B to correct any drift in the perceived orientation PO or perceived position PP of wearable audio device 102 that occurs during operation of system 100, in order to maintain the accurate orientation and / or position of virtual audio sources 146. As will be discussed below, it should be understood that one or more additional parameters may be assumed to obtain or derive more detailed information relating to the position, orientation, or height of the wearable audio device, i.e., one or more assumed parameters 152. For example, it may be assumed that the user is on the surface of the earth and that they are 1.8 meters (approximately 5 feet 9 inches) tall. Additionally, it may be assumed that, for example, the physical speakers of audio sources 104A and 104B are positioned 1 meter (approximately 3 feet 2 inches) above the ground. This allows the IMU 130 to determine the distance or angle in question more accurately, as changes in elevation between the source and the wearable audio device (which contribute to the distance between the devices) can be taken into account.

[0063] In another exemplary operation, such as Figure 7 As shown, system 100 includes a wearable audio device 102 having two microphones (i.e., a first microphone 126 and a second microphone 128) and three audio source devices 104A to 104C. A peripheral device PD can transmit wireless data (e.g., data associated with a reference signal 106) to the source devices 104A to 104C, enabling the audio sources 104A to 104C to generate, produce, or otherwise reproduce audible sound energy (e.g., sounds 108A to 108C) within the environment E. The wearable audio device 102 can receive the sound energy of sounds 108A to 108C at the first microphone 126 and the second microphone 128, and obtain associated signals 150A to 150F representing sounds 108A to 108C at each microphone. Figure 2A(As shown in the diagram). Since the wearable audio device 102 can also receive a reference signal 106 from, for example, a peripheral device PD, the wearable audio device 102 can compare each signal 150A to 105F with the reference signal 106 and obtain time-of-flight information 148 for the signals 150A to 150F. Using the time-of-flight information 148 for each signal 150A to 150F, and using the known constant speed of sound through the air, the wearable audio device 102 can determine, calculate, or otherwise derive the distances between each audio source device 104A to 104C and each microphone, for example, distances D1 to D6. Once these distances are known, the angles generated by distance lines D1 and D2, D3 and D4, and D5 and D6 can be calculated, and the actual position P1 can be determined with a second accuracy higher than the first accuracy. In one example, the second level of accuracy is high accuracy. For example, triangulation can be performed using time-of-flight information 148 from all three audio sources 104A to 104C, using multipoint positioning techniques to obtain or derive the actual position P1 of the wearable audio device 102. Additionally, since the wearable audio device 102 includes two microphones, for example, where the first microphone 126 and the second microphone 128 are positioned on opposite sides of the wearable audio device 102, the actual orientation O1 of the wearable audio device 102 relative to audio sources 104A and 104B can also be determined. Furthermore, with three or more audio source devices 104, the system 100 can potentially derive the actual altitude H of the wearable audio device 102. With additional sources (e.g., four, five, or six sources), the system 100 can utilize the algorithms and techniques discussed herein to determine its own position and orientation relative to the audio source devices and obtain information related to the device's six degrees of freedom, such as the device's x, y, and z positions in a Cartesian coordinate system, as well as yaw, pitch, and roll.

[0064] As shown, if system 100 generates or otherwise reproduces one or more virtual audio sources 146 (in... Figures 3 to 6As shown in the diagram, the algorithm employed by IMU 130 can utilize the known orientation O1 and actual position P1 of wearable audio device 102 relative to audio sources 104A and 104B to correct any drift in the perceived orientation PO or perceived position PP of wearable audio device 102 that occurs during operation of system 100, in order to maintain the accurate orientation and / or position of virtual audio source 146. As will be discussed below, it should be understood that one or more additional parameters can be assumed to obtain or derive more detailed information related to the position, orientation, or height of the wearable audio device, i.e., one or more assumed parameters 152. For example, it can be assumed that the user is on the surface of the earth and that they are approximately 1.8 meters (approximately 5 feet 9 inches) tall. Additionally, it can be assumed, for example, that the physical speakers of audio sources 104A and 104B are positioned approximately 1 meter (approximately 3 feet 2 inches) above the ground. This allows IMU 130 to determine the distance or angle in question more accurately, as changes in elevation between the source and the wearable audio device (which contribute to the distance between the devices) can be taken into account.

[0065] Figures 8 to 9An exemplary flowchart illustrating the steps of method 200 according to the present disclosure is shown. Method 200 includes, for example: obtaining a first signal 150A representing a first sound 108A reproduced by a first audio source 104A in an environment E via a first microphone 126 and a second microphone 128 of a wearable audio device 102 (step 202); and deriving the orientation O1 of the wearable audio device 102 relative to the first audio source 104A in the environment E via a processor 112 based on the first signal 150A received at the first microphone 126 and the second microphone 128 (step 204). The determination of orientation can be aided by determining a first distance D1 between the first audio source 104A and the first microphone 126 and a second distance D2 between the first audio source 104A and the second microphone 128 via a processor 112 based on time-of-flight information 148 (step 206); and deriving the orientation O1 of the wearable audio device 102 relative to the first audio source 104A based at least in part on the time-of-flight information 148 (step 208). Optionally, method 200 may include: obtaining a second signal 150B representing a second sound 108B reproduced by a second audio source 104B in the environment E via a first microphone 126 and a second microphone 128 (step 210); obtaining a third signal 150C representing a third sound 108C reproduced by a third audio source 104C in the environment via the first microphone 126 and the second microphone 126 (step 212); and deriving, via processor 112, at least in part based on the first signal 150A, the second signal 150B, and / or the third signal 150C, the orientation O1, position P1, or height H1 of the wearable audio device 102 relative to the first audio source 104A, the second audio source 104B, and / or the third audio source 104C (step 214).

[0066] The method may further include generating a first virtual audio source 146 within the environment E via processor 112 (step 216); and preventing or correcting a drift of the virtual position of the first virtual audio source 146 relative to the first audio source 104A via processor, the drift being formed by the perceived orientation PO of the wearable audio device 102 relative to the first audio source 104A (step 218).

[0067] All definitions defined and used herein should be understood to encompass dictionary definitions, definitions incorporated by reference in documents, and / or the general meaning of the terms defined.

[0068] Unless explicitly stated to the contrary, the indefinite articles “a” and “an” as used herein in the specification and claims shall be understood to mean “at least one / an”.

[0069] As used herein in the specification and claims, the phrase “and / or” should be understood to mean “any one or both elements” of the elements so combined, that is, elements that exist together in some cases and separately in others. Multiple elements listed with “and / or” should be understood in the same way, that is, “one or more elements” of the elements so combined. Other elements may optionally be present, whether related to or unrelated to those explicitly identified by the “and / or” clause.

[0070] As used herein in the specification and claims, “or” should be understood to have the same meaning as “and / or” as defined above. For example, when items are separated in a list, “or” or “and / or” should be understood to be inclusive, i.e., including multiple elements or at least one and more than one element from a list of elements, as well as optional additional unlisted items. Terms that are explicitly stated to the contrary only (such as “only one of…” or “exact one of…” or “consisting of…” (when used in the claims)) will refer to multiple elements or exactly one element from a list of elements. In general, when followed by an exclusive term (such as “any one,” “one of…,” “only one of…” or “exact one of…”), the term “or” as used herein should be understood only to indicate an exclusive alternative (i.e., “one or another but not two”).

[0071] As used herein in the specification and claims, the phrase “at least one” (with regard to a list of one or more elements) should be understood to mean at least one element selected from any one or more elements in the list of elements, but not necessarily including at least one element from every element specifically listed in the list of elements, and does not exclude any combination of elements in the list of elements. This definition also allows for the optional presence of elements other than those explicitly identified in the list of elements referred to by the phrase “at least one,” whether related to or unrelated to those explicitly identified elements.

[0072] It should also be understood that, unless expressly stated to the contrary, in any method claimed herein that includes more than one step or action, the order of the steps or actions of the method is not necessarily limited to the order in which the steps or actions of the method are described.

[0073] In the claims and the foregoing description, all connecting phrases (such as “comprising,” “including,” “carrying,” “having,” “containing,” “involving,” “accommodating,” “constituting,” “made up of,” etc.) shall be understood as open-ended, meaning including but not limited to. Only the connecting phrases “composed of” and “substantially composed of” shall be closed or semi-closed connecting phrases, respectively.

[0074] The above examples of the described subject matter can be implemented in any of a variety of ways. For example, some aspects can be implemented using hardware, software, or a combination thereof. When any aspect is implemented at least partially in software, the software code can be executed on any suitable processor or set of processors, whether it is provided in a single device or a single computer or distributed among multiple devices / computers.

[0075] This disclosure can be implemented as a system, method, and / or computer program product at any possible level of technical detail integration. A computer program product may include one or more computer-readable storage media having computer-readable program instructions thereon for causing a processor to perform aspects of this disclosure.

[0076] Computer-readable storage media can be tangible devices capable of holding and storing instructions for use by an instruction execution device. Computer-readable storage media can be, for example, but not limited to, electronic storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or any suitable combination of the foregoing. A less complete list of more specific examples of computer-readable storage media includes: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or Flash memory), static random access memory (SRAM), portable optical disc read-only memory (CD-ROM), digital multifunction disc (DVD), memory sticks, floppy disks, mechanically encoded devices (such as punched cards or raised structures with recesses on which instructions are recorded), and any suitable combination of the foregoing. As used herein, computer-readable storage media should not be construed as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses passing through fiber optic cables), or electrical signals transmitted through wires.

[0077] The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to a corresponding computing / processing device via a network (e.g., the Internet, a local area network, a wide area network, and / or a wireless network), or downloaded to an external computer or external storage device. The network may include copper cables, optical fibers, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives and forwards the computer-readable program instructions from the network for storage in a computer-readable storage medium within the corresponding computing / processing device.

[0078] Computer-readable program instructions used to perform the operations of this disclosure may be any of the following: assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, status setting data, configuration data for integrated circuits, or source code or object code written in any combination of one or more programming languages, including: object-oriented programming languages ​​such as Smalltalk, C++, etc.; and procedural programming languages ​​such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partially on the user's computer, as a stand-alone software package, partially on the user's computer, partially on a remote computer, or entirely on a remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer via any type of network (including a local area network (LAN) or a wide area network (WAN)) or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some examples, electronic circuits, including, for example, programmable logic circuits, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), can execute computer-readable program instructions to perform aspects of this disclosure by using state information of the computer-readable program instructions to personalize the electronic circuits.

[0079] This document describes aspects of the disclosure with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to examples of the disclosure. It should be understood that each block in the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.

[0080] Computer-readable program instructions may be provided to a processor of a special-purpose computer or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions / actions specified in one or more blocks of a flowchart and / or block diagram. These computer-readable program instructions may also be stored in a computer-readable storage medium that can instruct a computer, programmable data processing apparatus, and / or other device to function in a particular manner, such that the computer-readable storage medium having the instructions stored therein includes an article of writing comprising instructions for implementing aspects of the functions / actions specified in the flowchart and / or block diagram or blocks.

[0081] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus or other equipment to cause a series of operational steps to be performed on the computer, other programmable apparatus or other equipment to produce a computer-implemented process, such that the instructions, which execute on the computer, other programmable apparatus or other equipment, implement the functions / actions specified in one or more boxes of a flowchart and / or block diagram.

[0082] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible specific implementations of systems, methods, and computer program products according to various examples of this disclosure. In this regard, each block in a flowchart or block diagram may represent an instruction module, instruction fragment, or instruction section, comprising one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions described in the blocks may occur in a different order than those shown in the drawings. For example, depending on the function involved, two blocks shown consecutively may actually be executed substantially simultaneously, or the blocks may sometimes be executed in reverse order. Each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented by a dedicated hardware-based system that performs the specified function or action or executes a combination of dedicated hardware and computer instructions.

[0083] Other specific implementations are within the scope of the following claims and other claims that the applicant may enjoy.

[0084] While various examples have been described and illustrated herein, those skilled in the art will readily conceive of a variety of other devices and / or structures for performing the functions described herein and / or obtaining one or more of the results and / or advantages described herein, and each of such variations and / or modifications is considered to be within the scope of the examples described herein. More generally, those skilled in the art will readily understand that all parameters, dimensions, materials, and configurations described herein are intended to be exemplary, and actual parameters, dimensions, materials, and / or configurations will depend on one or more specific applications using the teachings of this invention. Those skilled in the art will recognize, or be able to determine, many equivalents of the specific examples described herein using only conventional experimentation. Therefore, it should be understood that the above embodiments are presented by way of example only, and that the examples may be practiced in ways other than those specifically described and claimed within the scope of the appended claims and their equivalents. The examples of this disclosure relate to each individual feature, system, article of manufacture, material, tooling kit, and / or method described herein. Furthermore, any combination of two or more such features, systems, articles of manufacture, materials, tooling kits, and / or methods is included within the scope of this disclosure if such features, systems, articles of manufacture, materials, tooling kits, and / or methods do not contradict each other.

Claims

1. A wearable audio device, comprising: A first microphone and a second microphone, the first microphone and the second microphone being configured to obtain a first signal representing a first sound reproduced in the environment by a first audio source; Inertial measurement unit; A processor configured to derive the orientation of the wearable audio device relative to the first audio source in the environment, based at least in part on the first signal received at the first microphone and the second microphone; The processor is also configured to determine the perceived orientation of the wearable audio device based at least in part on the inertial measurement unit. The processor is further configured to generate a first virtual audio source within the environment, and the processor is configured to prevent or correct a drift of the virtual position of the first virtual audio source relative to the first audio source, the drift being formed by the perceived orientation of the wearable audio device relative to the first audio source.

2. The wearable audio device of claim 1, wherein the processor is configured to: determine a first distance between the first audio source and the first microphone and a second distance between the first audio source and the second microphone based on time-of-flight information, and derive the orientation of the wearable audio device relative to the first audio source based at least in part on the time-of-flight information.

3. The wearable audio device of claim 1, wherein the first microphone and the second microphone are configured to obtain a second signal representing a second sound reproduced by a second audio source within the environment, and wherein the processor is further configured to derive the position of the wearable audio device relative to the first audio source and the second audio source.

4. The wearable audio device of claim 3, wherein the first microphone and the second microphone are configured to obtain a third signal representing a third sound reproduced by a third audio source in the environment, and wherein the processor is further configured to derive the height of the wearable audio device relative to the first audio source, the second audio source, and / or the third audio source, at least in part based on the first signal, the second signal, and / or the third signal.

5. The wearable audio device of claim 1, wherein the processor is configured to determine the orientation and position of the wearable audio device using a gradient descent algorithm or a gradient ascent algorithm, the gradient descent algorithm or gradient ascent algorithm utilizing time-of-flight information from the first signal received at the first microphone and the second microphone, the second signal generated by the second audio source and / or the third signal generated by the third audio source.

6. A wearable audio device, comprising: A first microphone is configured to receive a first signal representing a first sound reproduced by a first audio source in an environment and a second signal representing a second sound reproduced by a second audio source in the environment. An inertial measurement unit, wherein the processor is further configured to determine the perceived orientation of the wearable audio device based at least in part on the inertial measurement unit; A processor configured to derive the position of the wearable audio device relative to the first audio source and the second audio source, at least in part, based on the first signal and the second signal; The processor is configured to generate a first virtual audio source within the environment, and the processor is configured to prevent or correct a drift of the virtual position of the first virtual audio source relative to the first audio source and the second audio source, the drift being formed by the perceived orientation of the wearable audio device relative to the first audio source and the second audio source.

7. The wearable audio device of claim 6, wherein the processor is configured to: determine a first distance between the first audio source and the first microphone and a second distance between the second audio source and the first microphone based on time-of-flight information, and derive the position of the wearable audio device relative to the first audio source and the second audio source based at least in part on the time-of-flight information.

8. The wearable audio device of claim 6, wherein the wearable audio device includes a second microphone configured to acquire the first signal and the second signal within the environment, and the processor is further configured to derive the orientation of the wearable audio device relative to the first audio source and the second audio source, at least in part, based on the first signal and the second signal.

9. The wearable audio device of claim 8, wherein the first microphone and the second microphone are configured to obtain a third signal representing a third sound reproduced by a third audio source in the environment, and wherein the processor is further configured to derive the height of the wearable audio device relative to the first audio source, the second audio source, and / or the third audio source, at least in part based on the first signal, the second signal, and / or the third signal.

10. The wearable audio device of claim 9, wherein the processor is configured to determine the orientation and position of the wearable audio device at least in part based on the first signal, the second signal and / or the third signal using a gradient descent algorithm or a gradient ascent algorithm, the gradient descent algorithm or gradient ascent algorithm utilizing time-of-flight information from the first signal received at the first microphone and the second microphone, the second signal generated by the second audio source and / or the third signal generated by the third audio source.

11. A method for determining the orientation of a wearable audio device, the method comprising: A first signal representing a first sound reproduced in the environment by a first audio source is obtained via a first microphone and a second microphone of a wearable audio device; The orientation of the wearable audio device relative to the first audio source in the environment is derived by the processor based on the first signal received at the first microphone and the second microphone; The wearable audio device includes an inertial measurement unit, and the method further includes: The processor determines the perceived orientation of the wearable audio device, at least in part, based on the inertial measurement unit. The processor generates a first virtual audio source within the environment; as well as The processor prevents or corrects the drift of the virtual position of the first virtual audio source relative to the first audio source, the drift being caused by the perceived orientation of the wearable audio device relative to the first audio source.

12. The method of claim 11, further comprising: The processor determines a first distance between the first audio source and the first microphone, and a second distance between the first audio source and the second microphone, based on time-of-flight information. as well as The orientation of the wearable audio device relative to the first audio source is derived at least in part based on the time-of-flight information.

13. The method of claim 11, further comprising: A second signal representing a second sound reproduced by a second audio source within the environment is obtained via the first microphone and the second microphone; as well as The processor derives the position of the wearable audio device relative to the first audio source and the second audio source, at least in part, based on the first signal and / or the second signal.

14. The method of claim 13, further comprising: A third signal representing a third sound reproduced by a third audio source within the environment is obtained via the first microphone and the second microphone; as well as The height of the wearable audio device relative to the first audio source, the second audio source, and / or the third audio source is derived by the processor, at least in part, based on the first signal, the second signal, and / or the third signal.