Voice translation system and method based on Bluetooth earphones

JP7880164B2Active Publication Date: 2026-06-25SHENZHEN TIMEKETTLE TECH CO LTD

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Patents
Current Assignee / Owner
SHENZHEN TIMEKETTLE TECH CO LTD
Filing Date
2023-08-30
Publication Date
2026-06-25

AI Technical Summary

Benefits of technology

【0011】 本発明は、ブルートゥースイヤホンによる音声翻訳システム及び音声翻訳方法を提供し、当該システムは、第1音声信号と第2音声信号とが同じ音源から由来するか否かを識別し、同一音源信号に由来する第1音声信号と第2音声信号とが具体的にどの音源から由来するものであるかを判断したら、上記第1音声信号及び第2音声信号にそれぞれ対応するゲインファクタを設定し、すなわち、第1音声信号および第2音声信号が、いずれも、第1の翻訳ブルートゥースイヤホンを装着するユーザからのものである場合、第1ゲインファクタG1=1、第2ゲインファクタG2=0であり、第1音声信号および第2音声信号が、いずれも、第2の翻訳ブルートゥースイヤホンを装着するユーザからのものである場合、第1ゲインファクタG1=0、第2ゲインファクタG2=1であり、ゲインされた第1音声信号および第2音声信号を翻訳機又はクラウド翻訳エンジンに送信して翻訳されることにより、認識や翻訳の乱れの問題を解決し、コミュニケーションを効率化するとともに、不要な音声認識や翻訳も大幅に低減するという発明の効果を有する。

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 0007880164000028
    Figure 0007880164000028
  • Figure 0007880164000029
    Figure 0007880164000029
  • Figure 0007880164000030
    Figure 0007880164000030
Patent Text Reader

Abstract

The present invention provides a speech translation system based on Bluetooth earphones, comprising a first translation Bluetooth earphone, a second translation Bluetooth earphone, a speech signal processing center, and a translation module. The speech signal processing center includes a Fourier transform module, a signal cross-correlation processing module, a determination module, and a gain module. The first and second speech signals collected by the first and second translation Bluetooth earphones, respectively, undergo time-frequency signal processing and signal cross-correlation processing to determine the location of the sound source, and the first and second speech signals are gain-increased for recognition and translation by the translation module. The present invention also provides a speech translation method using Bluetooth earphones. The speech translation system of the present invention solves the problem of inaccurate recognition and translation, improves communication efficiency, and significantly reduces unnecessary speech recognition and translation.
Need to check novelty before this filing date? Find Prior Art

Claims

1. A voice translation system based on Bluetooth earphones, A first translation Bluetooth earphone and a second translation Bluetooth earphone are attached to each of the users who interact with each other, An audio signal processing center comprising a Fourier transform module, a signal cross-correlation processing module, a determination module, and a gain module, wherein the first audio signal and the second audio signal, respectively, acquired by the first translation Bluetooth earphone and the second translation Bluetooth earphone, are transmitted to the Fourier transform module for time-frequency signal processing, the signal cross-correlation processing module performs signal cross-correlation processing on the first audio signal and the second audio signal, and the determination module determines whether the first audio signal and the second audio signal originate from the same sound source based on the magnitude of the signal cross-correlation values ​​of the first audio signal and the second audio signal. If the numerical range of the signal cross-correlation value of the first audio signal and the second audio signal is (0.7, 1), it is determined that the first audio signal and the second audio signal are the same sound source signal. If the first audio signal and the second audio signal originate from the same sound source, the position of the sound source is determined from the time delay relationship between the first audio signal and the second audio signal, and the gain module sets the gain factor G1 of the first audio signal and the gain factor G2 of the second audio signal based on the acquired information on the position of the sound source. If both the first and second audio signals are audio signals from a user wearing the first translation Bluetooth earphone, the gain factor G1 of the first audio signal becomes 1 and the gain factor G2 of the second audio signal becomes 0. If both the first and second audio signals are audio signals from a user wearing the second translation Bluetooth earphone, the gain factor G1 of the first audio signal becomes 0 and the gain factor G2 of the second audio signal becomes 1. A translation module that, after the first audio signal and the second audio signal are processed by the gain module, performs recognition and translation, and transmits the translated audio signal in association with the first translation Bluetooth earphone or the second translation Bluetooth earphone, A voice translation system equipped with [features / equipment].

2. The audio signal processing center further comprises a signal amplitude detection module that detects the magnitude of the signal amplitudes of the first audio signal and the second audio signal, When it is determined that the first audio signal and the second audio signal originated from the same sound source, If the signal amplitude of the first audio signal is greater than the signal amplitude of the second audio signal, then both the first audio signal and the second audio signal are audio signals from the user wearing the first translation Bluetooth earphone. The voice translation system according to claim 1, characterized in that when the signal amplitude of the second voice signal is greater than the signal amplitude of the first voice signal, both the first voice signal and the second voice signal are voice signals from a user wearing the second translation Bluetooth earphone.

3. A voice translation method based on Bluetooth earphones for a voice translation system based on Bluetooth earphones according to claim 1 or 2, The voice translation method based on the Bluetooth earphones includes the steps of performing a Fourier transform on the first voice signal and the second voice signal, respectively, collected by the first translation Bluetooth earphone and the second translation Bluetooth earphone, and processing the time-frequency signal; The first audio signal and the second audio signal are subjected to cross-correlation processing after processing of the time-frequency signals, the cross-correlation value of the first audio signal and the second audio signal is obtained, and it is determined whether the first audio signal and the second audio signal originate from the same sound source based on the magnitude of the cross-correlation value, and if the numerical range of the cross-correlation value of the first audio signal and the second audio signal is (0.7, 1), it is determined that the first audio signal and the second audio signal are from the same sound source. If it is determined that the first audio signal and the second audio signal originate from the same sound source, the step of determining the position of the sound source based on the time delay relationship between the first audio signal and the second audio signal, Based on the location information of the sound source, the gain factor G1 of the first audio signal and the gain factor G2 of the second audio signal are set, and if both the first audio signal and the second audio signal are audio signals from a user wearing the first translation Bluetooth earphone, the gain factor G1 of the first audio signal = 1 and the gain factor G2 of the second audio signal = 0, and if both the first audio signal and the second audio signal are audio signals from a user wearing the second translation Bluetooth earphone, the gain factor G1 of the first audio signal = 0 and the gain factor G2 of the second audio signal = 1, The steps include: gaining up the first audio signal and the second audio signal and outputting them to a translator or cloud translation engine, performing recognition and translation, and transmitting the translated audio signal in association with the first translation Bluetooth earphone or the second translation Bluetooth earphone; A voice translation method based on Bluetooth earphones equipped with [specific feature / technology].

4. The step of determining whether the first audio signal and the second audio signal originate from the same sound source is: The cross-correlation function between the first audio signal and the second audio signal is, [Number 20] Satisfying the conditions, x 1 (t) is the signal propagation model of the first audio signal, and x 2 (t) is the signal propagation model of the second audio signal, [Math 21] Satisfying the conditions, [Number 22] Satisfying the conditions, Here, t represents time, s(●) represents the sound source model, and n 1 (●) and n 2 (●) represents the noise model, x 1 (●), x 2 (●) represents the signal model received by the first translation Bluetooth earphone and the second translation Bluetooth earphone, respectively, and τ 1 , τ 2 α and β represent the time at which the sound source propagates to the first and second translation Bluetooth earphones, respectively, α and β represent the energy attenuation factors when the sound source propagates to the first and second translation Bluetooth earphones, respectively, and τ represents the signal propagation delay. Furthermore, assuming that the noise signal and the audio signal are not correlated, and that the noise signals are not correlated with each other, the cross-correlation function between the first audio signal and the second audio signal is [Number 23] Satisfying the conditions, In equation (4) [Number 24] Calculate the value of, [Number 25] The voice translation method based on Bluetooth earphones according to claim 3, characterized in that when the range of the value is (0.7, 1), the first voice signal and the second voice signal are voice signals from the same sound source.

5. In the step of determining the position of the sound source based on the time delay relationship between the first audio signal and the second audio signal, Equation (4) is, [Number 26] Convert to, As can be seen from the properties of the correlation function, [Number 27] is satisfied, and the time delay τ = τ between the first audio signal and the second audio signal 1 −τ 2 is calculated. If the time delay τ is positive, it indicates that both the first audio signal and the second audio signal are audio signals from a user wearing the second translation Bluetooth headset. On the other hand, if the time delay τ is negative, it indicates that both the first audio signal and the second audio signal are audio signals from a user wearing the first translation Bluetooth headset. A voice translation method based on the Bluetooth headset according to claim 4, characterized in that.