An integrated microphone device for speech translation

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By designing an integrated microphone device that supports switching between singing/speaking and translation modes, efficient cross-language communication with multiple people listening and translating simultaneously is achieved. This solves the problem that existing devices cannot meet the needs of multiple people listening and translating at the same time, and improves the adaptability of the device and the user experience.

CN224385633UActive Publication Date: 2026-06-19SHENZHEN TEANA TECH CO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Utility models(China)
Current Assignee / Owner: SHENZHEN TEANA TECH CO LTD
Filing Date: 2025-06-14
Publication Date: 2026-06-19

Application Information

Patent Timeline

14 Jun 2025

Application

19 Jun 2026

Publication

CN224385633U

IPC: H04R1/08

AI Tagging

Application Domain

Mouthpiece/microphone attachments

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN224385633U_ABST

Patent Text Reader

Abstract

This application relates to the field of audio devices, and more particularly to an integrated microphone device for speech translation, including a pickup module and a sound output module. The sound output module includes a Bluetooth communication unit, an audio processing unit, an audio playback unit, and a mode switching switch. The audio processing unit receives and processes the raw audio data collected by the pickup unit. The microphone device includes a speaking mode and a translation mode. The mode switching switch switches the microphone device to either speaking or translation mode. When the microphone device is in translation mode, the raw audio data is transmitted to an external mobile terminal for translation via the Bluetooth communication unit to obtain translated audio data. The translated audio data is transmitted to the audio processing unit for processing via the Bluetooth communication unit. The audio playback unit is connected to the audio processing unit and is used to receive and play the translated audio data. This application improves communication efficiency in scenarios where multiple people are simultaneously listening and translating.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of audio devices, and more particularly to an integrated microphone device for speech translation. Background Technology

[0002] Currently, voice interaction technology plays an increasingly important role in people's lives and work. In numerous scenarios such as multilingual communication, remote conferencing, education, and live streaming, the demand for efficient and convenient voice interaction is growing daily. In the past, various conventional methods were used to solve voice interaction-related problems. Traditional microphone equipment, such as karaoke microphones and conference microphones, primarily functioned for voice capture and real-time playback. These devices focused on clear sound pickup and instant transmission, but lacked translation capabilities and could only be used within the same language environment.

[0003] However, existing translation devices, such as the iFlytek Translator and Google Translate headphones, while possessing translation functions, mostly rely on headphones to output the translation results. This method can only meet the listening needs of a single user and cannot enable multiple people to listen and translate simultaneously. This makes it unsuitable for situations requiring teamwork or multi-person communication, thus affecting communication efficiency in scenarios where multiple people are listening and translating at the same time. Utility Model Content

[0004] To improve communication efficiency in scenarios where multiple people are simultaneously listening and translating, this application provides an integrated microphone device for speech translation.

[0005] This application provides an integrated microphone device for voice translation, which adopts the following technical solution:

[0006] An integrated microphone device for speech translation includes a pickup module and a sound output module. The output end of the pickup module is connected to the input end of the sound output module. The sound output module includes a Bluetooth communication unit, an audio processing unit, an audio playback unit, and a mode switching switch. The audio processing unit is connected to the pickup unit in the pickup module and the Bluetooth communication unit. The audio processing unit receives and processes the raw audio data collected by the pickup unit. The microphone device includes a speech-and-talk mode and a translation mode. The mode switching switch is used to switch the microphone device to the speech-and-talk mode or the translation mode. When the microphone device is in the translation mode, the raw audio data is transmitted to an external mobile terminal for translation via the Bluetooth communication unit to obtain translated audio data. The translated audio data is transmitted to the audio processing unit for processing via the Bluetooth communication unit. The audio playback unit is connected to the audio processing unit and receives and plays the translated audio data.

[0007] By adopting the above technical solution, the microphone device combines the functions of a microphone and a speaker. The sound pickup module can collect raw audio data, and the sound output module can play translated audio data. Furthermore, the mode switch can flexibly switch between singing and speaking mode and translation mode. In translation mode, the raw audio data can be transmitted to an external mobile terminal for translation and playback, meeting the needs of multiple people listening to translation at the same time and improving the communication efficiency in scenarios where multiple people listen to translation at the same time.

[0008] Optionally, when the microphone device is in translation mode, the Bluetooth communication unit adopts a bidirectional transmission protocol; when the microphone device is in singing / speaking mode, the Bluetooth communication unit adopts a unidirectional transmission protocol.

[0009] By adopting the above technical solutions, in translation mode, the Bluetooth communication unit uses a bidirectional transmission protocol to meet the back-and-forth transmission of voice data and realize the playback of the translated audio signal; in singing and speaking mode, the unidirectional transmission protocol can meet the requirements of high-quality audio playback and can flexibly adapt the protocol according to different modes, improving the flexibility and applicability of the device.

[0010] Optionally, the microphone module further includes a mesh head and a sensitivity switch. The microphone unit is disposed inside the mesh head, and the sensitivity switch is used to control the sensitivity of the microphone unit. When the microphone device is in the singing / speaking mode, the sensitivity of the microphone unit is restored to the default value.

[0011] By adopting the above technical solution, the sensitivity of the microphone unit can be controlled by a sensitivity switch to adapt to different ambient noise levels. In the singing and talking mode, the sensitivity of the microphone unit is restored to the default value to avoid excessive sensitivity causing feedback and affecting the experience, thus ensuring the stability of audio acquisition effect in the singing and talking mode.

[0012] Optionally, the pickup unit includes any one of a first pickup, a second pickup, and a third pickup; the first pickup has a cardioid polar pattern and is fixedly oriented towards the top of the mesh head; the second pickup has a bidirectional polar pattern and is arranged at a predetermined angle to the first pickup; the third pickup includes two cardioid microphones, which are arranged back-to-back, and the pickup angle of the two cardioid microphones is greater than 90°;

[0013] When the microphone device is in the singing / speaking mode, the first pickup (11) is working, while the second and third pickups are not working; when the microphone device is in the translation mode, the second and third pickups are working, while the first pickup is not working.

[0014] By adopting the above technical solution, using a cardioid directional first microphone facing upwards and a bidirectional second or third microphone arranged at a predetermined angle to it, combined with the switching between singing and speaking modes and translation modes, the device can use the first microphone to pick up a person's singing voice and close-range speech in singing and speaking mode, and use the second or third microphone in translation mode to meet the needs of non-handheld communication in a relatively quiet environment, thereby improving the device's adaptability to different usage scenarios.

[0015] Optionally, the pickup module further includes a level comparison unit and an input switch unit connected to the audio processing unit; the level comparison unit is used to compare the audio signal levels output by the two cardioid microphones in the third pickup; the input switch unit is used to close the audio input channel of the cardioid microphone with the lower audio signal level according to the comparison result output by the level comparison unit.

[0016] By adopting the above technical solution, the level comparison unit compares the audio signal levels of the two cardioid microphones in the third microphone in real time, and can detect the difference in signal strength. According to the level comparison result, the input switch unit will automatically turn off the cardioid microphone channel with the lower audio signal level. This adjustment can ensure the quality of the final output signal and reduce unnecessary interference. By dynamically adjusting the audio input channel, the system can ensure that the pickup module always works in the best audio input state.

[0017] Optionally, the microphone device further includes a handle, which is disposed on the side of the sound output module away from the sound pickup module. The handle has a power supply unit inside, which is used to supply power to the sound output module and the sound pickup module.

[0018] By adopting the above technical solution, the microphone device has a handle for easy handheld operation; the power unit inside the handle supplies power to the sound output module and the pickup module, ensuring normal operation of the device and improving its independence and portability.

[0019] Optionally, the microphone device further includes a motion posture detection unit, which is used to acquire the posture change value of the handle in real time. If the posture change value is greater than a preset threshold, the motion posture detection unit sends a punctuation signal. After receiving the punctuation signal, the audio processing unit sends a punctuation command to the external mobile terminal, and the external mobile terminal performs punctuation operation according to the punctuation command.

[0020] By adopting the above technical solution, the motion posture detection unit acquires posture change values in real time. When the posture change value is greater than a preset threshold, a sentence segmentation signal is issued. The audio processing unit then sends a sentence segmentation command to the external mobile terminal, enabling the external mobile terminal to perform sentence segmentation operations. This enhances the interactive flexibility and environmental adaptability of the device, and can achieve sentence segmentation based on the user's handle posture changes, making it convenient for use in different communication scenarios.

[0021] Optionally, the handle is a telescopic handle.

[0022] By adopting the above technical solution and using a telescopic handle, users can flexibly adjust the handle length according to actual conditions to meet different social distancing requirements and environmental noise levels.

[0023] Optionally, the microphone device may further include either an analog signal interface (17) or a digital signal interface (18); the analog signal interface (17) is used to connect an external microphone or audio equipment; the digital signal interface (18) is used to connect professional audio equipment or an extended translation input source.

[0024] By adopting the above technical solutions, the addition of an analog signal interface enables the microphone device to connect to external microphones or audio equipment, enhancing the device's audio input and output capabilities; the addition of a digital signal interface enables the microphone device to connect to professional audio equipment or expand translation input sources, expanding the device's applicability and functionality.

[0025] Optionally, the microphone device further includes a translation module, the translation module comprising:

[0026] A speech recognition unit, wherein the input end of the speech recognition unit is connected to the output end of the microphone unit, is used to convert the raw audio data into raw language text;

[0027] A text translation unit, wherein the input of the text translation unit is connected to the output of the speech recognition unit, is used to translate the original language text into target language text.

[0028] The text-to-speech unit has its input end connected to the output end of the text translation unit and its output end connected to the audio processing unit. It is used to convert target language text into target audio data and transmit the target audio data to the audio processing unit. The audio processing unit then transmits the target audio data to the audio playback unit for playback.

[0029] By adopting the above technical solution, the raw audio data collected by the sound pickup module is converted into raw language text by the speech recognition unit, then translated into target language text by the text translation unit, and finally converted into target audio data by the text-to-speech unit. The audio data is then transmitted to the audio playback unit for playback via the audio processing unit. This enables the device to perform local translation processing of audio data without relying entirely on external mobile terminals for translation, thus improving the autonomy and flexibility of the device's translation capabilities.

[0030] In summary, this application includes at least one of the following beneficial technical effects:

[0031] 1. The microphone device combines the functions of a microphone and a speaker. The sound pickup module can collect raw audio data, and the sound output module can play translated audio data. The mode switch can flexibly switch between singing and speaking mode and translation mode. In translation mode, the raw audio data can be transmitted to an external mobile terminal for translation and playback, meeting the needs of multiple people listening to translation at the same time and improving the communication efficiency in scenarios where multiple people listen to translation at the same time.

[0032] 2. The sensitivity of the microphone unit can be controlled by the sensitivity switch to adapt to different ambient noise levels. In the singing and talking mode, the sensitivity of the microphone unit is restored to the default value to avoid excessive sensitivity and feedback, which would affect the experience and ensure the stability of the audio acquisition effect in the singing and talking mode.

[0033] 3. The motion posture detection unit acquires posture change values in real time. When the posture change value is greater than a preset threshold, a sentence segmentation signal is issued. The audio processing unit sends a sentence segmentation command to the external mobile terminal, enabling the external mobile terminal to perform sentence segmentation operations. This enhances the device's interactive flexibility and environmental adaptability. It can achieve sentence segmentation based on the user's handle posture changes, making it convenient for use in different communication scenarios. Attached Figure Description

[0034] Figure 1 This is a schematic diagram of the overall structure of the microphone device in the embodiments of this application;

[0035] Figure 2 This is a schematic diagram of the virtual structure of the pickup module and the sound output module in the embodiments of this application;

[0036] Figure 3 This is a schematic diagram of the virtual structure of the pickup module, the sound output module, and the handle in the embodiments of this application;

[0037] Figure 4 This is an exploded view of the microphone device in an embodiment of this application;

[0038] Figure 5 This is a schematic diagram of the telescopic handle in an embodiment of this application;

[0039] Figure 6This is a cross-sectional view of the handle in an embodiment of this application;

[0040] Figure 7 This is a schematic diagram of the virtual structure of the translation module in the embodiments of this application.

[0041] Explanation of reference numerals in the attached figures:

[0042] 1. Sound pickup module; 2. Sound output module; 3. Bluetooth communication unit; 4. Audio processing unit; 5. Audio playback unit; 6. Mode switching switch; 7. Microphone unit; 8. Mesh head; 10. Sensitivity switch; 11. First microphone; 12. Second microphone; 13. Handle; 14. Power supply unit; 15. Motion posture detection unit; 16. Lock hole; 17. Analog signal interface; 18. Digital signal interface; 19. Translation module; 20. Speech recognition unit; 21. Text translation unit; 22. Text-to-speech unit; 23. Sound cavity; 24. Speaker; 25. Stand; 26. Middle frame; 27. Cover; 28. Third microphone; 100. External mobile terminal; 201. Outer tube; 202. Inner tube; 203. U-shaped spring; 205. Circular protrusion. Detailed Implementation

[0043] The following is a further detailed description of this application.

[0044] This application discloses an integrated microphone device for speech translation.

[0045] Example 1

[0046] Reference Figure 1 An integrated microphone device for voice translation includes a pickup module 1, an output module 2, and a handle 13. The pickup module 1 is mounted above the output module 2 to collect the user's voice input; the output module 2 is positioned below the pickup module 1 and is responsible for playing back the translated voice output. The handle 13 is mounted on the side of the output module 2 away from the pickup module 1, and its internal cavity houses a power supply unit 14 to provide power to the entire device. The output terminal of the pickup module 1 is connected to the input terminal of the output module 2, and both the output module 1 and the output module 2 are electrically connected to the power supply unit 14 to ensure continuous operation of the device. The user can operate the device by holding the handle 13, pointing the pickup module 1 towards themselves to record voice, while simultaneously pointing the output module 2 towards the other party to play back the translation result. This design is ergonomic, portable, and easy to use, suitable for cross-language communication scenarios.

[0047] Reference Figure 1 and Figure 2The sound output module 2 includes a Bluetooth communication unit 3, an audio processing unit 4, an audio playback unit 5, and a mode switching switch 6. The pickup module 1 includes a pickup unit 7, a mesh head 8, and a sensitivity switch 10. The microphone device supports a singing / speaking mode and a translation mode, which can be switched by the user via the mode switching switch 6.

[0048] Singing / Speaking Mode: The device functions as a regular microphone. The raw audio data picked up by the microphone unit 7 is optimized by the audio processing unit 4 and then directly output by the audio playback unit 5, suitable for karaoke, speeches, or amplification scenarios. Specifically, the accompaniment music from the external mobile terminal 100 is transmitted to the audio processing unit 4 via the Bluetooth communication unit 3. Simultaneously, the raw audio data collected by the microphone unit 7 enters the audio processing unit 4 for processing. After processing the accompaniment music and vocals, the audio processing unit 4 plays them together through the speaker unit.

[0049] Translation Mode: The raw audio data is transmitted to an external mobile terminal 100 (such as a smartphone or tablet) via Bluetooth communication unit 3. A translation application on the mobile terminal performs real-time translation, generating translated audio data, which is then transmitted back to the audio processing unit 4 via Bluetooth. The processed translated audio data is played by the audio playback unit 5, enabling real-time cross-language dialogue. (The device supports end-to-end latency ≤ 1 second, meeting non-real-time translation needs.)

[0050] Specifically, the mesh head 8 serves as an external protective structure to protect internal components and optimize acoustic performance; the microphone unit 7 (located inside the mesh head 8) is responsible for collecting the user's voice signal and converting it into an electrical signal; the sensitivity switch 10 is used to manually adjust the sensitivity of the microphone unit 7 to adapt to different environments (such as noisy or quiet scenes).

[0051] The microphone grille 8 is typically made of metal or plastic, providing some protection against dust and debris. It is generally circular or oval in shape, with numerous tiny pores on its surface. These pores allow sound to enter smoothly while also providing some wind protection and filtering. The microphone unit 7 is the core component for sound pickup, converting sound signals into electrical signals.

[0052] Among them, reference Figure 3 and Figure 4The microphone unit 7 includes a first microphone 11, a second microphone 12, and a third microphone 28. The first microphone 11 is a cardioid dynamic microphone, which is highly sensitive to sounds directly in front of the user and can accurately capture sounds from directly in front of the user, making it suitable for singing or close-range voice communication. Dynamic microphones have a relatively simple structure, good stability, and are relatively affordable. Besides dynamic microphones, condenser microphones can also be used, which offer higher sensitivity and more delicate sound quality. The second microphone 12 is a bidirectional cylindrical condenser microphone with high sensitivity, and it is arranged at a 90-degree angle to the first microphone 11. In a relatively quiet environment, even without holding the microphone, placing it on a table between two people can achieve good sound pickup. The bidirectional characteristic allows it to simultaneously capture sounds from both the front and rear directions. An electret condenser microphone can also be used instead of the second microphone 12, achieving similarly high sensitivity and good pickup performance.

[0053] In one embodiment, the second microphone 12 can be configured as an omnidirectional microphone for receiving ambient sound in a 360° omnidirectional manner, suitable for scenarios such as meetings and multi-person conversations, ensuring that surrounding speech can also be effectively captured. The second microphone 12 (omnidirectional) supplements the capture of the voices of other speakers in the surrounding area, ensuring the continuity of multi-person conversations.

[0054] In one embodiment, the third microphone 28 employs two cardioid microphones positioned opposite each other, with a pickup angle greater than 90° to ensure audio signal reception over a wider area. To further optimize audio acquisition, the pickup module 1 also integrates a level comparison unit and an input switch unit. The level comparison unit performs real-time comparison of the audio signal levels output by the two cardioid microphones, while the input switch unit automatically shuts down the audio input channel of the cardioid microphone with the lower level based on the comparison result, thereby avoiding redundancy and noise and ensuring the clarity and stability of the picked-up signal.

[0055] For example, consider two heart-shaped microphones positioned opposite each other, designated as microphone A and microphone B. When user A speaks, facing microphone A, the volume (audio signal level) of microphone A will be higher than that of microphone B. Therefore, the system identifies user A as speaking and shuts down microphone B's input. This allows the external mobile terminal 100 to recognize user A's voice and translate into user B's language. Similarly, when user B speaks, if microphone B's level is higher than microphone A's, the system identifies user B as speaking. This intelligent audio signal level comparison and channel closure mechanism effectively avoids confusion when simultaneously capturing two people's voices, ensuring the efficient operation and accuracy of the translation system. The system automatically switches input channels based on the user's speech, providing clear and accurate voice input to the external terminal and guaranteeing the smoothness and accuracy of the translation process.

[0056] Specifically, the sensitivity switch 10 is used to control the sensitivity of the microphone unit 7. It can be a button or a knob. By operating the sensitivity switch 10, the sensitivity of the microphone can be adjusted according to different ambient noise levels. When the ambient noise is low and the signal-to-noise ratio is high, the sensitivity can be increased to pick up voices from a greater distance. When the ambient noise is high and the signal-to-noise ratio is low, the sensitivity should be decreased to pick up only nearby voices, ensuring the purity of the voice. Details are as follows:

[0057] In the speaking / singing mode: the first pickup 11 is working, while the second pickup 12 and the third pickup 28 are not working; the sensitivity of the pickup unit 7 is automatically restored to the default value (such as medium sensitivity, suitable for ordinary speaking or singing scenarios); to ensure stable sound quality and avoid accidental operation affecting the output effect; and to avoid excessive sensitivity causing feedback and affecting the experience.

[0058] In translation mode: the second microphone 12 and / or the third microphone 28 start working, the sensitivity switch 10 becomes available again, and the user can adjust the pickup sensitivity according to the ambient noise to improve translation accuracy.

[0059] The Bluetooth communication unit 3 is a key component enabling communication between the device and the external mobile terminal 100, and it supports multiple Bluetooth protocols. When the microphone device is in translation mode, the Bluetooth communication unit 3 uses a bidirectional transmission protocol; when the microphone device is in speech / preaching mode, the Bluetooth communication unit 3 uses a unidirectional transmission protocol. For example, when the device is in translation mode, the Bluetooth communication unit 3 can use either a call mode (HFP) based on the classic Bluetooth protocol or a bidirectional audio streaming mode based on the Bluetooth Low Energy Audio protocol (LEAudio).

[0060] HFP mode is suitable for real-time calls and translation scenarios, with bidirectional transmission and end-to-end latency ≤250ms, and bidirectional latency ≤500ms.

[0061] The LE Audio mode integrates the LC3 codec, enabling bidirectional transmission with an end-to-end latency of ≤200ms and a bidirectional latency of ≤400ms (for low-power translation scenarios).

[0062] When the device is in the singing and speaking mode, Bluetooth communication unit 3 adopts the classic Bluetooth audio mode (A2DP). This mode is mostly for one-way music playback, with an end-to-end latency of ≤300ms (one-way transmission is suitable for singing and speaking mode).

[0063] Reference Figure 3 and Figure 4A motion attitude detection unit 15 is installed on the handle 13. This unit acquires the handle 13's attitude data in real time and can consist of a three-axis accelerometer and a gyroscope. The three-axis accelerometer measures acceleration in three directions, and the gyroscope measures rotational angular velocity. By comprehensively analyzing the acceleration and angular velocity data, the attitude change of the handle 13 can be determined. This unit can be installed inside the handle 13 near its center of gravity for more accurate detection of attitude changes. When the attitude change value of the handle 13 exceeds a preset threshold, the motion attitude detection unit 15 will issue a sentence segmentation signal. For example, swinging the handle 13 forward may trigger a Chinese-to-English sentence segmentation; swinging the handle 13 backward may trigger an English-to-Chinese sentence segmentation. This function provides users with more convenience and interaction methods during operation. The motion posture detection unit 15 is connected to the audio processing unit 4 via a signal line. When the detected posture change value exceeds the preset threshold, the motion posture detection unit 15 sends a sentence segmentation signal to the audio processing unit 4. After receiving the signal, the audio processing unit 4 sends a sentence segmentation command to the external mobile terminal 100, and the external mobile terminal 100 performs the sentence segmentation operation according to the command.

[0064] Reference Figure 5 The microphone device also includes an analog signal interface 17 and a digital signal interface 18. Specifically, the analog signal interface 17 is used to connect an external microphone or audio equipment, and it can be a common 3.5mm headphone jack. Through this interface, external audio devices can be connected to this device, expanding the device's audio input and output capabilities. For example, a professional external microphone can be connected to improve the quality of audio input; or an audio device can be connected to achieve a wider sound propagation range. The digital signal interface 18 is used to connect professional audio equipment or expand translation input sources, and it can be a USB-C interface or an SPDIF interface. The USB-C interface has a fast transmission speed and strong compatibility, and can connect to various audio devices that support USB-C. The SPDIF interface is often used for high-quality digital audio transmission and is suitable for occasions with high sound quality requirements. Both the analog signal interface 17 and the digital signal interface 18 are located on the device's casing for easy plugging and unplugging by the user. They are connected to the audio processing unit 4 through internal wiring, allowing signals input from external devices to be smoothly transmitted to the audio processing unit 4 for processing, and the processed signals can also be output to external devices through these interfaces.

[0065] Reference Figure 3The audio playback unit 5 includes a sound cavity 23, two identical speakers 24 symmetrically arranged facing each other on the side walls of the sound cavity 23, a bracket 25 positioned between the speakers 24 and the side walls of the sound cavity 23 to support the speakers 24, a middle frame 26 fitted over the outside of the sound cavity 23, and two identical covers 27 symmetrically arranged facing each other on the middle frame 26. The sound cavity 23 is a closed cavity; its shape and size affect sound resonance and sound quality. The sound cavity 23 can be made of high-strength plastic or metal to ensure good acoustic performance. The speakers 24 are the core component of audio playback, converting electrical signals into sound signals. The diaphragm material of the speakers 24 can be paper, plastic, or metal; different materials affect the timbre and sound quality. The bracket 25 supports the speakers 24 and is usually made of elastic materials such as rubber or silicone, providing cushioning and shock absorption to ensure stable operation of the speakers 24. The middle frame 26 fits over the outside of the sound cavity 23, protecting the sound cavity 23 and fixing other components. The middle frame 26 can be made of materials such as aluminum alloy, providing a certain level of strength and heat dissipation. The cover 27 is installed on both sides of the middle frame 26, serving both sealing and decorative purposes, making the entire audio playback unit 5 more aesthetically pleasing and compact. Two speakers 24 are symmetrically mounted on the side walls of the sound cavity 23 via brackets 25, enabling stereo playback and enhancing the stereo effect and layering of the sound. The sound cavity 23, middle frame 26, and cover 27 are assembled together using screws or clips to form a complete audio playback unit 5. The audio signal processed by the audio processing unit 4 is transmitted to the speakers 24 via circuitry. The speakers 24 convert the electrical signal into a sound signal, which, through resonance and diffusion within the sound cavity 23, ultimately produces a clear and loud sound.

[0066] In one embodiment, such as Figure 4 and Figure 5 The handle 13 in the microphone device is a telescopic handle 13. The telescopic handle 13 includes an outer tube 201 and an inner tube 202. The outer tube 201 is provided with a locking hole 16. The inner cavity of the inner tube 202 is provided with a locking structure. The locking structure includes a U-shaped spring piece 203 fixed at one end to the inner wall of the inner tube 202, a circular protrusion 205 fixed at the other end of the spring piece, and a through hole opened on the inner tube 202 and communicating with the locking hole 16.

[0067] When the telescopic handle 13 is needed, the bottom side of the outer tube 201 first abuts against the circular protrusion 205, pressing the circular protrusion 205 inward. This causes the U-shaped spring piece 203 to undergo elastic deformation under force, and the circular protrusion 205 retracts inward along the through hole of the inner tube 202 until it is completely disengaged from the locking hole 16 of the outer tube 201. At this time, the inner tube 202 loses the limiting restraint of the locking structure and can slide and extend smoothly within the outer tube 201.

[0068] After the inner tube 202 is stretched to the required length, the U-shaped spring 203, relying on its own elastic restoring force, drives the circular protrusion 205 to reset. The circular protrusion 205 passes through the through hole of the inner tube 202 and re-engages into the locking hole 16 of the outer tube 201, forming a stable locking state. This firmly fixes the relative positions of the inner tube 202 and the outer tube 201, ensuring that the telescopic handle 13 will not retract unexpectedly during use, providing the user with a stable and reliable grip experience.

[0069] The implementation principle of an integrated microphone device for voice translation according to an embodiment of this application is as follows: The microphone device collects sound signals through the sound pickup module 1. Users can select between a singing / speaking mode and a translation mode via a mode switching switch 6, depending on the usage scenario. In singing / speaking mode, the Bluetooth communication unit 3 adopts a one-way transmission protocol. The audio processing unit 4 processes the collected voice signal in real time and outputs it directly through the audio playback unit 5 to meet the needs of singing or speaking. In translation mode, the Bluetooth communication unit 3 switches to a two-way transmission protocol, transmitting the original audio data to an external mobile terminal 100 for translation processing. The translation result is then sent back to the audio processing unit 4 for sound quality optimization, and finally outputs the translated audio through the audio playback unit 5. The device is also equipped with a sensitivity switch 10 to adapt to ambient noise, significantly improving practicality and user experience. This innovative design integrates the dual functions of a microphone and a speaker: the sound pickup module 1 efficiently collects the original audio, and the sound output module 2 accurately outputs the translation result. Flexible function switching is achieved through mode switching. In translation mode, it especially supports simultaneous listening and translation by multiple people, effectively solving the problem of real-time translation in cross-language communication and greatly improving communication efficiency in scenarios such as multi-person meetings and international exchanges.

[0070] Example 2

[0071] The difference between Embodiment 2 and Embodiment 1 is that the microphone device adds a translation module 19, which has a local translation function. Referring to Figure 6, the translation module 19 includes a speech recognition unit 20, a text translation unit 21, and a text-to-speech unit 22. The input of the speech recognition unit 20 is connected to the output of the microphone unit 7, converting the raw audio data collected by the microphone unit 7 into raw language text. The input of the text translation unit 21 is connected to the output of the speech recognition unit 20, and it can use online translation services or a local translation dictionary to translate the raw language text into the target language text. The input of the text-to-speech unit 22 is connected to the output of the text translation unit 21, and its output is connected to the audio processing unit 4. It can convert the target language text into target audio data. The speech recognition unit 20, the text translation unit 21, and the text-to-speech unit 22 are connected sequentially to form a complete translation chain. After the microphone unit 7 collects the raw audio data, it is first converted into text by the speech recognition unit 20, then translated by the text translation unit 21, and finally converted into audio by the text-to-speech unit 22. The whole process realizes the speech translation and conversion function.

[0072] The above are all preferred embodiments of this application and are not intended to limit the scope of protection of this application. Therefore, all equivalent changes made in accordance with the structure, shape and principle of this application should be covered within the scope of protection of this application.

Claims

1. An integrated microphone device for voice translation, characterized in that, The microphone device includes a pickup module (1) and a sound output module (2). The output end of the pickup module (1) is connected to the input end of the sound output module (2). The sound output module (2) includes a Bluetooth communication unit (3), an audio processing unit (4), an audio playback unit (5), and a mode switching switch (6). The audio processing unit (4) is connected to the microphone unit (7) in the pickup module (1) and the Bluetooth communication unit (3), respectively. The audio processing unit (4) is used to receive and process the raw audio data collected by the microphone unit (7). The device includes a singing and speaking mode and a translation mode; the mode switching switch (6) is used to switch the microphone device to the singing and speaking mode or the translation mode. When the microphone device is in the translation mode, the original audio data is transmitted to the external mobile terminal (100) for translation through the Bluetooth communication unit (3) to obtain translated audio data; the translated audio data is transmitted to the audio processing unit (4) for processing through the Bluetooth communication unit (3), and the audio playback unit (5) is connected to the audio processing unit (4) to receive the translated audio data and play it.

2. The all-in-one microphone device for speech translation of claim 1, wherein, When the microphone device is in translation mode, the Bluetooth communication unit (3) adopts a bidirectional transmission protocol; when the microphone device is in singing / speaking mode, the Bluetooth communication unit (3) adopts a unidirectional transmission protocol.

3. The all-in-one microphone device for speech translation of claim 1, wherein, The microphone module (1) also includes a mesh head (8) and a sensitivity switch (10). The microphone unit (7) is located inside the mesh head (8). The sensitivity switch (10) is used to control the sensitivity of the microphone unit (7). When the microphone device is in the singing mode, the sensitivity of the microphone unit (7) is restored to the default value.

4. The all-in-one microphone device for speech translation of claim 3, wherein, The pickup unit (7) includes any one of a first pickup (11), a second pickup (12), and a third pickup (28); the first pickup (11) has a cardioid directional orientation and is fixedly oriented towards the top of the mesh head (8); the second pickup (12) has a bidirectional orientation and is arranged at a predetermined angle with the first pickup (11); the third pickup (28) includes two cardioid microphones, which are arranged back to back, and the pickup angle of the two cardioid microphones is greater than 90°; When the microphone device is in the singing / speaking mode, the first pickup (11) is working, while the second pickup (12) and the third pickup (28) are not working; when the microphone device is in the translation mode, the second pickup (12) and the third pickup (28) are working, while the first pickup (11) is not working.

5. The all-in-one microphone device for speech translation of claim 4, wherein, The pickup module (1) further includes a level comparison unit and an input switch unit connected to the audio processing unit (4); the level comparison unit is used to compare the audio signal levels output by the two cardioid microphones in the third pickup (28); the input switch unit is used to close the audio input channel of the cardioid microphone with the lower audio signal level according to the comparison result output by the level comparison unit.

6. The integrated microphone device for speech translation according to claim 1, characterized in that, The microphone device also includes a handle (13), which is located on the side of the sound output module (2) away from the sound pickup module (1). A power supply unit (14) is provided inside the handle (13), which is used to supply power to the sound output module (2) and the sound pickup module (1).

7. The all-in-one microphone device for speech translation of claim 6, wherein, The microphone device also includes a motion posture detection unit (15), which is used to acquire the posture change value of the handle (13) in real time. If the posture change value is greater than a preset threshold, the motion posture detection unit (15) sends a sentence break signal. After receiving the sentence break signal, the audio processing unit (4) sends a sentence break command to the external mobile terminal (100). The external mobile terminal (100) performs sentence break operation according to the sentence break command.

8. The all-in-one microphone device for speech translation of claim 6, wherein, The handle (13) is a telescopic handle.

9. The all-in-one microphone device for speech translation of claim 1, wherein, The microphone device also includes either an analog signal interface (17) or a digital signal interface (18); the analog signal interface (17) is used to connect an external microphone or audio equipment; the digital signal interface (18) is used to connect professional audio equipment or an extended translation input source.

10. The all-in-one microphone device for speech translation of claim 1, wherein, The microphone device further includes a translation module (19), the translation module (19) comprising: A speech recognition unit (20) is provided, the input end of which is connected to the output end of the microphone unit (7), for converting the original audio data into original language text. A text translation unit (21), the input of which is connected to the output of the speech recognition unit (20), is used to translate the original language text into target language text. The text-to-speech unit (22) has its input end connected to the output end of the text translation unit (21) and its output end connected to the audio processing unit (4). It is used to convert the target language text into target audio data and transmit the target audio data to the audio processing unit (4). The audio processing unit (4) transmits the target audio data to the audio playback unit (5) for playback.