Electronic device, speech filtering method using same, and storage medium

By generating speaker embeddings and filtering ambient noise, the electronic device accurately translates voice signals between multiple speakers, addressing the challenges of language translation in noisy environments.

WO2026127677A1 Publication Date: 2026-06-18SAMSUNG ELECTRONICS CO LTD

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
SAMSUNG ELECTRONICS CO LTD
Filing Date
2025-12-11
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing electronic devices face challenges in accurately translating speech between users speaking different languages due to interference from ambient noise and the inability to distinguish between multiple speakers effectively.

Method used

The electronic device generates speaker embeddings for individual users in real-time, filters out ambient noise, and translates the voice signals of each user separately using a translation-related program, ensuring accurate extraction and translation of voice signals.

🎯Benefits of technology

This approach enhances translation accuracy by isolating and translating the voice signals of individual speakers, providing a more reliable and intuitive translation experience.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
Patent Text Reader

Abstract

According to various embodiments, an electronic device may comprise: a microphone; a display; a processor including a processing circuit; and a memory for storing instructions. When the instructions are individually or collectively executed by the processor, the electronic device may: execute a translation-related program in response to a translation event; generate a first speaker embedding corresponding to a first user on the basis of a first audio signal received through the microphone in response to a first event; generate a second speaker embedding corresponding to a second user on the basis of a second audio signal received through the microphone in response to a second event; extract a first speech signal of the first user on the basis of the first speaker embedding in a situation in which an audio signal is received through the microphone; convert the extracted first speech signal into a first text; translate the first text in a first language into a second text in a second language on the basis of the translation-related program; and display the first text and the second text through the display. Various other embodiments may be possible.
Need to check novelty before this filing date? Find Prior Art