Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Voice processing method and device thereof

A technology of speech processing and speech fragments, applied in the field of communication, can solve the problem that mixed speech cannot be separated quickly and effectively, and achieve the effect of quickly separating specific target speech

Active Publication Date: 2018-12-18
湖南华威金安企业管理有限公司
View PDF9 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The embodiment of the present invention provides a voice processing method and device to at least solve the problem in the related art that a specific target voice cannot be quickly and effectively separated from a mixed voice mainly speaking for a specific target

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice processing method and device thereof
  • Voice processing method and device thereof
  • Voice processing method and device thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0065] The method embodiment provided in Embodiment 1 of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking running on a mobile terminal as an example, figure 1It is a hardware structural block diagram of a mobile terminal of a voice processing method in an embodiment of the present invention, as figure 1 As shown, the mobile terminal 10 may include one or more ( figure 1 Only one is shown in the figure) a processor 102 (the processor 102 may include but not limited to a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data. Optionally, the above-mentioned mobile terminal also A transmission device 106 for communication functions as well as input and output devices 108 may be included. Those of ordinary skill in the art can understand that, figure 1 The shown structure is only for illustration, and does not limit the structure of the above mo...

Embodiment 2

[0088] In this embodiment, a voice processing device is also provided, which is used to implement the above embodiments and preferred implementation modes, and those that have been explained will not be repeated here. As used below, the term "module" may be a combination of software and / or hardware that realizes a predetermined function. Although the devices described in the following embodiments are preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.

[0089] image 3 is a block diagram of a speech processing device according to an embodiment of the present invention, such as image 3 shown, including:

[0090] Segmentation module 32, is used for dividing mixed speech into N speech segments by endpoint detection, wherein, said N is a natural number greater than or equal to 2;

[0091] The detection module 34 is configured to perform Bayesian information criterion BIC detection on any...

Embodiment 3

[0111] An embodiment of the present invention also provides a storage medium, in which a computer program is stored, wherein the computer program is set to execute the steps in any one of the above method embodiments when running.

[0112] Optionally, in this embodiment, the above-mentioned storage medium may be configured to store a computer program for performing the following steps:

[0113] S11, segmenting the mixed speech into N speech segments by endpoint detection, wherein the N is a natural number greater than or equal to 2;

[0114] S12. Perform Bayesian information criterion BIC detection on any two adjacent speech segments among the N speech segments, and discard the abnormal speech segment in the BIC detection to obtain a valid speech segment of the target object.

[0115] Optionally, in this embodiment, the above-mentioned storage medium may include but not limited to: U disk, read-only memory (Read-Only Memory, ROM for short), random access memory (Random Access ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a voice processing method and a device thereof, wherein the method comprises the steps of segmenting a mixed value to N voice segments through end point detection, wherein N isa natural number which is higher than or equal with 2; performing Bayes information criterion (BIC) detection on two random adjacent voice segments in the N voice segments, abandoning abnormal voice segments in BIC detection, and obtaining effective voice segments of a target object. The voice processing method and the device thereof can settle a problem of incapability of quickly and effectivelyseparating the specific target voice from the fixed voice in speaking of a specific target, thereby realizing an effect of quickly separating the specific target voice from the mixed voice.

Description

technical field [0001] The present invention relates to the communication field, in particular to a voice processing method and device. Background technique [0002] The original speaker turning point detection scheme based on Bayesian Information Criterion BIC is aimed at separability, and generally ends up separating the mixed voices of multiple speakers. Technically no assumptions are made about the location of the turning point, and speech data from different speakers is generally retained as much as possible. In addition, this method is generally not used alone, such as calculating the distance between different data distributions, clustering, and so on. For situations where the speech duration of a specific speaker is dominant, and the speech duration of other people or noise is relatively low, and the speech content is less concerned, and more concerned about the characteristics of the speaker, a separable scheme is proposed. For this type of problem, the current so...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/04G10L15/05G10L25/51
CPCG10L15/04G10L15/05G10L25/51
Inventor 邹新生
Owner 湖南华威金安企业管理有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products