Voice processing method and device thereof

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of speech processing and speech fragments, applied in the field of communication, can solve the problem that mixed speech cannot be separated quickly and effectively, and achieve the effect of quickly separating specific target speech

Active Publication Date: 2018-12-18

湖南华威金安企业管理有限公司

View PDF9 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The embodiment of the present invention provides a voice processing method and device to at least solve the problem in the related art that a specific target voice cannot be quickly and effectively separated from a mixed voice mainly speaking for a specific target

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0065] The method embodiment provided in Embodiment 1 of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking running on a mobile terminal as an example, figure 1It is a hardware structural block diagram of a mobile terminal of a voice processing method in an embodiment of the present invention, as figure 1 As shown, the mobile terminal 10 may include one or more ( figure 1 Only one is shown in the figure) a processor 102 (the processor 102 may include but not limited to a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data. Optionally, the above-mentioned mobile terminal also A transmission device 106 for communication functions as well as input and output devices 108 may be included. Those of ordinary skill in the art can understand that, figure 1 The shown structure is only for illustration, and does not limit the structure of the above mo...

Embodiment 2

[0088] In this embodiment, a voice processing device is also provided, which is used to implement the above embodiments and preferred implementation modes, and those that have been explained will not be repeated here. As used below, the term "module" may be a combination of software and / or hardware that realizes a predetermined function. Although the devices described in the following embodiments are preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.

[0089] image 3 is a block diagram of a speech processing device according to an embodiment of the present invention, such as image 3 shown, including:

[0090] Segmentation module 32, is used for dividing mixed speech into N speech segments by endpoint detection, wherein, said N is a natural number greater than or equal to 2;

[0091] The detection module 34 is configured to perform Bayesian information criterion BIC detection on any...

Embodiment 3

[0111] An embodiment of the present invention also provides a storage medium, in which a computer program is stored, wherein the computer program is set to execute the steps in any one of the above method embodiments when running.

[0112] Optionally, in this embodiment, the above-mentioned storage medium may be configured to store a computer program for performing the following steps:

[0113] S11, segmenting the mixed speech into N speech segments by endpoint detection, wherein the N is a natural number greater than or equal to 2;

[0114] S12. Perform Bayesian information criterion BIC detection on any two adjacent speech segments among the N speech segments, and discard the abnormal speech segment in the BIC detection to obtain a valid speech segment of the target object.

[0115] Optionally, in this embodiment, the above-mentioned storage medium may include but not limited to: U disk, read-only memory (Read-Only Memory, ROM for short), random access memory (Random Access ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a voice processing method and a device thereof, wherein the method comprises the steps of segmenting a mixed value to N voice segments through end point detection, wherein N isa natural number which is higher than or equal with 2; performing Bayes information criterion (BIC) detection on two random adjacent voice segments in the N voice segments, abandoning abnormal voice segments in BIC detection, and obtaining effective voice segments of a target object. The voice processing method and the device thereof can settle a problem of incapability of quickly and effectivelyseparating the specific target voice from the fixed voice in speaking of a specific target, thereby realizing an effect of quickly separating the specific target voice from the mixed voice.

Description

technical field [0001] The present invention relates to the communication field, in particular to a voice processing method and device. Background technique [0002] The original speaker turning point detection scheme based on Bayesian Information Criterion BIC is aimed at separability, and generally ends up separating the mixed voices of multiple speakers. Technically no assumptions are made about the location of the turning point, and speech data from different speakers is generally retained as much as possible. In addition, this method is generally not used alone, such as calculating the distance between different data distributions, clustering, and so on. For situations where the speech duration of a specific speaker is dominant, and the speech duration of other people or noise is relatively low, and the speech content is less concerned, and more concerned about the characteristics of the speaker, a separable scheme is proposed. For this type of problem, the current so...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/04G10L15/05G10L25/51

CPCG10L15/04G10L15/05G10L25/51

Inventor邹新生

Owner湖南华威金安企业管理有限公司

Voice processing method and device thereof

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology