Sound detection method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A sound detection and sound technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problem of low accuracy of aliased speech, and achieve the effect of improving accuracy

Active Publication Date: 2021-10-01

ALIBABA DAMO (HANGZHOU) TECH CO LTD

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] An embodiment of the present invention provides a sound detection method to at least solve the technical problem of low accuracy in aliasing speech detection

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0035] According to an embodiment of the present invention, an embodiment of a sound detection method is also provided. It should be noted that the steps shown in the flowcharts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and, although A logical order is shown in the flowcharts, but in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0036] The method embodiment provided in Embodiment 1 of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Take running on a computer terminal as an example, figure 1 It is a block diagram of hardware structure of a computer terminal of a sound detection method according to an embodiment of the present invention. like figure 1 As shown, the computer terminal 10 may include one or more (only one is shown in the figure) processors 102 (the proce...

Embodiment 2

[0122] According to an embodiment of the present application, an embodiment of a sound detection method is also provided. It should be noted that the steps shown in the flow charts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and, although A logical order is shown in the flowcharts, but in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0123] Figure 5a is a flowchart of a sound detection method according to an embodiment of the present invention. like Figure 5a As shown, the method may include the following steps:

[0124] Step S502, displaying an audio-video interactive interface in the conference interface.

[0125] The above conference interface may be a display interface of a computer terminal or a mobile terminal.

[0126] The above-mentioned audio-video interaction interface may be an interface displayed by a meeting scene i...

Embodiment 3

[0144] According to an embodiment of the present application, an embodiment of a sound detection method is also provided. It should be noted that the steps shown in the flow charts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and, although A logical order is shown in the flowcharts, but in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0145] Figure 6a is a flowchart of a sound detection method according to an embodiment of the present invention. like Figure 6a As shown, the method may include the following steps:

[0146] Step S602, triggering the teaching interaction function in the teaching interface to obtain the initial sound signal and the spatial distribution spectrum of the initial sound signal generated during the teaching process.

[0147] The teaching interface mentioned above may be a teaching video interface of a mobil...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a sound detection method. The method comprises the following steps: acquiring an initial sound signal and a spatial distribution spectrum of the initial sound signal; segmenting the initial sound signal to obtain a target sound fragment, and acquiring a timestamp corresponding to the target sound fragment, wherein the target sound fragment includes the voice of at least one object, and the timestamp is used for indicating the start time of the target sound fragment and the end time of the target sound fragment; segmenting the spatial distribution spectrum by using the timestamp to obtain a spatial distribution spectrum fragment corresponding to the target sound fragment; and inputting the target sound fragment and the spatial distribution spectrum fragment into a sound detection model to obtain a first sound detection result, wherein the first sound detection result is used for describing whether sounds of a plurality of objects exist in the initial sound signal. The invention solves the technical problem that the accuracy of aliasing voice detection is low.

Description

technical field [0001] The invention relates to the field of sound detection, in particular to a sound detection method. Background technique [0002] At present, the speech recognition system can recognize and transcribe the speech that appears in the scene into text. However, there will be multiple people speaking in the general scene, that is, there are aliased speech in the scene, and the existence of aliased speech will cause It brings great challenges to the subsequent speaker segmentation and speech recognition. For the current speech recognition system, it is difficult to accurately recognize the situation of multiple people speaking. Usually, it is necessary to obtain the aliased voice through the aliased voice detection technology, and then use the voice separation technology to separate each speaker, so that a general-purpose voice recognition system can be used for voice recognition. But a problem existing at present is that the accuracy rate of the aliasing spe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/04G10L15/02

CPCG10L15/04G10L15/02G10L25/51G10L25/30G10L2021/02166G10L15/063G10L25/87

Inventor 张仕良郑斯奇黄伟隆

Owner ALIBABA DAMO (HANGZHOU) TECH CO LTD

Sound detection method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology