Method, device, equipment and computer storage medium for voice segmentation

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A speech segmentation and speech technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of unstable clock frequency of recording equipment, far-field speech data not meeting training requirements, inaccurate segmentation, etc. The effect of sub-accuracy

Active Publication Date: 2019-11-26

BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

View PDF19 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, due to the unstable clock frequency of the recording device, the long voice segmentation method based on time tags will cause inaccurate segmentation. For example, the voice segments obtained after segmentation are truncated, which further leads to Speech data does not meet training requirements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0059] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0060] figure 1 The main method flowchart provided for the embodiment of the present invention, such as figure 1 As shown in , the method may include the following steps:

[0061] In 101, determine the cross-correlation between the first speech and the second speech, where the second speech is a speech obtained by recording the first speech, and the first speech is formed by splicing more than two first speech segments.

[0062] In 102, the time tag is calibrated based on the determined cross-correlation, where the time tag includes the start time and end time of each first speech segment in the first speech.

[0063] In 103, the second speech is segmented using the calibrated time tag to obtain more than two second speech segments.

[0064]In ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a speech segmentation method, a device, a device and a computer storage medium, wherein the method comprises the following steps: determining the cross correlation degree of a first speech and a second speech, wherein the second speech is the speech obtained after recording the first speech, and the first speech is formed by splicing two or more first speech segments; Calibrating a time tag based on the cross-correlation, the time tag comprising a start time and an end time of each first speech segment in a first speech; Using the calibrated time stamp, the second speechis segmented to obtain more than two second speech segments. The invention can better align the calibrated time label with the second speech, thereby improving the segmentation accuracy of the secondspeech.

Description

【Technical field】 [0001] The present invention relates to the field of computer application technology, in particular to a method, device, equipment and computer storage medium for voice segmentation. 【Background technique】 [0002] With the rapid development of artificial intelligence technology, voice technology has become the main way of artificial intelligence interaction because of its convenient and barrier-free interaction. On the premise that near-field speech recognition technology is gradually mature, far-field speech recognition has gradually become a topic of concern. Through far-field voice recognition, users can conduct voice interaction with smart devices at a relatively long distance, such as voice interaction with smart TVs and smart speakers. [0003] Far-field speech recognition is realized through a far-field acoustic model, and a large amount of far-field speech data is required to train the far-field acoustic model. At this stage, the real data of far...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G10L15/04G10L15/05G10L25/51G10L25/87

CPCG10L15/04G10L15/05G10L25/51G10L25/87

Inventor孙建伟

OwnerBAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Method, device, equipment and computer storage medium for voice segmentation

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology