Speech segmentation method, apparatus, apparatus and computer storage medium

A voice segmentation and voice technology, applied in voice analysis, voice recognition, instruments, etc., can solve problems such as unstable clock frequency of recording equipment, far-field voice data not meeting training requirements, inaccurate segmentation, etc., to achieve improved cutting The effect of sub-accuracy

Active Publication Date: 2019-01-08
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF19 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, due to the unstable clock frequency of the recording device, the long voice segmentation method based on time tags will cause inaccurate seg

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech segmentation method, apparatus, apparatus and computer storage medium
  • Speech segmentation method, apparatus, apparatus and computer storage medium
  • Speech segmentation method, apparatus, apparatus and computer storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0059] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0060] figure 1 The main method flowchart provided for the embodiment of the present invention, such as figure 1 As shown in , the method may include the following steps:

[0061] In 101, determine the cross-correlation between the first speech and the second speech, where the second speech is a speech obtained by recording the first speech, and the first speech is formed by splicing more than two first speech segments.

[0062] In 102, the time tag is calibrated based on the determined cross-correlation, where the time tag includes the start time and end time of each first speech segment in the first speech.

[0063] In 103, the second speech is segmented using the calibrated time tag to obtain more than two second speech segments.

[0064]In ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a speech segmentation method, a device, a device and a computer storage medium, wherein the method comprises the following steps: determining the cross correlation degree of a first speech and a second speech, wherein the second speech is the speech obtained after recording the first speech, and the first speech is formed by splicing two or more first speech segments; Calibrating a time tag based on the cross-correlation, the time tag comprising a start time and an end time of each first speech segment in a first speech; Using the calibrated time stamp, the second speechis segmented to obtain more than two second speech segments. The invention can better align the calibrated time label with the second speech, thereby improving the segmentation accuracy of the secondspeech.

Description

【Technical field】 [0001] The present invention relates to the field of computer application technology, in particular to a method, device, equipment and computer storage medium for voice segmentation. 【Background technique】 [0002] With the rapid development of artificial intelligence technology, voice technology has become the main way of artificial intelligence interaction because of its convenient and barrier-free interaction. On the premise that near-field speech recognition technology is gradually mature, far-field speech recognition has gradually become a topic of concern. Through far-field voice recognition, users can conduct voice interaction with smart devices at a relatively long distance, such as voice interaction with smart TVs and smart speakers. [0003] Far-field speech recognition is realized through a far-field acoustic model, and a large amount of far-field speech data is required to train the far-field acoustic model. At this stage, the real data of far...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/04G10L15/05G10L25/51G10L25/87
CPCG10L15/04G10L15/05G10L25/51G10L25/87
Inventor 孙建伟
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products