Method, device, equipment and computer storage medium for voice segmentation

A speech segmentation and speech technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of unstable clock frequency of recording equipment, far-field speech data not meeting training requirements, inaccurate segmentation, etc. The effect of sub-accuracy
CN109166570BActive Publication Date: 2019-11-26BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Publication Date
2019-11-26

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention provides a speech segmentation method, a device, a device and a computer storage medium, wherein the method comprises the following steps: determining the cross correlation degree of a first speech and a second speech, wherein the second speech is the speech obtained after recording the first speech, and the first speech is formed by splicing two or more first speech segments; Calibrating a time tag based on the cross-correlation, the time tag comprising a start time and an end time of each first speech segment in a first speech; Using the calibrated time stamp, the second speechis segmented to obtain more than two second speech segments. The invention can better align the calibrated time label with the second speech, thereby improving the segmentation accuracy of the secondspeech.
Need to check novelty before this filing date? Find Prior Art

Description

【Technical field】

[0001] The present invention relates to the field of computer application technology, in particular to a method, device, equipment and computer storage medium for voice segmentation. 【Background technique】

[0002] With the rapid development of artificial intelligence technology, voice technology has become the main way of artificial intelligence interaction because of its convenient and barrier-free interaction. On the premise that near-field speech recognition technology is gradually mature, far-field speech recognition has gradually become a topic of concern. Through far-field voice recognition, users can conduct voice interaction with smart devices at a relatively long distance, such as voice interaction with smart TVs and smart speakers.

[0003] Far-field speech recognition is realized through a far-field acoustic model, and a large amount of far-field speech data is required to train the far-field acoustic model. At this stage, the real data of far...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More