Unlock instant, AI-driven research and patent intelligence for your innovation.

Telephone recording annotation method and system, storage medium and electronic equipment

A telephone and audio technology, applied in the field of audio signal processing, can solve the problems of inconvenient speech recognition and speech synthesis training, low efficiency of manual audio annotation, etc., to reduce the time of manual audio annotation and improve performance.

Pending Publication Date: 2020-06-19
上海携程国际旅行社有限公司
View PDF10 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is to provide a method, system, storage medium and electronic equipment for telephone recording labeling in order to overcome the low efficiency of manual audio labeling in the prior art and the inconvenient training of later speech recognition and speech synthesis , so that the cut recording and semi-automatically marked text can be used for speech recognition and speech synthesis training of intelligent customer service, so as to achieve the effect of customizing and expanding the training and test sample set of speech recognition and speech synthesis of intelligent customer service

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Telephone recording annotation method and system, storage medium and electronic equipment
  • Telephone recording annotation method and system, storage medium and electronic equipment
  • Telephone recording annotation method and system, storage medium and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] This embodiment relates to a semi-automatic labeling method for customer service recordings, which belongs to the field of audio signal processing, and belongs to the stages of audio signal preprocessing and labeling processing. The method of endpoint detection in the field of speech signal processing is mainly used to find out the effective speech segment in the long speech, then cut and recognize the speech, and finally listen to and modify the misrecognized text subjectively.

[0050] The cut and tagged audio can be used not only for speech recognition to obtain the content of customer service recordings, but also for corpus training for speech synthesis. The voice after speech synthesis can make the intelligent customer service speak naturally like a human. The combination of the two can be used in some enterprise customer service centers, especially the intelligent customer service of the travel service center, which can reduce a lot of labor costs and greatly impro...

Embodiment 2

[0075] This embodiment provides a telephone recording labeling system, which executes the method described in Embodiment 1, such as Figure 5 shown, including:

[0076] Audio processing module 1, is used for obtaining the audio file of a telephone recording, and carries out the processing of channel separation and format conversion to said audio file;

[0077] It includes: a channel separation module 11, which is used to separate the audio file from the left channel and the right channel, and save the separated left channel audio data and right channel audio data;

[0078] And a format conversion module 12, configured to convert the sampling frequency, bit width and encoding format of the left channel audio data and the right channel audio data.

[0079] Cutting module 2, for cutting the processed audio file by VAD method;

[0080] It comprises: initialization module 21, is used for initializing the parameter of VAD, and described parameter comprises frame length;

[0081] ...

Embodiment 3

[0087] This embodiment provides a computer-readable storage medium, on which a computer program is stored. When the program is executed by a processor, the steps of the method for annotating a telephone recording provided in Embodiment 1 are realized.

[0088] Wherein, the readable storage medium may more specifically include but not limited to: portable disk, hard disk, random access memory, read-only memory, erasable programmable read-only memory, optical storage device, magnetic storage device or any of the above-mentioned the right combination.

[0089] In a possible implementation manner, the present invention can also be implemented in the form of a program product, which includes program code, and when the program product runs on the terminal device, the program code is used to make the terminal device execute The steps of the telephone recording labeling method in embodiment 1.

[0090] Wherein, the program code for executing the present invention can be written in an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a telephone recording annotation method and system, a storage medium and electronic equipment. The method comprises the steps: acquiring an audio file of telephone recording, and carrying out channel separation and format conversion of the audio file; cutting the audio file subjected to channel separation and format conversion into a plurality of audio clips by a VAD method; invoking a voice recognition interface to recognize the audio clip as a text; and performing error correction on the text to generate an annotation file. According to the invention, the automatic annotation of the recording data is realized, the time for manually annotating the audio is reduced, and the annotated audio and text can be better applied to the scenes of speech recognition and speechsynthesis.

Description

technical field [0001] The invention relates to the field of audio signal processing, in particular to a telephone recording labeling method, system, storage medium and electronic equipment. Background technique [0002] Language is the most important carrier of human thought and the most effective, convenient and natural way for people to communicate. Speaking of the language of human-computer communication, it is mainly divided into speech recognition and speech synthesis. Speech recognition technology is a technology that allows machines to receive, recognize and understand voice signals, and convert them into corresponding digital signals; and speech synthesis technology is to endow machines with the function of "artificial mouth", solving how to make machines speak like humans The problem. Speech recognition (Automatic SpeechRecognition, referred to as ASR) and speech synthesis (Text to Speech, referred to as TTS) require a large amount of corpus for training in the e...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/04G10L15/26G10L25/78G06F40/232G06Q30/00
CPCG10L15/04G10L15/26G10L25/78G06Q30/01
Inventor 袁鹏江文斌李健
Owner 上海携程国际旅行社有限公司