Corpus expansion and speech synthesis system construction method and device based on artificial intelligence

A technology of artificial intelligence and speech synthesis, applied in speech synthesis, speech analysis, special data processing applications, etc., can solve the problems of high cost of hiring voice substitutes, unsatisfactory results, and poor speech synthesis effects, so as to save manpower, material resources and time Cost, the effect of improving the effect of speech synthesis

Active Publication Date: 2018-09-25
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF18 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] However, the above two methods will have certain problems in practical applications. For example, although the former method has a certain effect, after all, the acoustic characteristics such as frequency spectrum and fundamental frequency of the basic speaker and the target speaker have relatively large differences. In addition, it is difficult for some speakers with heavy accents to find a relatively matching speaker in the existing large-sample sound library, so the actual use effect is not ideal, and the speech synthesis effect is poor; although the method of hiring a voice substitute It can make up for the lack of corpus, but the cost of hiring voice replacements is often much higher than that of ordinary speakers, and it is also difficult to find voice replacements for stars with high timbre recognition. The process of finding voice replacements is quite a process. time consuming process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Corpus expansion and speech synthesis system construction method and device based on artificial intelligence
  • Corpus expansion and speech synthesis system construction method and device based on artificial intelligence
  • Corpus expansion and speech synthesis system construction method and device based on artificial intelligence

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0060] Since the WaveNet model with waveform modeling capabilities was proposed in 2016, the WaveNet model has received extensive attention from industry and academia. Among them, based on the Chinese WaveNet model independently improved and built by Baidu, it has been able to quickly build a speech synthesis system with better performance by using tens of minutes of small-scale corpus, which can better restore the timbre of the speaker, and the sound quality is consistent with that based on large-scale The sound quality obtained from the sample sound bank is similar, and at the same time, there will be no problems such as incoherence and unnaturalness.

[0061] However, because the WaveNet model is predicted one by one, the real-time performance is poor, and the running time cannot meet the real-time requirements, so it cannot be directly applied to the online system.

[0062] However, in view of the fact that the WaveNet model has advantages such as high degree of reduction,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a corpus expansion and speech synthesis system construction method and device based on artificial intelligence. The method comprises the following steps of according to a corpus in a small sample sound library, training and acquiring a WaveNet model; using the WaveNet model to generate a speech waveform corresponding to a given text; supplementing the corpus corresponding to the generated speech waveform to the small sample sound library and acquiring a large sample sound library; and using the corpus in the large sample sound library to construct a statistical parameter speech synthesis system. In the scheme, a speech synthesis effect can be increased, and manpower, material resources, time cost and the like are saved.

Description

【Technical field】 [0001] The invention relates to computer application technology, in particular to an artificial intelligence-based corpus expansion and speech synthesis system construction method and device. 【Background technique】 [0002] Artificial Intelligence (Artificial Intelligence), the English abbreviation is AI. It is a new technical science that studies and develops theories, methods, technologies and application systems for simulating, extending and expanding human intelligence. Artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can respond in a manner similar to human intelligence. Research in this field includes robotics, language recognition, image recognition, natural language processing and expert systems, etc. [0003] In the speech synthesis technology, it is necessary to record the corpus for the speaker. The recording process needs to be carried out...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L13/08G10L25/30G06F17/30G06F17/18
CPCG06F17/18G10L13/08G10L25/30
Inventor 顾宇王振宇李昊康永国
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products