Speech synthesis method and speech synthesis device

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A sound synthesis and sound technology, which is applied in speech synthesis, speech analysis, instruments, etc., can solve the problems of users' incoordination and unpleasantness, and achieve the effect of reducing sound quality and suppressing the sense of noise

Inactive Publication Date: 2005-11-02

PANASONIC INTELLECTUAL PROPERTY CORP OF AMERICA

View PDF1 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

No matter what the situation is, it will make the user feel uncoordinated and unpleasant if he responds with a so-called synthetic voice with a blunt tone.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

no. 1 Embodiment approach

[0045]

[0046] FIG. 1 shows the configuration of a voice interactive interface according to the first embodiment. The interface is between digital information equipment (such as digital TV and car navigation system) and the user, and supports the operation of the user's equipment by exchanging information (dialogue) with the user through voice. This interface includes a voice recognition unit 10 , a dialog processing unit 20 and a voice synthesis unit 30 .

[0047] The voice recognition unit 10 recognizes a user's voice.

[0048] The dialogue processing part 20 sends the control signal corresponding to the recognition result by the voice recognition part 10 to the digital information device, or sends the recognition result by the voice recognition part 10 and / or the response message (text) according to the control signal from the digital information device A signal for controlling and giving emotion to the response text is sent to the voice synthesis unit 30 .

[0049] Th...

no. 2 Embodiment approach

[0100] In the first embodiment, phase shaping and high-domain phase diffusion are performed in separate steps. If these are applied, it is possible to impose some other operation on the pitch waveform temporarily shaped by phase shaping. The second embodiment is characterized in that the data storage capacity is reduced by grouping the temporarily shaped pitch waveforms into clusters.

[0101] The interface according to the second embodiment includes a speech synthesis unit 40 shown in FIG. 16 instead of the speech synthesis unit 30 shown in FIG. 1 . Other constituent elements are the same as those shown in FIG. 1 . The speech synthesis unit 40 shown in FIG.

[0102] The representative pitch waveform obtained by the device shown in FIG. 17( a ) (a device separate from the voice interactive interface) is stored in advance in the representative pitch waveform DB 42 . In the apparatus shown in FIG. 17( a ), a waveform DB 34 is provided, the output of which is connected to the ...

no. 3 Embodiment approach

[0107] The storage capacity reduction effect brought about by clustering, that is, the improvement of clustering efficiency is not only effective in shaping the pitch waveform by removing phase fluctuations, but also in normalizing the amplitude and time length. In the third embodiment, when the pitch waveform is stored, a step of normalizing the amplitude and the time length is designed. In addition, when reading the pitch waveform, the amplitude and duration are appropriately converted according to the synthesized voice.

[0108] The interface according to the third embodiment includes a speech synthesis unit 50 shown in FIG. 18( a ) instead of the speech synthesis unit 30 shown in FIG. 1 . Other constituent elements are the same as those shown in Fig. 1 . The speech synthesis unit 50 shown in FIG. 18( a ) further adds a deformation unit 51 to the constituent elements of the speech synthesis unit 40 shown in FIG. 16 . The deformation unit 51 is provided between the pitch w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

A language processing portion ( 31 ) analyzes a text from a dialogue processing section ( 20 ) and transforms the text to information on pronunciation and accent. A prosody generation portion ( 32 ) generates an intonation pattern according to a control signal from the dialogue processing section ( 20 ). A waveform DB ( 34 ) stores prerecorded waveform data together with pitch mark data imparted thereto. A waveform cutting portion ( 33 ) cuts desired pitch waveforms from the waveform DB ( 34 ). A phase operation portion ( 35 ) removes phase fluctuation by standardizing phase spectra of the pitch waveforms cut by the waveform cutting portion ( 33 ), and afterwards imparts phase fluctuation by diffusing only high phase components randomly according to the control signal from the dialogue processing section ( 20 ). The thus-produced pitch waveforms are placed at desired intervals and superimposed.

Description

technical field [0001] The invention relates to a method and device for artificially synthesizing sound. Background technique [0002] In recent years, the high performance and complexity of information equipment using digital technology has rapidly advanced. In order to allow users to use such digital information equipment easily, one of the user interfaces is a voice interactive interface. The audio interactive interface realizes the desired operation of the device by communicating information (dialogue) with the user using audio, and is beginning to be installed in car navigation systems and digital televisions. [0003] The dialog realized through the voice dialog interface is a dialog between a user (person) with emotion and a system (equipment) without emotion. Responding with a synthetic voice with a so-called blunt tone no matter what the situation is, it will make the user feel uncoordinated and unpleasant. For a voice-conversational interface to be comfortable f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L13/033G10L13/00G10L13/02G10L13/06G10L13/10

CPCG10L13/10G10L13/07

Inventor釜井孝浩加藤弓子

OwnerPANASONIC INTELLECTUAL PROPERTY CORP OF AMERICA

Speech synthesis method and speech synthesis device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

no. 1 Embodiment approach

no. 2 Embodiment approach

no. 3 Embodiment approach

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology