Check patentability & draft patents in minutes with Patsnap Eureka AI!

Device and method for generating voice animation from audio signal

An audio signal and voice technology, which is applied in the field of devices for generating voice animation, can solve problems such as decision-making, and achieve the effect of high accuracy and simple execution method.

Pending Publication Date: 2021-12-03
TCL CORPORATION
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in cases where audio is the only input available, the quality of the results largely depends on real-time phoneme recognition, such as recognition accuracy and latency, due to resource constraints or privacy concerns.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Device and method for generating voice animation from audio signal
  • Device and method for generating voice animation from audio signal
  • Device and method for generating voice animation from audio signal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The technical solutions of the embodiments of the present invention are described below with reference to the accompanying drawings. The described embodiments are only some, but not all, embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0047] Several recent works demonstrate that believable voice animation is achievable. In some works, the audio input is divided into small segments called frames, and the basic frequency features are extracted from each frame. Phonemes are perceptually distinct units of sound in a given language that distinguish words from words. Phonemes are predicted by identifying vowels and basic fricative consonants from the features, which are then mapped with corresponding static animations called visemes, i.e. copies of the phonemes in the visual domain....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a device and a method for generating a voice animation from an audio signal. The method comprises the following steps: receiving the audio signal; and converting the received audio signal into a frequency domain audio feature; the frequency domain audio features are input into a neural network to execute a neural network processing process to identify phonemes, the neural network is trained by using a phoneme data set, and the audio data set comprises audio signals with corresponding accurate phoneme labels; and generating a voice animation from the recognized phonemes.

Description

technical field [0001] The invention relates to the field of animation generation, in particular to a device and method for generating voice animation. Background technique [0002] Avatars are very popular these days in streaming video, gaming, and virtual reality-related applications. These characters interact with each other in the virtual world, and sometimes they interact with viewers or players in the real world. Humans are very sensitive to any facial artifacts and incongruous or out-of-sync performance of virtual characters, which makes facial animation, especially voice animation, very challenging since animation involves both voice and mouth movements. [0003] In realistic voice animation, artists seek the most immersive experience with highly anthropomorphic virtual characters. However, high costs in the production process and huge requirements for data including audio, video and possibly 3D models greatly undermine its expansion potential. In the case of an o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L21/10G10L25/24G10L15/14G10L15/16
CPCG10L21/10G10L25/24G10L15/16G10L15/142G10L2015/025G10L15/02G10L13/047G10L25/30
Inventor 郁子潇汪灏泓
Owner TCL CORPORATION
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More