Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Methods and systems for image and voice processing

a voice processing and voice technology, applied in the field of systems and techniques for digital image and voice processing, can solve the problems of inordinate time and large amount of computer resources required for conventional techniques for processing computer generated videos, and achieve the effect of reducing the number of images and minimizing or reducing errors

Active Publication Date: 2021-02-25
NEON EVOLUTION INC
View PDF7 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This patent describes a way to use neural networks to classify and match faces, as well as a method to change the voice in a recording without affecting its overall structure and content. The invention uses a one-shot or few-shot architecture, which means a small number of example images are used to train the neural networks for future classification. This is in contrast to other methods that may require thousands of facial images for training. The patent also describes how a system can use an autoencoder to automatically generate a latent voice representation from a training voice sample, which can then be used to change the voice in a recording without affecting its overall structure and content. The invention is useful for applications such as voice swapping and audio re-recording.

Problems solved by technology

Conventional techniques for processing computer generated videos may require large amounts of computer resources, take an inordinate amount of time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Methods and systems for image and voice processing
  • Methods and systems for image and voice processing
  • Methods and systems for image and voice processing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035]As discussed above, conventional techniques for processing computer generated videos require large amounts of computer resources and take an inordinate amount of time. Further, certain relatively new applications for digital image processing, such as face-swapping, are becoming ever more popular, creating further demand for computer resources.

[0036]Conventionally, face-swapping is performed by capturing an image or a video of a person (sometimes referred to as the source) whose face is to be used to replace a face of another person in a destination video. For example, a face region in the source image and target image may be recognized, and the face region from the source may be used to replace the face region in the destination, and an output image / video is generated. The source face in the output preserves the expressions of the face in the original destination image / video (e.g., has lip motions, eye motions, eyelid motions, eyebrow motions, nostril flaring, etc.). If insuff...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Systems and methods are disclosed configured to train an autoencoder using images that include faces, wherein the autoencoder comprises an input layer, an encoder configured to output a latent image from a corresponding input image, and a decoder configured to attempt to reconstruct the input image from the latent image. An image sequence of a face exhibiting a plurality of facial expressions and transitions between facial expressions is generated and accessed. Images of the plurality of facial expressions and transitions between facial expressions are captured from a plurality of different angles and using different lighting. An autoencoder is trained using source images that include the face with different facial expressions captured at different angles with different lighting, and using destination images that include a destination face. The trained autoencoder is used to generate an output where the likeness of the face in the destination images is swapped with the likeness of the source face, while preserving expressions of the destination face.

Description

INCORPORATION BY REFERENCE TO ANY PRIORITY APPLICATIONS[0001]Any and all applications for which a foreign or domestic priority claim is identified in the Application Data Sheet as filed with the present application are hereby incorporated by reference under 37 CFR 1.57.BACKGROUND OF THE INVENTIONField of the Invention[0002]This document relates to systems and techniques for digital image and voice processing.Description of the Related Art[0003]Conventional techniques for processing computer generated videos may require large amounts of computer resources, take an inordinate amount of time. Hence, more computer resource-efficient and time-efficient techniques are needed to perform advanced forms of digital image processing, such as face-swapping.SUMMARY[0004]The following presents a simplified summary of one or more aspects in order to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither ident...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(United States)
IPC IPC(8): G06K9/62G06K9/00G06T1/20G06T7/32G06V10/764G06V10/774
CPCG06K9/6256G06K9/00302G06T2207/20081G06T1/20G06T7/32G06K9/00268G06T11/001G06T11/60G06T13/80G10L2021/0135G10L2021/105G10L21/0356G06T17/00G06T13/40G06N3/084G06N3/088G06V40/168G06V40/174G06V10/454G06V10/82G06V10/764G06V10/774G06N3/045G06F18/214
Inventor BERLIN, CODY GUSTAVEBOGAN, III, CARL DAVISLANDE, KENNETH MICHAELLASER, JACOB MYLESLEE, BRIAN SUNGØLAND, ANDERS
Owner NEON EVOLUTION INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products