Hybrid audio representations for editing audio content

a technology of audio content and audio representation, applied in the field of editing audio content, can solve the problems of unintuitive user experience, difficulty in interacting with the waveform to perform edits, and inconvenient use, and achieve the effect of simple and intuitive viewing and editing of audio

Active Publication Date: 2017-06-15
ADOBE SYST INC
View PDF2 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

[0008]Embodiments of the present disclosure provide benefits and / or solve one or more of the foregoing or other problems in the art with systems and methods for displaying and editing multimedia, particularly audio. For example, the disclosed systems and methods provide a hybrid waveform display that includes waveforms inline with text converted from recognizable speech, which makes viewing and editing audio simple and intuitive to a user. Specifically, the hybrid waveform includes text corresponding to recognizable speech and waveforms of non-recognizable audio. Further, the systems and methods can display the waveforms inline with the converted text, such that audio information from the waveforms is displayed in connection with the recognizable speech.
[0010]The disclosed systems and methods provide a number of benefits over conventional audio editing systems and methods. For example, the systems and methods provide a user with a display that enables the user to quickly ascertain the context of each portion of audio, which improves the user's ability to edit the audio and / or corresponding text. In particular, the disclosed graphical user interfaces provide speech-recognizable portions of audio as text along with audio information, in the form of waveforms, for portions of the audio that are not speech-recognizable. For instance, a displayed waveform can indicate a long pause, an applause, a loud noise, music, or other sounds in the audio content that are not recognized as speech. Further, because waveforms are displayed inline with converted text, the waveforms provide additional context to the surrounding text.
[0011]In addition to providing context to audio, the disclosed systems and methods also simplify the editing process for a user. For example, the disclosed systems and methods provide text-based editing that is much easier for users than waveform-based editing. For example, using the disclosed systems, a user can easily identify and edit portions of audio content using the corresponding portions of text as a reference. In some embodiments, a user can make edits to the audio content through direct interactions with the text itself, as will be discussed below.

Problems solved by technology

However, conventional audio processing systems, suffer from a number of drawbacks and shortcomings as well.
For example, using conventional systems, audio editing can be difficult and confusing, especially for novice users.
Interacting with the waveform to perform edits can be confusing and unintuitive for users.
Oftentimes, even expert users cannot readily decipher the audio to which a waveform corresponds.
As a result, even with the proper training and experience, editing audio waveforms can be a complex and cumbersome process.
However, providing the text derived from an audio sample does not give any indications of time, or any other context beyond the words themselves.
Further, if there is audio content that is not recognizable as speech—such as applause, music, sound effects, or other noise—this information is not properly represented in the text transcription.
Accordingly, even in systems that provide text transcriptions of the audio content, it is still difficult for the users to accurately correlate the text to the audio content or to use such information to aid in editing the audio content.
These and other problems exist with regard to displaying multimedia, and in particular, displaying audio in a manner that is convenient and understandable to all users.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hybrid audio representations for editing audio content
  • Hybrid audio representations for editing audio content
  • Hybrid audio representations for editing audio content

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024]One or more embodiments of the present disclosure include a hybrid waveform system and corresponding methods for providing a user interface for interacting with audio recordings including spoken words. In particular, the hybrid waveform system provides a graphical user interface that includes a hybridized transcription of text converted from recognizable speech along with non-textual representations of non-speech-recognizable audio (e.g., pauses, ambience, background noise, etc.). The hybrid waveform system displays the non-textual representations as small waveforms inline with the transcribed text. In one or more embodiments, the non-textual representations provide audio information to a user that is otherwise missing from a conventional transcription. In many cases, the audio information is recognizable by a user from the visualization of the non-textual representations itself.

[0025]As an illustration, in one implementation, the hybrid waveform system identifies an audio seg...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present disclosure includes a hybrid waveform system that displays a hybrid waveform to a user. In general, the hybrid waveform system provides a hybrid waveform to a user that uses converted readable text and waveforms to represent an audio segment. By providing a user with a hybrid waveform, the hybrid waveform system offers users with a number of benefits, such as providing an audio display that enables a user to quickly ascertain context information and audio information typically missing from audio transcriptions.

Description

BACKGROUND[0001]1. Technical Field[0002]One or more embodiments of the present disclosure relate generally to editing audio content. More specifically, one or more embodiments of the present disclosure relate to systems and methods for displaying audio waveforms inline with text within an editing user interface.[0003]2. Background and Relevant Art[0004]Computing devices are useful in interacting with multimedia content, such as audio content, in many ways. For example, using a computing device, a user can capture, store, play back, and / or share audio content. In addition, computing devices allow users to edit audio by, for example, trimming unwanted noise, changing the audio characteristics for an audio file, and mixing audio together. Further, computing devices are often used to convert audio data to other types of data. For example, using a computing device, a user can transcribe audio data into text using speech-to-text (“STT”) technologies and / or convert audio data to a graphica...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(United States)
IPC IPC(8): G10L21/12G10L25/87G06T11/60G06F3/16G06F3/0481G10L15/26G10L21/10
CPCG10L21/10G06F3/167G06F3/0481G10L25/87G06T2200/24G10L21/12G10L15/26G06T11/60G10L15/04G06F40/10G06F40/103
Inventor RUBIN, MICHAELMOORER, JAMES A.
Owner ADOBE SYST INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products