Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

1064 results about "Speech coding" patented technology

Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting modeled parameters in a compact bitstream.

Enhancing speech intelligibility using variable-rate time-scale modification

The method and preprocessor enhances the intelligibility of narrowband speech without essentially lengthening the overall time duration of the signal. Both spectral enhancements and variable-rate time-scaling procedures are implemented to improve the salience of initial consonants, particularly the perceptually important formant transitions. Emphasis is transferred from the dominating vowel to the preceding consonant through adaptation of the phoneme timing structure. In a further embodiment, the technique is applied as a preprocessor to a speech coder.
Owner:NUANCE COMM INC

Noise-dependent postfiltering

A method of filtering a speech signal is presented. The method involves providing a filter (404) suited for reduction of distortion caused by speech coding, estimating acoustic noise in the speech signal, adapting the filter in response to the estimated acoustic noise to obtain an adapted filter, and applying the adapted filter to the speech signal so as to reduce acoustic noise and distortion caused by speech coding in the speech signal.
Owner:NOKIA CORP

Method for facilitating text to speech synthesis using a differential vocoder

InactiveUS20070106513A1High memory requirementEffectively primeSpeech synthesisAdemetionineText to speech synthesis
A text to speech system (100) uses differential voice coding (230, 416) to compress a database of digitized speech waveform segments (210). A seed waveform (535) is used to precondition each speech waveform prior to encoding which, upon encoding, provides a seeded preconditioned encoded speech token (550). The seed portion (541) may be removed and the preconditioned encoded speech token portion (542) may be stored in a database for text to speech synthesis. When speech it to be synthesized, upon requesting the appropriate speech waveform for the present sound to be produced, the seed portion is preappended to the preconditioned encoded speech token for differential decoding.
Owner:MOTOROLA INC

Method and apparatus for coding an information signal using pitch delay contour adjustment

In a speech encoder / decoder a pitch delay contour endpoint modifier is employed to shift the endpoints of a pitch delay interpolation curve up or down. Parficularly, the endpoints of the pitch delay interpolation curve are shifted based on a variation and / or a standard deviation in pitch delay.
Owner:GOOGLE TECH HLDG LLC

Signal modification based on continuous time warping for low bit rate CELP coding

A signal modification technique facilitates compact voice coding by employing a continuous, rather than piece-wise continuous, time warp contour to modify an original residual signal to match an idealized contour, avoiding edge effects caused by prior art techniques. Warping is executed using a continuous warp contour lacking spatial discontinuities which does not invert or overly distend the positions of adjacent end points in adjacent frames. The linear shift implemented by the warp contour is derived via quadratic approximation or other method, to reduce the complexity of coding to allow for practical and economical implementation. In particular, the algorithm for determining the warp contour uses only a subset of possible contours contained within a sub-range of the range of possible contours. The relative correlation strengths from these contours are modeled as points on a polynomial trace and the optimum warp contour is calculated by maximizing the modeling function.
Owner:MICROSOFT TECH LICENSING LLC

Noise suppression in the frequency domain by adjusting gain according to voicing parameters

An input signal enters a noise suppression system in a time domain and is converted to a frequency domain. The noise suppression system then estimates a signal to noise ratio of the frequency domain signal. Next, a signal gain is calculated based on the estimated signal to noise ratio and a voicing parameter. The voicing parameter may be determined based on the frequency domain signal or may be determined based on a signal ahead of the frequency domain signal with respect to time. In that event, the voicing parameter is fed back to the noise suppression system, for example, by a speech coder, to calculate the signal gain. After calculating the gain, the noise suppression system modifies the signal using the calculated gain to enhance the signal quality. The modified signal may further be converted from the frequency domain back to the time domain for speech coding.
Owner:MACOM TECH SOLUTIONS HLDG INC +1

Internet telephone service using cellular digital vocoder

A system and method for providing telephone type services over the internetwork commonly known as the Internet. Wireless digital telephones in communication with wireless digital communications systems have speech coders that generate digital voice samples, and encoders that encode the digital voice samples to minimize bit errors during RF transmission. The wireless digital communications system demodulates the received modulated wireless signal transmitted by the digital telephone to recover the encoded, digital voice samples. The error correction codes within the encoded, digital voice samples are removed to recover the original digital voice samples generated by the vocoder in the digital telephone. The decoded digital voice samples are then supplied to a gateway interface that packetizes the decoded digital voice samples into digital voice sample segments, assigns a packet address corresponding to the destination telephone number, and outputs the digital voice sample segments as data packets onto a packet switched network, such as the Internet, for reception by a network node corresponding to the destination address.
Owner:VERIZON PATENT & LICENSING INC

Synchronizer for use with improved in-band signaling for data communications over digital wireless telecommunications networks

An inband signaling modem communicates digital data over a voice channel of a wireless telecommunications network. An input receives digital data. An encoder converts the digital data into audio tones that synthesize frequency characteristics of human speech. The digital data is also encoded to prevent voice encoding circuitry in the telecommunications network from corrupting the synthesized audio tones representing the digital data. An output then outputs the synthesized audio tones to a voice channel of a digital wireless telecommunications network.
Owner:AIRBIQUITY INC

Speech encoder adaptively applying pitch preprocessing with warping of target signal

A multi-rate speech codec supports a plurality of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. A speech encoder employing various encoding schemes based upon parameters including an available transmission bit rate. In addition, the speech encoder is operable to identify and apply an optimal encoding scheme for a given speech signal. The speech encoder may be applied code-excited linear prediction when the available bit rate is above a predetermined upper threshold. Pitch preprocessing, including continuous warping, may be applied when it is below a predetermined lower threshold. The encoder considers varying characteristics of the speech signal including the long term prediction mode of a previous frame, and a spectral difference between the line spectral frequencies of a current and a previous frame, a predicted pitch lag, an open loop pitch lag, a closed loop pitch lag, a pitch gain, and a pitch correlation.
Owner:SAMSUNG ELECTRONICS CO LTD

Apparatus and method for transmitting call holding message in mobile communication terminal

A call holding voice message transmitting apparatus and method in a mobile communication terminal are disclosed that transmit voice data, preferably stored in the terminal, to a caller in a situation in which the user of the terminal cannot receive an incoming call from the caller. The apparatus can include a key input unit to input a command to the terminal, a controller that generates a control signal adapted to execute the command, a microphone that converts a voice into an analog voice signal and a voice encoder that converts the analog voice signal into a digital voice signal. A memory can retrievably store the digital voice signal outputted from the voice encoder and a voice decoder converts the digital voice signal into an outgoing analog voice signal. A selection device outputs a selected one of the analog voice signal from the microphone and the outgoing analog voice signal from the voice decoder in response to the control signal from the control unit. Finally, a backward speech channel unit can convert the analog voice signal outputted from the selection device into a signal meeting a transmission system used in association with the terminal.
Owner:LG ELECTRONICS INC

In-band signaling for data communications over digital wireless telecommunications networks

An inband signaling modem communicates digital data over a voice channel of a wireless telecommunications network. An input receives digital data. An encoder converts the digital data into audio tones that synthesize frequency characteristics of human speech. The digital data is also encoded to prevent voice encoding circuitry in the telecommunications network from corrupting the synthesized audio tones representing the digital data. An output then outputs the synthesized audio tones to a voice channel of a digital wireless telecommunications network.
Owner:AIRBIQUITY INC

Using signal to noise ratio of a speech signal to adjust thresholds for extracting speech parameters for coding the speech signal

InactiveUS6898566B1Improved speechImproved threshold settingSpeech analysisSignal-to-noise ratio (imaging)Speech code
There are provided speech coding methods and systems for estimating a plurality of speech parameters of a speech signal for coding the speech signal using one of a plurality of speech coding algorithms, the plurality of speech parameters includes pitch information, the plurality of speech parameters is calculated using a plurality of thresholds. An example method includes estimating a background noise level in the speech signal to determine a signal to noise ratio (SNR) for the speech signal, adjusting one or more of the plurality of thresholds based on the SNR to generate one or more SNR adjusted thresholds, analyzing the speech signal to extract the pitch information using the one or more SNR adjusted thresholds, and repeating the estimating, the adjusting and the analyzing to code the speech signal using one the plurality of speech coding algorithms.
Owner:WIAV SOLUTIONS LLC +1

Echo cancellation device for cancelling echos in a transceiver unit

An echo cancellation device (ECD) comprises an echo canceller (EC) including a transfer function estimator (EST, H) and a subtractor (ADD) and a residual echo suppression device (G, ADD2). The residual echo suppression device (G) comprises a residual echo filter (G) having an adjustable filter function (g). This filter function (g) can be adapted to either remove from the subtractor output (TNE') the spectral characteristics relating to the reception signal (RFE) and / or to emphasize in the subtractor output signal (TNE') a background signal spectral content relating to the transmission signal (TNE). A noise generation means (NGM') can be provided at the output of the adaptable filter (G) for injecting a noise process in to the filter output signal (TNE') prior to a speech coding in a speech coder (COD). The noise process masks in the filter output signal a spectral content relating to the reception signal (RFE).
Owner:TELEFON AB LM ERICSSON (PUBL)

System and method of minimizing the number of voice transcodings during a conference call in a packet-switched network

A system, method and access gateway for minimizing the number of transcodings of a speech signal during a Voice-over-IP (VoIP) conference call in a packet-switched network in which Tandem Free Operation (TFO) is utilized. The system includes a first gateway connecting the first mobile subscriber to the network, a second gateway connecting the second subscriber, and a third gateway connecting the third subscriber. The second gateway sends a message to the first gateway indicating a speech coding mode being utilized between the second gateway and the second subscriber. The third gateway sends a message to the first gateway indicating a speech coding mode being utilized between the third gateway and the third subscriber. When a three-way conference call is initiated, the first gateway encodes the call path to the second subscriber with the speech coding mode being utilized between the second gateway and the second subscriber. The first gateway also encodes the speech signal for the call leg to the third subscriber with the speech coding mode being utilized between the third gateway and the third subscriber.
Owner:TELEFON AB LM ERICSSON (PUBL)

Software code for improved in-band signaling for data communications over digital wireless telecommunications networks

An inband signaling modem communicates digital data over a voice channel of a wireless telecommunications network. An input receives digital data. An encoder converts the digital data into audio tones that synthesize frequency characteristics of human speech. The digital data is also encoded to prevent voice encoding circuitry in the telecommunications network from corrupting the synthesized audio tones representing the digital data. An output then outputs the synthesized audio tones to a voice channel of a digital wireless telecommunications network.
Owner:AIRBIQUITY INC

Method for executing automatic evaluation of transmission quality of audio signals using source/received-signal spectral covariance

A source signal (e.g. a speech sample) is processed or transmitted by a speech coder 1 and converted into a reception signal (coded speech signal). The source and reception signals are separately subjected to preprocessing 2 and psychoacoustic modelling 3. This is followed by a distance calculation 4, which assesses the similarity of the signals. Lastly, an MOS calculation is carried out in order to obtain a result comparable with human evaluation. According to the invention, in order to assess the transmission quality a spectral similarity value is determined which is based on calculation of the covariance of the spectra of the source signal and reception signal and division of the covariance by the standard deviations of the two said spectra.The method makes it possible to obtain an objective assessment (speech quality prediction) while taking the human auditory process into account.
Owner:ASCOM

Method and apparatus for performing packet loss or frame erasure concealment

The invention concerns a method and apparatus for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder that does not have a built-in or standard FEC process. A receiver with a decoder receives encoded frames of compressed speech information transmitted from an encoder. A lost frame detector at the receiver determines if an encoded frame has been lost or corrupted in transmission, or erased. If the encoded frame is not erased, the encoded frame is decoded by a decoder and a temporary memory is updated with the decoder's output. A predetermined delay period is applied and the audio frame is then output. If the lost frame detector determines that the encoded frame is erased, a FEC module applies a frame concealment process to the signal. The FEC processing produces natural sounding synthetic speech for the erased frames.
Owner:AT&T INTPROP II L P

Speech encoder adaptively applying pitch preprocessing with warping of target signal

InactiveUS20010023395A1Efficient and effective of signalReduce bitrateSpeech analysisTarget signalClosed loop
A multi-rate speech codec supports a plurality of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. A speech encoder employing various encoding schemes based upon parameters including an available transmission bit rate. In addition, the speech encoder is operable to identify and apply an optimal encoding scheme for a given speech signal. The speech encoder may be applied code-excited linear prediction when the available bit rate is above a predetermined upper threshold. Pitch preprocessing, including continuous warping, may be applied when it is below a predetermined lower threshold. The encoder considers varying characteristics of the speech signal including the long term prediction mode of a previous frame, and a spectral difference between the line spectral frequencies of a current and a previous frame, a predicted pitch lag, an open loop pitch lag, a closed loop pitch lag, a pitch gain, and a pitch correlation.
Owner:SAMSUNG ELECTRONICS CO LTD

Frame erasure concealment for predictive speech coding based on extrapolation of speech waveform

A method and system are provided for synthesizing a number of corrupted frames output from a decoder including one or more predictive filters. The corrupted frames are representative of one segment of a decoded signal (sq(n)) output from the decoder. The method comprises determining a first preliminary time lag (ppfe1) based upon examining a predetermined number (K) of samples of another segment of the decoded signal and determining a scaling factor (ptfe) associated with the examined number (K) of samples when the first preliminary time lag (ppfe1) is determined. The method also comprises extrapolating one or more replacement frames based upon the first preliminary time lag (ppfe1) and the scaling factor (ptfe).
Owner:AVAGO TECH WIRELESS IP SINGAPORE PTE

Sound activity detecting method and detector thereof

The invention discloses a sound activation detecting method and a sound activation detector, the core of which is: extracting the feature parameters of the current signal frame when the sound activation detection is needed; and determining the sound type of the current signal frame according to the feature parameters and the set parameter threshold. By the invention, the specific coding algorithm is not relied on when the used feature parameters in the classifying process are extracted, thus being convenient for maintenance and updating, and classifying the input signals into more sound types. When being used in the sound coding technical field, the invention can not only be used as new-opened variable rate sound frequency coding algorithm and standard rate selection foundation, but also provide foundation of rate selection for prior variable rate voice or sound frequency coding standard without VAD algorithm. The invention can be applicable to voice boosting, voice recognition, recognition of spoken person and other voice signal processing fields with strong commonality.
Owner:HUAWEI TECH CO LTD

Voice coder with two microphone system and strategic microphone placement to deter obstruction for a digital communication device

The present invention provides a voice coder for voice communication that employs a multi-microphone system as part of an improved approach to enhancing signal quality and improving the signal to noise ratio for such voice communications, where there is a special relationship between the position of a first microphone and a second microphone to provide the communication device with certain advantageous physical and acoustic properties. In addition, the communication device can have certain physical characteristics, and design features. In a two microphone arrangement, the first microphone is located in a location directed toward the speech source, while the second microphone is located in a location that provides a voice signal with significantly lower signal-to-noise ratio (SNR).
Owner:XENOGENIC DEV LLC

Dual network integration scheme

The present invention provides geographically fixed receiver functionality for subscriber information which specifies the subscriber's identity in a mobile telecommunications system. This is achieved in accordance with the present invention by associating the Subscriber Identification Information (SII) with a fixed geographical reference point rather than with a particular device. In accordance with one embodiment of the present invention the access point lies outside the radio telecommunications system. In this embodiment, the control signals of the mobile system must be transmitted right up to the subscriber's premises. In a further preferred embodiment the access point lies in the radio telecommunications system. In the latter embodiment, the control signals of the mobile telecommunications system may be terminated in an access node which may be conveniently located in a wired base station. In this embodiment, the control signals of the mobile telecommunications system are not transmitted to the customer premises. Also, the voice coding typically used in mobile telecommunications systems may be terminated at the access node and not at the customer premises. In another embodiment of the invention, the functions of the access node of the previous embodiment is shared between an access base station and a responder on the customer's premises. The responder is capable of responding to certain inquiries initiated in the wireless network, e.g. to provide subscriber identity information or to provide the result of an encryption algorithm carried out on a random number supplied from the wireless network.
Owner:NORTEL NETWORKS GERMANY

Method and apparatus for performing packet loss or frame erasure concealment

InactiveUS7047190B1Reducing unnatural artifactSpeech recognitionTransmissionDelayed periodsPacket loss
The invention concerns a method and apparatus for performing packet loss or Frame Erasure Concealment (FEC) for a speech coder that does not have a built-in or standard FEC process. A receiver with a decoder receives encoded frames of compressed speech information transmitted from an encoder. A lost frame detector at the receiver determines if an encoded frame has been lost or corrupted in transmission, or erased. If the encoded frame is not erased, the encoded frame is decoded by a decoder and a temporary memory is updated with the decoder's output. A predetermined delay period is applied and the audio frame is then output. If the lost frame detector determines that the encoded frame is erased, a FEC module applies a frame concealment process to the signal. The FEC processing produces natural sounding synthetic speech for the erased frames.
Owner:AT&T INTPROP II L P

High-band speech coding apparatus and high-band speech decoding apparatus in wide-band speech coding/decoding system and high-band speech coding and decoding method performed by the apparatuses

A high-band speech encoding apparatus and a high-band speech decoding apparatus that can reproduce high quality sound even at a low bitrate when wideband speech encoding and decoding using a bandwidth extension function, and a high-band speech encoding and decoding method performed by the apparatuses. The high-band speech encoding apparatus includes: a first encoding unit encoding a high-band speech signal based on a structure in which a harmonic structure and a stochastic structure are combined, if the high-band speech signal has a harmonic component; and a second encoding unit encoding a high-band speech signal based on a stochastic structure if the high-band speech signal has no harmonic components. The high-band speech decoding apparatus includes: a first decoding unit decoding a high-band speech signal based on a combination of a harmonic structure and a stochastic structure using received first decoding information; a second decoding unit decoding the high-band speech signal based on a stochastic structure using received second decoding information; and a switch outputting one of the decoded high-band speech signals received from the first and second decoding units according to received mode selection information.
Owner:SAMSUNG ELECTRONICS CO LTD

High-band speech coding apparatus and high-band speech decoding apparatus in wide-band speech coding/decoding system and high-band speech coding and decoding method performed by the apparatuses

A high-band speech encoding apparatus and a high-band speech decoding apparatus that can reproduce high quality sound even at a low bitrate when wideband speech encoding and decoding using a bandwidth extension function, and a high-band speech encoding and decoding method performed by the apparatuses. The high-band speech encoding apparatus includes: a first encoding unit encoding a high-band speech signal based on a structure in which a harmonic structure and a stochastic structure are combined, if the high-band speech signal has a harmonic component; and a second encoding unit encoding a high-band speech signal based on a stochastic structure if the high-band speech signal has no harmonic components. The high-band speech decoding apparatus includes: a first decoding unit decoding a high-band speech signal based on a combination of a harmonic structure and a stochastic structure using received first decoding information; a second decoding unit decoding the high-band speech signal based on a stochastic structure using received second decoding information; and a switch outputting one of the decoded high-band speech signals received from the first and second decoding units according to received mode selection information.
Owner:SAMSUNG ELECTRONICS CO LTD

LPC speech synthesis using harmonic excitation generator with phase modulator for voiced speech

A speech coding system (10) and associated method relies on a speech encoder (15) and a speech decoder (20). The speech decoder (20) includes a harmonic generator (70) which modulates the phase of each generated harmonic with a low frequency, low bandwidth signal to remove the buzzy quality of the speech and to produce natural sounding speech. The amplitude of the phase modulating signal is adjusted in accordance with the harmonic magnitude. For harmonics residing in a spectral valley the amplitude of the modulating signal is relatively large and for harmonics residing near spectral peaks, the amplitude of the modulation signal is relatively small.
Owner:LOCKHEED MARTIN CORP

Speech encoding/decoding device

A linear prediction coefficient of a signal represented in a frequency domain is obtained by performing linear prediction analysis in a frequency direction by using a covariance method or an autocorrelation method. After the filter strength of the obtained linear prediction coefficient is adjusted, filtering may be performed in the frequency direction on the signal by using the adjusted coefficient, whereby the temporal envelope of the signal is transformed. This reduces the occurrence of pre-echo and post-echo and improves the subjective quality of the decoded signal, without significantly increasing the bit rate in a band extension technique in the frequency domain represented by SBR.
Owner:NTT DOCOMO INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products