Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

42 results about "Pitch contour" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

In linguistics, speech synthesis, and music, the pitch contour of a sound is a function or curve that tracks the perceived pitch of the sound over time. Pitch contour may include multiple sounds utilizing many pitches, and can relate the frequency function at one point in time to the frequency function at a later point.

Prototype waveform phase modeling for a frequency domain interpolative speech codec system

ActiveUS6931373B1Speech analysisPhase correlationFrequency spectrum

A system and method is provided that employs a frequency domain interpolative CODEC system for low bit rate coding of speech which comprises a linear prediction (LP) front end adapted to process an input signal that provides LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal. An open loop pitch estimator adapted to process the LP residual signal, a pitch quantizer, and a pitch interpolator and provide a pitch contour within the predetermined intervals is also provided. Also provided is a signal processor responsive to the LP residual signal and the pitch contour and adapted to perform the following: provide a voicing measure, where the voicing measure characterizes a degree of voicing of the input speech signal and is derived from several input parameters that are correlated to degrees of periodicity of the signal over the predetermined intervals; extract a prototype waveform (PW) from the LP residual and the open loop pitch contour for a number of equal sub-intervals within the predetermined intervals; normalize the PW by a gain value of the PW; encode a magnitude of the PW; and separate stationary and nonstationary components of the PW using a low complexity alignment process and a filtering process that introduce no delay. The ratio of the energy of the nonstationary component of the PW to that of the stationary component of the PW is averaged across 5 subbands to compute the nonstationarity measure as a frequency dependent vector entity. A measure of the degree of voicing of the residual is also computed using openloop pitchgain, pitch variance, relative signal power, PW correlation and PW nonstationarity in low frequency subbands. The nonstationarity measure and voicing measure are encoded using a 6-bit spectrally weighted vector quantization scheme using a codebook partitioned based on a voiced / unvoiced decision. At the decoder, a stationary component of PW is reconstructed as a weighted combination of the previous PW phase vector, a random phase perturbation and a fixed phase vector obtained from a voiced pitch pulse.

Prototype waveform phase modeling for a frequency domain interpolative speech codec system

Prototype waveform phase modeling for a frequency domain interpolative speech codec system

Prototype waveform phase modeling for a frequency domain interpolative speech codec system

Owner:HUGHES NETWORK SYST

Prototype waveform magnitude quantization for a frequency domain interpolative speech codec system

ActiveUS6996523B1Accurately spectral featureAccurate featuresSpeech analysisPitch contourLinear prediction

A system and method is provided that employs a frequency domain interpolative CODEC system for low bit rate coding of speech which comprises a linear prediction (LP) front end adapted to process an input signal that provides LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal. An open loop pitch estimator adapted to process the LP residual signal, a pitch quantizer, and a pitch interpolator and provide a pitch contour within the predetermined intervals is also provided. Also provided is a signal processor responsive to the LP residual signal and the pitch contour and adapted to perform the following: provide a voicing measure, where the voicing measure characterizes a degree of voicing of the input speech signal and is derived from several input parameters that are correlated to degrees of periodicity of the signal over the predetermined intervals; extract a prototype waveform (PW) from the LP residual and the open loop pitch contour for a number of equal sub-intervals within the predetermined intervals; normalize the PW by a gain value of the PW; encode a magnitude of the PW; and directly quantize the PW in a magnitude domain without further decomposition of the PW into complex components, where the direct quantization is performed by a hierarchical quantization method based on a voicing classification using fixed dimension vector quantizers (VQ's).

Prototype waveform magnitude quantization for a frequency domain interpolative speech codec system

Prototype waveform magnitude quantization for a frequency domain interpolative speech codec system

Prototype waveform magnitude quantization for a frequency domain interpolative speech codec system

Owner:HUGHES NETWORK SYST

Voicing measure for a speech CODEC system

ActiveUS7013269B1Improve regenerative abilitySpeed up the processSpeech analysisFrequency spectrumVoice activity

A system and method is provided that employs a frequency domain interpolative CODEC system for low bit rate coding of speech which comprises a linear prediction (LP) front end adapted to process an input signal providing LP parameters which are quantized and encoded over predetermined intervals and used to compute a LP residual signal. An open loop pitch estimator adapted to process the LP residual signal, a pitch quantizer, and a pitch interpolator also provides a pitch contour within the predetermined intervals. A voice activity detector adapted to process the LP parameters and the open loop pitch contour over the predetermined intervals is also provided as well as a signal processor responsive to the LP residual signal and the pitch contour and adapted to perform the following functions: extract a prototype waveform (PW) from the LP residual and the open loop pitch contour for a number of equal sub-intervals within the predetermined invervals; normalize the PW by a gain value of the PW; encode a magnitude of the PW; and provide a voicing measure where the voicing measure characterizes a degree of vocing of the input speech signal and is derived from several input parameters that are correlated to degrees of periodicity of the signal over the predetermined intervals. The voicing measure is provided for the purpose of regenerating a PW phase at a decoder; and providing improved quantization of the PW magnitude at an encoder. The voicing measure is encoded jointly with a PW nonstationarity measure vector using a spectrally weighted vector quantizer having a codebook partioned based on a voiced and unvoiced mode.

Voicing measure for a speech CODEC system

Voicing measure for a speech CODEC system

Voicing measure for a speech CODEC system

Owner:HUGHES NETWORK SYST

Audio transform coding using pitch correction

ActiveUS20100198586A1Reducing transition lengthEfficient codingSpeech analysisPitch contourTransform coding

A processed representation of an audio signal having a sequence of frames is generated by sampling the audio signal within first and second frames of the sequence of frames, the second frame following the first frame, the sampling using information on a pitch contour of the first and second frames to derive a first sampled representation. The audio signal is sampled within the second and third frames, the third frame following the second frame in the sequence of frames. The sampling uses the information on the pitch contour of the second frame and information on a pitch contour of the third frame to derive a second sampled representation. A first scaling window is derived for the first sampled representation, and a second scaling window is derived for the second sampled representation, the scaling windows depending on the samplings applied to derive the first sampled representations or the second sampled representation.

Audio transform coding using pitch correction

Audio transform coding using pitch correction

Audio transform coding using pitch correction

Owner:FRAUNHOFER GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG EV

Emotional-speech synthesizing device, method of operating the same and mobile terminal including the same

InactiveUS20160329043A1Sound input/outputSpeech synthesisPitch contourSpeech sound

Provided is an emotional-speech synthesizing device including: a sentence recognition unit that recognizes a sentence that is input; a word emotion determination unit that calculates probability vector of an emotion that is pre-defined for each word that makes up the recognized sentence and estimates the emotion and a rhythm based on the probability vector; and an emotional-speech synthesizing unit. The emotional-speech synthesizing unit calculates in stages degrees of similarity in the emotion and the rhythm between the adjacent words based on context information on the recognized sentence, applies weight to a phoneme candidate corresponding to the each word based on the degrees of the similarity and the probability vector, selects the phoneme candidate that has a minimum target pitch, minimum duration time, a minimum distance value of a target pitch contour, and thus synthesizes an emotional speech that corresponds to the recognized sentence in optimal units.

Emotional-speech synthesizing device, method of operating the same and mobile terminal including the same

Emotional-speech synthesizing device, method of operating the same and mobile terminal including the same

Emotional-speech synthesizing device, method of operating the same and mobile terminal including the same

Owner:LG ELECTRONICS INC

Method and system for pitch contour quantization in audio coding

InactiveUS20050091044A1Improve coding efficiencyDigital variable displaySpeech analysisPitch contourVariable length

A method and device for improving coding efficiency in audio coding. From the pitch values of a pitch contour of an audio signal, a plurality of simplified pitch contour segments are generated to approximate the pitch contour, based on one or more pre-selected criteria. The contour segments can be linear or non-linear with each contour segment represented by a first end point and a second end point. If the contour segments are linear, then only the information regarding the end points, instead of the pitch values, are provided to a decoder for reconstructing the audio signal. The contour segment can have a fixed maximum length or a variable length, but the deviation between a contour segment and the pitch values in that segment is limited by a maximum value.

Method and system for pitch contour quantization in audio coding

Method and system for pitch contour quantization in audio coding

Method and system for pitch contour quantization in audio coding

Owner:NOKIA CORP

System and method for scoring a singing voice

ActiveUS20120067196A1GearworksMusical toysPitch contourAudio frequency

A system for scoring a singing voice comprises a receiving means (1) for receiving a singing reference audio signal and / or a user audio signal and / or a pitch contour representation (PCR) of the reference and / or user singing audio signals; a processor means (2) connected to the receiving means (1) and comprising a pitch contour representation (PCR) module (10) for determining a PCR of the singing reference and / or user audio signal, a time synchronization module (11) for time synchronizing the PCRs of the reference and user audio signals respectively, a selection module (12) for selecting a segment of the PCRs of the reference and user audio signals based on pre-defined criteria, a cross-correlation module (13) for performing time-warped cross-correlation on the selected segments of the PCRs of the reference and user audio signals and outputting a cross-correlation score, a key matching module (14) and rhythm matching module (15) for key matching and rhythm matching the remaining unselected segments of the PCRs of the reference and user audio signals respectively and outputting a respective key matching score and rhythm matching score, a scoring module (16) for determining a singing score based on a combination of a pre-determined weightage of the cross-correlation, key matching and rhythm matching scores; a user interface means connected to the processor means for changing at least one module parameter within at least one module; a storing means (4) connected to the processor means (2) and a display means (5) connected to the processor means (2) for displaying the PCR and singing score.

System and method for scoring a singing voice

System and method for scoring a singing voice

System and method for scoring a singing voice

Owner:SENSIBOL AUDIO TECH PVT LTD

Osa/csa diagnosis using recorded breath sound amplitude profile and pitch contour

ActiveCN103687540AAuscultation instrumentsRespiratory organ evaluationMedicineCentral sleep apnea

Disclosed herein are breathing disorder identification, characterization and diagnosis methods, devices and systems. In general the disclosed methods, devices and systems may rely on the characterization of breath sound amplitudes, periodic breath sounds and / or aperiodic breath sounds to characterize a breathing disorder as obstructive (e.g. obstructive sleep apnea - OSA) or non-obstructive (e.g. central sleep apnea - CSA).

Owner:UNIV HEALTH NETWORK

Audio transform coding using pitch correction

ActiveUS8700388B2Improve coding efficiencyEfficient codingSpeech recognitionSpeech synthesisPitch contourTransform coding

A processed representation of an audio signal having a sequence of frames is generated by sampling the audio signal within first and second frames of the sequence of frames, the second frame following the first frame, the sampling using information on a pitch contour of the first and second frames to derive a first sampled representation. The audio signal is sampled within the second and third frames, the third frame following the second frame in the sequence of frames. The sampling uses the information on the pitch contour of the second frame and information on a pitch contour of the third frame to derive a second sampled representation. A first scaling window is derived for the first sampled representation, and a second scaling window is derived for the second sampled representation, the scaling windows depending on the samplings applied to derive the first sampled representations or the second sampled representation.

Audio transform coding using pitch correction

Audio transform coding using pitch correction

Audio transform coding using pitch correction

Owner:FRAUNHOFER GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG EV

Advance TTS for facial animation

InactiveUS7076426B1Simple systemSpeech synthesisAnimationPitch contour

An enhanced system is achieved by allowing bookmarks which can specify that the stream of bits that follow corresponds to phonemes and a plurality of prosody information, including duration information, that is specified for times within the duration of the phonemes. Illustratively, such a stream comprises a flag to enable a duration flag, a flag to enable a pitch contour flag, a flag to enable an energy contour flag, a specification of the number of phonemes that follow, and, for each phoneme, one or more sets of specific prosody information that relates to the phoneme, such as a set of pitch values and their durations.

Advance TTS for facial animation

Advance TTS for facial animation

Advance TTS for facial animation

Owner:NUANCE COMM INC

Method and system for speech coding

InactiveUS20050091041A1Improve coding efficiencySpeech analysisTelephonic communicationFrequency spectrumPitch contour

A method and device for use in conjunction with an encoder for encoding an audio signal into a plurality of parameters. Based on the behavior of the parameters, such as pitch, voicing, energy and spectral amplitude information of the audio signal, the audio signal can be segmented, so that the parameter update rate can be optimized. The parameters of the segmented audio signal are recorded in a storage medium or transmitted to a decoder so as to allow the decoder to reconstruct the audio signal based on the parameters indicative of the segment audio signals. For example, based on the pitch characteristic, the pitch contour can be approximated by a plurality of contour segments. An adaptive downsampling method is used to update the parameters based on the contour segments so as to reduce the update rate. At the decoder, the parameters are updated at the original rate.

Method and system for speech coding

Method and system for speech coding

Method and system for speech coding

Owner:NOKIA CORP

Prosody Generation Using Syllable-Centered Polynomial Representation of Pitch Contours

ActiveUS20140195242A1Smooth connectionSpeech recognitionSpeech synthesisSyllableStress level

The present invention discloses a parametrical representation of prosody based on polynomial expansion coefficients of the pitch contour near the center of each syllable. The said syllable pitch expansion coefficients are generated from a recorded speech database, read from a number of sentences by a reference speaker. By correlating the stress level and context information of each syllable in the text with the polynomial expansion coefficients of the corresponding spoken syllable, a correlation database is formed. To generate prosody for an input text, stress level and context information of each syllable in the text is identified. The prosody is generated by using the said correlation database to find the best set of pitch parameters for each syllable. By adding to global pitch contours and using interpolation formulas, complete pitch contour for the input text is generated. Duration and intensity profile are generated using a similar procedure.

Prosody Generation Using Syllable-Centered Polynomial Representation of Pitch Contours

Prosody Generation Using Syllable-Centered Polynomial Representation of Pitch Contours

Prosody Generation Using Syllable-Centered Polynomial Representation of Pitch Contours

Owner:THE TRUSTEES OF COLUMBIA UNIV IN THE CITY OF NEW YORK

Method and apparatus for producing natural sounding pitch contours in a speech synthesizer

InactiveUS7280969B2High energyImprove naturalnessSpeech synthesisCarrier signalEngineering

A speech synthesis system is disclosed that utilizes a pitch contour resulting in a more natural-sounding speech. The present invention modifies the predicted pitch, b(t), for synthesized speech using a low frequency energy booster. The low frequency energy booster interpolates the discrete pitch values, if necessary, and increase the amount of energy of the pitch contour associated with low frequency values, such as all frequency values below 10 Hertz. The amount of energy of the pitch contour associated with low frequency values can be increased, for example, by adding band-limited noise (a carrier signal) to the pitch contour, b(t), or by filtering the pitch values with an impulse response filter having a pole at the desired low frequency value. The present invention serves to add vibrato to the to the original pitch contour, b(t), and thereby improves the naturalness of the synthetic waveform.

Method and apparatus for producing natural sounding pitch contours in a speech synthesizer

Method and apparatus for producing natural sounding pitch contours in a speech synthesizer

Method and apparatus for producing natural sounding pitch contours in a speech synthesizer

Owner:CERENCE OPERATING CO

Method and system for pitch contour quantization in audio coding

ActiveUS20080275695A1Improve coding efficiencyDigital variable displaySpeech analysisVariable lengthPitch contour

A method and device for improving coding efficiency in audio coding. From the pitch values of a pitch contour of an audio signal, a plurality of simplified pitch contour segments are generated to approximate the pitch contour, based on one or more pre-selected criteria. The contour segments can be linear or non-linear with each contour segment represented by a first end point and a second end point. If the contour segments are linear, then only the information regarding the end points, instead of the pitch values, are provided to a decoder for reconstructing the audio signal. The contour segment can have a fixed maximum length or a variable length, but the deviation between a contour segment and the pitch values in that segment is limited by a maximum value.

Method and system for pitch contour quantization in audio coding

Method and system for pitch contour quantization in audio coding

Method and system for pitch contour quantization in audio coding

Owner:RPX CORP

Polyphony melody extraction method based on significance

ActiveCN105957538AAccurate outputSpeech analysisHarmonicFundamental frequency

The invention discloses a polyphony melody extraction method based on significance. A corresponding significance function is defined as the product of the amplitude of two spectral peaks, candidate pitches of which the frequency spacing is less than 50 cents in the same frame are merged, and the pitch can be estimated according to the combination of a variety of co-prime harmonic frequencies. Candidate pitches of which the frequency spacing is less than 50 cents in two adjacent frames are connected into a pitch contour line, pitch contours less than 50ms are preliminarily screened out, and a melody is selected according to a set screening criterion and output. The pitch of the melody component can be estimated accurately even if the fundamental frequency of the melody component is absent or buried by accompaniment. The melody contour is tracked according to the set screening criterion, and thus, correct melody output is obtained.

Polyphony melody extraction method based on significance

Polyphony melody extraction method based on significance

Polyphony melody extraction method based on significance

Owner:大连赛听科技有限公司

Method and apparatus to improve speaker intelligibility in competitive talking conditions

InactiveUS20060106603A1Improving speaker intelligibilitySpeech recognitionEngineeringPitch contour

A system, wireless device (102) and method improve speaker intelligibility in a multi-party call by receiving a plurality of individual voice signals, determining a pitch contour for each individual voice signal, determining that the pitch contours for at least two of the individual voice signals are within a predetermined range relative to each other, and shifting the pitch of at least one voice signal a predetermined amount for the duration of the call. The pitch of the individual voice is shifted one to approximately five semitones. The method is performed at a central control station (110) prior to summation of the signals, or at an individual receiving unit (204) when three or more wireless devices (102) are communicating without the use of a central control station (110).

Method and apparatus to improve speaker intelligibility in competitive talking conditions

Method and apparatus to improve speaker intelligibility in competitive talking conditions

Method and apparatus to improve speaker intelligibility in competitive talking conditions

Owner:MOTOROLA INC

Gate valve rod screw processing single purpose machine

InactiveCN101758306ARealize semi-automatic productionIncrease productivityThread cutting machinesPitch contourBlade plate

The invention relates to a gate valve rod screw processing single purpose machine, which is characterized in that the machine mainly comprises a rack, a transmission mechanism, a main shaft case body fixedly connected with the rack, the main shaft provided with a blade plate, a spring clip and a pitch contour hollow shaft, wherein the main shaft provided with the blade plate supports the bearing of the main shaft case body, and the main shaft is connected with a motor via the belt transmission; the taper of the front end of the spring clip is matched with the taper of the front end mouth of the pitch contour hollow shaft, and the spring clip passes through the pitch contour hollow shaft the back end of which is connected with a locking nut. The invention has the advantages that the machine meets the processing requirement of small-size civil gate valve rod screw, the semi-automatic production is realized, and the production efficiency is very high. The operation workers only put the valve rod into the spring clip to tightly clamp the valve rod, and then the acme thread of the valve rod can be molded at one time.

Gate valve rod screw processing single purpose machine

Owner:天津市大站阀门总厂

Emotional-speech synthesizing device, method of operating the same and mobile terminal including the same

InactiveUS9881603B2Speech recognitionSound input/outputDegree of similarityPitch contour

Provided is an emotional-speech synthesizing device including: a sentence recognition unit that recognizes a sentence that is input; a word emotion determination unit that calculates probability vector of an emotion that is pre-defined for each word that makes up the recognized sentence and estimates the emotion and a rhythm based on the probability vector; and an emotional-speech synthesizing unit. The emotional-speech synthesizing unit calculates in stages degrees of similarity in the emotion and the rhythm between the adjacent words based on context information on the recognized sentence, applies weight to a phoneme candidate corresponding to the each word based on the degrees of the similarity and the probability vector, selects the phoneme candidate that has a minimum target pitch, minimum duration time, a minimum distance value of a target pitch contour, and thus synthesizes an emotional speech that corresponds to the recognized sentence in optimal units.

Emotional-speech synthesizing device, method of operating the same and mobile terminal including the same

Emotional-speech synthesizing device, method of operating the same and mobile terminal including the same

Emotional-speech synthesizing device, method of operating the same and mobile terminal including the same

Owner:LG ELECTRONICS INC

Speech processing apparatus, method, and computer program product

InactiveUS20090248417A1Special data processing applicationsSpeech synthesisSpoken languagePitch contour

A method to generate a pitch contour for speech synthesis is proposed. The method is based on finding the pitch contour that maximizes a total likelihood function created by the combination of all the statistical models of the pitch contour segments of an utterance, at one or multiple linguistic levels. These statistical models are trained from a database of spoken speech, by means of a decision tree that for each linguistic level clusters the parametric representation of the pitch segments extracted from the spoken speech data with some features obtained from the text associated with that speech data. The parameterization of the pitch segments is performed in such a way, the likelihood function of any linguistic level can be expressed in terms of the parameters of one of the levels, thus allowing the maximization to be calculated with respect to the parameters of that level. Moreover, the parameterization of that main level has to be invertible so that the final pitch contour is obtained from the parameters of that level by means of an inverse transformation.

Speech processing apparatus, method, and computer program product

Speech processing apparatus, method, and computer program product

Speech processing apparatus, method, and computer program product

Owner:KK TOSHIBA

Coding device, decoding device, coding method, and decoding method

ActiveUS20130144611A1Improve sound qualitySpeech analysisMultiplexingMultiplexer

A coding device includes: a pitch contour detection unit which detects a pitch contour of an input audio signal; a dynamic time warping unit which determines the number of pitch nodes based on the pitch contour and generates a first time warping parameter including information indicating the determined number of pitch nodes, a pitch change position, and a pitch change ratio; a first encoder which codes the first time warping parameter; a time warping unit which corrects pitch, using the information obtained from the first time warping parameter, to approximate the pitches of the number of pitch nodes to a predetermined reference value; a second encoder which codes the input audio signal at the corrected pitch; and a multiplexer which multiplexes the coded time warping parameter and the coded audio signal to generate a bitstream.

Coding device, decoding device, coding method, and decoding method

Coding device, decoding device, coding method, and decoding method

Coding device, decoding device, coding method, and decoding method

Owner:PANASONIC CORP

Method for displaying words and processing device and computer program product thereof

InactiveUS20130325464A1Substation equipmentSpeech recognitionDisplay deviceTime alignment

The disclosure provides a method for displaying words. In the method, a speech signal is received. A pitch contour and an energy contour of the speech signal are extracted. Speech recognition is performed on the speech signal to recognize a plurality of words corresponding to the speech signal and determine time alignment information of each of the plurality of words. At least one display parameter of each of the plurality of words is determined according to the pitch contour, the energy contour and the time alignment information of each of the plurality of words. Thus, the plurality of words is integrated into a sentence according to the at least one display parameter of each of the plurality of words. Then, the sentence is displayed on at least one display device.

Method for displaying words and processing device and computer program product thereof

Method for displaying words and processing device and computer program product thereof

Method for displaying words and processing device and computer program product thereof

Owner:QUANTA COMPUTER INC

Dynamic programming based humming melody extracting and matching search method

InactiveCN105022744ASpecial data processing applicationsDynamic planningType conversion

The present invention discloses a dynamic planning based humming melody extracting and matching search method. The humming melody extracting and matching search method comprises the specific steps of: acquiring a section of a song hummed by a user through a microphone in a form of audio signal; solving a logarithm energy curve of an input signal and smoothing the curve; dynamically estimating an energy threshold value of an audio region according to a maximum value and a minimum value of the curve; cutting off sections with continuous sounds and enabling each section to correspond to each hummed note; solving a base frequency of each frame of hummed signal based on a time domain self-correlated method, and carrying out down type conversion on the base frequency to obtain a half-tone unit; calculating the pitch of an audio frame by adopting a rule based method, and carrying out melody curve smoothing to remove noise sections, so as to finally obtain effective humming melodies; aiming at the representation of three stages of pitch outline line melodies, carrying out melody matching on indexed network audio files based on a recursive calculation method for a minimum editing distance between pitch outline lines; and taking the plurality of network audio files with the highest total point in similarity as a searching result to be returned back to the user.

Dynamic programming based humming melody extracting and matching search method

Owner:上海京知信息科技有限公司

Audio transform coding using pitch correction

ActiveCN101743585AEfficient determinationSpeech analysisPitch contourAudio frequency

A processed representation of an audio signal having a sequence of frames is generated by sampling the audio signal within a first and a second frame of the sequence of frames, the second frame following the first frame, the sampling using information on a pitch contour of the first and the second frame to derive a first sampled representation. The audio signal is sampled within the second and the third frame, the third frame following the second frame in the sequence of frames. The sampling uses the information on the pitch contour of the second frame and information on a pitch contour of the third frame to derive a second sampled representation. A first scaling window is derived for the first sampled representation and a second scaling window is derived for the second sampled representation, the scaling windows depending on the samplings applied to derive the first sampled representations or the second sampled representation.

Owner:FRAUNHOFER GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG EV

Method for extracting melody of counterpoint based on GPU

ActiveCN103247286AEasy programmingGetting Started QuicklyElectrophonic musical instrumentsFrequency spectrumPitch contour

The invention provides a parallelization method for extracting the melody of counterpoint based on GPU (Graphic Processing Unit), and the method comprises the following three steps: firstly, conducting spectrum transform and pitch value saliency computing to the music data on the GPU; secondly, constructing pitch contour by using pitch saliency on the GPU, getting the relevant characteristics of the pitch contour, and conducting voice detection by using the characteristics; thirdly, conducting frequency doubling removing and outliers removing to the remained pitch contour and finally obtaining the melody track of the counterpoint. The method provided by the invention is based on counterpoint, and can be applied to music without background music and speech sounds. According to the method provided by the invention, the melody can be extracted based on GPU, the extracting time is decreased from the second level to the millisecond level, so as to achieve the standard of real time application. Furthermore, the extracting of required hardware resource is greatly reduced, and the developing speed of the algorithmic is greatly improved, so as to provide the method with wide use value and application prospect in the field of commercial application and scientific research.

Method for extracting melody of counterpoint based on GPU

Method for extracting melody of counterpoint based on GPU

Method for extracting melody of counterpoint based on GPU

Owner:上海芷锐电子科技有限公司

System and method for scoring a singing voice

ActiveUS8575465B2GearworksMusical toysComputer modulePitch contour

A system for scoring a singing voice comprises receiving a singing reference audio signal and / or a user audio signal and / or a pitch contour representation (PCR) of the reference and / or user singing audio signals; a processor means connected to the receiving means and comprising a pitch contour representation (PCR) module (10) for determining a PCR of the singing reference and / or user audio signal, a time synchronization module for time synchronizing the PCRs of the reference and user audio signals respectively. A selection module is provided for selecting a segment of the PCRs based on pre-defined criteria. A cross-correlation module is provided for performing time-warped cross-correlation on the selected segments of the PCRs and outputting a cross-correlation score. The system comprises a key matching module and rhythm matching module for key matching and rhythm matching the remaining unselected segments of the PCRs, and outputting a respective key matching score and rhythm matching score, a scoring module (16) for determining a singing score based on a combination of a pre-determined weightage of the cross-correlation, key matching and rhythm matching scores. A user interface means connects the processor for changing at least one module parameter within at least one module; stores and displays the PCR and singing score.

System and method for scoring a singing voice

System and method for scoring a singing voice

System and method for scoring a singing voice

Owner:SENSIBOL AUDIO TECH PVT LTD

Prosody generation using syllable-centered polynomial representation of pitch contours

ActiveUS8886539B2Speech synthesisSyllableSpoken language

The present invention discloses a parametrical representation of prosody based on polynomial expansion coefficients of the pitch contour near the center of each syllable. The said syllable pitch expansion coefficients are generated from a recorded speech database, read from a number of sentences by a reference speaker. By correlating the stress level and context information of each syllable in the text with the polynomial expansion coefficients of the corresponding spoken syllable, a correlation database is formed. To generate prosody for an input text, stress level and context information of each syllable in the text is identified. The prosody is generated by using the said correlation database to find the best set of pitch parameters for each syllable. By adding to global pitch contours and using interpolation formulas, complete pitch contour for the input text is generated. Duration and intensity profile are generated using a similar procedure.

Prosody generation using syllable-centered polynomial representation of pitch contours

Prosody generation using syllable-centered polynomial representation of pitch contours

Prosody generation using syllable-centered polynomial representation of pitch contours

Owner:THE TRUSTEES OF COLUMBIA UNIV IN THE CITY OF NEW YORK

Music melody generation method based on pitch contour curve

PendingCN111739492AElectrophonic musical instrumentsSpeech analysisNerve networkNetwork generation

The invention relates to the technical field of music generation and specifically relates to a music melody generation method based on pitch contour curve. The method comprises the following steps that 1, long-term structure information of the pitch profile curve is extracted in a frequency domain, the long-term structure information comprises a low-frequency part in a frequency domain sequence ofthe pitch contour curve, and a long-term trend rule of melody is reflected; 2, through fitting the long-term structure information by using a neural network with label control, long-term structure information corresponding to the label is generated; and 3, another neural network is trained by using the long-term structure information and the melody length information of the music data, so that the music data has the capability of speculating the melody length information according to the long-term structure information. According to the invention, the music melody with a controllable long-term structure is generated by using the frequency domain characteristic of the pitch contour curve, so that the music generated by the method is closer to the real music distribution than the music generated by a long-short-term network.

Music melody generation method based on pitch contour curve

Music melody generation method based on pitch contour curve

Music melody generation method based on pitch contour curve

Owner:NANJING UNIV OF POSTS & TELECOMM

Speech signal processing method and apparatus and hearing aid using the same

InactiveCN104205213AIncrease the slopeImprove speech intelligibilityHearing aids signal processingSpeech synthesisTime domainFrequency spectrum

A speech signal processing method is disclosed. The method includes the steps of: step S1 : converting speech waveform into digital signal; step S2: computing a short-time spectrum from the digital signal obtained in the step SI; step S3 : shifting pitch of the short-time spectrum to obtain a spectrum with modified pitch by using a pitch shift algorithm shown in following formula: F0new(n) = CxF0origin(n), wherein F0new(n) is tone enhanced pitch contour samples, F0origin(n) refers to pitch contour samples of the original speech signal, C is a pitch shifting factor and is larger than 1; step S4: converting the spectrum with modified pitch back to time-domain signal; step S5: re-sampling the time-domain signal obtained in the step S4 to obtain re-sampled speech signal; and step S6: converting the re-sampled speech signal back to waveform.

Speech signal processing method and apparatus and hearing aid using the same

Speech signal processing method and apparatus and hearing aid using the same

Owner:SIEMENS AG

Method for displaying words and processing device and computer program product thereof

InactiveUS8935165B2Substation equipmentSpeech recognitionDisplay deviceTime alignment

The disclosure provides a method for displaying words. In the method, a speech signal is received. A pitch contour and an energy contour of the speech signal are extracted. Speech recognition is performed on the speech signal to recognize a plurality of words corresponding to the speech signal and determine time alignment information of each of the plurality of words. At least one display parameter of each of the plurality of words is determined according to the pitch contour, the energy contour and the time alignment information of each of the plurality of words. Thus, the plurality of words is integrated into a sentence according to the at least one display parameter of each of the plurality of words. Then, the sentence is displayed on at least one display device.

Method for displaying words and processing device and computer program product thereof

Method for displaying words and processing device and computer program product thereof

Method for displaying words and processing device and computer program product thereof

Owner:QUANTA COMPUTER INC

Novel color changing device for computerized flat knitting machine

PendingCN108221157AEasy to install and maintainRich weaving state designWeft knittingMotor driveEngineering

The invention discloses a novel color changing device for a computerized flat knitting machine, which comprises a seat body and a mandril device component; the mandril device component is provided with several groups of seat bodies, and the mandril device component comprises a mandril, a swaying rod, a cam and a motor driving device; a seat plate is arranged on the seat body, the motor driving device is fixed on the seat plate; the cam is connected with the motor driving device, and the mandril is moveably embedded on the seat body; a supporting spring is arranged between the upper end of themandril and the seat plate, the sway rod is hinged to one side of the lower end of the seat body; the front end of the sway rod is sleeved to the lower end of the mandril, and the back end thereof isbonded to the lower part of the periphery of an outline of the cam; an inductive sensor corresponding to the upper part of the cam is arranged on the seat plate. The whole structure realizes the conversion of three working positions of vertically moved mandril; the mandril position is correspondingly controlled to move to the required height by adopting an equal pitch contour section of differentzone sections of the cam; the structure is small in working noise, stable in action, and can meet the using demands of different production states.

Novel color changing device for computerized flat knitting machine

Novel color changing device for computerized flat knitting machine

Novel color changing device for computerized flat knitting machine

Owner:QUANZHOU YONGQI PLASTIC ELECTRON

Popular searches

Low complexity Codebook Voice pitch Energy analysis Signal processing Low frequency Bit rate Phase perturbation Frequency domain Vector quantization