Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

47 results about "Vocal quality" patented technology

Factors that determine quality. Production of original tone by the vocal folds (phonation); process of selection, reinforcement and damping of this tone by the resonators (throat, mouth, nose) Types of vocal quality. Breathy, strident, harsh, hypernasality, glottal fry, throaty, glottal attack, hoarse, hyponasality.

Training method of voice bandwidth expansion model and voice bandwidth expansion method

The invention discloses a training method of a voice bandwidth expansion model and a voice bandwidth expansion method, wherein the voice bandwidth expansion method comprises the following steps: acquiring to-be-expanded narrowband voice; calculating an amplitude spectrum and a phase spectrum of the to-be-expanded narrowband voice, and extracting auxiliary characteristics of the to-be-expanded narrowband voice ; processing the amplitude spectrum and the auxiliary characteristics of the to-be-expanded narrowband voice by virtue of the voice bandwidth expansion model which is obtained from training, so that the amplitude spectrum of a reconstructed bandwidth voice high-frequency band can be obtained; conducting mirror image reversing on the phase spectrum of the to-be-expanded narrowband voice in a frequency domain, and determining the phase spectrum of the bandwidth voice high-frequency band; and on the basis of the amplitude spectrum and the phase spectrum of the narrowband voice, and in combination with the amplitude spectrum and the phase spectrum of the reconstructed bandwidth voice high-frequency band, determining a bandwidth voice signal. With the application of the voice bandwidth expansion method provided by the invention, an effect of improving the tone quality and naturalness of the narrowband voice can be achieved.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

System for improving speech quality and intelligibility with bandwidth compression/expansion

A system and method are provided for improving the quality and intelligibility of speech signals. The system and method apply frequency compression to the higher frequency components of speech signals while leaving lower frequency components substantially unchanged. This preserves higher frequency information related to consonants which is typically lost to filtering and bandpass constraints. This information is preserved without significantly altering the fundamental pitch of the speech signal so that when the speech signal is reproduced its overall tone qualities are preserved. The system and method further apply frequency expansion to speech signals. Like the compression, only the upper frequencies of a received speech signal are expanded. When the frequency expansion is applied to a speech signal that has been compressed according to the invention, the speech signal is substantially returned to its pre-compressed state. However, frequency compression according to the invention provides improved intelligibility even when the speech signal is not subsequently re-expanded. Likewise, speech signals may be expanded even though the original signal was not compressed, without significant degradation of the speech signal quality. Thus, a transmitter may include the system for applying high frequency compression without regard to whether a receiver will be capable of re-expanding the signal. Likewise, a receiver may expand a received speech signal without regard to whether the signal was previously compressed.
Owner:MALIKIE INNOVATIONS LTD

Method and device for singing imitation

ActiveCN104464725AImprove network smoothing problemCalculation speedSpeech recognitionSpeech synthesisVocal qualityImitation
The invention provides a method for singing imitation. The method comprises the steps that corresponding audio materials of a source singer and a target singer are prepared; voice features of the source singer and voice features of the target singer are analyzed by using a STRAIGHT model; a joint GMM model of the source singer and the target singer is obtained through training by using a Gaussian mixture model; Gaussian transform functions based on inter-frame correlation are used in the tone conversion process; the tone of the source singer and the tone of the target singer are mixed in proportion; voice having the tone of the target singer is reconstructed by using the STRAIGHT model. The invention further provides a device for implementing the method. According to the method and device, the voice of the source singer can be converted into the voice having the tone of the target singer, converted tone quality is good, the tone is close to the tone of the target singer, and the tone features of the target singer can be added in proportion; especially, when the target singer is a music star, the self-confidence and interestingness of singing for the user at a digital audio and video place can be improved greatly, and meanwhile the level of the user for imitating the tone of the music star is improved.
Owner:福建凯米网络科技有限公司

Hi-Fi tone quality identifying method based on pattern recognition

The invention discloses a Hi-Fi tone quality identifying method based on pattern recognition. The method includes the steps: selecting a learning sample, converting a sample time domain signal into a frequency domain signal, and making a frequency attenuation mean value every 500Hz from 11050Hz to 22050Hz serve as a characteristic, wherein twenty two characteristics are provided and numbered from one to twenty two. Through statistical learning of a lot of already labeled lossy and lossless audios, a lossless audio condition probability corresponding to a lossless audio characteristic is obtained, a lossy audio condition probability corresponding to a lossy audio characteristic is obtained, Fourier transform is performed for a to-be-measured audio, and a combined probability can be calculated and then the probability of the specific tone quality of a given audio can be reckoned after condition probabilities between the characteristics and between the characteristics and the tone quality are obtained by extracting the frequency characteristics. Through the above algorithm, a high detection success rate of real lossless audios is achieved, and a relatively low misjudgement rate of false lossless audios is guaranteed. Moreover, compared with auCDtect, the method can reach the advanced level in the industry.
Owner:ZHEJIANG TIANGE INFORMATION TECH CO LTD

Voice signal-based gradient boosting decision tree depression identification method

ActiveCN112006697AImprove speech processing qualityImprove objectivitySpeech analysisSensorsDepression screeningLearning methods
The invention relates to a voice signal-based gradient boosting decision tree depression identification method. The method comprises the following steps: obtaining voice data of an interviewer and a corresponding PHQ-8 depression screening scale score, enabling a voice signal to correspond to a PHQ-8 value, selecting a training sample set for training, and testing the sample set; extracting prosodic features representing emotion and depression in the voice signal, and spectrum-based related features and tone quality features; and performing learning on the training set by adopting a machine learning method of a gradient boosting decision tree, and taking the PHQ-8 score as an output result as a basis for judging the depression degree. According to the invention, by using the gradient boosting decision tree as a learning method, the accuracy of the predicted PHQ-8 value and the timeliness of training are improved, the PHQ-8 value of the PHQ-8 depression screening scale is taken as an output result, the score of the PHQ-8 value is between 0 and 24, the depression is determined when the score is higher than 10 and lower than 20, and the severe depression is determined when the score is higher than 20. The method has higher accuracy and objectivity.
Owner:SOUTHEAST UNIV

Non-periodic component syllable model building and speech synthesizing method and device

InactiveCN104282300APreserve sound qualitySmall scaleSpeech synthesisSyllableData information
The invention discloses a non-periodic component syllable model building and speech synthesizing method and device. The method includes the steps that according to a non-periodic component representative value, of each frame of each syllable in an original speech waveform file, on each piece of frequency band information obtained through dividing, a non-periodic component spectrum fitting curve, of each syllable, on the selected frequency band information is obtained through a discrete cosine transform method, and a non-periodic component syllable model including the non-periodic component spectrum fitting curves, of all the syllables of the original speech waveform file, on the different frequency band information is generated, so that the data information, including the frequency band number *syllable frame number, in the syllable model is converted into the fitting curves including the number of frequency bands, the scale of speech model building is downsized, the system resources are saved, meanwhile, the non-periodic component spectrum fitting curve of each syllable is built, the continuity among frames of the syllables is fully considered, the original tone quality of the syllables is kept through the fitting curves, and the quality of the synthetic speech is improved in the synthesis process.
Owner:CHINA MOBILE COMM GRP CO LTD

Double-microphone earphone microphone

PendingCN107241666ASolve the problems of loud noise, mic spray and poor sound qualitySolve the problem of microphone spraying and poor sound qualityMicrophonesLoudspeakersSound sourcesNoise
The invention provides a double-microphone earphone microphone. A circuit board is installed in a cavity, two voice tubes are installed on the circuit board, respectively an omni-directivity voice tube and a uni-directivity voice tube, a control switch can control circuits of the omni-directivity voice tube and the uni-directivity voice tube, when a switch slide sheet slides upwards, the circuit of the uni-directivity voice tube is connected, the uni-directivity voice tube starts to work, and the circuit of the omni-directivity voice tube is opened; when the switch slide sheet slides downwards, the circuit of the omni-directivity voice tube is connected, the omni-directivity voice tube starts to work, and the circuit of the uni-directivity voice tube is opened; the uni-directivity voice tube is mainly utilized during singing, voice can be accurately transmitted into the uni-directivity voice tube through a sound inlet port of a cavity upper lid, tone quality efficient transmission is achieved, noise due to voice tube air injection is reduced, the problem of poor tone quality due to the voice tube air injection when a consumer sings can be solved, and the tone quality is made more euphonic. The omni-directivity voice tube is mainly utilized in daily telephone chats, the limitation that the orientation of a sound source is not fixed can be gotten rid of, and the conversation is still clear even if body motion causes swings of the voice tube.
Owner:陆文胜

Speech emotion recognition method based on spectral features and ELM

ActiveCN110827857ASolve the problem of single feature extraction and poor robustnessReduce processing timeSpeech recognitionLearning machineBiology
The invention provides a speech emotion recognition method based on spectral features and ELM. The method comprises the following steps of extracting basic characteristics of an original speech signal, wherein the basic characteristics comprise rhythm characteristics and tone quality characteristics; extracting a Mel frequency cepstrum coefficient (MFCC) and a cochlear filter cepstrum coefficient(CFCC) by using a Teager energy operator TEO algorithm, weighting the MFCC and the CFCC to obtain a teCMFCC characteristic, and fusing the teCMFCC characteristic with a basic characteristic value to construct a characteristic matrix; performing selective dimension reduction on the characteristics by using a Fisher criterion and correlation analysis, and reserving the individual characteristics ofthe speech signals; and establishing an ELM decision tree model of an extreme learning machine to finish speech emotion recognition and classification. Through the method provided by the invention, the nonlinear characteristics of the speech signals are emphasized, and good robustness is provided; the test is performed on the CASIA Chinese emotion corpus recorded by the automatization institute ofChinese academy of sciences so as to verify that the proposed speech emotion recognition algorithm based on the spectral characteristics and the ELM has good classification and recognition precisionon the Chinese speech signals.
Owner:HARBIN ENG UNIV

Method for implementing orchestral resonance instrument with bass extension effect

InactiveCN102663997AReduce noiseBalanced and melodious soundMusical instrumentsBreathy voiceAcoustic theory
A method for implementing an orchestral resonance instrument with a bass extension effect is designed for solving problems that an existing national bass string instrument is slow in audio vibration, low in volume and dull in tone and a western violin is extremely inharmonious with the style of national music. A stem is designed to be in the shape of a hollow tube and is used as an auxiliary resonance cavity for a barrel which is used as a main resonance cavity, a membrane is a mocking leather membrane, at least six metal strings are uniformly distributed on the tubular stem and the barrel by means of drawing a bow, so that the main resonance cavity and the auxiliary resonance cavity realize a stereoscopic resonance effect, and audio is further filtered by the aid of a loudspeaker opening on the surface of the main resonance cavity and a loudspeaker opening on the surface of the auxiliary resonance cavity. The method for implementing the orchestral resonance instrument with a bass extension effect has the advantages that modern instrumental acoustic theories are applied on the basis of the resonance characteristics of the traditional string instrument, single structures in which a traditional bass Chinese fiddle and a Ke-hu which are used as resonance cavities are fundamentally changed, the stereoscopic resonance effect which is generated by the barrel used as the main circular resonance cavity and the tubular stem used as the auxiliary resonance cavity is highlighted, the audio of the instrument is further filtered by the aid of the loudspeaker opening of the surface of the barrel and the loudspeaker opening on the surface of the hollow stem, noise and murmur are effectively eliminated, and the tone quality, the volume and the tone are more balanced and melodious assuredly.
Owner:SHENYANG NORMAL UNIV

Storable English listening training device

The invention discloses a storable English listening training device, and belongs to the field of English teaching, the storable English listening training device comprises a box body, the front end of the box body is provided with an opening, the opening is provided with a box door, the box body and the box door are movably installed through a rotating shaft, and the bottom of the box door is fixedly installed with the rotating shaft through a first rotating seat, the lower side of an opening in the front end of the box body is rotationally installed on the rotating shaft through a second rotating seat, a player is arranged in the box body, sliding blocks are installed on the bottom face and the top face of the player, and first sliding grooves matched with the sliding blocks are formed in the inner bottom wall and the inner top wall of the box body, second sliding grooves matched with the sliding blocks are formed in the inner wall of the box door. According to the storable hearing training device, the player in the device can be stored after being used, the situation that the tone quality of the player is reduced due to the influence of dust is avoided, meanwhile, damage to the player and related circuits is avoided, in addition, heat dissipation can be efficiently conducted on the player, and the situation that normal use of the player is affected due to high temperature is avoided.
Owner:SHAANXI UNIV OF CHINESE MEDICINE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products