Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

96 results about "Rate of speech" patented technology

Dialect speech recognition method and electronic device

The embodiments of the invention disclose a dialect speech recognition method and an electronic device. The method includes the following steps: the current location information of a user is acquired,and the current regional information of the user is determined according to the current location information; a preset target speech model bound with the current regional information is searched, wherein the target speech model includes a set of mapping relationships between dialect speech corresponding to the current regional information and standard mandarin speech; a target mapping relationship is determined from the set of the mapping relationships, and the content of the dialect speech corresponding to the current regional information in the target mapping relationship is identical withthe content of instant dialect speech inputted by the user; and standard mandarin speech which is in mapping relationship with the instant dialect speech inputted by the user is obtained from the target mapping relationship; and the standard mandarin speech which is in mapping relationship with the instant dialect speech inputted by the user is converted into text information, and the text information is outputted. With the dialect speech recognition method and the electronic device provided by the embodiments of the present invention adopted, dialect speech can be correctly recognized, and the correct rate of speech recognition is improved.
Owner:GUANGDONG XIAOTIANCAI TECH CO LTD

Speech Synthesizing Apparatus

A corpus-based speech synthesizing apparatus is provided which has a text analysis unit for analyzing a given sentence in text data and generating phonetic symbol data corresponding to the sentence; a prosody estimation unit for generating a prosodic parameter representing an accent and an intonation corresponding to each phonetic symbol data according to a preset prosodic knowledge base for accents and intonations; speech-unit extraction unit for extracting all the speech segment waveform data of a predetermined speech unit part from each speech data having the predetermined speech unit part closest to the prosodic parameter, based on a speech database which stores therein plural kinds of predetermined selectively prerecorded speech data only such that the speech database has a predetermined speech unit suitable for a specific application of the speech synthesizing apparatus; and a waveform connection unit for generating synthesized speech data by performing sequentially successive waveform connection of the speech segment waveform data groups such that the speech waveform of the speech segment waveform data groups continues, wherein the respective functional units, a data input unit, a speech conversion processing unit, and a speech speed conversion unit is added or removed as desired depending on a specific application and a scale of the apparatus.
Owner:EE I KK

Spectrum-entropy improvement based speech endpoint detection method in low signal-to-noise ratio environment

The invention provides a spectrum-entropy improvement based speech endpoint detection method in a low signal-to-noise ratio environment. In order to solve the problem that a speech endpoint detection system is not high in accuracy rate in the low signal-to-noise ratio environment in recognition of a current speaker, the endpoint detection method capable of improving accuracy rate of speech endpoint detection in the low signal-to-noise ratio environment is provided, wherein the detection method comprises the following steps: (1) pre-processing a speech signal in accordance with characteristics of the signal; (2) in accordance with division of each frame of frequency band of the speech signal, calculating spectrum-entropy and energy of various sub-bands, so that an energy-entropy ratio SEH of the various sub-bands is finally obtained; and (3) setting a proper threshold value, and in combination with median filtering, obtaining starting and ending positions of speech. The invention aims at removing influence of environmental noise by conducting the median filtering, so that the speech signal is more stable, and the accuracy rate of endpoint detection in the low signal-to-noise ratio environment is improved.
Owner:CHONGQING UNIV OF POSTS & TELECOMM

Method for realizing sound speed-variation without tone variation and system for realizing speed variation and tone variation

The invention discloses a system for realizing sound speed variation and tone variation, which comprises an input cache module, a tone variation processing module, a speed-variation no-tone-variation processing module and a data output module, wherein the input cache module is used for reading the sound signal data to be processed into the cache; the tone variation processing module is used for carrying out the tone variation processing on the sound signal to change the sound tone; the speed-variation no-tone-variation processing module is used for carrying out the speed-variation no-tone-variation processing on the sound signal, thereby changing the sound speed without changing the tone; and the data output module is used for outputting the speed-variation tone-variation signal. The speed-variation no-tone-variation processing module comprises a segmentation data module and a connection data module, wherein the speed-variation no-tone-variation processing module extracts a string of signal subfamilies (namely small sections of sound) from the original speech signal according to the coefficient of variation in speed by using a window function; and the connection data module connects the signal subfamilies according to the time sequence, thereby obtaining the speed-variation no-tone-variation signal. The invention realizes the speed-variation no-tone-variation function and the speed-variation tone-variation function of the audio frequency by using very low algorithm complexity, and does not introduce noise, thereby enhancing the quality of the processed sound.
Owner:刘盛举 +1

Voice recognition method and device and computer storage medium

The invention discloses a voice recognition method and device and a computer storage medium which are used for solving the technical problems of low recognition rate of speech and inconvenience and slowness in the prior art. The method comprises a step of collecting a user facial image through an image collecting device when collecting a user voice through a voice collecting device, a step of predicting a prediction voice corresponding to the user voice by using a prediction model based on the user voice and the user facial image, wherein the prediction model is obtained by training voices ofdifferent people corresponding to each control instruction and the corresponding facial image, a step of matching a voice audio standard data corresponding to the control instruction in a voice database based on the prediction voice, wherein the voice data comprises a mapping relationship between the control instruction and the corresponding voice audio standard data, and a step of calculating thematching degree between the user voice and the voice audio standard data by a matching model, and controlling a smart home device according to the control instruction corresponding to the voice audiostandard data when the matching degree reaches a set threshold.
Owner:GREE ELECTRIC APPLIANCES INC

Chinese speech recognition system based on heterogeneous model differentiated fusion

The invention relates to a Chinese speech recognition system which pertains to the speech recognition technology field and is based on heterogeneous model differential fusion. The system comprises: a model-probability weighty-distribution module, a differential model-probability weighty-training module, a model-probability weighty-smoothing module and a speech recognition module of differential fusion. The model-probability weighty-distribution module is responsible for generating the relevant model-probability weight sets for the linguistic context of every arc of a lattice and carrying out initialization; the differential model-probability weighty-training module utilizes minimum tone error rule to differentially train the output of heterogeneous model and obtain a minimum tone error cumulant, and a differential model-probability weight sets is obtained according to the minimum tone error cumulant; the model-probability weighty-smoothing module carries out the smoothing process on the relevant model-probability weight sets which is input into the context; the speech recognition module of differential fusion carries out speech recognition output by the weight sets after the smoothing process. The system can reduce the relative error recognition rate of speech recognition.
Owner:SHANGHAI JIAO TONG UNIV

Method of processing emotion in voice, and mobile terminal

The invention provides a method of processing emotion in voice and a mobile terminal. The method of processing emotion in voice includes the steps: according to the voice data entered by a user, establishing an emotion database of the user; based on the emotion data in the emotion database, performing identification of the voice clips to be processed on the original voice data of the user before sending, wherein an emotion data packet includes at least one of the followings: a user negative induced emotion word bank, the average speed of the user and the average volume of the user; and the voice clips to be processed include the negative induced emotion data; processing the negative induced emotion data of the voice clips to be processed in the original voice data, and generating the voicedata to be transmitted; and replacing the original voice data by means of the voice data to be transmitted, and sending the voice data to be transmitted to the receiving terminal of voice communication. The method of processing emotion in voice and a mobile terminal can avoid the receiving party receiving the voice data which is not conductive to communication so as to realize the good effect ofimproving the communication efficiency by performing emotion processing on the voice data before sending the voice data.
Owner:VIVO MOBILE COMM CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products