Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

65 results about "Speech perception" patented technology

Speech perception is the process by which the sounds of language are heard, interpreted and understood. The study of speech perception is closely linked to the fields of phonology and phonetics in linguistics and cognitive psychology and perception in psychology. Research in speech perception seeks to understand how human listeners recognize speech sounds and use this information to understand spoken language. Speech perception research has applications in building computer systems that can recognize speech, in improving speech recognition for hearing- and language-impaired listeners, and in foreign-language teaching.

Sound-processing strategy for cochlear implants

A sound processing method for auditory prostheses, such as cochlear implants, which is adapted to improve the perception of loudness by users, and to improve speech perception. The overall contribution of stimuli to simulated loudness is compared with an estimate of acoustic loudness for a normally hearing listener based on the input sound signal. A weighting is applied to the filter channels to emphasize those frequencies which are most important to speech perception for normal hearing listeners when selecting channels as a basis for stimulation.
Owner:UNIVERSITY OF MELBOURNE

Methods and apparatus for maximizing speech intelligibility in quiet or noisy backgrounds

ActiveUS20050114127A1Improve speech clarityMaximize intelligibility metricSpeech recognitionPublic address systemHearing test
Methods and apparatus for maximizing speech intelligibility use psycho-acoustic variables of a model of speech perception to control the determination of optimal frequency-band specific gain adjustments. Speech signals (or other audio input) whose intelligibility is to be improved are characterized by parameters which are applied to the model. These include measurements or estimates of speech intensity level, average noise spectrum of the incoming audio signal, and / or the current frequency-gain characteristic of the hearing compensation device. Characterizations of listeners based on hearing test results, for example, may also be applied to the model. Frequency-band specific gain adjustments generated by use of the model can be used for hearing aids, assistive listening devices, telephones, cellular telephones, or other speech delivery systems, personal music delivery systems, public-address systems, sound systems, speech generating systems, or other devices or mediums which project, transfer or assist in the detection or recognition of speech.
Owner:ARTICULATION

Sound-processing strategy for cochlear implants

InactiveUS20070043403A1Reduce impactNormalising overall loudnessElectrotherapyEar treatmentCochlear implantationProsthesis
A sound processing method for auditory prostheses, such as cochlear implants, which is adapted to improve the perception of loudness by users, and to improve speech perception. The overall contribution of stimuli to simulated loudness is compared with an estimate of acoustic loudness for a normally hearing listener based on the input sound signal. A weighting is applied to the filter channels to emphasize those frequencies which are most important to speech perception for normal hearing listeners when selecting channels as a basis for stimulation.
Owner:UNIVERSITY OF MELBOURNE

System for treating disabilities such as dyslexia by enhancing holistic speech perception

The present invention relates to systems and methods for enhancing the holistic and temporal speech perception processes of a learning-impaired subject. A subject listens to a sound stimulus which induces the perception of verbal transformations. The subject records the verbal transformations which are then used to create further sound stimuli in the form of semantic-like phrases and an imaginary story. Exposure to the sound stimuli enhances holistic speech perception of the subject with cross-modal benefits to speech production, reading and writing. The present invention has application to a wide range of impairments including, Specific Language Impairment, language learning disabilities, dyslexia, autism, dementia and Alzheimer's.
Owner:EPOCH INNOVATIONS

Method and system for speech quality perception evaluation based on speech semantic recognition technology

ActiveCN108877839AGood repeatabilitySolve problems that cannot restore the thinking paradigm of the human brainNatural language data processingSpeech recognitionUser perceptionUsers perceptions
The invention discloses a method and system for speech quality perception evaluation based on the speech semantic recognition technology. According to a text of a sender after user speech conversion and a text of a receiver after user speech conversion, text similarity evaluation is carried out based on a text similarity fitting algorithm; network parameters and event information of communicationunit connection networks of the sender and the receiver are displayed in real time and are stored; a user speech perception evaluation model is established by using speech information according to a telecom psychology algorithm and speech perception evaluation is carried out on a user; and then user perception evaluation is formed by means of text similarity evaluation , network information and voice perception evaluation. With the method disclosed by the invention, problems of poor repeatability of the subjective evaluation method and the human brain thinking paradigm can not be restored by the objective issues can be solved; the method is close to the human brain thinking mode and the perception of the network conversation speech quality of the user; and on the basis of the time-positionmapping, a network issue can be located precisely by combining the network parameter information and events.
Owner:NANJING HOWSO TECH

Cochlea stimulator

The invention provides a cochlea stimulator for implantation comprising optical fibres of which are coupled to an irradiation source that is controlled by a modulator to generate irradiation specific for a pre-determined range of sound-frequencies. The cochlea stimulator effects a frequency-specific activation of the organ of Corti needed for speech perception especially in noisy environment and more complex sounds. For imparting excitation signals which are generated by modulated pulsed laser irradiation conducted within an optical fibre in order to elicit nervous signals in residual functional organ of Corti sections, the auditory prosthesis preferably contains optical fibres which are dimensioned to terminate in end sections within the cochlea at different sites or sections of the organ of Corti. e.g. having different lengths for locating their end sections at different internal parts of the cochlea.
Owner:MEDIZINISCHE HOCHSCHULE HANNOVER +1

Sensing Hash value extracting method and sensing Hash value authenticating method for voice sensing Hash authentication

Disclosed are a sensing Hash value extracting method and a sensing Hash value authenticating method for voice sensing Hash authentication. According to the characteristics of LPC (linear prediction coefficients), robustness of an LPC method is improved by optimizing the LPC method and performing three steps of optimizing and blocking of the LPC and disintegrating matrix of parameters after blocking, the optimized LPC are high in robustness and calculating efficiency, and robustness of a sensing Hash sequence formed is improved as well. The sensing Hash value calculated by the optimized LPC maintains the advantages of good instantaneity of a simple LPC method while having good robustness for attacks to the voice during transmission.
Owner:LANZHOU UNIVERSITY OF TECHNOLOGY

Speaker understandability detection method of artificial cochlea signal under noise environment

The invention relates to a speaker understandability detection method of an artificial cochlea signal under the noise environment, and belongs to the field of voice signal processing. Firstly the artificial cochlea processing algorithm is performed on a pure reference voice signal and waveform reconstruction is performed so that the pure voice after artificial cochlea processing can be obtained. Then the voice model of the specific speaker is established after feature extraction; and as for the identification phase, the voice and the noise are purely identified and then the identified voice with noise is formed, and matching with the speaking understandability model is performed after feature extraction so that the final detection result can be obtained. The advantages are that the important theoretical basis can be provided for enhancing the voice perception capacity of the artificial cochlea user, the influence of the noise in the matching process can be reduced, the detection accuracy can be enhanced, and the noise robustness of the detection method can be further enhanced by using the combined feature parameters based on the dynamic Gammachirp filter bank.
Owner:JILIN UNIV

Binaural speech reverberation eliminating method and device based on speech presence probability and consistency

ActiveCN108986832AReverb removalImprove the perceived quality of speechSpeech analysisLow frequency bandSpeech perception
The invention discloses a binaural speech reverberation eliminating method and device based on the speech presence probability and consistency. The method comprises the steps of 1) performing time delay compensation on speech signals received by two microphones to obtain speech signals aligned in time; 2) performing windowing and framing processing, and transforming the speech signals from the time domain to the frequency domain through Fourier transform; 3) estimating a reverberation power spectrum of a low frequency part based on the speech presence probability; 4) calculating the consistency of different signal components of the speech signals; 5) estimating a reverberation power spectrum of a high frequency part based on the consistency; 6) estimating a reverberation power spectrum combining high and low frequencies according to a division threshold of high and low frequency bands; 7) calculating a final reverberation power spectrum by using a recursive smoothing algorithm; 8) obtaining frequency domain signals with the reverberation being eliminated through a gain function; and 9) obtaining time domain signals with the reverberation being eliminated by using short-time inverseFourier transform. According to the invention, the reverberation on the whole frequency band can be effectively eliminated, and the quality of speech perception is improved.
Owner:PEKING UNIV SHENZHEN GRADUATE SCHOOL

Voice aware audio system and method

A voice aware audio system and a method for a user wearing a headset to be aware of an outer sound environment while listening to music or any other audio source. An adjustable sound awareness zone gives the user the flexibility to avoid hearing far distant voices. The outer sound can be analyzed in a frequency domain to select an oscillating frequency candidate and in a time domain to determine if the oscillating frequency candidate is the signal of interest. If the signal directed to the outer sound is determined to be a signal of interest the outer sound is mixed with audio from the audio source.
Owner:HED TECH SARL

Phonetic empathy Hash content authentication method capable of implementing tamper localization

The invention relates to a phonetic empathy Hash content authentication method capable of implementing tamper localization. The method comprises the following steps: pre-processing a voice signal, carrying out 10-order linear prediction analysis on each frame of the voice signal, and acquiring an LSP (line spectral pair) coefficient through the discrete Fourier transformation as the empathy characteristics; grouping the voice data in sequence, combining LSP coefficient weighted expectations of each group of the voice data as the final authentication data, and compressing the authentication data volume through a Hash structure; and finally, quickly authenticating the voice contents through the Hash match. The method can keep robustness for the operations such as changing the sound volume, resounding and resampling, is sensitive to malicious operations such as replacement and deletion, can accurately locate the tamper area, has the characteristics of low authentication data volume and high operation efficiency, and is suitable for resource-limited voice communication terminals.
Owner:LANZHOU UNIVERSITY OF TECHNOLOGY

Cipher text speech perception hashing and retrieving scheme based on time-frequency domain trend change

The invention discloses a cipher text speech perception hashing and retrieving scheme based on time-frequency domain trend change. A piece of speech is divided into a time domain part and a frequency domain part for extracting perception hash, the speech is encrypted by a high-efficiency chaotic XOR encryption algorithm adapting to large-scale data, and a perception hash sequence is embedded into the least significant bit of the cipher text speech by the digital watermarking technology to generate a cipher text speech library and a system perception hash table. The cipher text speech library and the perception hash table are uploaded to the cloud. During retrieval, a perception hash sequence is extracted from an index speech provided by a user, the abstract sequence is submitted to a cloud server as an index, and matching retrieval is carried out in the system hash table of the cloud. When the perception hash sequence is matched with a perception hash value in the system hash table, a cipher text speech corresponding to the hash abstract in the hash table is returned to the user, and retrieval succeeds. Rapid and accurate retrieval of an encrypted speech in the cloud is realized. According to the method, weight distinguishing is carried out, matching is carried out successively, and therefore, the matching efficiency in large-scale application is improved.
Owner:SOUTHWEST JIAOTONG UNIV

A remote control device and method with speech control

A remote control device and method with speech control for controlling an electric appliance which stores up an identifiable speech vocabulary data. The remote control device includes a sensor and a speech identification module. The sensor receives speech to produce a speech sense signal. The speech identification module establishes contact to the electric appliance through communication protocol and obtains the identifiable speech vocabulary data with which a speech vocabulary of the speech sense signal is compared to produce a remote control signal. The remote control signal controls the said electric appliance by communication protocol.
Owner:DELTA ELECTRONICS INC

Method, base station and system for improvement of voice perceptibility in handover process in global system for mobile communications (GSM)

The invention discloses a method for improvement of voice perceptibility in a handover process in the global system for mobile communications (GSM). The method comprises the steps that a base station feeds back PhyInfo to a mobile communication terminal after receiving HandoverAccess sent from the mobile communication terminal and carries out detection on downlink voice; the PhyInfo is used for replacing replaceable voice frames according to a replaceable voice frame detection result of the downlink voice; and transmission of the PhyInfo is stopped when an SABM frame sent by the mobile communication terminal is received or the number of times of transmitting reaches to the biggest value Ny1. The invention further discloses a base station and a system for the improvement of the voice perceptibility in the handover process in the GSM. According to the method, the base station and the system for the improvement of the voice perceptibility in the handover process in the GSM, the voice perceptibility can be improved.
Owner:ZTE CORP

VoLTE service switching method and system

ActiveCN109413701AAvoiding Perceived Deterioration ProblemsSpeech analysisWireless communicationCurrent cellUser perception
The invention provides a VoLTE service switching method, which comprises the following steps: S1, calculating the voice quality of a user in a target cell based on a preset voice quality detection algorithm, wherein the target cell comprises a current cell and a plurality of adjacent cells; S2, if the voice quality of the user in the current cell is lower than a first preset threshold, selecting the adjacent cell with the highest voice quality for handover. The VoLTE service switching method and system provided by the invention can detect the voice quality of the user at a network side, thereby controlling the user to switch to a cell with better voice perception and avoiding the problem of deterioration of the user perception to the maximum extent.
Owner:CHINA MOBILE COMM GRP CO LTD +1

Symmetrical ternary string represented voice perception Hash sequence constructing and authenticating method

InactiveCN104134443AOvercome weaknessPerceptual hash digest is strongSpeech analysisAlgorithmVoice communication
The invention discloses a symmetrical ternary string represented voice perception Hash sequence constructing and authenticating method. The method comprises the steps that firstly, overall discrete wavelet transforming (DWT) is carried out on voice signals produced after preprocessing and intensity-loudness transformation (ILT); secondly, non-overlapping partitioning is carried out on the low-frequency part of the voice signals produced after DWT, and short-time logarithm energy of blocks is calculated to obtain the signal frequency-domain features; lastly, a final ternary perception Hash sequence is generated based on the time domain spectrum flux features (SFF) of the voice signals, and the voice frequency content is quickly authenticated through Hash matching. The symmetrical ternary string representation of the perception Hash abstract is superior to that of the binary form, the common voice content is operated between the robustness and the difference in a balanced mode, the time complexity of the algorithm is low, efficiency and the abstraction are high, precise manipulation detecting and positioning can be achieved, and the method can be used for authenticating a mobile voice communication terminal with bandwidth resources limited in real time.
Owner:LANZHOU UNIVERSITY OF TECHNOLOGY

Method and system for consonant-vowel ratio modification for improving speech perception

InactiveUS20160365099A1Speech perception can be enhancedImprove perceptionSpeech analysisFrequency spectrumConsonant vowel
Increasing the level of the consonant segments relative to the nearby vowel segments, known as consonant-vowel ratio (CVR) modification, is reported to be effective in improving speech intelligibility by listeners in noisy backgrounds and by hearing-impaired listeners. A method along with a system for real-time CVR modification using the rate of change of spectral centroid for detection of spectral transitions is disclosed. A preferred embodiment of the invention using a 16-bit fixed point processor with on-chip FFT hardware is also presented for real-time signal processing. It can be integrated with other FFT-based signal processing in communication devices, hearing aids, and other systems for improving speech perception under adverse listening conditions.
Owner:INDIAN INSTITUTE OF TECHNOLOGY BOMBAY

Psychological acoustic model-based voice post-perception filter

The invention relates to a psychological acoustic model-based voice post-perception filter. The perception filters does not need to be fused in all algorithms, so that the complexity of the algorithms is not influenced; but the identical auditory perception enhancement effect can be obtained. Because the re-processing process of voice enhancement is focused, the auditory perception of the enhanced voice is further improved; and even under the circumstances that noise exists and the signal to noise ratio is not improved, the objective of auditory perception improvement can be achieved by using the post-perception filter. The filter is established under the circumstances that the voice signal distortion is in a minimum state and on the condition that the residual noises are not heard by human ears. Moreover, the gain of the filter is obtained by constructing a cost function containing a masking threshold on the condition; and further optimization is carried out by a perception normalization factor constructed by the masking threshold. Therefore, excessive signal weakening can be avoided and the minimum voice perception distortion after enhancement can be ensured.
Owner:TAIYUAN UNIV OF TECH

Digital speech perception hash method based on formant frequency

The invention discloses a digital speech perception hash method based on formant frequency. The method is used for speech retrieval in a big data background, and the format frequency capable of reflecting timbre characteristics of speakers and time domain energy differences having the strong robustness can be respectively extracted to be used as the detail characteristics of the speech segments. During the matching process, the speech rough characteristics can be matched, and the speech segments having the timbres, which are similar to that of the target speech, can be screened out, and then the speeches having the similar timbres can be screened out for the matching of the detail characteristics, and at last, the accurate matching result can be acquired. When the method is used for the mass speech signal processing, a lot of unnecessary calculation amount can be saved, and the matching efficiency can be improved obviously.
Owner:SOUTHWEST JIAOTONG UNIV

Perceptual Hash feature extraction method and system of encrypted voice signal

The invention discloses a perceptual Hash feature extraction method and system of an encrypted voice signal. The method includes steps: performing signal framing on the encrypted voice signal, calculating a short-time cross correlation coefficient between each encrypted voice frame and an adjacent encrypted voice frame, and obtaining a cross correlation coefficient matrix; determining the previousshort-time cross correlation coefficients with large values in each row of the cross correlation coefficient matrix as elements of a feature coefficient matrix, and obtaining the feature coefficientmatrix; decomposing the feature coefficient matrix by employing a non-negative matrix decomposition method, and obtaining a feature parameter matrix; and performing binary Hash construction on the feature parameter matrix by employing a Hash function, and obtaining a perpetual Hash value of the encrypted voice signal. By employing the method or system, the short-time cross correlation coefficientsextracted from the encrypted voice signal can be regarded as perpetual features of the encrypted voice signal, and the perpetual Hash value of the encrypted voice signal is generated through Hash construction so that the robustness, the distinction and the abstractness of direct extraction of the voice perpetual features from the encrypted voice signal are improved.
Owner:LANZHOU UNIVERSITY OF TECHNOLOGY

System and method for evaluating speech perception in complex listening environments

ActiveUS20190261095A1Enhanced signalPerformance benefitsHealth-index calculationSpeech analysisAuditory testingSpeech perception
The present application describes a plurality of test simulation environment mimic complex listening environment of everyday life, comprising a speech component, and a noise component. The application also describes an auditory testing system and method for evaluating a listener's speech perception and method to test hearing prosthesis or hearing protection device's effect on a person's speech perception in a complex listening environment
Owner:THE US REPRESENTED BY THE SEC OF THE DEFENSE

Voice call method and device

The embodiment of the invention provides a voice call method and device, and relates to the technical field of terminals, and the method comprises the steps that first terminal equipment sends a first request to second terminal equipment; the second terminal equipment sends a first message to the first terminal equipment; the first terminal equipment obtains azimuth information between the first terminal equipment and the second terminal equipment according to the position information and the orientation information of the first terminal equipment and the position information of the second terminal equipment; when the azimuth information indicates that the second terminal equipment is located on the first side of the first terminal equipment, the first terminal equipment processes a voice signal from the second terminal equipment based on the azimuth information to obtain a first channel voice signal and a second channel voice signal; and a first earphone can play the first channel voice signal in a first channel and play the second channel voice signal in a second channel. Therefore, in the voice listening process, the user can sense the orientation of the opposite end through the voice fed back by the earphone.
Owner:BEIJING HONOR DEVICE CO LTD

Non-contact type voice perception synthesis device

The invention discloses a non-contact type voice perception synthesis device. The non-contact type voice perception synthesis device comprises a vocal cord vibration sensing module and a voice perception synthesis module; the vocal cord vibration sensing module is connected with the voice perception synthesis module and extracts the vocal cord vibration frequency of a tested person; and the voice perception synthesis module conducts voice synthesis on the vocal cord vibration frequency of the tested person with the harmonic synthesis method. The non-contact type voice perception synthesis device is small and exquisite in size, convenient to install, high in measurement accuracy and tiny in transmitting power; any harm or side effect on the tested person can be avoided; medical workers can conveniently take corresponding medical treatment measures in time according to the actual conditions of vocal cord vibration; and a patient with voice disorders can be assisted.
Owner:NANJING UNIV OF SCI & TECH

Method for calculating telescopic resistance interval of music clip

The invention relates to a method for calculating the telescopic resistance interval of a music clip, belonging to the technical field of audio processing. The method comprises the following steps of: firstly, establishing a music telescopic resistance dataset to obtain a telescopic resistance distribution histogram; performing equal-area division to form a telescopic resistance type; extracting multiple audio content characteristics to form a characteristic vector of the music clip; performing generalization processing and solving to obtain a diagonal matrix; distinguishing the differing degree of the music clip by use of the music style; and with K neighbor judgment, calculating the telescopic resistance interval of the clip to be processed. The method provided by the invention puts forward a quantitative expression method for the music telescopic resistance for the first time, and calculates the music telescopic resistance interval by mainly focusing on the characteristics of the audio content and taking the music style as subsidiary as well as combining the machine learning strategy; and the method has relatively high accuracy, is easy to operate, and can be directly applied to the parameter estimation in a music reconstruction algorithm and the study on the characteristics of human perception of the music clip in music psychology and speech perception.
Owner:TSINGHUA UNIV

Cell switching method and apparatus, and computing device

The embodiment of the invention relates to the technical field of communication, and discloses a cell switching method and device and computing equipment. The method comprises the following steps: when a user terminal performs a VoLTE voice call, acquiring an MOS value in real time; if the MOS value is smaller than a preset threshold value, controlling a mobility management entity (MME) to send a pilot frequency measurement control instruction to a base station connected with the user terminal, so that the user terminal reports a pilot frequency measurement report; receiving the pilot frequency test report, and judging whether a cell level value in the pilot frequency test report within a preset time is greater than or equal to a preset threshold value; and if yes, performing pilot frequency cell switching on the user terminal. Through the above mode, the embodiment of the invention can evaluate the actual voice perception of the user, and switches the user at the voice perception edge to the pilot frequency cell, thereby improving the voice quality.
Owner:CHINA MOBILE GROUP ZHEJIANG +1

Methods and systems for optimizing speech and music perception by a bilateral cochlear implant patient

An exemplary method of optimizing speech and music perception by a bilateral cochlear implant patient includes identifying a first ear of a bilateral cochlear implant patient as being relatively more suited for processing speech than a second ear of the patient, directing a first cochlear implant subsystem associated with the first ear to operate in accordance with a first sound processing program configured to enhance speech perception by the patient, and directing a second cochlear implant subsystem associated with the second ear to operate in accordance with a second sound processing program configured to enhance music perception by the patient. Corresponding methods and systems are also disclosed.
Owner:ADVANCED BIONICS AG

Speech enhancement post-processing method and device based on harmonic structure prediction

PendingCN114360560AHarmonic Component RestorationImprove speech perception qualitySpeech analysisInformation processingTime domain
The invention discloses a voice enhancement post-processing method and device based on harmonic structure prediction, and belongs to the field of information processing, and the method comprises the following steps: S1, carrying out the short-time Fourier transform of a voice signal of a microphone, and obtaining a time-frequency domain expression; s2, carrying out harmonic loss estimation and correction on the time-frequency domain signal to obtain an estimated power spectrum density; s3, estimating a time-frequency masking value according to the power spectrum density; and S4, according to the estimated time-frequency masking value, obtaining the frequency domain estimation of the target voice, and further obtaining the time domain estimation of the target voice. According to the method, the lost harmonic structure can be predicted to a certain extent, the recovered voice better conforms to the characteristics of the near-speaking voice, and the intelligibility and the voice perception quality are higher.
Owner:SUIRUI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products