Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

26 results about "Voice analysis" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Voice analysis is the study of speech sounds for purposes other than linguistic content, such as in speech recognition. Such studies include mostly medical analysis of the voice (phoniatrics), but also speaker identification. More controversially, some believe that the truthfulness or emotional state of speakers can be determined using voice stress analysis or layered voice analysis.

system

PendingJP2026105311ADialog systemVoice analysis

We provide the system. [Solution] A receiving method that accepts language selection, A generation means for generating learning content based on a generative AI model, An acoustic conversion means that presents the generated learning content as an audio output, A voice analysis method that converts user voice information into text data, A management system that continues the dialogue based on user responses and provides feedback, A dialogue system that includes this.

system

Owner:SOFTBANK GROUP CORP

A breath-speech pause signal analysis system and method

PendingCN122250949ASpeech analysisRespiratory organ evaluationVoice analysisCross lingual

The present application relates to the cross technical field of speech signal processing, respiratory physiological monitoring and artificial intelligence, and specifically discloses a respiratory-speech pause signal analysis system and method. The present application synchronously collects speech and respiratory signals, extracts language-adapted pause parameters and physiological indexes, performs time sequence alignment and correlation analysis, and then fuses a prediction model and a true-false recognition model for intelligent analysis, thereby solving the problems that traditional heart-lung function detection relies on professional equipment and cannot be remotely and non-contactly monitored, and that existing speech analysis lacks a physiological coupling mechanism, leading to the inability to identify synthetic speech and insufficient cross-language adaptation, and realizing the dual ability improvement of non-contact respiratory physiological state evaluation and speech fraud recognition.

A breath-speech pause signal analysis system and method

A breath-speech pause signal analysis system and method

A breath-speech pause signal analysis system and method

Owner:ZHONGDE NUOHAO (BEIJING) EDUCATION TECH CO LTD

A personalized sound medicine formula generation and evaluation method based on voice emotion recognition

PendingCN122290642APersonalizationSound therapy

This invention relates to a method for generating and evaluating personalized sound therapy formulas based on voice emotion recognition, belonging to the field of voice analysis, and more specifically to the field of digital music development and production. The method includes: using an AI sound therapy effect judgment model to intelligently determine the increase in the target user's happiness level after using each five-tone sound therapy formula based on the target user's current five-element attribute data and current five-organ attribute data, and the digital content of each five-tone sound therapy formula; selecting the five-tone sound therapy formula with the largest increase value as the target user's personalized sound therapy formula. This invention addresses the technical problems of difficulty in providing the most suitable personalized music therapy plan for different users and the insufficient precision of music therapy plans. It employs an AI-based traversal method to intelligently judge each sound therapy formula for the target user, and uses five-tone five-element sound therapy formulas to improve the precision of the sound therapy formulas, thereby solving the aforementioned technical problems.

A personalized sound medicine formula generation and evaluation method based on voice emotion recognition

Owner:SHANDONG SHANGYI HEALTHCARE TRADITIONAL CHINESE MEDICINE TECHNOLOGY DEVELOPMENT CO LTD +1

Prediction of neurodegenerative diseases based on speech analysis

ActiveDE602023018560T2Neuro-degenerative diseaseVoice analysis

Owner:GENENTECH INC

Portable ai psychiatric assistant recording device

PendingCN122157925ASemantic analysisBiological modelsMedical recordEngineering

The present application relates to the technical field of medical informatics, and in particular to a portable AI psychiatric auxiliary recording device, a multi-modal sensing module collects audio signals of doctor-patient conversations, separates the voices of doctors and patients and extracts labeled acoustic feature vectors. The term enhancement module converts the acoustic features into text and outputs standard term annotations in combination with a psychiatric knowledge graph. The context reconstruction module analyzes long-range conversation dependencies to generate a structured semantic abstract. The template generation module fills in the medical record template based on the abstract and outputs a standardized draft. The meta-learning adaptation module monitors the output of each module and dynamically updates the parameters through distributed learning. The device uses beamforming, graph attention networks, Transformer-XL architecture and federated learning, etc. to solve the problem of unstructured voice analysis in psychiatric conversations, achieve accurate term mapping and automatic generation of medical records, improve accuracy and efficiency, and reduce the burden of manual review.

Portable ai psychiatric assistant recording device

Owner:BEIJING HAOXINQING MOBILE MEDICAL TECH CO LTD

A dialect generation method and system based on voiceprint features

PendingCN122116872ASpeech synthesisPattern recognitionVoice analysis

The present application relates to the field of artificial intelligence, and discloses a dialect generation method and system based on voiceprint features, comprising: extracting individual voiceprint identification and recording content information of a target area; using a pre-trained deep learning model to extract a voiceprint acoustic feature vector of the recording content information to construct a dialect feature vector corresponding to individuals in the target area; receiving target dialect data to be processed, analyzing voiceprint reference features corresponding to the target dialect data, and using a dialect classifier to analyze the dialect type corresponding to the target dialect data; performing voiceprint feature fusion processing on the target dialect data to obtain an adapted initial speech, and analyzing regional prosody features of the target area; and performing prosody optimization processing on the adapted initial speech to generate a dialect generation result corresponding to the target area. The present application can improve the accuracy of dialect generation.

A dialect generation method and system based on voiceprint features

A dialect generation method and system based on voiceprint features

A dialect generation method and system based on voiceprint features

Owner:SIMAI INTELLIGENT TECHNOLOGY (SHENZHEN) CO LTD

Hearing device comprising an own voice estimator

PendingUS20260156420A1In the ear hearing aidsBone conduction transducer hearing devicesTransducerVoice analysis

Disclosed herein are embodiments of a hearing device including at least one first, outward-facing, input transducer configured to pick up first sounds from the environment of a user and a second, inward-facing, input transducer configured to pick up a second sounds at the eardrum of the user. The hearing device can further include a directional system including a) an own voice beamformer configured to provide an estimate of the user's own voice in dependence of the at least one first and the second electric input signals, configurable own voice beamformer weights; and b) an own voice analyzer configured to analyze at least one of the at least one first and said second electric input signals, and to provide an own voice beamformer weight control signal.

Hearing device comprising an own voice estimator

Owner:OTICON

Electronic photo frame system based on electronic paper display screen and electronic paper photo frame control method

PendingCN122369440AEngineeringVoice analysis

The electronic paper display includes an electronic photo frame and a control method for the electronic paper photo frame. Module A and the electronic paper display are located within the photo frame assembly. Module B includes a network module B, a voice analysis module, a storage module, and a control module B. Module A includes a network module A, a voice acquisition module, a voice detection module, and a control module A. The voice acquisition module acquires sound signals, and the voice detection module detects that the sound signals include voice signals. The voice detection module sends the sound signals to module B through network module A. Network module B in module B receives the sound signals, and the voice analysis module analyzes the sound signals to obtain commands from them. According to the commands, the control module selects image C1 from the storage module and sends it to module A through network module B. Module A displays image C1 on the electronic paper display. The data format of image C1 is adapted to the data format of the electronic paper display.

Electronic photo frame system based on electronic paper display screen and electronic paper photo frame control method

Owner:JIANGXI XINGTAI TECH INC

system

PendingJP2026104499AInput/output for user-computer interactionGraph readingAutomatic controlImaging analysis

We provide the system. [Solution] An image analysis means that analyzes environmental information in real time via a user interface, A voice instruction generation means that generates voice instructions based on analyzed visual information, A voice analysis means that receives voice instructions from the user, analyzes them, and converts them into action commands, An automatic control means that automatically controls actions based on voice commands, A system that includes this.

system

Owner:SOFTBANK GROUP CORP

A psychological counseling level bidirectional psychological buffer method based on multi-scene AI voice analysis and intelligent earphone

PendingCN122245351AMental therapiesPsychotechnic devicesData setMood

This invention relates to the field of artificial intelligence psychological intervention technology, and in particular to a psychological counseling-level two-way psychological buffering method and smart earphone based on multi-scenario AI voice analysis. The method includes the following steps: collecting the speaker's voice and converting it into text and emotional information; training a neural network model using a dataset composed of emotional information to obtain an AI emotion model; analyzing and processing the two-way psychological buffering based on the wearer's personal personality traits, the level of harm in the dialogue, and text information, and outputting personalized psychological buffering scripts; recognizing text and emotional information from the speaker's voice, and extracting personal personality traits by combining the wearer's historical voice data; accurately calculating the dialogue harm score using a dialogue harm scoring formula, and determining whether to activate the two-way psychological buffering mechanism based on the harm level, thereby achieving timely emotional intervention and protection, fundamentally eliminating self-denial, strengthening psychological boundaries, and preventing the intergenerational transmission and cross-scenario spread of emotions.

A psychological counseling level bidirectional psychological buffer method based on multi-scene AI voice analysis and intelligent earphone

A psychological counseling level bidirectional psychological buffer method based on multi-scene AI voice analysis and intelligent earphone

A psychological counseling level bidirectional psychological buffer method based on multi-scene AI voice analysis and intelligent earphone

Owner:HUBEI YIXIN TIANAN MEDICAL BIOTECHNOLOGY CO LTD

system

PendingJP2026105485AVoice analysisHuman–computer interaction

We provide the system. [Solution] A means of acquiring audio for collection, A processing means for pre-processing the acquired audio and extracting features, An analysis means for classifying speech based on extracted features and determining appropriate suggestions, A means of displaying the decided proposal to the user, A communication means for physically or visually transmitting voice analysis results using human assistance devices, A system that includes this.

system

Owner:SOFTBANK GROUP CORP

Emotional speech synthesis method and system based on dialogue context and personal experience

PendingCN122369424ASemantic representationVoice analysis

The application discloses a kind of based on dialogue context and personal experience emotional speech synthesis method and system, method includes obtaining text to be synthesized and corresponding multi-dimensional emotional enhancement condition set, condition set includes personal experience description and dialogue context description, can also include at least one of paralanguage description and open vocabulary emotional label;The text to be synthesized and condition set are input into semantic generation module to obtain semantic representation sequence, generate target emotional speech by acoustic reconstruction and voice code processing.The condition set can be obtained by automatic annotation, role knowledge base, historical script, long dialogue summary, reference speech analysis or artificial configuration etc.;System includes condition acquisition and construction, semantic generation, acoustic reconstruction and voice code module.The application can realize the joint driving of dialogue context and personal experience emotional speech synthesis, improve the emotional naturalness, emotional accuracy and character consistency of synthesized speech, suitable for digital person, voice content production, intelligent interaction etc.

Emotional speech synthesis method and system based on dialogue context and personal experience

Owner:SHANGHAI JIAOTONG UNIV

Voice analysis method, electronic device, readable storage medium and chip system

ActiveCN116206602BAlarmsSpeech recognitionTerminal equipmentVoice analysis

The application is suitable for the terminal technical field, and provides a voice analysis method, an electronic device, a readable storage medium and a chip system.The method comprises the following steps: acquiring a voice instruction and information sent by a running application program, the voice instruction being used for instructing a terminal device to perform an operation, and the information sent by the application program comprising reminding information used for reminding a user; and determining a user intention corresponding to the voice instruction according to the voice instruction and the reminding information.Through acquiring the information sent by the application program, the information sent by the application program is used as a factor for determining the user intention, the accuracy of determining the user intention corresponding to the voice instruction can be improved, and thus the efficiency of voice interaction between the terminal device and the user can be improved.

Voice analysis method, electronic device, readable storage medium and chip system

Voice analysis method, electronic device, readable storage medium and chip system

Voice analysis method, electronic device, readable storage medium and chip system

Owner:HUAWEI DEVICE CO LTD

Systems and computer-implemented methods for voice analysis and authentication of a user based on confidence metrics

PendingUS20260188326A1Voice analysisData mining

Systems and computer-implemented methods for voice analysis and authentication of a user based on confidence metrics are disclosed. According to an aspect, a system includes a voice identification module configured to receive voice data associated with a user. The voice identification module is also configured to receive user input that indicates an identifier of the user, and to analyze the voice data of the user to generate at least one confidence metric indicative of a consistency of the voice data of the user with stored voice data of an identified user. The voice identification module determines whether the at least confidence metric meets one or more criterion for authenticating a user's voice, and to implement an action associated with authenticating the user associated with the received voice data and the identifier of the user in response to a determination that the at least confidence metric meets one or more criterion.

Systems and computer-implemented methods for voice analysis and authentication of a user based on confidence metrics

Owner:CLONEOPS AI LLC

Method and apparatus for extracting feature representation, device, medium, and program product

ActiveUS12651606B2Speech analysisAlgorithmVoice analysis

A method and an apparatus for extracting a feature representation, a device, a medium, and a program product are provided and relate to the field of voice analysis technologies. The method includes: obtaining sample audio; extracting a sample time-frequency feature representation corresponding to the sample audio; performing frequency band segmentation on the sample time-frequency feature representation from a frequency domain dimension, to obtain time-frequency sub-feature representations respectively corresponding to at least two frequency bands; and performing inter-frequency band relationship analysis on the time-frequency sub-feature representations respectively corresponding to the at least two frequency bands from the frequency domain dimension, and obtaining an application time-frequency feature representation based on an inter-frequency band relationship analysis result.

Method and apparatus for extracting feature representation, device, medium, and program product

Method and apparatus for extracting feature representation, device, medium, and program product

Method and apparatus for extracting feature representation, device, medium, and program product

Owner:TENCENT TECHNOLOGY (SHENZHEN) CO LTD

Multi-scene adaptive AI intelligent control teaching audio comprehensive interaction system

PendingCN122116704ASpeech recognitionFrequency/directions obtaining arrangementsLarge screenHuman–computer interaction

The application discloses a multi-scene adaptive AI intelligent control teaching audio comprehensive interaction system, which comprises a switchable AI intelligent control audio terminal, a modular wearable AI instruction microphone and a course sound treasure AI intelligent control audio recording and broadcasting software.The switchable AI intelligent control audio terminal is used for selecting and deploying according to the size of a classroom, performing high-fidelity sound reinforcement, integrating multiple device interfaces, performing teaching software and hardware intelligent control and uniformly covering a sound field.The modular wearable AI instruction microphone is used as a teacher interaction entrance, and through multi-wearing mode adaptation and rapid pairing technology, performs voice instruction collection, wireless transmission and teaching mobile freedom.The course sound treasure AI intelligent control audio recording and broadcasting software is installed on a teaching large screen, is used for integrating AI voice analysis, multiple device linkage logic and resource management functions, and performs network-independent voice control, sound effect parameter modulation locking and teaching resource automatic reservation.The application realizes teaching audio and teaching software and hardware intelligent control interaction requirements of different space sizes, and improves the convenience, stability and resource reusability of a teaching process.

Multi-scene adaptive AI intelligent control teaching audio comprehensive interaction system

Owner:SHANGHAI BIJIE INTELLIGENT TECHNOLOGY CO LTD

Method for early recognition of parkinsonian dysarthria based on voiceprint features

ActiveCN121545557BDysarthriaMedical diagnosis

The application discloses a Parkinson disease dysarthria early identification method based on voiceprint features and belongs to the technical field of medical diagnosis. The original speech data of a to-be-identified object is subjected to speech analysis and speech recognition, and an initial feature set related to voiceprints in a pronunciation deviation and a segment is extracted, target features with significant discriminability for Parkinson disease dysarthria identification are screened out, a dysarthria identification result of the to-be-identified object is obtained through a model, accurate vowel and consonant segment extraction and pronunciation deviation identification are realized, phoneme segment accurate cutting is realized by combining the association of a text sequence and speech data, the quantization determination of the pronunciation deviation is realized, the key dimension of the dysarthria identification is reserved through feature extraction, the early signals of Parkinson disease dysarthria are captured, the high-risk groups are marked, the early warning result is output in combination with the joint risk value calculation, the problem that the prodromal symptoms are not obvious is solved, the review interval is dynamically adjusted, and the whole-process management from identification to monitoring is realized.

Method for early recognition of parkinsonian dysarthria based on voiceprint features

Owner:SECOND MEDICAL CENT OF CHINESE PLA GENERAL HOSPITAL

Vehicle voice interaction test method and related device

PendingCN122177084ASemantic analysisSpeech recognitionDriving testIn vehicle

The application discloses a vehicle-mounted voice interaction test method and related equipment, and relates to the technical field of vehicle testing. The method comprises the following steps: a preset large language model is used to analyze and process a knowledge graph in the field of vehicle-mounted voice, so as to obtain a test case; a test text corpus is subjected to voice synthesis processing, so as to obtain instruction voice; the instruction voice is sent to a vehicle-mounted voice system to be tested, and response voice returned by the vehicle-mounted voice system to be tested in response to the instruction voice is acquired; the response voice is subjected to voice analysis processing, so as to obtain response text; a preset large language model is used to perform semantic verification on an expected response result and the response text, so as to obtain a test result for the vehicle-mounted voice system to be tested. The application drives test case generation through a large language model, and combines voice synthesis, voice analysis and a semantic level verification mechanism, so as to construct an automatic closed-loop test process based on real voice interaction, thereby improving the efficiency of vehicle-mounted voice system testing and the accuracy of semantic evaluation.

Vehicle voice interaction test method and related device

Owner:VOYAH AUTOMOBILE TECH CO LTD

Method and system for intelligent parameterization extraction and representation of characteristics of opera singing tunes

PendingCN122369504AFeature extractionAlgorithm

This invention provides an intelligent parameterized extraction and representation method and system for opera singing features, relating to the field of speech analysis and recognition technology. By constructing a configuration mapping table and rule engine containing genre labels, school labels, and downstream task labels, a configuration vector containing parameter set switches, weight values, and output granularity information is automatically generated before analysis begins. Based on downstream labels such as classification, comparison, teaching, and AI generation, the weight values and output granularity of the set parameters are automatically adjusted. Using the configuration vector as a constraint, corresponding feature extraction is performed, decomposing opera singing features into three levels for extraction and fusion. Parameter vectors for the vocal structure layer and embellishment layer are extracted, and dynamic layer parameters are extracted by calculating emotional tension curves. A structured fusion is used to form a unified parameter model. The system can select to output a complete model or a subset based on the downstream task, achieving a targeted parameter generation strategy for different downstream tasks, improving accuracy and adaptability.

Method and system for intelligent parameterization extraction and representation of characteristics of opera singing tunes

Owner:LANZHOU YINQIAO CULTURAL COMMUNICATION CO LTD

Speech practice with media content synchronization

ActiveUS12651605B2Speech analysisVoice analysisAcoustics

An embodiment includes detecting by a Speech Detection Component of a system a speech metric of a speaker in response to a reference speech. The embodiment includes responsive to the detected speech metric, computing by a Speech Analysis Component of the system a deviation metric between the speech metric and the reference speech. The embodiment includes training a machine learning model by a Speech Prediction Component of the system based on the deviation metric to generate a predicted speech pattern of the speaker. The embodiment also includes transforming by a Controller Component of the system the reference speech based on the predicted speech pattern.

Speech practice with media content synchronization

Owner:INTERNATIONAL BUSINESS MACHINE CORPORATION

A user call emotion recognition method and system for AI telephone customer service

PendingCN122337260AMoodVoice analysis

This invention relates to the field of speech analysis, specifically to a method and system for recognizing user emotions in AI-powered telephone customer service. Based on a complete dialogue between an AI-powered telephone customer service representative and a user in a smart home installation after-sales service scenario, this invention collects multi-round dialogue text data, analyzes the user's description of product issues, extracts scene feature words and their negative tags, and specifically analyzes user emotions. Based on the distribution characteristics of negative tags, feature values of scene feature words are obtained, quantifying the intensity of negative user emotions. The emotion decay characteristics are analyzed by combining feature value differences to obtain the emotional feature values of each round of dialogue. Utilizing the differences in emotional features between adjacent dialogues, the reference value index of each round of dialogue is quantified, and based on this, an emotion analysis model is trained and used to determine the user's incoming call emotion recognition result. This invention considers multiple dimensions in emotion recognition, such as individual user expression habits, tone words, and voice characteristics, enabling the after-sales AI telephone customer service to more accurately capture the user's true emotional changes.

A user call emotion recognition method and system for AI telephone customer service

Owner:匠达(苏州)科技有限公司 +1

Intelligent language switching system and method for display screen of coal mining machine

PendingCN122347946AVoice analysisSpeech sound

The present application belongs to the technical field of coal mining equipment, and discloses a coal mining machine display screen intelligent language switching system and method, which comprises the following steps: collecting the voice instructions of the operator through a voice collection module; recognizing the voice instructions collected by the voice collection module through a voice recognition module and converting the voice instructions into text information; recognizing the language type of the text information through a voice analysis module; obtaining the recognition result of the voice analysis module by a main control module, determining the display language switching requirement of the operator, and switching the display language of the coal mining machine display screen based on the display language switching requirement. The technical scheme disclosed by the present application can automatically switch the language of the coal mining machine display screen to the corresponding language, improve the convenience and efficiency of operation, and reduce the risk of misoperation caused by language problems.

Intelligent language switching system and method for display screen of coal mining machine

Intelligent language switching system and method for display screen of coal mining machine

Owner:SHANGHAI TIANDI MINING EQUIP TECH CO LTD +1

Dynamic graphical user interface for a whisper mode switch of an electronic device

ActiveCN310113098SGraphical user interfaceComputer graphics (images)

1. The name of the design product: dynamic graphical user interface of the electronic device's voice-over mode switch. 2. The use of the design product: an electronic device. 3. The design points of the design product: the graphical user interface in the electronic device. 4. The picture or photo that best indicates the design points: front view. 5. The electronic device is a conventional design, and other views are omitted. 6. The use of the graphical user interface: the interface for displaying and interacting with the voice-over mode switch; for the user to turn on or off the voice-over mode according to the cultural background intelligent voice-over; the front view interface is the interactive interface when the voice-over mode is not turned on during video playback; in the front view interface, after clicking the voice-over mode button (the button in the lower right corner of the interface in the shape of a pen) to turn on the voice-over mode, the dynamic change effect of interface change state figure 1-2 is displayed; in the interface change state figure 2 interface, according to the plot and voice AI analysis, the voice-over card appears at the appropriate time during video playback, displaying interface change state figure 3; in the interface change state figure 3 interface, after clicking the voice-over card, the dynamic change effect of interface change state figure 4-6 is displayed; the gray area in the interface is the content picture.

Dynamic graphical user interface for a whisper mode switch of an electronic device

Owner:HUNAN HAPPLY SUNSHINE INTERACTIVE ENTERTAINMENT MEDIA CO LTD

Mechanical arm control system and method, electronic device, computer readable storage medium

PendingCN122100153AProgramme-controlled manipulatorPathPingSoftware engineering

The application relates to the technical field of intelligent automation control, further relates to a mechanical arm control system and method, an electronic device and a computer readable storage medium. The system comprises a voice analysis module, which is used for analyzing voice input of a user and determining action requirements of the user on a mechanical arm; an instruction generation module connected with the voice analysis module, which is used for generating standardized action instructions according to the action requirements, the types of the standardized action instructions including preset action instructions, autonomous path planning instructions, man-machine interaction instructions, linkage instructions and cross-station path planning instructions; and a mechanical arm provided with a moving chassis and used for executing corresponding actions according to the standardized action instructions; wherein the linkage instructions are used for instructing the mechanical arm and other auxiliary equipment to complete linkage actions. The system effectively improves the scene adaptability of mechanical arm control and can meet diversified operation requirements.

Mechanical arm control system and method, electronic device, computer readable storage medium

Owner:SHANGHAI XINGLUOQIBU TECHNOLOGY CO LTD

Detecting impaired physiological function by speech analysis

ActiveEP4000529C0Voice analysisAcoustics

Detecting impaired physiological function by speech analysis

Detecting impaired physiological function by speech analysis

Detecting impaired physiological function by speech analysis

Owner:CORDIO MEDICAL LTD

system

PendingJP2026104619AResourcesPsychological statusAlgorithm

We provide the system. [Solution] A means of conducting automated interviews for the purpose of data acquisition, A means of converting audio data to text in real time, means for analyzing audio and image data to infer emotional states, A means for calculating the degree of fit between the applicant and the organization based on the analysis results, A means of performing facial expression analysis using video equipment for the purpose of data visualization, A method for evaluating the psychological state of applicants by performing voice analysis, A system that includes this.

system

Owner:SOFTBANK GROUP CORP

Popular searches

Audio frequency Management system Data science Information transformation User feedback Signal processing Respiratory signal Lung function Digital content Digitization