Patents

Literature

Patsnap Eureka AI that helps you search prior art, draft patents, and assess FTO risks, powered by patent and scientific literature data.

9349 results about "Voice data" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

System and method for the creation and automatic deployment of personalized, dynamic and interactive inbound and outbound voice services, with real-time interactive voice database queries

InactiveUS6885734B1Convenient transactionProvide informationData processing applicationsAutomatic call-answering/message-recording/conversation-recordingDatabase queryService bureau

The interactive delivery of voice serve messages communicating financial, personal or other news telecasts can be accessed at a subscriber's convenience on an inbound telephone, network enabled or other call. A voice service bureau may generate voice messages for individual subscribers according to their natural language queries. For instance, a subscriber may inquire “What was the closing Dow today” and the system converts the request to digital form, discriminates search terms, interrogates a database and annunciates to the call “The closing Dow was 11,000 today”. Responses to those or other inquiries may be delivered in real time or substantially real time.

System and method for the creation and automatic deployment of personalized, dynamic and interactive inbound and outbound voice services, with real-time interactive voice database queries

View all

Owner:MICROSTRATEGY

Method and system for matching speech data

InactiveUS7707032B2Easy to implementLow costSpeech recognitionSpeech soundVoice data

A method and system used to determine the similarity between an input speech data and a sample speech data is provided. First, the input speech data is segmented into a plurality of input speech frames and the sample speech data is segmented into a plurality of sample speech frames. Then, the input speech frames and the sample speech frames are used to build a matching matrix, wherein the matching matrix comprises the distance values between each of the input speech frames and each of the sample speech frames. Next, the distance values are used to calculate a matching score. Finally, the similarity between the input speech data and the sample speech data is determined according to this matching score.

Method and system for matching speech data

View all

Owner:NAT CHENG KUNG UNIV

Method and system for text-to-speech synthesis with personalized voice

ActiveUS20080235024A1Speech synthesisPersonalizationData set

A method and system are provided for text-to-speech synthesis with personalized voice. The method includes receiving an incidental audio input (403) of speech in the form of an audio communication from an input speaker (401) and generating a voice dataset (404) for the input speaker (401). The method includes receiving a text input (411) at the same device as the audio input (403) and synthesizing (312) the text from the text input (411) to synthesized speech including using the voice dataset (404) to personalize the synthesized speech to sound like the input speaker (401). In addition, the method includes analyzing (316) the text for expression and adding the expression (315) to the synthesized speech. The audio communication may be part of a video communication (453) and the audio input (403) may have an associated visual input (455) of an image of the input speaker. The synthesis from text may include providing a synthesized image personalized to look like the image of the input speaker with expressions added from the visual input (455).

Method and system for text-to-speech synthesis with personalized voice

View all

Owner:CERENCE OPERATING CO

Method and system for adapting a wireless mobile communication device for wireless transactions

ActiveUS20080051059A1Unauthorised/fraudulent call preventionEavesdropping prevention circuitsComputer hardwareTransceiver

A method for configuring a mobile communication device to perform transactions using a second communication channel that is different from a first communication channel through which the mobile communication device sends voice data. The method includes attaching a secure element to the mobile communication device. The secure element includes a memory storing an application, a processor configured to execute the application stored in the memory; and a wireless transceiver configured to send transaction data associated with the executed application through the second communication channel to a terminal that is remote from the mobile communication device.

Method and system for adapting a wireless mobile communication device for wireless transactions

View all

Owner:BLAZE MOBILE

Communication-status notification apparatus for communication system, communication-status display apparatus, communication-status notification method, medium in which communication-status notification program is recorded and communication apparatus

InactiveUS6967958B2Easily via a subscriber terminalEasy to observeInterconnection arrangementsMultiple keys/algorithms usageControl communicationsSpeech sound

A communication-status notification apparatus enabling a subscriber to observe various kinds of communication status in a network easily via the subscriber's own terminal in a communication system. The apparatus includes a request analysis section for discriminating whether or not voice data received by gateway equipment from a subscriber terminal contains a request on monitoring / controlling or notifying of a communication status in the network and for analyzing the content of the request when contains, a communication-status monitor / control section for monitoring / controlling the communication status responsive to the content of the request analyzed by the request analysis section based on a processing status of the voice data in the gateway equipment, and a communication-status notification section for notifying the subscriber terminal of the communication status monitored / controlled by the communication-status monitor / control section via the gateway equipment responsive to the content of the request analyzed by the request analysis section. The apparatus is useful when applied to VoIP gateway equipment or the like used for a VoIP communication system.

Communication-status notification apparatus for communication system, communication-status display apparatus, communication-status notification method, medium in which communication-status notification program is recorded and communication apparatus

View all

Owner:FUJITSU LTD

Virtual conference room for voice conferencing

InactiveUS6850496B1Increase voice comprehensionEasy to identifySpecial service provision for substationMultiplex system selection arrangementsConfusionVirtual conference

A system and method are disclosed for packet voice conferencing. The system and method divide a conferencing presentation sound field into sectors, and allocate one or more sectors to each conferencing endpoint. At some point between capture and playout, the voice data from each endpoint is mapped into its designated sector or sectors. Thereafter, when the voice data from a plurality of participants from multiple endpoints is combined, a listener can identify a unique apparent location within the presentation sound field for each participant. The system allows a conference participant to increase their comprehension when multiple participants speak simultaneously, as well as alleviate confusion as to who is speaking at any given time.

Virtual conference room for voice conferencing

View all

Owner:CISCO TECH INC

Pronunciation variation rule extraction apparatus, pronunciation variation rule extraction method, and pronunciation variation rule extraction program

InactiveUS8595004B2High propertySpeech recognitionSpeech identificationSpeech sound

A problem to be solved is to robustly detect a pronunciation variation example and acquire a pronunciation variation rule having a high generalization property, with less effort. The problem can be solved by a pronunciation variation rule extraction apparatus including a speech data storage unit, a base form pronunciation storage unit, a sub word language model generation unit, a speech recognition unit, and a difference extraction unit. The speech data storage unit stores speech data. The base form pronunciation storage unit stores base form pronunciation data representing base form pronunciation of the speech data. The sub word language model generation unit generates a sub word language model from the base form pronunciation data. The speech recognition unit recognizes the speech data by using the sub word language model. The difference extraction unit extracts a difference between a recognition result outputted from the speech recognition unit and the base form pronunciation data by comparing the recognition result and the base form pronunciation data.

Pronunciation variation rule extraction apparatus, pronunciation variation rule extraction method, and pronunciation variation rule extraction program

View all

Owner:NEC CORP

System, method, program product, and networking use for recognizing words and their parts of speech in one or more natural languages

ActiveUS7680649B2Natural language data processingSpeech recognitionPart of speechSpeech sound

A system, method, and computer program are disclosed for recognizing one or more words not listed in a dictionary database. One or more sequences of characters in the word are checked to determine a probability that the word is valid. A prefix removal process removes any prefixes from a word, and obtains information about the removed prefix. A suffix removal process removes any suffixes from the word, and obtains information about the removed suffix. A root process obtains information about a root word from the dictionary database. A combination process then determines if the prefix, the root, and the suffix can be combined into a valid word as defined by one or more combination rules, obtains one or more of the possible parts of speech of the valid word, and stores the parts of speech with the valid word in the dictionary database.

System, method, program product, and networking use for recognizing words and their parts of speech in one or more natural languages

View all

Owner:IBM CORP

Generating objectively evaluated sufficiently natural synthetic speech from text by using selective paraphrases

ActiveUS8015011B2Speech synthesisData segmentParaphrase

A synthetic speech system includes a phoneme segment storage section for storing multiple phoneme segment data pieces; a synthesis section for generating voice data from text by reading phoneme segment data pieces representing the pronunciation of an inputted text from the phoneme segment storage section and connecting the phoneme segment data pieces to each other; a computing section for computing a score indicating the unnaturalness of the voice data representing the synthetic speech of the text; a paraphrase storage section for storing multiple paraphrases of the multiple first phrases; a replacement section for searching the text and replacing with appropriate paraphrases; and a judgment section for outputting generated voice data on condition that the computed score is smaller than a reference value and for inputting the text after the replacement to the synthesis section to cause the synthesis section to further generate voice data for the text.

Generating objectively evaluated sufficiently natural synthetic speech from text by using selective paraphrases

View all

Owner:CERENCE OPERATING CO

Method and apparatus for monitoring and processing voice over internet protocol packets

ActiveUS7209473B1Application of processModify characteristicMetering/charging/biilling arrangementsInterconnection arrangementsDenial-of-service attackPacket processing

A processor architecture for processing data packets representing voice over Internet Protocol (VoIP) calls in a packet-switched network is disclosed. According to an embodiment, a VoIP processor executes a voice packet processing operating system that is configured to monitor or manipulate the packets at an IP layer, media layer and signaling layer of the call. The VoIP processor includes a plurality of independently callable primitive software functions that carry out low-level VoIP packet processing functions. The VoIP processor executes one or more application programs that selectively call one or more of the primitive software functions and are independent of any underlying protocols of the existing network, thereby isolating the application programs from low-level processing details. Further, techniques are described for modifying characteristics of VoIP traffic for the purpose of monitoring and directing the VoIP traffic through a network. The techniques include extracting information associated with the VoIP traffic and using the information for the purpose of controlling access, for fraud detection, for billing, for enforcing policy decisions, for protection against denial of service attacks, for lawful interception, for service selection, and other applications.

Method and apparatus for monitoring and processing voice over internet protocol packets

View all

Owner:JUMIPER NETWORKS INC

Retrieval and Presentation of Network Service Results for Mobile Device Using a Multimodal Browser

ActiveUS20070061146A1Interconnection arrangementsWeb data indexingDisplay deviceMobile device

A method of obtaining information using a mobile device can include receiving a request including speech data from the mobile device, and querying a network service using query information extracted from the speech data, whereby search results are received from the network service. The search results can be formatted for presentation on a display of the mobile device. The search results further can be sent, along with a voice grammar generated from the search results, to the mobile device. The mobile device then can render the search results.

Retrieval and Presentation of Network Service Results for Mobile Device Using a Multimodal Browser

View all

Owner:NUANCE COMM INC

Transaction apparatus and method that identifies an authorized user by appearance and voice

InactiveUS6023688AComplete banking machinesFinanceOutput deviceData memory

A financial transaction apparatus (30) includes a financial transaction machine (32). The machine includes devices (34) including transaction function devices (42, 44, 46, 48) for carrying out operations associated with financial transactions. The terminal also includes an imaging device (50) and an audio input device (52), as well as a visual output device (36) and an audio output device (54). Terminal (32) is connected to a computer (68) which has an associated data store (70). The data store includes user data including image data and voice data corresponding to authorized users. The identity of a customer operating the machine is determined by resolving first identity data based on image signals from the imaging device which correspond to a user's appearance. Second identity data is resolved by the processor from voice signals from the audio input device corresponding to the user's voice. The computer enables operation of the transaction function devices if the level of correlation between the first and second identity data is sufficient to establish that the image and voice signals originate from a single authorized user.

Transaction apparatus and method that identifies an authorized user by appearance and voice

View all

Owner:DIEBOLD NIXDORF

Synthesis unit selection apparatus and method, and storage medium

InactiveUS6980955B2Inhibit deteriorationSound input/outputSpeech synthesisFrequency of occurrenceA* search algorithm

Input text data undergoes language analysis to generate prosody, and a speech database is searched for a synthesis unit on the basis of the prosody. A modification distortion of the found synthesis unit, and concatenation distortions upon connecting that synthesis unit to those in the preceding phoneme are computed, and a distortion determination unit weights the modification and concatenation distortions to determine the total distortion. An Nbest determination unit obtains N best paths that can minimize the distortion using the A* search algorithm, and a registration unit determination unit selects a synthesis unit to be registered in a synthesis unit inventory on the basis of the N best paths in the order of frequencies of occurrence, and registers it in the synthesis unit inventory.

Synthesis unit selection apparatus and method, and storage medium

View all

Owner:CANON KK

Internet-based telephone call manager

InactiveUS6212261B1Improve handlingImprove abilitiesInterconnection arrangementsSpecial service for subscribersData connectionData information

A method is provided that allows data access service provider subscribers to manage their telephone service through a data connection. The subscriber is enabled to obtain call data information and is provided real time control. During a data call, a visual incoming call indicator informs the subscriber, through a popup window, connected to the data access service provider that there is a call attempt. A visual message waiting indicator allows a subscriber, connected to the data access service provider to be notified of a pending message on the voice message system. A visual call disposition allows the subscriber, through the data connection, to dispose of calls. The call disposition options include forwarding a call to voice mail, playing an announcement to the calling party, forwarding the call to another line, sending a text message which could be converted to speech using text to speech technology, answering the call using voice over data call or terminating the data connection in order to accept the call.

View all

Owner:RPX CLEARINGHOUSE

Adaptive context for automatic speech recognition systems

InactiveUS20080091426A1Speech recognitionPost processorSpeech identification

A system improves speech recognition includes an interface linked to a speech recognition engine. A post-recognition processor coupled to the interface compares recognized speech data generated by the speech recognition engine to contextual information retained in a memory, generates a modified recognized speech data, and transmits the modified recognized speech data to a parsing component.

Adaptive context for automatic speech recognition systems

View all

Owner:NUANCE COMM INC +1

Voice authentication method and system utilizing same

InactiveUS6510415B1Individual entry/exit registersSpeech recognitionSpeech verificationSpeech sound

A system for authorizing user access to a secure site includes a memory unit, first and second input devices, and first and second processing devices. The memory unit stores voice prints and identities of the set of individuals that have access to the secure site. The first input device is for inputting information that identifies the user as a member of the set. The second input device is for inputting temporary user voice data. The first processing device is for generating a temporary voice print from the temporary data. The second processing device is for comparing the temporary voice print to the stored voice prints. Access is granted only if the temporary voice print is most similar to the voice print of the individual that the user claims to be.

Voice authentication method and system utilizing same

View all

Owner:SENTRYCOM

Technique of Generating High Quality Synthetic Speech

ActiveUS20080183473A1Speech synthesisParaphraseSpeech sound

View all

Owner:CERENCE OPERATING CO

Apparatus and method for reproducing voice in synchronism with music piece

InactiveUS7365260B2Avoid wastingEdited and revised with easeGearworksMusical toysData designEvent data

Music piece sequence data are composed of a plurality of event data which include performance event data and user event data designed for linking a voice to progression of a music piece. A plurality of voice data files are stored in a memory separately from the music piece sequence data. In music piece reproduction, the individual event data of the music piece sequence data are sequentially read out, and a tone signal is generated in response to each readout of the performance event data. In the meantime, a voice reproduction instruction is output in response to each readout of the user event data. In accordance with the voice reproduction instruction, a voice data file is selected from among the voice data files stored in the memory, and a voice signal is generated on the basis of each read-out voice data.

Apparatus and method for reproducing voice in synchronism with music piece

View all

Owner:YAMAHA CORP

Hierarchical real-time speaker recognition for biometric VoIP verification and targeting

ActiveUS8160877B1Reduce complexitySpeech recognitionCrowdsCluster based

A method for real-time speaker recognition including obtaining speech data of a speaker, extracting, using a processor of a computer, a coarse feature of the speaker from the speech data, identifying the speaker as belonging to a pre-determined speaker cluster based on the coarse feature of the speaker, extracting, using the processor of the computer, a plurality of Mel-Frequency Cepstral Coefficients (MFCC) and a plurality of Gaussian Mixture Model (GMM) components from the speech data, determining a biometric signature of the speaker based on the plurality of MFCC and the plurality of GMM components, and determining in real time, using the processor of the computer, an identity of the speaker by comparing the biometric signature of the speaker to one of a plurality of biometric signature libraries associated with the pre-determined speaker cluster.

Hierarchical real-time speaker recognition for biometric VoIP verification and targeting

View all

Owner:THE BOEING CO

Partial speech processing device & method for use in distributed systems

InactiveUS20050086059A1Flexibly and optimally distributedImprove accuracyNatural language translationData processing applicationsTime responseClient-side

A client device incorporates partial speech recognition for recognizing a spoken query by a user. The full recognition process is distributed over a client / server architecture, so that the amount of partial recognition signal processing tasks can be allocated on a dynamic basis based on processing resources, channel conditions, etc. Partially processed speech data from the client device can be streamed to a server for a real-time response. Additional natural language processing operations can also be performed to implement sentence recognition functionality.

Partial speech processing device & method for use in distributed systems

View all

Owner:NUANCE COMM INC

Portable digital audio recorder with adaptive control configurations

InactiveUS6038199AEasy to shareEasy to manageDriving/moving recording headsSpecial service for subscribersComputer hardwareSelf adaptive

A portable digital audio recorder provides a first set of control or editing options with respect to a first category of voice data files stored in the recorder, and provides a different set of control or editing options with respect to a second category of voice data files stored in the recorder. The recorder may also be conveniently shared among a number of users and adapts various operating parameters to the preferences of the current user.

Portable digital audio recorder with adaptive control configurations

View all

Owner:NUANCE COMM INC

Using network time protocol in voice over packet transmission

ActiveUS7602815B2Guaranteed normal transmissionEffective maintenanceTime-division multiplexData switching by path configurationTTEthernetVoice over packet

One or more methods and systems of effectively transmitting voice and voice band data from one node to another are presented. In one embodiment, the system comprises an NTP time server generating absolute times to computing devices such as residential voice over internet protocol (VoIP) gateways. The NTP time server generates absolute times in response to NTP time requests made by one or more computing devices such as residential VoIP gateways. In one embodiment, the method comprises determining an adequate rate for requesting absolute times from an NTP server, making periodic requests to the NTP server, obtaining the absolute times from the NTP server, and generating an adjustment parameter for use by a computing device such as a residential VoIP gateway.

Using network time protocol in voice over packet transmission

View all

Owner:AVAGO TECH INT SALES PTE LTD

Phonetic decoding and concatentive speech synthesis

ActiveUS20080133241A1Speech recognitionSpeech synthesisMultiplexerSpeech input

A speech processing system includes a multiplexer that receives speech data input as part of a conversation turn in a conversation session between two or more users where one user is a speaker and each of the other users is a listener in each conversation turn. A speech recognizing engine converts the speech data to an input string of acoustic data while a speech modifier forms an output string based on the input string by changing an item of acoustic data according to a rule. The system also includes a phoneme speech engine for converting the first output string of acoustic data including modified and unmodified data to speech data for output via the multiplexer to listeners during the conversation turn.

Phonetic decoding and concatentive speech synthesis

View all

Owner:CERENCE OPERATING CO

Power line communication voice over IP system and method

InactiveUS7856007B2Electric signal transmission systemsError preventionVoice over IPSpeech sound

A power line communication system communicating over a medium voltage power line including a VoIP endpoint that transmits voice data and requests to establish a voice connection to a medium voltage (MV) access device is provided. The medium voltage access device may determine a response to the request, allocate voice data packets a higher priority than general data packets, and transmit the data packets over the MV power line according to their priority. The MV access device may provide one or more voice over internet protocol (VoIP) switch functions and, in response to the requests, grant or deny the requests based, for example, on the number of established voice connections.

Power line communication voice over IP system and method

View all

Owner:CURRENT TECH

Apparatus for speech recognition using multiple acoustic model and method thereof

ActiveUS20140180689A1Improve performanceSpeech recognitionAcoustic modelFeature data

Disclosed are an apparatus for recognizing voice using multiple acoustic models according to the present invention and a method thereof. An apparatus for recognizing voice using multiple acoustic models includes a voice data database (DB) configured to store voice data collected in various noise environments; a model generating means configured to perform classification for each speaker and environment based on the collected voice data, and to generate an acoustic model of a binary tree structure as the classification result; and a voice recognizing means configured to extract feature data of voice data when the voice data is received from a user, to select multiple models from the generated acoustic model based on the extracted feature data, to parallel recognize the voice data based on the selected multiple models, and to output a word string corresponding to the voice data as the recognition result.

Apparatus for speech recognition using multiple acoustic model and method thereof

View all

Owner:ELECTRONICS & TELECOMM RES INST

Conference support system, record generation method and a computer program product

InactiveUS20050209848A1Special service provision for substationTelevision conference systemsSupporting systemSpeech sound

A conference support system includes a data reception portion for receiving image data of attendants in a conference and voice data, an emotion distinguishing portion for distinguishing emotions of attendants in accordance with the image data, a text data generation portion for generating comment text data that indicate contents of speeches of the attendants in accordance with the voice data, and a record generation portion for generating record data that include contents of a speech of an attendant and emotions of attendants when the speech was made, in accordance with emotion data that indicate a result of distinguishing made by the emotion distinguishing portion and the comment text data.

Conference support system, record generation method and a computer program product

View all

Owner:FUJITSU LTD

Multiplexing several individual application sessions over a pre-allocated reservation protocol session

InactiveUS7013338B1Multiple digital computer combinationsData switching networksMultiplexingCommunity based

Apparatus and methods are provided for multiplexing application flows over a pre-allocated bandwidth reservation protocol session. According to one embodiment, a pre-allocated reservation protocol session, such as an RSVP session, is shared by one or more application sessions. The reservation protocol session is pre-allocated over a path between a first network device associated with a first user community and a second network device associated with a second user community based upon an estimated usage of the path for application sessions between users of the first and second user communities. Subsequently, the one or more application sessions are dynamically aggregated by multiplexing application flows associated with the one or more individual application sessions onto the pre-allocated reservation protocol session at the first network device and demultiplexing at the second network device. According to another embodiment, a network device enables multiple applications, such as VoIP applications, that require real-time performance to share an aggregated reservation protocol session, such as an RSVP session. The network device includes a storage device having stored therein one or more routines for establishing and managing the aggregated reservation protocol session. A processor coupled to the storage device executes the one or more routines to pre-allocate the aggregated reservation protocol session and thereafter share the aggregated reservation protocol session among multiple application sessions of individual application sessions. The aggregated reservation protocol session is pre-allocated based upon an estimate of the bandwidth requirements to accommodate the multiple application sessions. The aggregated reservation protocol session is shared by multiplexing, onto the aggregated reservation protocol session, outbound media packets (e.g., packetized voice data) originated by local application / endpoints associated with the application sessions, and demultiplexing, from the aggregated reservation protocol session, inbound media packets (e.g., packetized voice data) originated by remote application / endpoints.

Multiplexing several individual application sessions over a pre-allocated reservation protocol session

View all

Owner:WINTERSPRING DIGITAL LLC

System and method of providing conversational visual prosody for talking heads

ActiveUS7136818B1High-quality impressionNatural appearing interactionSpeech recognitionVisual prosodySpeech sound

A system and method of controlling the movement of a virtual agent while the agent is speaking to a human user during a conversation is disclosed. The method comprises receiving speech data to be spoken by the virtual agent, performing a prosodic analysis of the speech data, selecting matching prosody patterns from a speaking database and controlling the virtual agent movement according to the selected prosody patterns.

System and method of providing conversational visual prosody for talking heads

View all

Owner:INTERACTIONS LLC (US)

Speech data mining for call center management

InactiveUS20050010411A1Improving automatic recognition of speechEasy to identifySpeech recognitionQuality of serviceFrustration

A speech data mining system for use in generating a rich transcription having utility in call center management includes a speech differentiation module differentiating between speech of interacting speakers, and a speech recognition module improving automatic recognition of speech of one speaker based on interaction with another speaker employed as a reference speaker. A transcript generation module generates a rich transcript based on recognized speech of the speakers. Focused, interactive language models improve recognition of a customer on a low quality channel using context extracted from speech of a call center operator on a high quality channel with a speech model adapted to the operator. Mined speech data includes number of interaction turns, customer frustration phrases, operator polity, interruptions, and / or contexts extracted from speech recognition results, such as topics, complaints, solutions, and resolutions. Mined speech data is useful in call center and / or product or service quality management.

Speech data mining for call center management

View all

Owner:PANASONIC CORP

Voice-controlled data system

ActiveUS20070198273A1Speech recognitionControl dataSpeech identification

A voice-controlled data system may include a data storage unit including media files having associated file identification data, and a vocabulary generating unit generating phonetic data corresponding to the file identification data, the phonetic data being supplied to a speech recognition unit as a recognition vocabulary, where one of the media files may be selected according to a recognized speech control command on the basis of the generated phonetic data, where the file identification data include a language identification part for identifying the language of the file identification data, and where the vocabulary generating unit generates the phonetic data for the file identification data of a media file based on its language identification part.

View all

Owner:HARMAN BECKER AUTOMOTIVE SYST

9349 results about "Voice data" patented technology

Popular searches