Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

521 results about "Transformation of text" patented technology

Transformations of text are strategies to perform geometric transformations on text (reversal, rotations, etc.), particularly in systems that do not natively support transformation, such as HTML, seven-segment displays and plain text.

Method and device for converting speech

Electronic device and method for speech to text conversion procedure, wherein the overall conversion result may include smaller portions with multiple conversion options that are audibly and optionally visually or tactilely reproduced for user confirmation, thereby resulting enhanced conversion accuracy with minimal additional effort by the user.
Owner:MOBITER DICTA

Precision speech to text conversion

A speech-to-text conversion module uses a central database of user speech profiles to convert speech to text. Incoming audio information is fragmented into numerous audio fragments based upon detecting silence. The audio information is also converted to numerous text files by any number of speech engines. Each text file is then fragmented into numerous text fragments based upon the boundaries established during the audio fragmentation. Each set of text fragments from the different speech engines corresponding to a single audio fragments is then compared. The best approximation of the audio fragment is produced from the set of text fragments; a hybrid may be produced. If no agreement is reached, the audio fragment and set the text fragments are sent to human agents who verify and edit to produce a final edited text fragment that best corresponds to the audio fragment. Fragmentation that produces overlapping audio fragments requires splicing of the final text fragments to produce the output text file.
Owner:FONEWEB

Method for form completion using speech recognition and text comparison

A system and method for creating a final text from an audio file. This has particular utility in completing forms with speech-to-text conversion. The system and method includes transcribing the audio file into a transcribed text file using a speech recognition program. They further include comparing the transcribed text file with a previously created text file to determine differences between the transcribed text file and the previously created text file. Finally, the system and method includes correcting one of the transcribed text file or the previously created text file based upon the differences to create the final text.
Owner:CUSTOM SPEECH USA

Method and system for name-face/voice-role association

A method for providing name-face / voice-role association includes determining whether a closed captioned text accompanies a video sequence, providing one of text recognition and speech to text conversion to the video sequence to generate a role-name versus actor-name list from the video sequence, extracting face boxes from the video sequence and generating face models, searching a predetermined portion of text for an entry on the role-name versus actor-name list, searching video frames for face models / voice models that correspond to the text searched by using a time code so that the video frames correspond to portions of the text where role-names are detected, assigning an equal level of certainty for each of the face models found, using lip reading to eliminate face models found that pronounce a role-name corresponding to said entry on the role-name versus actor-name list, scanning a remaining portion of text provided and updating a level of certainty for said each of the face models previously found. Once a particular face model / voice model and role-name association has reached a threshold the role-name, actor name, and particular face model / voice model is stored in a database and can be displayed by a user when the threshold for the particular face model has been reached. Thus the user can query information by entry of role-name, actor name, face model, or even words spoken by the role-name as a basis for the association. A system provides hardware and software to perform these functions.
Owner:UNILOC 2017 LLC

Voice enabled knowledge system

This invention discloses a voice enabled knowledge system, comprising a speech recognition engine and text to speech engine. The speech recognition engine further comprises a representation unit to represent the spoken words, a model classification unit to classify the spoken words, a training database to match the spoken words with preset words and a search unit to search for the spoken word in said training database, based on the results of said model classification. The text to speech engine for conversion of an input text to speech, comprises a text pre-processing unit for analyzing the input text in a sentence form, a prosody unit for word recognition using said acoustic model, a concatenation unit for converting the diphone equivalents into words and thereafter into a sentence and an audio output device for speech output.
Owner:MUKHERJEE SANTOSH KUMAR

Technique for improved audio compression

A method, system, computer program product, and method of doing business by providing improved audio compression wherein an audio stream is securely transformed to an encoded text stream (such as an ASCII, EBCDIC, or Unicode text stream). One or more components which are involved in the transformation process are authenticated. A unique identifier of each such component is included within cryptographically-protected information that is provided for the encoded text stream. A digital signature is preferably used for the cryptographic protection, thereby digitally notarizing the encoded text stream. The authenticity and integrity of the encoded text stream can therefore be verified. In preferred embodiments, the authenticated identities of components performing the transformation can also be determined from the cryptographically-protected information. The encoded text stream will typically require much less storage space than the audio stream, and providing the digital notarization along with the encoded text stream serves to reliably establish evidence of the contents of the audio stream (even though a perfect speech-to-text transformation might not be achieved).
Owner:NUANCE COMM INC

Vowel recognition system and method in speech to text applictions

InactiveUS20100217591A1Accurate speech-to-text conversionAccurate conversionSpeech recognitionVowel recognitionRecognition algorithm
The present invention provides systems, software and methods method for accurate vowel detection in speech to text conversion, the method including the steps of applying a voice recognition algorithm to a first user speech input so as to detect known words and residual undetected words; and detecting at least one undetected vowel from the residual undetected words by applying a user-fitted vowel recognition algorithm to vowels from the known words so as to accurately detect the vowels in the undetected words in the speech input, to enhance conversion of voice to text.
Owner:SHPIGEL AVRAHAM

Dialog driven personal information manager

A personal information manager including a data input device receiving an audio data stream, and decoding the data stream into text. A dialog manager is provided having a record mode and a dialog mode, the dialog manager examines the decoded text received to determine whether it contains one of an explicit and an implicit data processing request. The dialog manager immediately passing explicit data processing requests and queuing implicit data processing requests. An information storage / retrieval module is provided for storing and retrieving data from a database. The information storage / retrieval module executing data processing requests specified by the dialog manager. A an output module is provided for converting text received from the dialog module into speech and outputting the speech in response to a data processing request. The dialog manager passing implicit processing requests to the information storage / retrieval module during periods of inactivity.
Owner:EXB ASSET MANAGEMENT

Television/radio speech-to-text translating processor

A device used with a conventional television set that permits both hearing impaired and non hearing impaired individuals to view television audio voice signals in a text format on the television screen without the requirement of a closed / open caption broadcast signal or a closed / open caption enabled television. The device also can utilize an FM audio input, an audio line input, and a microphone or impedance input for conversion into a text format. The system may use on-board or remote displays, wireless or wired, for also providing the text format representative of human speech. The system includes audio filters for filtering a television audio signal and filtering out human speech audio signals for processing and conversion by a speech-to-text converter.
Owner:CHELDAN TECH

Apparatus, system and method for securing digital documents in a digital appliance

Various embodiments include an apparatus and a method to secure protected digital document content from tampering by their user, such as unauthenticated use or use violating a policy of the digital document. The digital document file can be transferred from a network node such as a web site server to a digital appliance, such as a computer, in encrypted form. The digital document file can be resident already on a device, and / or be transferred into a device that is connected to the digital appliance. The device (hereafter a DRM device) can internally store the digital document or part of the document. The DRM device may decrypt the digital document when requested to do so. The device may further format the content for usage, for example, convert text into its graphic bitmap representation. Device formatting can include sending plain text data to the digital appliance. The device may further process degradation to the resulted file, for example, reduce the resolution of the graphic representation. The digital appliance uploads the result of the processing or sections of the result of the processing for user access via the digital appliance.
Owner:SANDISK TECH LLC

Speech to text conversion system

Disclosed is a system for automatically accessing an account and routing speech to text converted communication to a predetermined text capable device based on user identification. A user placing a call into a communications server is automatically identified based on Caller ID. The user has an account with information unique to that user that includes text capable destinations. The user transmits a voice message to the communications server which is converted to text and automatically routed to a predetermined text capable destination, such as fax machines or email addresses, for example, based on user identification.
Owner:DISCERNIX

Text-to-speech conversion with associated mood tag

A method (and associated apparatus) comprises associating a mood tag with text. The mood tag specifies a mood to be applied when the text is subsequently converted to an audio signal. In accordance with another embodiment, a method (and associated apparatus) comprises receiving text having an associated mood tag and converting the text to speech in accordance with the associated mood tag.
Owner:HEWLETT PACKARD DEV CO LP

Multimodal natural language query system and architecture for processing voice and proximity-based queries

InactiveUS20060116987A1Inaccurate and imprecise and unreliable and trainingDigital data information retrievalData processing applicationsDatabase queryTransformation of text
The present invention provides a wireless natural language query system, architecture, and method for processing multimodally-originated queries, including voice and proximity-based queries. The natural language query system includes a Web-enabled device including a speech input module for receiving a voice-based query in natural language form from a user and a location / proximity module for receiving location / proximity information from a location / proximity device. The natural language query system also includes a speech conversion module for converting the voice-based query in natural language form to text in natural language form and a natural language processing module for converting the text in natural language form to text in searchable form. The natural language query system further includes a semantic engine module for converting the text in searchable form to a formal database query and a database-look-up module for using the formal database query to obtain a result related to the voice-based query in natural language form from a database.
Owner:PORTAL COMM LLC

Method and system for extracting and utilizing metadata to improve accuracy in speech to text conversions

A computer-based system and method for speech to text conversion preprocessing of a presentation with a speech audio, useable in real time. The method captures a presentation speech audio input to be converted into text, temporally associates the speech audio input with at least one supporting text source from the same presentation containing common keywords and creates an optimized and prioritized keyword positional index metadata set for inputting into a speech to text conversion processor.
Owner:IBM CORP

Human-computer interaction intelligent question answering method based on cloud platform

The invention discloses a human-computer interaction intelligent question answering method based on a cloud platform, which comprises the following steps: acquiring voice information input by a user;performing voice-to-text conversion on the voice information to obtain text information; performing word segmentation processing on the text information to obtain keyword extraction results; using a machine learning algorithm to classify the obtained keyword extraction results so as to obtain classification results; using a natural language processing algorithm to perform keyword expansion on verbs and nouns in the keyword extraction results and taking out a result with the largest similarity from the results of the keyword expansion in each of the verbs and nous, wherein all the results forma keyword expansion sequence; and performing fuzzy matching in a local database according to the classification results and the keyword expandable sequence. The human-computer interaction intelligentquestion answering method based on the cloud platform in the invention can solve the technical problem of low interaction accuracy caused by inaccurate word segmentation, inaccurate keyword expansion,and inaccurate extraction of answers in the existing human-computer interaction question answering system.
Owner:CHANGSHA UNIVERSITY

Automated creation of filenames for digital image files using speech-to-text conversion

A system and method for automatically generating annotated filenames for digital image files allows users to create meaningful filenames for digital image files captured by a digital camera. After an image is captured by the digital camera, an audio annotation containing audio information is associated with the digital image file. The audio information in the audio annotation is converted to a text string using speech-to-text conversion. The text string is then associated with the digital image file as the annotated filename of the digital image file.
Owner:SIEMENS ENTERPRISE COMM GMBH & CO KG

Text information display apparatus equipped with speech synthesis function, speech synthesis method of same, and speech synthesis program

A text information display apparatus equipped with a speech synthesis function able to clearly display a linked portion by speech and enabling easy recognition of a change from a link, provided with a controller for referring to the display rules of text to be converted to speech when converting text included in text information being displayed on a display unit to speech, controlling a speech synthesizing processing unit so as to convert the text to speech with a first voice in a case of predetermined display rules (presence of link destination, cursor position display, etc.) and convert the text to speech with a second voice having a speech quality different from that of the first voice in the case of not the predetermined display rules, and controlling the speech synthesizing processing unit so as to convert the text included in a display object to speech with a third voice when the display object linked with the link destination is selected or determined by a key operation unit.
Owner:KYOCERA CORP

Method and system for operational improvements in dispatch console systems in a multi-source environment

A method and system for operational improvements in a dispatch console in a multi-source environment includes receiving (310) a plurality of audio streams simultaneously from a plurality of mobile devices, transcribing received audio streams by the means of speech-to-text conversion, presenting real-time transcriptions to the user and determining (320) if a first keyword is present in at least one of the plurality of audio and / or text streams. Upon determining the presence of the first keyword, the dispatch console automatically performs (330) at least one predefined dispatch console operation from a list of predefined dispatch console operations. The dispatch console further receives (340) a second keyword based on determining the presence of the first keyword and checks (350) for the presence of the second keyword within the audio and / or text streams thereby enabling additional automated dispatch console operations.
Owner:MOTOROLA SOLUTIONS INC

Method and apparatus for converting words into animation

A method for converting text into animation and device thereof is capable of converting text into corresponding animation, comprising steps of text inputting; speech synthesizing, synthesizing an input text in order to obtain corresponding audio files; video synthesizing, including steps of spelling analyzing and mouth shape synthesizing, which are doing phonetic analysis for the text to obtain corresponding spellings for the text, then extracting images corresponding to the spelling from a preset data base of mouth shape of spelling, and finally synthesizing the images into a video files; animation synthesizing, synthesizing the audio file and video file into images. The device for converting text into animation comprises a tent inputting module, a speech synthesizing module, a video synthesizing module, and an animation synthesizing module. The text input by users are processed by the modules, thereby the animation in accordance with text is obtained, achieving the goal of conversion of text to animation.
Owner:李嘉辉

Conversion of text relating to media content and media extension apps

A first messaging app, in one embodiment receives text and detects that the text includes a URL that refers to audio or audiovisual media content in a catalog of media content, and the messaging app obtains metadata about the media content and transmits the metadata to a second messaging app. Both the first and the second messaging app can operate, through interprocess communication, with its own music extension app that displays a user interface within a view hosted by the respective messaging app.
Owner:MICHELIN & CO CIE GEN DES ESTAB MICHELIN +1

Grapheme to phoneme module for synthesizing speech alternately using pairs of four related data bases

PCT No. PCT / GB94 / 00430 Sec. 371 Date Dec. 2, 1996 Sec. 102(e) Date Dec. 2, 1996 PCT Filed Mar. 7, 1994 PCT Pub. No. WO94 / 23423 PCT Pub. Date Oct. 13, 1994Synthetic speech is generated from conventional texts and in particular by converting text in graphemes into a text in phonemes. The grapheme text is analyzed into rimes and onsets, and each word is analyzed from the end so that earlier-occurring segments are at least partially defined by the identification of later-occurring segments. It is a particular feature that an internal string of consonants, i.e., a string of consonants preceded and followed by a vowel, is split into two portions, namely, a second portion which is contained in a database of onsets, and an earlier portion which, together with the preceding vowel or vowels, is contained in a database of rimes.
Owner:BRITISH TELECOMM PLC

Systems and methods to control work progress for content transformation based on natural language processing and/or machine learning

Systems and methods are provided to compute indicators of completeness of the work output of a transformation of text-based content, worker capacity in performing the transformation, and / or the degree of matching between a unit of work and a worker, based on information collected about complexity of works, times and throughput of workers, rating of work outputs and using natural language processing techniques and machine learning techniques, such as language detection, longest common substring, length ratio, document similarity, etc. The indicators are utilized to optimize job pickup and output submission for online crowdsourcing tasks related to transformation of text-based content, such as transcription, translation, proofreading, etc.
Owner:GENGO

Multiple typeface, size and model displaying system and method in embedded system

The invention relates to a system and method for displaying webpage character information with multiple fonts, sizes, language types and styles in an embedded system based on digital transmission network, comprising webpage information transmitting equipment, transmission network and receiving equipment. It especially relates to the display of webpage information with multiple fonts and sized, language types and styles in a data broadcasting system based on digital TV, and the data broadcasting system comprises webpage information broadcast transmitting equipment, digital TV transmission network and digital TV set-top box (STB). The method of the invention converts webpage characters into image data in advance to make superposition with other graphics, thus able to thoroughly solve the related problems.
Owner:广州网上新生活软件技术服务有限公司

Method and system for intuitive text-to-speech synthesis customization

A system for tuning the text-to-speech conversion process having a text-to-speech engine that converts the input text into a processed text form which includes speech features. A visual editing interface displaying the processed text form using graphical indicators on an output device to allow a user to edit the text and graphical indicators to modify the speech features of the text input.
Owner:PANASONIC CORP

Method for converting text into rap music and device thereof

The invention provides a method for converting text into rap music and a device thereof, and belongs to the technical field of electronic digital data processing. The method comprises carrying out the character rhythm analysis for obtained to-be-converted text to obtain words and characters in the to-be-converted text, and endowing with sound attribute for each word and each character in the to-be-converted text, and converting each word and each character in the to-be-converted text to character audio frequency according with MIDI music rules through a preset character voice database and the sound attribute, obtaining to-be-played MIDI audio frequency, and synthesizing the to-be-played MIDI audio frequency and the character audio frequency to generate rap music, wherein the text can be output in the form of rap music for increasing the recreation of the text, thereby improving the experience of users.
Owner:VIMICRO CORP

Method of using semantic recognition for automatic coding conversion

The invention provides a method of using semantic recognition for automatic coding conversion. The method includes: collecting original coding and a sample data set of a corresponding relation described by diagnosis; preprocessing data in a diagnosis text historical library according to medical rules, and performing word segmentation; building a synonym library for data in a diagnosis text training set, and performing preprocessing and word segmentation according to medical rules; calculating a document IDF weight value corresponding to each word in the diagnosis text historical library; performing word segmentation on each diagnosis text record in the diagnosis text training set to generate a training set TF-IDF matrix formed by text conversion; performing word segmentation on a to-be-converted diagnosis text record to acquire a word vector, comparing the word vector with the training set TF-IDF matrix, and finding disease coding corresponding to a most similar result in the training set through a cosine similarity formula. By the method, coding represented by text description can be converted automatically according to diagnosis text description written for a patient by a doctor.
Owner:天津艾登科技有限公司

Keyword extraction method and apparatus, storage medium, and electronic apparatus

A keyword extraction method is provided. In the method, a candidate keyword set in a target text is obtained by processing circuitry of a server. An extraction degree of the candidate keyword is determined by the processing circuitry based on subject similarity and a text conversion frequency of a candidate keyword in the candidate keyword set. The subject similarity is between the candidate keyword and the target text. The extraction degree indicates a probability at which the candidate keyword used as a keyword matching the target text is extracted. The keyword is extracted by the processing circuitry from the candidate keyword set according to the extraction degree.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Method for generating guided text abstract based on Transformer

The invention relates to a method for generating a guided text abstract based on Transformer, and belongs to the technical field of information processing. According to the invention, a deep learningalgorithm and a machine learning algorithm are combined to solve the problem of automatically obtaining the text abstract under the big data condition. Firstly, a text key semantic feature extractionmethod is constructed, and key semantic features of a text are obtained through the method; the method also includes, secondly, converting the long text into a key short text in combination with an extraction type abstract method to serve as input of an abstract model; and finally, constructing a text abstract generation model based on Transformer by utilizing the extracted text key semantic features. In the abstract generation model, the attention mechanism is corrected by using the key semantic features, so that the generation model can generate more abstract contents rich in key information, and a pointer and a coverage mechanism are added, so that the abstract generation model can better solve the OOV problem and the repeated fragment generation problem in the abstract generation process.
Owner:BEIJING UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products