Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

31 results about "Phonetic form" patented technology

In the field of linguistics, specifically in syntax, phonetic form (PF), also known as phonological form or the articulatory-perceptual (A-P) system, is a certain level of mental representation of a linguistic expression, derived from surface structure, and related to Logical Form. Phonetic form is the level of representation wherein expressions, or sentences, are assigned a phonetic representation, which is then pronounced by the speaker. Phonetic form takes surface structure as its input, and outputs an audible (or visual, in the case of sign languages), pronounced sentence.

Distributed real time speech recognition system

InactiveUS20050080625A1Facilitates query recognitionAccurate best responseNatural language translationData processing applicationsFull text searchTime system
A real-time system incorporating speech recognition and linguistic processing for recognizing a spoken query by a user and distributed between client and server, is disclosed. The system accepts user's queries in the form of speech at the client where minimal processing extracts a sufficient number of acoustic speech vectors representing the utterance. These vectors are sent via a communications channel to the server where additional acoustic vectors are derived. Using Hidden Markov Models (HMMs), and appropriate grammars and dictionaries conditioned by the selections made by the user, the speech representing the user's query is fully decoded into text (or some other suitable form) at the server. This text corresponding to the user's query is then simultaneously sent to a natural language engine and a database processor where optimized SQL statements are constructed for a full-text search from a database for a recordset of several stored questions that best matches the user's query. Further processing in the natural language engine narrows the search to a single stored question. The answer corresponding to this single stored question is next retrieved from the file path and sent to the client in compressed form. At the client, the answer to the user's query is articulated to the user using a text-to-speech engine in his or her native natural language. The system requires no training and can operate in several natural languages.
Owner:NUANCE COMM INC

Speech-to-speech translation system with user-modifiable paraphrasing grammars

The present invention discloses a speech-to-speech translation device which allows one or more users to input a spoken utterance in one language, translates the utterance into one or more second languages, and outputs the translation in speech form. Additionally, the device allows for translation both directions, recognizing inputs in the one or more second languages and translating them back into the first language. The device recognizes and translates utterances in a limited domain as in a phrase book translation system, so the translation accuracy is essentially 100%. By limiting the domain the system increases the accuracy of the speech recognition component and thus the accuracy of the overall system. However unlike other phrase book systems, the device also allows wide variations and paraphrasing in the input, so that the user is much more likely to find the desired phrase from the stored list of phrases. The device paraphrases the input to a basic canonical form and performs the translation on that canonical form, ignoring the non-essential variations in the surface form of the input. The device can provide visual and / or auditory feedback to confirm the recognized input and makes the system usable for non-bilingual users with absolute confidence.
Owner:EHSANI FARZAD +2

Prosthetic hearing device that transforms a detected speech into a speech of a speech form assistive in understanding the semantic meaning in the detected speech

A speech transformation apparatus comprises a microphone 21 for detecting speech and generating a speech signal; a signal processor 22 for performing a speech recognition process using the speech signal; a speech information generator for transforming the recognition result responsive to the physical state of the user, the operating conditions, and / or the purpose for using the apparatus; and a display unit 26 and loudspeaker 25 for generating a control signal for outputting a raw recognition result and / or a transformed recognition result. In a speech transformation apparatus thus constituted, speech enunciated by a spoken-language-impaired individual can be transformed and presented to the user, and sounds from outside sources can also be transformed and presented to the user.
Owner:YUGEN KAISHA GM & M

Systems and methods for inputting graphical data into a graphical input field

A system (20) for inputting graphical data into a graphical input field includes a graphical input device (22) for inputting the graphical data into the graphical input field, and a processor-executable voice-form module (28) responsive to an initial presentation of graphical data to the graphical input device. The voice-form module (28) causes a determination of whether the inputting of the graphical data into the graphical input field is complete. A method for inputting graphical data into a graphical input field includes initiating an input of graphical data via a graphical input device into the graphical input field, and actuating a voice-form module in response to initiating the input of graphical data into the graphical input field.
Owner:IBM CORP

Synchronizing visual and speech events in a multimodal application

Exemplary methods, systems, and products are disclosed for synchronizing visual and speech events in a multimodal application, including receiving from a user speech; determining a semantic interpretation of the speech; calling a global application update handler; identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation; and executing the additional function. Typical embodiments may include updating a visual element after executing the additional function. Typical embodiments may include updating a voice form after executing the additional function. Typical embodiments also may include updating a state table after updating the voice form. Typical embodiments also may include restarting the voice form after executing the additional function.
Owner:NUANCE COMM INC

Systems and methods for inputting graphical data into a graphical input field

A method for inputting graphical data into a graphical input field includes initiating an input of graphical data via a graphical input device into the graphical input field, and actuating a voice-form module in response to initiating the input of graphical data. Actuating the voice-form module includes actuating a first voice-form function for capturing an initial value corresponding to the graphical input field and actuating a second voice-form function based upon a final value corresponding to the graphical input field. The first voice-form function initiates a timing function for polling the graphical input field at a predefined interval to determine subsequent values corresponding to the graphical input field in order to determine whether the input of graphical data into the graphical input field is complete. The second voice-form function determines whether the final value corresponding to the graphical input field is contained within a predefined set of valid values.
Owner:IBM CORP

On-line touch-and-talk pen system and touch reading method thereof

InactiveCN103236195ARealize random readingPracticalElectrical appliancesData informationLine search
The invention discloses an on-line touch-and-talk pen system, which comprises a scanning module, a storage module, an on-line searching module and an output module, wherein the scanning module is used for scanning feature information of books to be read; the storage module is used for receiving and storing the feature information transmitted by the scanning module; the on-line searching module is used for realizing connection of the storage module and a network on-line high-volume database; data information matched with the feature information stored in the storage module is acquired through on-line searching; and the output module is used for outputting data information acquired by the on-line searching module, and playing the data information in a phonetic form. The on-line touch-and-talk pen system is combined with an on-line searching function, the books are scanned, and the features of the books are extracted, and through on-line searching matching, the corresponding information data are acquired, are fed back to the storage module, are output and are subjected to voice playing, so that random touch reading on any book is realized, the application range is enlarged, and the practicability is strong. The invention also provides a touch reading method of the on-line touch-and-talk pen system.
Owner:SUN YAT SEN UNIV

Advanced image semantic parsing method based on human-computer interaction

The embodiment of the invention discloses an advanced image semantic parsing method based on human-computer interaction. The advanced image semantic parsing method comprises the following steps: scanning a source image by a portable scanning device; identifying a target in the source image; filtering and parsing the content in the source image, and extracting effective knowledge; transmitting the image content to a user in a phonetic form through a semantic organization tool. According to the advanced image semantic parsing method disclosed by the embodiment of the invention, for visual impairment people and people with poor self-study ability, the simple scanning operation is only required, that the image is scanned by a computer with the assistance of the visual system is not required, and the method can help the vulnerable groups to experience a different world, and also can be used as a part of entertainment in the life; the method is simple to execute and good in portability.
Owner:SUN YAT SEN UNIV

Universal special word recognition method and system based on mode expansion

The invention provides a universal special word recognition method and a system based on mode expansion, and provides the method and the system for completing new word extraction by constructing a prefix tree based on phonetic form codes, common Chinese character syllables, common Chinese character structures and special character mapping nodes of basic words and performing fuzzy matching by comparing character code similarities. The method can be applied to scenes such as discovery and extraction of specific words in a large number of texts, extraction and generation of data sets of some tasks and preprocessing of a given text data set, such as text preprocessing processes such as screening and correction of data sets such as short messages and microblogs. Data sources and basic annotations are provided for the next text classification task, and help is provided for discovery and correction of new words in text data.
Owner:NAT COMP NETWORK & INFORMATION SECURITY MANAGEMENT CENT +1

Terminal device and hands-free device for hands-free automatic interpretation service, and hands-free automatic interpretation service method

A user terminal, hands-free device and method for hands-free automatic interpretation service. The user terminal includes an interpretation environment initialization unit, an interpretation intermediation unit, and an interpretation processing unit. The interpretation environment initialization unit performs pairing with a hands-free device in response to a request from the hands-free device, and initializes an interpretation environment. The interpretation intermediation unit sends interpretation results obtained by interpreting a user's voice information received from the hands-free device to a counterpart terminal, and receives interpretation results obtained by interpreting a counterpart's voice information from the counterpart terminal. The interpretation processing unit synthesizes the interpretation results of the counterpart into a voice form based on the initialized interpretation environment when the interpretation results are received from the counterpart terminal, and sends the synthesized voice information to the hands-free device.
Owner:ELECTRONICS & TELECOMM RES INST

Voice record for experiment and used system and method

The invention relates to a method for an experimental record in a voice mode, which comprises the following steps: recording an experiment condition and an experimental discussion in the voice mode; transmitting voice data to a processing device to record; and simultaneously converting the voice data into literal data and parallelly storing the literal data to form a database.
Owner:吴凯凯

Method and system for achieving language interpretation in browser and mobile terminal

Described are a method, system and mobile terminal for realizing language interpretation through a browser, which may improve the efficiency for learning a foreign language or enable a user to interact in real time through a voice format, with another person who speaks the foreign language. The method includes the steps of receiving through a user's terminal browser language interpretation interface, expression information of an input language; translating the expression information of the input language into expression information of a target language, according to an input predetermined conversion relation between the input language and the target language; outputting the expression information of the target language through a text format; and outputting the expression information of the target language through a voice format.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Chinese word similarity detection algorithm based on pronunciation, shape and meaning

The invention provides a Chinese word similarity detection algorithm based on pronunciation, shape and meaning, which detects the overall similarity of Chinese character strings by comprehensively considering three characteristics of pronunciation, shape and meaning of Chinese characters, and comprises the following steps of: firstly, converting the pinyin of each Chinese character of the Chinesecharacter strings s1 and s2 into a corresponding phonetic code, and converting each Chinese character of the Chinese character strings s1 and s2 into a shape code; then respectively calculating the phonetic code similarity and the shape code similarity between the Chinese character strings s1 and s2, then independently calculating the similarity of the Chinese character string meanings, and finally setting contribution parameters for an application scene in combination with the phonetic form meanings to calculate the overall similarity of the final Chinese character strings s1 and s2. The algorithm can meet complex application scenarios, can be applied to detection of the repetition degree of structured data items, especially in the case of manual input errors, and can also be applied to detection of sensitive words hidden in wrongly written characters and the like. Compared with a Chinese character similarity detection algorithm of the same type, the detection effect on the Chinese character string similarity is greatly enhanced.
Owner:HAINAN UNIVERSITY

Chatting equipment, information output method of chatting equipment, chatting system and information interactive method of chatting system

The invention discloses chatting equipment, an information output method of the chatting equipment, a chatting system and an information interactive method of the chatting system. The chatting equipment comprises a controller, a display panel and a numerical matrix library; the display panel is electrically connected with the controller; the numerical matrix library is in communication connection with the controller; the numerical matrix library comprises a plurality of numerical matrixes in one-to-one correspondence with a plurality of emotion icons; the display panel comprises a plurality of lifting parts; the chatting equipment is used for receiving the emotion icons; and the controller is used for controlling the lifting parts at the corresponding positions on the display panel to lift according to positions and values of various elements in the matrixes to be displayed, so that the received emotion icons can be displayed on the display panel. According to the invention, the emotion icons can be vividly displayed on a hardware structure; the display manner of chatting information is enriched; practical chatting scenes are presented more really; furthermore, received texts or voice can be output in a voice form; and thus, chatting information is relatively rapid and convenient to obtain.
Owner:SHANGHAI CHUANGGONG COMM TECH

Display device, text error correction method and server

The embodiment of the invention provides a display device, a text error correction method and a server, the display device comprises a display and a controller, the controller is configured to: in response to receiving a voice command input by a user, performing voice conversion on the voice command to obtain a text to be corrected; controlling a display to display the to-be-corrected text; performing error correction on the to-be-corrected text based on the confusion set with similar phonetic forms and a graph attention mechanism to obtain an initial error-corrected text, performing candidate recall on the to-be-corrected text and the initial error-corrected text, and obtaining a final error-corrected text according to a sorting result of the recalled texts; and controlling the display to refresh the to-be-corrected text into the final corrected text. According to the embodiment of the invention, the pronunciation similar knowledge graph and the shape similar knowledge graph are generated according to the confusion set corresponding to the to-be-corrected text, the pinyin and font related knowledge of Chinese characters is fused into the graph neural network, and the deep semantic information among the similar characters is extracted, so that the knowledge of similar pronunciation and shape can be effectively utilized, and the correct rate and recall rate of error detection and correction are improved.
Owner:HISENSE VISUAL TECH CO LTD

Methods and devices for request services and service resource distribution

The invention provides methods and devices for request services and service resource distribution, and relates to the technical field of communication. A detailed description of the invention includesresponding to receiving a preset instruction, and performing information interaction of phonetic form with users, the preset instruction being used for instructing a target service requested in a speech pattern; obtaining target information used for requesting the target service through the information interaction; and requesting the target service based on the target information. As the users donot need to depend on visibility interfaces, electronic equipment calling services can be normally used, and the efficiency of man-computer interaction can also be enhanced.
Owner:BEIJING DIDI INFINITY TECH & DEV

Method for translation service using the cellular phone

Disclosed is a method for providing translation service using a mobile communication terminal. The method includes a button input step of pressing a voice recognition key to use a voice recognition function, a menu screen provision step of selecting a translator menu item, a translation recognition method determination step of selecting a sentence input method or a word input method, a Korean input step of inputting Korean, a confirmation step of confirming whether a completed Korean sentence matches an intended sentence, and a translated sentence output step of providing a relevant translated sentence in a text form and reproducing the relevant translated sentence in a voice form.
Owner:INFINITYTELECOM

WeChat public platform-based Chinese-Mongolian corpus crowdsourcing construction method

ActiveCN110472948ASolving the problem of collecting open-domain natural spoken corpusImprove experienceOffice automationResourcesSpoken languageThe Internet
The invention discloses a WeChat public platform-based Chinese-Mongolian corpus crowdsourcing construction method, and belongs to the field of corpus resource construction. The method comprises the following specific operation steps: 1) obtaining a multi-body cut open domain original corpus; 2) screening and filtering the users participating in the translation task through a Mongolian level test questionnaire; 3) sending a crowdsourcing translation task to a user following the WeChat official account in a subscription account pushing mode; 4) enabling each WeChat client to translate one or more source sentences into Mongolian and feed back the Mongolian to the background in a voice form; 5) evaluating the corpus quality in a manner of combining background administrator auditing and crowdsourcing quality evaluation to realize corpus quality control. The WeChat public platform-based Chinese-Mongolian corpus crowdsourcing construction method completes corpus collection online, is simple in interaction, good in user experience and high in user participation degree, effectively solves the problem of collecting open domain natural spoken language corpora in a real Mongolian language environment, and shows an extremely high practical prospect under an Internet mobile platform.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Method and system for realizing language interpretation in browser of mobile terminal

The application of the invention is a divisional application of the application 201310174193.8 in China. The invention relates to the technical field of internet, and discloses a method and a system for realizing language interpretation in a browser, and a mobile terminal, the method comprises the following steps: receiving information input by a user through a browser language interpretation interface and expressed by an input language; according to a preset conversion relationship between an input language and a target language, translating the information expressed by the input language into information expressed by the target language; outputting information expressed in a target language by adopting a character form; and outputting the information expressed in the target language in avoice form. By implementing the embodiment of the invention, the foreign language learning efficiency or the foreign language interaction efficiency of the user can be improved.
Owner:深圳市雅阅科技有限公司

Leadless group discussion system

PendingCN114881024AImprove interview skillsImprove interview efficiencyNatural language data processingSpeech recognitionPart of speechSpeech sound
The invention discloses a leadless group discussion system, which comprises the following steps of: converting a captured speech in a voice form into a speech in a text form, performing word segmentation processing on the speech in the text form, and filtering out stop words in the speech and retaining words with specified parts of speech in the word segmentation processing process to obtain a filtered word segmentation set; constructing a vertex set by using the segmented words in the filtered segmented word set, constructing an edge between any two points of the vertex set by adopting a co-occurrence relationship to obtain candidate keyword graphs of the filtered segmented word set, calculating the weight of each vertex set in the candidate keyword graphs, and sorting the weights of the candidate keyword graphs from large to small to obtain the candidate keyword graphs of the filtered segmented word set. The N vertexes with the weight values in the top are selected as the keywords of the speaking, so that an interviewer or an examiner is assisted to quickly understand whether the speaking of the interviewer deviates from the theme or not by extracting the keywords, and the interview ability of the interviewer or the interview efficiency of the interviewer is assisted to be improved.
Owner:CENT SOUTH UNIV

Elderly user use disorder reporting and solving method of mobile phone terminal

The invention relates to software engineering and data mining technologies, in particular to an elderly user use obstacle report and solution method for a mobile phone terminal, which comprises the following steps: 1, identifying an action, converting the action into a character string, and further processing the character string only when an identification result shows that a problem is encountered; 2, searching a character string, and converting a search result into a vector; 3, performing data training, converting all known search results into vectors according to the constructed Chinese word bank, and constructing a prediction model; and 4, carrying out data prediction, predicting a search result of the new character string through a prediction model, giving a solution to a problem, and returning the solution in a voice form. According to the method, the elder user does not need to perform input operation from beginning to end, and all operations are established on the basis of automatic elder action recognition. When the old people encounter difficulties in using the mobile phone, a solution can be obtained at the first time, and meanwhile, a result is returned in a voice form, so that the user experience of the old people in using the smart phone is improved.
Owner:WUHAN UNIV

Method and device for voice recognition matching

The invention discloses a method and device for speech recognition and matching, the main contents of which include: after determining the character information in the form of pinyin converted from the voice information, according to the fuzzy matching strategy, the characters stored in the form of pinyin and Chinese characters are selected from the local database In the information, the converted character information is fuzzy matched according to the pinyin, and the single complete matching strategy adopted in the prior art is extended to the converted character information in the form of pinyin according to the pinyin, which effectively increases the conversion. The speech recognition rate of the character information, thereby improving the efficiency of the speech recognition technology.
Owner:CHINA MOBILE COMM GRP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products