Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

14716results about How to "Improve recognition accuracy" patented technology

User interaction with voice information services

An iterative process is provided for interacting with a voice information service. Such a service may permit, for example, a user to search one or more databases and may provide one or more search results to the user. Such a service may be suitable, for example, for searching for a desired entity or object within the database(s) using speech as an input and navigational tool. Applications of such a service may include, for instance, speech-enabled searching services such as a directory assistance service or any other service or application involving a search of information. In one example implementation, an automatic speech recognition (ASR) system is provided that performs a speech recognition and database search in an iterative fashion. With each iteration, feedback may be provided to the user presenting potentially relevant results. In one specific ASR system, a user desiring to locate information relating to a particular entity or object provides an utterance to the ASR. Upon receiving the utterance, the ASR determines a recognition set of potentially relevant search results related to the utterance and presents to the user recognition set information in an interface of the ASR. The recognition set information includes, for instance, reference information stored internally at the ASR for a plurality of potentially relevant recognition results. The recognition set information may be used as input to the ASR providing a feedback mechanism. In one example implementation, the recognition set information may be used to determine a restricted grammar for performing a further recognition.
Owner:MICROSOFT TECH LICENSING LLC

Global speech user interface

A global speech user interface (GSUI) comprises an input system to receive a user's spoken command, a feedback system along with a set of feedback overlays to give the user information on the progress of his spoken requests, a set of visual cues on the television screen to help the user understand what he can say, a help system, and a model for navigation among applications. The interface is extensible to make it easy to add new applications.
Owner:PROMPTU SYST CORP

System and methods for improving accuracy of speech recognition

The invention provides a system and method for improving speech recognition. A computer software system is provided for implementing the system and method. A user of the computer software system may speak to the system directly and the system may respond, in spoken language, with an appropriate response. Grammar rules may be generated automatically from sample utterances when implementing the system for a particular application. Dynamic grammar rules may also be generated during interaction between the user and the system. In addition to arranging searching order of grammar files based on a predetermined hierarchy, a dynamically generated searching order based on history of contexts of a single conversation may be provided for further improved speech recognition. Dialogue between the system and the user of the system may be recorded and extracted for use by a speech recognition engine to refine or create language models so that accuracy of speech recognition relevant to a particular knowledge area may be improved.
Owner:INAGO CORP

Stroke segmentation for template-based cursive handwriting recognition

InactiveUS20050100214A1Improve match rateImproves East Asian cursive handwriting recognition accuracyCharacter and pattern recognitionCharacter recognitionHandwriting recognition
Ink strokes of cursive writing are segmented to make the cursive writing more like print writing, particularly with respect to the number of strokes of a character. A stroke-segmentation module first finds the local extrema points on a stroke of input ink. Then the local extrema points are stepped through, two (or three) at a time. The stroke-segmentation module may compare the three (or four) ink segments that are adjacent to the two (or three) local extrema points to a set of predefined stroke-segmentation patterns to find a closest matching pattern. Strokes are then segmented based on a stroke-segmentation rule that corresponds to the closest matching pattern. Additional stroke segmentation may be performed based on the change of curvature of the segmented ink strokes. Then, a character-recognition module performs character recognition processing by comparing the segmented ink strokes to prototype samples at least some of which have been similarly segmented.
Owner:MICROSOFT TECH LICENSING LLC

Method and apparatus for performing speech recognition utilizing a supplementary lexicon of frequently used orthographies

The invention relates to a method and an apparatus for recognising speech, more particularly to a speech recognition system and method utilising a speech recognition dictionary supplemented by a lexicon containing frequently occurring word sequences (orthographies). In typical speech recognition systems, the process of speech recognition consists of scanning the vocabulary database or dictionary by using a fast match algorithm to find the top N candidates that potentially match the input speech. In a second pass the N candidates are re-scored using more precise likelihood computations. The novel method comprises the introduction of a step in the search stage that consists of forcing the insertion in the list of N candidates entries selected from a lexicon containing frequently used orthographies to increase the probability of occurrence of certain text combinations.
Owner:RPX CLEARINGHOUSE

System and method for improving text input in a shorthand-on-keyboard interface

A word pattern recognition system improves text input entered via a shorthand-on-keyboard interface. A core lexicon comprises commonly used words in a language; an extended lexicon comprises words not included in the core lexicon. The system only directly outputs words from the core lexicon. Candidate words from the extended lexicon can be outputted and simultaneously admitted to the core lexicon upon user selection. A concatenation module enables a user to input parts of a long word separately. A compound word module combines two common shorter words whose concatenation forms a long word.
Owner:NUANCE COMM INC

Speech recognition and control system, program product, and related methods

A speech activated control system for controlling aerial vehicle components, program product, and associated methods are provided. The system can include a host processor adapted to develop speech recognition models and to provide speech command recognition. The host processor can be positioned in communication with a database for storing and retrieving speech recognition models. The system can include an avionic computer in communication with the host processor and adapted to provide command function management, a display and control processor in communication with the avionic computer adapted to provide a user interface between a user and the avionic computer, and a data interface positioned in communication with the avionic computer and the host processor provided to divorce speech command recognition functionality from vehicle or aircraft-related speech-command functionality. The system can also include speech actuated command program product at least partially stored in the memory of the host processor and adapted to provide the speech recognition model training and speech recognition model recognition functionality.
Owner:LOCKHEED MARTIN CORP

Gesture-based information and command entry for motor vehicle

A method of receiving input from a user includes providing a surface within reach of a hand of the user. A plurality of locations on the surface that are touched by the user are sensed. An alphanumeric character having a shape most similar to the plurality of touched locations on the surface is determined. The user is audibly or visually informed of the alphanumeric character and / or a word in which the alphanumeric character is included. Feedback is received from the user regarding whether the alphanumeric character and / or word is an alphanumeric character and / or word that the user intended to be determined in the determining step.
Owner:ROBERT BOSCH GMBH

Method of setting personal wake-up word by text for voice control

The present invention is to provide a method of setting personal wake-up word by text for voice control, which enables an electronic device to execute the steps of activating an wake-up-word set program; receiving a set message transmitted from an input unit; parsing a first text information contained in the set message and including at least one character; storing the at least one character as a personal wake-up word; and setting the personal wake-up word as a voice command for activating the voice control program when the at least one character is determined to exist in content of a voice database. Thus, a user can customize the personal wake-up word simply and quickly by inputting the text through the input unit to generate the set message, and later speak a voice corresponding to the personal wake-up word to activate the voice control program for voice control of the electronic device.
Owner:OPAH INTELLIGENCE LTD

Intelligent safety monitoring system and method based on multilevel filtering face recognition

The invention discloses a method based on multilevel filtering face recognition. The method comprises the following steps of: collecting a face image of a detected man through an image collection system on a user terminal; automatically detecting and partitioning an exact position of a face from the collected face image by a face detection and positioning system, and performing intelligent indication and image quality real-time monitoring on a face image collection process through an automatic and real-time face image quality detection system; extracting characteristic points from the face image of the user terminal according to an image quality detection threshold value, and generating corresponding target face templates; and performing real-time comparison on a face to be recognized which is detected by a client and a known face database based on a multilevel filter searching algorithm through a background server, finding out the face template having the highest matching score, judging according to a preset threshold value of the system and determining identity information of the shot man in real time. The invention also provides an intelligent identity recognition and safety monitoring system based on a multilevel face filtering and searching technology with high reliability and flexibility.
Owner:CHANGZHOU RUICHI ELECTRONICS TECH

Movement human abnormal behavior identification method based on template matching

The invention relates to a movement human abnormal behavior identification method based on template matching, which mainly comprises the steps of: video image acquisition and behavior characteristic extraction. The movement human abnormal behavior identification method is a mode identification technology based on statistical learning of samples. The movement of a human is analyzed and comprehended by using a computer vision technology, the behavior identification is directly carried out based on geometric calculation of a movement region and recording and alarming are carried out; the Gaussian filtering denoising and the neighborhood denoising are combined for realizing the denoising, thereby improving the independent analysis property and the intelligent monitoring capacity of an intelligent monitoring system, achieving higher identification accuracy for abnormal behaviors, effectively removing the complex background and the noise of a vision acquired image, and improving the efficiency and the robustness of the detection algorithm. The invention has simple modeling, simple algorithm and accurate detection, can be widely applied to occasions of banks, museums and the like, and is also helpful to improve the safety monitoring level of public occasions.
Owner:XIDIAN UNIV

Method of implementing digital payments

ActiveUS20030126078A1Eliminate, orAlleviate, the drawbacks and deficienciesFinanceComputer security arrangementsPaymentTerminal equipment
A method for transferring a digital payment order from a first terminal device to a second terminal device, and for saving the payment order on a payment order server. A payment order request is sent from the first terminal device to the payment order server and, in response, the requested payment order is sent from the payment order server to the first terminal device. The payment order is thereafter transmitted from the first terminal device to the second terminal device, and the payment order is then transferred from the second terminal device to the payment order server to be honored. A message confirming the honoring of the payment order is sent to the second terminal device.
Owner:MIND FUSION LLC

Word boundary probability estimating, probabilistic language model building, kana-kanji converting, and unknown word model building

Calculates a word n-gram probability with high accuracy in a situation where a first corpus), which is a relatively small corpus containing manually segmented word information, and a second corpus, which is a relatively large corpus, are given as a training corpus that is storage containing vast quantities of sample sentences. Vocabulary including contextual information is expanded from words occurring in first corpus of relatively small size to words occurring in second corpus of relatively large size by using a word n-gram probability estimated from an unknown word model and the raw corpus. The first corpus (word-segmented) is used for calculating n-grams and the probability that the word boundary between two adjacent characters will be the boundary of two words (segmentation probability). The second corpus (word-unsegmented), in which probabilistic word boundaries are assigned based on information in the first corpus (word-segmented), is used for calculating a word n-grams.
Owner:IBM CORP

Complex character recognition method based on deep learning

The invention relates to the field of image recognition, and especially relates to a complex character recognition method based on deep learning. Through the analysis of character complexity, a training sample, which contains a to-be-recognized image noise model and a distortion characteristic model, generated by a random sample generator is employed for the training of a deep neural network. The training sample comprises complex noise and distortion, and can meet the demands of the recognition of various types of complex characters. A few of manually annotated first training sample sets and a large amount of randomly generated second training sample sets are mixed and then inputted to the deep neural network, thereby solving a problem that a large number of manually annotated training samples are needed for character recognition through the deep neural network. Moreover, the most advanced deep neural network is employed for automatic learning under the condition that the noise and distortion of a to-be-recognized image are retained, thereby avoiding information loss caused by noise reduction in a conventional OCR method, and improving the recognition accuracy.
Owner:成都数联铭品科技有限公司

Rolling bearing fault diagnosis method based on variation mode decomposition and permutation entropy

The invention relates to a rolling bearing fault diagnosis method based on variation mode decomposition and permutation entropy. Vibration signals are decomposed with a variation mode decomposition method, so that reactive components and mode aliasing are effectively reduced, all the mode components include characteristic information of different time scales of original signals, and effective multi-scale components are provided for subsequent signal characteristic extraction. With the combination of the features that permutation entropy is simple in calculation, high in noise resisting ability and the like, bearing fault characteristics of all the mode components are extracted from multi-scale angles. Compared with single permutation entropy analysis of rolling bearing vibration, the characteristic information of the signals can be more comprehensively represented through the permutation entropy characteristic extracting method based on multiple scales, the recognition accuracy of a support vector machine is improved, and fault diagnosis of rolling bearings is better achieved.
Owner:SHANGHAI UNIVERSITY OF ELECTRIC POWER

Monosyllabic language lip-reading recognition system based on vision character

This system reads the lip movement of the video creature to recognize the speaking content. Its aim is to use the video info only to recognize the lip language of the single syllable word (SSW), e.g. in Chinese language. This invention includes the video demodulating module, the lip allocating module. The lip movement dividing module, the feature drawing module, the language material warehouse (LMW), the model establishing module and the lip language recognizing module. This LMW possesses rich contents and is easy to expand. This invention processes only video images and need not the audio data to help. It can process video files, e.g. avi, wmv, rmvb and mpg to meet the requirement of recognizing the talking content under soundless condition. The lip movement part in this invention aims SSW to handle intelligently dividing. Comparing with the solid length time dividing or the handwork dividing, this method is more practical and greatly raises the recognition accuracy.
Owner:HUAZHONG UNIV OF SCI & TECH

Method for detecting and representing one or more objects, for example teeth

A method for detecting and representing one or more objects, such as teeth, their preparations and their immediate environment, using a camera. A first recording is made wherein a still image is produced. The still image is blended into a current, mobile search image in at least one sub-area in the second step, so that both images are recognizable. In the third step, the camera is positioned in such a way that the search image overlaps the blended-in still image in at least one sub-area. The second recording process is initiated in a fourth step.
Owner:SIRONA DENTAL SYSTEMS

Extended videolens media engine for audio recognition

A system, method, and computer program product for automatically analyzing multimedia data audio content are disclosed. Embodiments receive multimedia data, detect portions having specified audio features, and output a corresponding subset of the multimedia data and generated metadata. Audio content features including voices, non-voice sounds, and closed captioning, from downloaded or streaming movies or video clips are identified as a human probably would do, but in essentially real time. Particular speakers and the most meaningful content sounds and words and corresponding time-stamps are recognized via database comparison, and may be presented in order of match probability. Embodiments responsively pre-fetch related data, recognize locations, and provide related advertisements. The content features may be also sent to search engines so that further related content may be identified. User feedback and verification may improve the embodiments over time.
Owner:SONY CORP

User online authentication method and system based on living body detection and face recognition

The invention discloses a user online authentication method and system based on living body detection and face recognition. The method includes the user online registration step and the user online authentication step. The user online authentication step comprises the living body detection step, the image processing step, the feature value extraction step, the face comparison step and the result processing step. In the living body detection step, whether an authenticated user is a living body or not is determined and a face picture is acquired. In the image processing step, the collected face picture is processed. In the feature value extraction step, face part features of the processed face picture are extracted. In the face comparison step, extracted feature data of the collected face image are compared with corresponding face data in a user face feature value database, a threshold value is set, and when similarity exceeds the threshold value, the acquired result through matching is output. The user online authentication method and system can avoid authentication cheating through videos including faces, safety of the system is improved, recognition time can be shortened, and recognition accuracy is improved.
Owner:SHANGHAI JUNYU DIGITAL TECH

Barrier identification method and system of laser radar

The invention provides a barrier identification method and system of a laser radar. The method comprises that S1) original point cloud data, position data and attitude data are fused to obtain fused point cloud data; S2) the fused point cloud data is divided into multiple fused point cloud data segments according to the time sequence, and ICP registering is carried out on point cloud in each fusedpoint cloud data segment to obtain superposed point cloud data; S3) point clouds of each group of superposed point cloud data segment are clustered to obtain candidate barriers, and static information of the candidate barriers is extracted; and S4) according to the static information of the candidate barriers, static and dynamic barriers are identified from the candidate barrier, and dynamic information of the dynamic barrier is extracted. A graph can be made needless of using offline data of the laser radar, the method and system are suitable for detecting barriers in different complex application environments, the identification precision is high, and the speed is high.
Owner:SHANGHAI ALLYNAV TECH CO LTD

Traffic flow running rate recognizing method based on bus GPS data

InactiveCN101710449AOvercome the problem of unsatisfactory application effectLow costDetection of traffic movementAverage speed measurementTraffic flowState recognition
The invention discloses a traffic flow running rate recognizing method based on bus GPS data and relates to a traffic information collecting and processing technology in the field of intelligent traffics. The method comprises the following solving steps of: carrying out grade division on an urban road section by a GIS; confirming a speed threshold value K1 and a speed threshold value K2 of all grades of roads; carrying out sub-road section division on the urban roads by the GIS; obtaining an average value of the speed that all buses pass through a sub-road section in a certain time interval, which is collected by a bus vehicle-mounted GPS system; comparing the average value of the speed with the threshold value K1 and the threshold value K2 of the sub-road section and confirming the traffic flow running rate of the sub-road section. The traffic flow running rate recognizing method based on bus GPS data can obviously improve the recognizing precision of the traffic flow running rate, reduce the time delay and provide the type of traffic jam simultaneously, thereby providing a basis for selecting more convenient traveling line for a traveler and proving more powerful decision support for establishing a jam facilitating scheme for a traffic management department.
Owner:JILIN UNIV

Gesture recognition method based on 3D-CNN and convolutional LSTM

The invention discloses a gesture recognition method based on 3D-CNN and convolution LSTM. The method comprises the steps that the length of a video input into 3D-CNN is normalized through a time jitter policy; the normalized video is used as input to be fed to 3D-CNN to study the short-term temporal-spatial features of a gesture; based on the short-term temporal-spatial features extracted by 3D-CNN, the long-term temporal-spatial features of the gesture are studied through a two-layer convolutional LSTM network to eliminate the influence of complex backgrounds on gesture recognition; the dimension of the extracted long-term temporal-spatial features are reduced through a spatial pyramid pooling layer (SPP layer), and at the same time the extracted multi-scale features are fed into the full-connection layer of the network; and finally, after a latter multi-modal fusion method, forecast results without the network are averaged and fused to acquire a final forecast score. According to the invention, by learning the temporal-spatial features of the gesture simultaneously, the short-term temporal-spatial features and the long-term temporal-spatial features are combined through different networks; the network is trained through a batch normalization method; and the efficiency and accuracy of gesture recognition are improved.
Owner:BEIJING UNION UNIVERSITY

Named-entity recognition model training method and named-entity recognition method and device

An embodiment of the invention provides a named-entity recognition model training method and a named-entity recognition method and device. The method used for training a recurrent neutral network (RNN) named-entity recognition model includes: acquiring multiple labeled sample data, wherein each sample datum includes a text string and multiple term segment labeled data thereof, and each term segment labeled datum includes segmented terms separated from the text string and a named-entity attribute tag in the text string; mapping the segmented terms in the labeled sample data to be term vectors, taming the sample data as training samples, training the RNN named-entity recognition model, and learning parameters of the RNN named-entity recognition model. By the named-entity recognition model training method and the name-entity recognition method and device, the trained model has better generalization ability, the named entity in the natural language tests can be recognized rapidly, and recognition accuracy of the named entity is improved.
Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Vehicle-mounted child omission reminder device and detection method thereof

The invention discloses a vehicle-mounted child leaving reminding device and a vehicle-mounted child leaving detection method. The device comprises a voice sensor, a touch sensor, a main control module, a voice alarm, a communication module and a data storage module, wherein the voice sensor and the touch sensor are connected with the signal input end of the main control module; the voice alarm and the communication module are connected with the signal output end of the main control module; the data storage module is connected with the main control module; after an automobile is power off andan automobile door is closed through detection, the voice sensor and the touch sensor are started and respectively perform detection in voice and touch modes; when at least one detection result indicates that a child is left in the automobile, the automobile sends alarm voice and sends information to a specified mobile phone; and an initial state is returned when the automobile door is opened through detection. By the vehicle-mounted child leaving reminding device and the vehicle-mounted child leaving detection method, people around the automobile and specific related personnel can be effectively reminded when the child is left in the automobile.
Owner:ZHEJIANG GEELY AUTOMOBILE RES INST CO LTD +1
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products