Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

800 results about "Punctuation" patented technology

Punctuation (formerly sometimes called pointing) is the use of spacing, conventional signs and certain typographical devices as aids to the understanding and correct reading of written text whether read silently or aloud. Another description is, "It is the practice action or system of inserting points or other small marks into texts in order to aid interpretation; division of text into sentences, clauses, etc., by means of such marks."

Keyboard System with Automatic Correction

There is disclosed an enhanced text entry system which uses word-level analysis to automatically correct inaccuracies in user keystroke entries on reduced keyboards such as those implemented on a touch-sensitive panel or display screen, or on mechanical keyboard systems. A method and system are defined which determine one or more alternate textual interpretations of each sequence of inputs detected within a designated auto-correcting keyboard region. The actual contact locations for the keystrokes may occur outside the boundaries of the specific keyboard key regions associated with the actual characters of the word interpretations proposed or offered for selection, where the distance from each contact location to each corresponding intended character may in general increase with the expected frequency of the intended word in the language or in a particular context. Likewise, in a mechanical keyboard system, the keys actuated may differ from the keys actually associated with the letters of the word interpretations. Each such sequence corresponds to a complete word, and the user can easily select the intended word from among the generated interpretations. Additionally, when the system cannot identify a sufficient number of likely word interpretation candidates of the same length as the input sequence, candidates are identified whose initial letters correspond to a likely interpretation of the input sequence. The approach utilizes the information contained in the entire sequence of keystrokes corresponding to a word in order to determine the user's likely intention for each character of the sequence. The system also accommodates punctuation characters such as hyphens or apostrophes that are commonly embedded in words such as hyphenated compounds and contractions in English, and characters with special diacritics such as are commonly found in various European languages, e.g., diacritic accent. Special functions may be applied depending on where the user touches the intended string in a displayed word selection list. Once the user selects the desired string, it is automatically “accepted” for output and the next input detected starts a new input sequence corresponding to the entry of a next word.
Owner:TEGIC COMM +1

Command insertion system and method for voice recognition applications

A speech recognition system and method operates a speech recognition application on a computer in a continuous dictation mode. A separate key pad is coupled to the computer through the USB port, and includes a plurality of keys for providing command signals representative of a desired computer operation command which override the dictation mode of the speech recognition application whenever such a command is operated. This allows the user to insert commands, such as punctuation, numerals, “next line”, “next paragraph”, and the like, directly from the key pad while maintaining operation of the voice recognition application in its continuous dictation mode.
Owner:LENGEN NICHOLAS D

Automated spell analysis

An apparatus, program product and method utilize automated analysis techniques to assist in the determination of acceptable usages of linguistic terms (e.g., words, phrases, acronyms, etc.). In particular, an acceptable usage of a linguistic term may be determined by scanning a plurality of documents for variants (e.g., based on differing spellings, punctuation, capitalization, meaning or definition, etc.) of the term, and tracking relative occurrences of a plurality of such variants found in the plurality of documents during scanning. By tracking occurrences of linguistic term variants, users may be able to use such statistical information to select which of the available variants represents an acceptable usage, or even a most acceptable usage, of a term. Scanned documents may be retrieved from the Internet, and scanning may occur while a user is browsing the Internet.
Owner:IBM CORP

Speech recognition using selectable recognition modes

InactiveUS20050049880A1Speech recognitionSound input/outputVocabulary speech recognitionSpeech sound
The present invention relates to speech recognition using selectable recognition modes. This includes innovations such as: large vocabulary speech recognition programming that supplies recognized words to external program as they are recognized, and allows a user to select between large vocabulary recognition of an utterance with and without language context from the prior utterance independently of state of the external program; allowing a user to select between continuous and discrete speech recognition that use substantially the same vocabulary; allowing a user to select between continuous and discrete large-vocabulary speech recognition modes; allowing a user to select between at least two different alphabetic entry speech recognition modes; and allowing a user to select from among four or more of the following recognitions modes when creating text: a large-vocabulary mode, an alphabetic entry mode, a number entry mode, and a punctuation entry mode.
Owner:CERENCE OPERATING CO

System and method for automatic speech to text conversion

Speech recognition is performed in near-real-time and improved by exploiting events and event sequences, employing machine learning techniques including boosted classifiers, ensembles, detectors and cascades and using perceptual clusters. Speech recognition is also improved using tandem processing. An automatic punctuator injects punctuation into recognized text streams.
Owner:SCTI HLDG INC

Method for automatically punctuating a speech utterance in a continuous speech recognition system

InactiveUS6067514ANatural language data processingSpeech recognitionSpoken languageContinuous speech recognition system
In a speech recognition system which recognizes a spoken utterance consisting of a sequence of spoken words and, in response, outputs a sequence of decoded words, a method for automatically punctuating the sequence of decoded words is provided. In a vocabulary of items including words, silences, and punctuation marks, assigning at least one baseform to each punctuation mark corresponding to one of silence and a non-word noise. Additionally, the method includes the step of automatically inserting a subject punctuation mark at a given point in the sequence of decoded words when an acoustic score and a language model score associated with the subject punctuation mark produce a higher combined likelihood than the acoustic score and the language model score associated with any other item in the vocabulary for the given point in the sequence of decoded words.
Owner:IBM CORP

Method and system for compression indexing and efficient proximity search of text data

A system and method of compression indexing and efficient proximity search of text data permits high speed search featuring ranking the relevance of search results according to closeness of desired terms within each portion of text found. The system includes (a) preparing target text, (b) creating a “compression index ebook”, (c) browsing in a compression index ebook, and (d) searching in a compression index ebook. To create the compression index, the method includes the steps of selecting target text, identifying tokens, such as words and punctuation strings, wherein each of the tokens has a frequency. The frequencies of each token are counted. Tokens are ranked from highest frequency to lowest frequency. The frequencies are compressed. The next step is assigning positions to each token frequency and compressing the positions to form a compression index ebook, which is stored in random access memory to eliminate disk seeks during browsing and searching.
Owner:MARPEX

Optical reader comprising keyboard

The present invention is directed toward a space saving design for an optical reader with keyboard and display wherein the reader is configured so that a reader trigger provides function in addition to the actuation of scanning. When an alphanumeric key is depressed the trigger is pulled and released to cycle through and select the available characters assigned to that particular key. The current character selection is shown at the cursor location on the display. When the key is released the current character selection is accepted and the cursor advances. This arrangement provides efficient, low cost access to alpha and punctuation-characters by reducing the number of keys and keystrokes required to enter data. The trigger may also be configured to operate in other user defined modes such as macro initiation or display scrolling.
Owner:HAND HELD PRODS

Interface for processing of an alternate symbol in a computer device

Described herein is a computer-implemented system and method for processing one or more alternate symbols associated or linked to a base symbol. A base symbol is a symbol to which at least one alternate symbol is linked. A base symbol commonly appears on a key of a keyboard or a display of a keyboard, or is a handwritten symbol recognized by handwriting entry software. Examples of alternate symbols are accented characters and punctuation marks that do not appear on a keyboard or are not commonly recognized by a handwriting entry program, and short symbol sequences. An example of a common short symbol sequence is an emoticon used in e-mail messages to convey tone or feelings. An example of a computer device that may embody the system or method is a hand-held computing device.
Owner:QUALCOMM INC

Intelligent error correcting system and method in network searching process

InactiveCN101206673AMeets preferencesSolve the problem of pinyin error correctionSpecial data processing applicationsLinguistic modelAlgorithm
The invention relates to an intelligent error correction system of key words in the process of searching networks and a method thereof. On an Internet platform, firstly, a related linguistic model and a corresponding dictionary as well as a data index database are established through the training of related data information; secondly, a text is inputted, a Pinyin error correction part calculates the mistakes of Pinyin and characters, the error correction of characters is calculated by a fuzzy match; finally, all results are filtered according to the degree of association, a plurality of results are sorted to get the proximal results. The polyphone mistakes and character types as well as word types mistakes inputted by a user are corrected by means of the sound-character conversion and fuzzy error correction technical methods to correct the character replace mistakes, the unwanted character or the leakage of character mistakes, the character position mistakes, etc. in the input process. Moreover, the basic functions are expanded on the basis such as the English-Chinese and punctuations mixing error correction, the fuzzy match technique, the related prompt technique and the enhanced intelligence error correction.
Owner:北京当当网信息技术有限公司

Method and system for realizing automatic addition of punctuation marks in speech recognition

ActiveCN102231278AImplement automatic additionAutomatically add simple and efficientSpeech recognitionSpeech soundComputer science
The invention relates to the technical field of speech recognition and discloses a method and system for realizing automatic addition of punctuation marks in the speech recognition. The method comprises the steps of: collecting user speech signals; carrying out the speech recognition on the user speech signals so as to generate a character sequence containing a plurality of sentences; sequentially calculating duration of pause positions between the sentences in the character sequence; if the duration is less than a preset threshold value, adding commas at the pause positions; and if the duration is greater than or equal to the preset threshold value, confirming the mood types of the sentences in front of the pause positions by utilizing a pre-generated classifier and adding punctuation marks at the pause positions according to the types. By utilizing the method and system which are provided by the invention, the automatic addition of the punctuation marks can be simply and conveniently realized and the accuracy and the flexibility of adding the punctuation marks are increased.
Owner:IFLYTEK CO LTD

Three-folded webpage text content recognition and filtering method based on the Chinese punctuation

A method based on Chinese website punctuation triple recognition and text content filtering. The method based on existing URL, the website information keywords in the method of filtration - filtration rate and the low rate of filtration of the whole problem, Bringing on a method for composite based on the URL and on keywords, as well as text-based knowledge representation method of vector space website text content filtering. Applying to a method Based on black-and-white list of URL filtering and Chinese punctuation statistical characteristics to effectively remove navigation information, relevant linked information, advertising linked information, copyright information and other Web content noise information to extract content of text; adopting vector space model text knowledge representation, By calculating vector text template and unhealthy information in the feature vector cosine angle, and set the threshold, compared to the text of the class. The invention can be widely used in the filtering of undesirable information network and website personalized information services.
Owner:DALIAN UNIV OF TECH

Network criticism oriented viewpoint subject identifying method and system

The invention discloses network criticism oriented viewpoint subject identifying method and system. The method comprises the following steps of: a. text inputting: inputting a criticism source and all criticism texts; b. text preprocessing: carrying out word division and speech part marking on the input texts, removing stop words, punctuations and specific empty words and calculating the word frequency information of the words; c. subject word judgment: calculating a word weight value, and if the word weight value is larger than a set threshold value, judging that a subject word is a viewpoint subject word; d. subject constructing: combining scattered viewpoint subject words into an integrated viewpoint subject; e. subject screening: confirming an effective viewpoint subject by filtering the viewpoint subject. The invention overcomes the field limitation of viewpoint analyzing method and system, identifies the viewpoint subject in a whole angle without constructing a body library, effectively overcomes the difficulty existing in single-sentence viewpoint analysis and automatically identifies the viewpoint subject in a phrase mode of a wide field and network criticism data which are dynamically changed along with the time.
Owner:THE PLA INFORMATION ENG UNIV

Speech recognition using selectable recognition modes

InactiveUS7313526B2Sound input/outputSpeech recognitionVocabulary speech recognitionSpeech identification
The present invention relates to speech recognition using selectable recognition modes. This includes innovations such as: large vocabulary speech recognition programming that supplies recognized words to external program as they are recognized, and allows a user to select between large vocabulary recognition of an utterance with and without language context from the prior utterance independently of state of the external program; allowing a user to select between continuous and discrete speech recognition that use substantially the same vocabulary; allowing a user to select between continuous and discrete large-vocabulary speech recognition modes; allowing a user to select between at least two different alphabetic entry speech recognition modes; and allowing a user to select from among four or more of the following recognitions modes when creating text: a large-vocabulary mode, an alphabetic entry mode, a number entry mode, and a punctuation entry mode.
Owner:CERENCE OPERATING CO

Method and apparatus for processing text and character data

An apparatus and method for processing text or character data are disclosed. A text processing system receives a character input string and determines whether to apply character processing. A non-English language such as Italian can be entered into a processing system such as a computer using a standard English based keyboard such that additional keys for providing accents or other grammatical and punctuation symbols or characters not existing in English are not required. In one mode, text is automatically accented or punctuated without requiring user intervention. In another mode, a user is provided with a list of accent or punctuation choices so that the user may select the optimum accent or punctuation. Text processing of an input may be activated by a text sequence including a possible vowel accent or apostrophe error, and may continue as an input method editor loop in response to repeated actuations of the key associated with the first activation event. When an activator event input is detected, a rules based system is utilized to select a correctly accented and punctuated character. A list of alternative accents and punctuations is optionally displayed, and a user may toggle through the list using the activator event to select a desired character. The display provides information for a level of certainty of a selected character or word.
Owner:CLOANTO CORP

Multi-phoneme streamer and knowledge representation speech recognition system and method

A system and method related to a new approach to speech recognition that reacts to concepts conveyed through speech. In its fullest implementation, the system and method shifts the balance of power in speech recognition from straight sound recognition and statistical models to a more powerful and complete approach determining and addressing conveyed concepts. This is done by using a probabilistically unbiased multi-phoneme recognition process, followed by a phoneme stream analysis process that builds the list of candidate words derived from recognized phonemes, followed by a permutation analysis process that produces sequences of candidate words with high potential of being syntactically valid, and finally, by processing targeted syntactic sequences in a conceptual analysis process to generate the utterance's conceptual representation that can be used to produce an adequate response. The invention can be employed for a myriad of applications, such as improving accuracy or automatically generating punctuation for transcription and dictation, word or concept spotting in audio streams, concept spotting in electronic text, customer support, call routing and other command / response scenarios.
Owner:CHEMTRON RES

Dynamically changing a character associated with a key of a keyboard

A software keyboard is provided with a dedicated key (dynamic character key) for inputting a character, where the character associated with the key is determined based upon a context and may dynamically change according to the context. For example, a first character may be dynamically determined and associated with the dedicated key for a first context and a second character, possibly different from the first character, may be selected and associated with the dedicated key for a different context. The character that is associated with the dynamic character key may also be displayed on the dynamic character key. In some embodiments, the character associated with the dynamic character key may be a non-alphanumeric character such as a diacritical mark, a punctuation mark, and the like.
Owner:APPLE INC

System, plug-in, and method for improving text composition by modifying character prominence according to assigned character information measures

A computer implemented system, plug-in application and method for composing a formatted text input to improve legibility, readability and / or print economy while preserving the format of the text input and satisfying any user selected aesthetic constraints. This is accomplished by reading in blocks of text input having defined characters including letters and punctuation in a given input format. A language unit such as a lexical or sub-lexical unit, a subset of punctuation or another defined unit for a particular language is examined and an information measure (IM) is assigned to each character in the language unit indicating the predictability of that character to differentiate the language unit from other language units. Typically, multiple different IMs are assigned to each character and combined to form a combined IM (CIM). The process is repeated for at least a plurality of language units and typically until all the text input in the block has been analyzed and information measures assigned to all of the characters. An adjustment to a physical feature is determined for each character in the plurality of units to modify the visual prominence of that character according to the values of the assigned information measures and a permitted range of physical variation for the block. The adjustments are applied to each character to compose the text input consistent with the input format.
Owner:LANGUAGE TECH

Method and apparatus for processing text and character data

Methods and systems for processing text or character data are disclosed. A text processing system receives a character input string and determines whether to apply character processing. A non-English language such as Italian can be entered into a processing system such as a computer using a standard English based keyboard such that additional keys for providing accents or other grammatical and punctuation symbols or characters not existing in English are unnecessary. In one mode, text is automatically accented or punctuated without requiring user intervention. In another mode, a user is provided with a list of accent or punctuation choices so that the user may select the optimum accent or punctuation. Text processing of an input may be activated by a predefined activator key pressed in a predetermined sequence, or may be activated in the event a predetermined sequence of characters is received.
Owner:CLOANTO CORP

Reading product fabrication methodology

A text enhancement method and apparatus for the presentation of text for improved human reading. The method includes extracting text specific attributes from machine readable text and varying the text presentation in accordance with the attributes. The preferred embodiment of the method: extracts parts of speech and punctuation from a sentence, applies folding rules which use the parts of speech to determine folding points, and presents text segments each on a new line and having a determined horizontal displacement based on the text specific attributes. One method displays text over bent curves having a shape based on text content. Another method includes displaying relative text position within a hierarchy using alternating vertically and horizontally tiled planes. Another method supports reading text segments across opposed pages without waiting for paging. Yet another method displays text to allow reading from bottom to top as though from front to back. Still another method displays words in colors reflecting the relationships between the words and the larger text segments of which they are apart.
Owner:WALKER READING TECH

System and method for generating analytic summaries

A technique for compressing texts such that referential integrity, sentence coherency, punctuation and readability are preserved and which provides for compression of sentence constituents based on the type of content, the informativity of the sentence constituent and the grammatical readability of the resultant sentence or phrase. Information content portions are parsed to generate parts of speech tags. The informativity of the constituents in a phrase or sentence is determined and the parts of speech having lower information content and having a low effect on grammatical readability of the phrase or sentence are selectively compressed. Parts of speech having successively higher informativity and low effect on grammatical readability are selected for compression until the desired level of compression is reached. Compressed portions are indicated in the summary with a selectable placeholder which expands to display the compressed text.
Owner:FUJIFILM BUSINESS INNOVATION CORP

Eye typing system using a three-layer user interface

InactiveUS20120086645A1Effective and efficient controlMinimizing user fatigueInput/output for user-computer interactionCathode-ray tube indicatorsComputer graphics (images)Typing
A specially-configured interactive user interface for use in eye typing takes the form of a three-layer arrangement that allows for controlling computer input with eye gazes. The three-layer arrangement includes an outer, rectangular ring of letters, displayed clockwise in alphabetical order (forming the first layer). A group of “frequently-used words” associated with the letters being typed forms an inner ring (and is defined as the second layer). This second layer of words is constantly updated as the user continues to enter text. The third layer is a central “open” portion of the interface and forms the typing space—the “text box” that will be filled as the user continues to type. A separate row of control / function keys (including mode-switching for upper case vs. lower case, numbers and punctuation) is positioned adjacent to the three-layer on-screen keyboard display.
Owner:SIEMENS CORP

Chinese domain term recognition method based on mutual information and conditional random field model

The invention discloses a Chinese domain term recognition method based on mutual information and a conditional random field model. The Chinese domain term recognition method includes the following steps: (1) gathering domain text corpus and marking all the punctuations, spaces, numbers, ASSCII (American Standard Code for Information Interchange) characters and characters except Chinese characters in the corpus; (2) setting character strings and computing the mutual information values of the character strings, (3) computing the left comentropy and the right comentropy of every character string, (4) defining character string evaluation function, setting evaluation function threshold, computing the evaluation function values of every character string, determining that every character string is a word, comparing in sequence the evaluation function value of the former character with the evaluation function value of the latter character in the character string and segmenting character meaning character strings one by one, (5) utilizing conditional random fields to train a conditional random field model and recognizing domain terms with the conditional random field model. When the Chinese domain term recognition method is used to recognize terms, the data sparsity of legitimate terms is overcome, the amount of calculation of conditional random fields is reduced, and the accuracy of the Chinese domain term recognition is improved.
Owner:SHANGHAI UNIV

Decoding-Time Prediction of Non-Verbalized Tokens

Non-verbalized tokens, such as punctuation, are automatically predicted and inserted into a transcription of speech in which the tokens were not explicitly verbalized. Token prediction may be integrated with speech decoding, rather than performed as a post-process to speech decoding.
Owner:MULTIMODAL TECH INC

Method for guiding text-to-speech output timing using speech recognition markers

InactiveUS7010489B1More natural and realistic playbackSpeech synthesisSpeech soundSpeech output
A method for guiding text-to-speech output timing with speech recognition markers can include the following steps. First, tokens can be retrieved in a TTS system. The tokens can include words, phrase markers, punctuation marks and meta-tags. Second, phrase markers can be identified among the retrieved tokens. Third, words can be identified among the retrieved tokens. Fourth, the TTS system can TTS play back the identified words. Finally, during the TTS playback of the words, the TTS system can pause in response to the identification of the phrase markers.
Owner:IBM CORP

Content Recommendation System using a Neural Network Language Model

The present disclosure relates to applying techniques similar to those used in neural network language modeling systems to a content recommendation system. For example, by associating consumed media content to words of a language model, the system may provide content predictions based on an ordering. Thus, the systems and techniques described herein may produce enhanced prediction results for recommending content (e.g. word) in a given sequence of consumed content. In addition, the system may account for additional user actions by representing particular actions as punctuation in the language model.
Owner:GOOGLE LLC

Interface for processing of an alternate symbol in a computer device

Described herein is a computer-implemented system and method for processing one or more alternate symbols associated or linked to a base symbol. A base symbol is a symbol to which at least one alternate symbol is linked. A base symbol commonly appears on a key of a keyboard or a display or a keyboard, or is a handwritten symbol recognized by handwriting entry software. Examples of alternate symbols are accented characters and punctuation marks that do not appear on a keyboard or are not commonly recognized by a handwriting entry program, and short symbol sequences. An example of a common short symbol sequence is an emoticon used in e-mail messages to convey tone or feelings. An example of a computer device that may embody the system or method is a hand-held computing device.
Owner:QUALCOMM INC

Tokenizer for a natural language processing system

The present invention is a segmenter used in a natural language processing system. The segmenter segments a textual input string into tokens for further natural language processing. In accordance with one feature of the invention, the segmenter includes a tokenizer engine that proposes segmentations and submits them to a linguistic knowledge component for validation. In accordance with another feature of the invention, the segmentation system includes language-specific data that contains a precedence hierarchy for punctuation. If proposed tokens in the input string contain punctuation, they can illustratively be broken into subtokens based on the precedence hierarchy.
Owner:MICROSOFT TECH LICENSING LLC

System and method for tokenization of text

The present invention pertains to a system and method for the tokenization of text. The featurizer may be configured to receive input text and convert the input text into tokens. According to one aspect of the invention, the tokens may include only one type of character, the characters selected from the group consisting of letters, numbers, and punctuation. The tokenizer may also include a classifier. The classifier may be configured to receive the tokens from the featurizer. Furthermore, the classifier may be configured to analyze the tokens received from the featurizer to determine if the tokens may be input into a predetermined classification model using a preclassifier. If one of the tokens passes the preclassifier, then the token is classified using the predetermined classification model. Additionally, according to a first aspect of the invention, the tokenizer may also include a finalizer. The finalizer may be configured to receive the tokens and may be configured to produce a final output.
Owner:NUANCE COMM INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products