Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

35 results about "Morphological parsing" patented technology

Morphological parsing, in natural language processing, is the process of determining the morphemes from which a given word is constructed. It must be able to distinguish between orthographic rules and morphological rules. For example, the word 'foxes' can be decomposed into 'fox' (the stem), and 'es' (a suffix indicating plurality).

Method for synthesizing a self-learning system for extraction of knowledge from textual documents for use in search

The invention relates to computer science, information-search and intelligent systems, and can be used in developing information-search and other information and intelligent systems that operate on the basis of Internet. The invention provides the possibility of automatic creation of knowledge by extraction of knowledge from textual documents in electronic form in different languages; intelligent processing of textual information and users' requests to extract knowledge in any foreign language. The claimed method provides a mechanism of self-learning in the form of a stochastically indexed system of artifical intelligence, providing automatic instruction of the system in rules of grammatical and semantic analysis. The method includes creating databases of stochastically indexed dictionaries, tables of indices of linguistic texts and knowledge bases of morphological analysis; performing morphological and syntactical analysis, and also stochastic indexing of textual documents in respect to a given theme from the search system in a given language, and creating knowledge base of syntactical analysis. Stochastically indexed textual documents pertaining to the given theme are subjected to semantic analysis, and knowledge bases of semantic analysis. A user's request is compiled and transformed, in the stochastically indexed form, into a plurality of new requests that are equivalent to the original request; and stochastically indexed fragments of textual documents that comprise all word combinations of the transformed request are selected. A stochastically indexed structure is generated from the selected documents and basing on said structure by means of logical conclusion a brief reply of the system is generated. Relevancy of the obtained brief reply is checked by generating an interrogative sentence based on said reply, and by comparing said sentence with the request. When the user's request is identical to the obtained interrogative sentence, the decision is made that the brief reply of the system is identical to the request, and the reply is submitted to the user.
Owner:VLADIMIR VLADIMIROVICH NASYPNY

Morphological analyzer and analysis method

A morphological analyzer divides a received text into known words and unknown words, divides the unknown words into their constituent characters, analyzes known words on a word-by-word basis, and analyzes unknown words on a character-by-character basis to select a hypothesis as to the morphological structure of the received text. Although unknown words are divided into their constituent characters for analytic purposes, they are reassembled into words in the final result, in which any unknown words are preferably tagged as being unknown. This method of analysis can process arbitrary unknown words without requiring extensive computation, and with no loss of accuracy in the processing of known words.
Owner:OKI ELECTRIC IND CO LTD

Abstract generation method and program product

The present invention relates to an abstract generation method of generating an abstract from document information, such as an electronic patient chart, and a program product that implements the abstract generation method, and has an object to make it possible to display only main parts of sentences concisely and effectively. When document information (electronic patient chart, for instance) is inputted into a system, morphological analysis is performed on the document information and it is judged whether a part of a sentence matches the whole of another sentence. When a matching result is obtained, a partially matching character string is set as a simplified sentence candidate. On the other hand, when a matching result is not obtained, the sentence is set as a simplification candidate as it is. Note that even when the partially matching result is obtained, when the number of characters of the matching character string is less than M or when the number of morphemes thereof is less than N, the partially matching character string is not set as the simplified sentence candidate but the sentence is set as the simplification candidate as it is. Next, each simplification candidate containing a keyword is extracted from among generated simplification candidates and is set as a summary candidate. Then, an abstract is generated by marking each part of the input document corresponding to the summary candidate.
Owner:SANYO ELECTRIC CO LTD

System and method for differential document analysis and storage

Systems and methods for differential document analysis and storage are provided. Specifically, the system can be configured to perform one or more differential analyses on a set of documents to detect and measure changes in language across entire sets of documents of a similar type, as well as changes in language in the specific objects (e.g., document sections, paragraphs, clauses) of the documents. The system comprises three primary components: document parsing, textual near-duplicate detection, and morphological analysis. The document parsing component breaks documents down into objects and creates indexes for each full document and components of the document. These indexes enable documents and objects to be compared for similarity using the near-duplicate detection component, which implements various similarity analysis algorithms. The morphological analyses component is configured to search the documents for particular language or sections and compare documents in which the searched language is present.
Owner:PLANET DATA SOLUTIONS INC

Information Processing Apparatus, Informaton Processing Method, Program, and Recording Medium

Disclosed herein is an information processing apparatus for analyzing text data, including: acquisition means for acquiring the text data; morpheme information registration means for registering morpheme information for use in analyzing the text data morphologically; morphological analysis means for analyzing the text data acquired by the acquisition means; compound word processing rule registration means for registering compound word processing rules for creating a compound word not registered in the morpheme information registration means; and compound word processing means, by use of the compound word processing rules registered in the compound word processing rule registration means, for combining the morphemes included in the morphological analysis information created by the morphological analysis means, into the compound word not registered in the morpheme information registration means and detecting the created compound word.
Owner:SONY CORP

Program recommending apparatus and method

The invention relates to a program recommendation apparatus and method. The apparatus includes: a module configured to extract category information and program abstracts of programs contained in an electronic program guide, extract program-specific terms from the program abstracts by morphological analysis and combine the category information and the program-specific terms to generate category-added terms; a module configured to analyze a history of programs viewed by a user based on the generated category-added terms to generate a preference vector indicating user's preferences for programs; a module analyzing the program abstracts based on the category-added terms to generate broadcast program vectors; a module generating a relevant term model for the category-added terms; a module calculating similarities between the preference vector and each of the broadcast program vectors based on the generated relevant term model; and a module outputting programs having the calculated similarities satisfying a predetermined condition as recommended programs matching with the user's preferences.
Owner:KK TOSHIBA

Optimized model for rapid identification of transgenic soybeans based on morphological analysis

The invention provides an optimized model for rapid identification of transgenic soybeans based on morphological analysis. According to the invention, firstly, an establishment method of the optimizedmodel for rapid identification of the transgenic soybean based on morphological analysis is provided; for whole soybeans, spectral information under a characteristic wave band of 9403-5438cm<-1> is selected, the spectrum is preprocessed by adopting a second derivative, and a PLS-DA model is established by adopting a partial least squares-discrimination method; and for powdery soybeans, spectral information under a characteristic waveband of 7505-4597cm<-1> is selected, the spectrum is preprocessed by adopting vector normalization and a first derivative, and a PLS-DA model is established by adopting a partial least squares-discrimination method. According to the invention, the transgenic soybeans are identified by combining the near-infrared spectrum with a discriminant analysis method, and the identification accuracy of the discrimination model can be improved by selecting the sample form, the wavelength range and the spectrum pretreatment method, so that the optimal model is selectedto be applied to actual production.
Owner:NAT INST FOR NUTRITION & HEALTH CHINESE CENT FOR DISEASE CONTROL & PREVENTION

Selection method and system of recognition unit for Uygur language voice recognition

ActiveCN103065632AAlleviate the problem of too many out-of-set wordsImprove speech recognition rateSpeech recognitionSpeech soundMorphological parsing
The invention relates to a selection method and a system of a recognition unit for Uygur language voice recognition. The method includes: corresponding text corpora are collected or prepared for to-be-recognized voice; different terms are picked out from the text corpora; the different terms are input into a morphological analyzer, corresponding term splitting results are obtained if analysis is successful, term splitting based on a tail dropping algorithm is carried out on the terms if the analysis is unsuccessful so as to obtain the splitting results, and a corresponding stem and supplementary elements of each term are obtained according to the splitting results; and the terms in the text corpora are mapped into the stems and the supplementary elements, and the high-frequency stems and supplementary elements are picked out to be used as a dictionary unit. According to the selection method and the system of the recognition unit for the Uygur language voice recognition, the Uygur language terms are split into stems and supplementary elements according to morphological change rules of Uygur language, the stems and the supplementary elements are selected to be used as the recognition unit, and therefore the problem that excessive foreign words are collected in the recognition system is solved, and recognition rates of the system are improved.
Owner:INST OF ACOUSTICS CHINESE ACAD OF SCI +1
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products