Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

239 results about "Text database" patented technology

A text database is a system that maintains a (usually large) text collection and provides fast and accurate access to it. These two goals are relatively orthogonal, and both are critical to profit from the text collection. Traditional database technologies are not well suited to handle text databases.

Method and apparatus for accessing multi-dimensional mapping and information

A method and apparatus for providing an interactive mapping and panoramic imaging application for utilization by a computer user is provided. A plurality of panoramic images are stored on a panoramic database, each panoramic image corresponding to a geographical location. A panoramic image is displayed on a screen and is navigable in response to input by the user. The panoramic image has embedded hotspots corresponding to selected panoramic images of geographically adjacent locations. Also displayed on the screen, simultaneously with the panoramic image, is a map image corresponding with the panoramic image. The map image is stored on a map database, and the map is navigable in response to input by the user. The map has embedded hotspots corresponding to the plurality of panoramic images. Also displayed on the screen, simultaneously with the panoramic image and the map image, is a text panel displaying textual information corresponding to the displayed panoramic image. The text panel is capable of receiving textual input from the user for activating a search of a text database having a plurality of text data corresponding to the plurality of panoramic images. The displayed panoramic image, the displayed map image and the displayed textual information are updated in response to the user activating a hotspot, such that the displayed panoramic image, the displayed map image and the displayed textual information correspond to one another.
Owner:TRUE VOYAGER

Method and apparatus for partitioning a database upon a timestamp, support values for phrases and generating a history of frequently occurring phrases

A method and apparatus for mining text databases, employing sequential pattern phrase identification and shape queries, to discover trends. The method passes over a desired database using a dynamically generated shape query. Documents within the database are selected based on specific classifications and user defined partitions. Once a partition is specified, transaction IDs are assigned to the words in the text documents depending on their placement within each document. The transaction IDs encode both the position of each word within the document as well as representing sentence, paragraph, and section breaks, and are represented in one embodiment as long integers with the sentence boundaries. A maximum and minimum gap between words in the phrases and the minimum support all phrases must meet for the selected time period may be specified. A generalized sequential pattern method is used to generate those phrases in each partition that meet the minimum support threshold. The shape query engine takes the set of phrases for the partition of interest and selects those that match a given shape query. A query may take the form of requesting a trend such as "recent upwards trend", "recent spikes in usage", "downward trends", and "resurgence of usage". Once the phrases matching the shape query are found, they are presented to the user.
Owner:GLOBALFOUNDRIES INC

Method and apparatus for accessing multi-dimensional mapping and information

A method and apparatus for providing an interactive mapping and panoramic imaging application for utilization by a computer user is provided. A plurality of panoramic images are stored on a panoramic database, each panoramic image corresponding to a geographical location. A panoramic image is displayed on a screen and is navigable in response to input by the user. The panoramic image has embedded hotspots corresponding to selected panoramic images of geographically adjacent locations. Also displayed on the screen, simultaneously with the panoramic image, is a map image corresponding with the panoramic image. The map image is stored on a map database, and the map is navigable in response to input by the user. The map has embedded hotspots corresponding to the plurality of panoramic images. Also displayed on the screen, simultaneously with the panoramic image and the map image, is a text panel displaying textual information corresponding to the displayed panoramic image. The text panel is capable of receiving textual input from the user for activating a search of a text database having a plurality of text data corresponding to the plurality of panoramic images. The displayed panoramic image, the displayed map image and the displayed textual information are updated in response to the user activating a hotspot, such that the displayed panoramic image, the displayed map image and the displayed textual information correspond to one another.
Owner:TRUE VOYAGER

Question-answering method, system, and program for answering question input by speech

Disclosed is a question answering method for answering to a question by using in conjunction with a text database storing text data and a speech database storing speech data. In this method, a speech signal of the question is received. A speech recognition for the speech signal is performed to obtain a speech recognition result including a recognition accuracy evaluation value. One of the text database and the speech database is selected by comparing the recognition accuracy evaluation value with a threshold value. Then, the selected database is searched for a search result. An answer to the question is generated based on the search result.
Owner:KK TOSHIBA

Method and apparatus for discovering knowledge gaps between problems and solutions in text databases

A method (and system) of determining a knowledge gap between a first database containing a set of problems records and a second database containing solutions documents, includes developing a set of clusters of the problems records of the first database, where each cluster has a centroid, developing a dictionary having entries based on the problems records in the first database, developing a vector space correlated to the solutions documents in the second database, where the vector space is based on the dictionary entries, developing a listing of distances between the cluster centroids and the vector space, and determining a knowledge gap for each cluster.
Owner:PENDRAGON NETWORKS

Pattern search method, pattern search apparatus and computer program therefor, and storage medium thereof

A fast search is performed of a large text database, while suppressing an increase in the data size of the data structure used for the process. A pattern search method for searching a target character string for a desired pattern includes: a range search step and a character string extraction step. At the range search step, intermediate patterns are obtained by adding characters in order, one by one, from the last character of the pattern to the first, and a range is determined for a suffix array, which corresponds to the target character string, wherein the first character of each of the intermediate patterns is present. Then, at the character string extraction step, elements of the character string are designated that correspond to elements included in the range of the suffix array, and character string segments are extracted consisting of the same number of elements as the elements of the pattern and having the elements of the character string as their first characters.
Owner:IBM CORP

Method of search content enhancement

Whenever a document is going to be included into the textual database, a semantic binder is used to associate the document with one or more semantic nodes which are defined in a semantic taxonomy. When a search is performed, a search application looks through a semantic dictionary (which contains a table mapping queries to nodes on the semantic taxonomy) to see whether any corresponding semantic node can be found for the searchers query. If a match is found, the search application transforms the user's query into [“original query” OR “semantic node”] so that relevant documents, even they do not contain any user's keyword, can also be found in the database. The system binds semantic nodes arranged in a hierarchical structure of the taxonomy using a Log Analyzer which periodically looks through the system log for new queries and through textual indices for documents added to the database to generate the semantic dictionary and to bind the semantic nodes to the queries in the textual indices of the documents.
Owner:IBM CORP

Method for generating training data for medical text abbreviation and acronym normalization

A method for electronically generating high-quality feature vectors that can be used in connection with electronic data processing systems implementing Maximum Entropy or other statistical models to accurately normalize abbreviations in text such as medical records. An abbreviation database and a training text database are provided. The abbreviation database includes abbreviation data representative of abbreviations and associated expansions to be normalized. The training text database includes a corpus of text having expansions of the abbreviations to be normalized. The corpus of text is processed as a function of the abbreviation data to identify the expansions in the corpus of text. Context information describing the context of the text in which the expansions were identified is generated. A set of feature vectors is also stored. Each feature vector including the context information generated for the associated expansion identified in the corpus of text.
Owner:MAYO FOUND FOR MEDICAL EDUCATION & RES

System and Method for Automatically Classifying Text using Discourse Analysis

InactiveUS20150081277A1Accurate analysisEasily and effectively organize information about particular discoursesNatural language data processingSpecial data processing applicationsText databaseSubject matter
The present invention is a textual discourse analysis with the purpose of analyzing and visualizing of complex text. The invention operates and functions based on conceptual relations, both logical and axiological, among grammatical components of a sentence and across sentences of a given text. Thus, three basic grammatical units, namely Agent / s, Topic / s and Object / s, have been utilized, in order to build a tripartite structure. Discursive analysis of text based on this invention provides a novel approach for automatically classifying positions of Agent / s within particular textual databases vis-a-vis to Topic / s and Object / s, and vice versa. Therefore, as illustrated above, a computer program method of the present invention starts by creating a conceptual map of a given text, classifying semantic macro-areas, positions of Agents, Topics and objects and then correlates such positions with other components in the database. In the next step of the invention, the computer assigns a reference system, provided for analyzing denotative content of discourse. The system is based upon a database of terms of words and phrases and their associated denotative as well as connotative meanings followed by generation of a database, axiologically categorizing subject-matters.
Owner:BEHI KAMBIZ

Text retrieval method and device

The embodiment of the invention provides a text retrieval method and device. The text retrieval method includes the steps that an original text input by a user is acquired; retrieval words are acquired from the original text; according to the retrieval requirement of the user, the retrieval words are filtered to acquire keywords; the keywords are combined, texts in a text database are retrieved according to the combined keywords, and at least one retrieval text is acquired; the retrieval texts are displayed in a relevancy inverted order mode, and the keywords are highlighted in the retrieval texts, wherein relevancy is used for representing the relevancy degree of the original text and the retrieval texts. Due to the fact that the keywords are acquired by filtering the retrieval words according to the retrieval requirement of the user, the probability that the keywords are invalid words is reduced, and the retrieval requirement is better met compared with the manner that the retrieval words are directly acquired from the original text, the retrieval texts acquired through retrieval by the application of the combined keywords can well meet the retrieval requirement, and therefore retrieval accuracy is improved.
Owner:STATE GRID CORP OF CHINA +3

Method of finding answers to questions

A method and a system for automatically finding one or more answers to a natural language question in a computer stored natural language text database is disclosed. The natural language text database has been analyzed with respect to syntactic functions of constituents, lexical meaning of word tokens and clause boundaries, and the natural language question comprises a question clause. A computer readable representation of the question clause is analyzed with respect to syntactic functions of its constituents and the lexical meaning of its word tokens. In response to the analysis a set of conditions for a clause in the natural language text database to constitute an answer to the question clause is defined. The conditions relate to the syntactic functions of constituents and the lexical meaning of word tokens in the clause. Furthermore, clauses that satisfy said conditions are identified in the natural language text database, and answers to the question clause is returned by means of the identified clauses that matches the conditions.
Owner:ESSENCIENT LTD

Translation system, translation method, and program

A translation system comprises an image reading unit that optically reads an image of a manuscript and generates image data; an inputting unit that inputs a translation target language; a character recognizing unit that generates an original text by performing a character recognition process on the image data generated by the image reading unit; a translation text database in which are associated and stored translation texts, language identifiers which specify the languages in which the translation texts are written, and document identifiers which specify the contents of the translation texts; an extracting unit that extracts the document identifier which specifies the content of the original text from the original text; a searching unit that searches the translation text database for a translation text associated with a document identifier identical to the document identifier extracted from the original text by the extracting unit and a language identifier identical to the language identifier which specifies the translation target language input by the inputting unit; and an outputting unit that outputs the translation text searched by the searching unit.
Owner:FUJIFILM BUSINESS INNOVATION CORP

Mobile phone advertising method based on user consumption feature vector

The invention discloses a mobile phone advertising method based on a user consumption feature vector, and belongs to the field of mobile phone data mining and analyzing. The method includes the steps of establishing a user keyword to represent a vector according to browsing contents of a user mobile internet; establishing a user consumption feature vector of a corresponding user according to the browsing contents of the mobile internet; extracting a keyword from an advertisement experience text database and setting a keyword weight, and constructing a characteristic matrix for each user feature vector; computing each component value of the user consumption feature vector of each user; setting the mapping relations between a user consumption tag and each component of the consumption feature vector of the user within different threshold ranges; labeling a corresponding consumption tag for each user and then conducting advertising according to each component of the current consumption feature vector of the user and the mapping relation. The mobile phone advertising method can improve precision of advertising, reduces manual interference, and effectively lowers platform running cost.
Owner:上海师域信息科技有限公司

Similarity-based text duplicate checking method and system

The invention provides a similarity-based text duplicate checking method. The method includes the steps of: preprocessing to-be-compared text; screening out all candidate text, of which coarse-grainedsimilarity with the to-be-compared text is greater than a similar-candidate-set threshold value, from a text database, and forming a similar candidate set; using sentences as segmentation units to segment the to-be-compared text and the candidate text; determining fine-grained similarity through calculating TFIDF similarity, LDA similarity, doc2vec similarity and word2vec similarity of the candidate text and the to-be-compared text; and screening out candidate text of which fine-grained similarity exceeds a similarity determination threshold value, and determining the same as similar text ofthe to-be-compared text to realize duplicate checking. The invention also provides a similarity-based text duplicate checking system for realizing the above-mentioned method.
Owner:COMP NETWORK INFORMATION CENT CHINESE ACADEMY OF SCI

Method and system for automatic analysis of hotspot subject propagation process in the internet

The invention relates to a method which can automatically analyze the propagation process of an internet hot subject, as well as a system thereof, and belongs to the intelligent information processing technology. As the textual information on the internet gradually increases, an important subject in the text mining and information retrieval field is to automatically detect and analyze the hot or sensitive subject from large text database, the subject has great use value. The invention utilizes the natural language processing approach to automatically analyze the propagation process of the text document in the given hot or sensitive subject; after the text documents in the subject are arranged in a time order, the reference origin of the current text document is searched by utilizing the pattern matching method from the first text document, if the reference origin isn't found, the reference origin is further judged by utilizing the text document similarity comparative method, at the same time, the corresponding source text document is obtained. At last, the reference relation is intuitively presented to the user in a graphic mode. The method is widely applicable to internet intelligent information processing, public opinion analyzing and monitoring, etc.
Owner:PEKING UNIV

Commodity public sentiment analysis method and system based on user evaluation information

The invention relates to a data mining and public sentiment analysis technology, and discloses a commodity public sentiment analysis method and system based on user evaluation information so as to quickly and effectively find the emotion of a user on a purchased commodity and carry out commodity public sentiment analysis on the basis of the emotion. The method comprises the following steps: a: carrying out data crawling on an electronic commerce platform to obtain the basic information of a commodity and the commodity evaluation data of a user, carrying out classification writing on the basic information of the commodity and the commodity evaluation data of the user into an evaluation text database; b: preprocessing the commodity evaluation data, and generating a feature vector which can be used for further analysis; c: extracting a typical feature in the feature vector, and analyzing the emotion of the user on the typical feature and the overall mood of the user on a product; and d: carrying out visualized presentation on the analysis result of an emotion analysis module on a Web end. The method is suitable for analyzing the comment data of the electronic commerce platform.
Owner:SICHUAN CHANGHONG ELECTRIC CO LTD

Interactive content generation method and device, computer equipment and storage medium

The invention discloses an interactive content generation method and device, a computer equipment and a storage medium, and the method comprises the following steps: receiving the current round information of the carrying session identifier sent by the client, and obtaining at least one current intention parameter based on the current round information; obtaining at least one historical intentionparameter corresponding to the session identifier; analyzing the at least one current intention parameter and the at least one historical intention parameter by adopting a preset reinforcement learning model to obtain a target intention; obtaining a corresponding target intention template based on the target intention; querying a retrieval text database based on each target parameter, and obtaining a retrieval text corresponding to each target parameter; and obtaining current reply information corresponding to each retrieval text, and pushing the current reply information to the client according to a parameter priority sequence. The method can ensure that the chat robot generates accurate reply information in time and replies the reply information to the client.
Owner:ONE CONNECT SMART TECH CO LTD SHENZHEN

Original article influence analysis system based on collection of media information

The invention discloses an original article influence analysis system based on collection of media information. The system comprises a media article data acquisition module, an update module for article page views, comments and likes, an original article clustering analysis module and an original article influence calculation module. The media article data acquisition module is used for acquiringarticle information issued by a media platform in the internet, extracting content text from article information and storing the content text. The update module for article page views, comments and likes is used for obtaining dissemination feedback data of the article information and storing the dissemination feedback data of the article information. The original article clustering analysis moduleis used for clustering all content texts stored in a text database in order to obtain original articles. The original article influence calculation module is used for calculating influence of the original articles in the media platform and used for calculating influence of the original articles in all media platforms. The invention further discloses an original article influence analysis method based on collection of media information. By quantitative analysis of original article influence, analysis efficiency is high and accuracy of analysis is great.
Owner:上海市互联网信息办公室 +1

Automatic-reply instant messeging system and method thereof

An automatic-reply instant messaging system and a method thereof are provided. First, an identity classification of a contactor is previously classified, when the contactor sends a message to a computer platform, a recognition unit and a message meaning identification unit in the computer platform respectively recognizes the identity classification of the contactor and identifies the meaning of the message, and appropriate texts are selected from a text database to constitute a reply message to reply to the contactor. Therefore, when in some occasion a user on the computer platform is unwilling to or incapable of replying to the message in person, the automatic-reply instant messaging system and the method thereof are used to reply with an appropriate message, so as to avoid misunderstanding between the user and the contactor due to busyness or bad mood on such occasion.
Owner:INVENTEC CORP

Activated voiceprint password security control method and system under public background noises

The invention discloses an activated voiceprint password security control method and system under public background noise. The steps of the method include: activating a recognition module to perform real-time voice monitoring in a common environment, and determining whether a voice signal needs to activate a sound source location pickup module; The sound source localization pickup module receives the sound source data of the interactive target, and estimates the time delay difference of arrival; for the time delay difference of arrival, multiple hyperboloids are constructed in combination with the position of the microphone array to determine the position of the sound source of the interactive target, and obtain the sound source of the interactive target The voice signal of the source; the preprocessing module preprocesses the voice signal of the interactive target sound source; the voiceprint password recognition module extracts the characteristic parameters of the voice signal, and matches the characteristic parameters with the instruction text of the recording library. The present invention solves the power consumption of the system, locates and picks up the increment of the voice burst under the low signal-to-noise ratio of other human voices, echoes, and reverberations, and solves the above-mentioned current problems through three safe voice recognition and control operations.
Owner:SOUTH CHINA UNIV OF TECH

Method for identifying conference speech as text, electronic device and storage medium

The invention relates to a method for identifying the conference speech as a text. The method comprises steps that the to-be-identified conference speech is converted through the speech identificationtechnology into the text as an initial speech identification text; the initial speech identification text is matched with a preset text database to obtain the matched speech identification text; a speech identification text draft in an editable status is generated according to the matched speech identification text; after receiving editing operation on the speech identification text draft is detected, a speech identification text in an uneditable state is generated according to the speech identification text after editing operation as a final speech identification text. The invention furtherprovides an electronic device taking the conference speech as the text, and a storage medium. The method is advantaged in that after preliminary identification of the to-be-identified speech, first matching with the preset text database is performed, second confirmation is performed manually, correctness of the text output content is effectively guaranteed, the proofreader workload of the conference content is reduced, and efficiency is improved.
Owner:PING AN TECH (SHENZHEN) CO LTD

System and method of determining phrasing in text

A system analyzes text, determines phrasing and, in an exemplary embodiment, reformats the text to establish optimal spacing and related features for readability, reader comprehension and publishing economies. A neural network uses a library of text data to analyze text and determine phrases. Formatting emphasizes phrases using one or more of a plurality of techniques including word spacing, text darkness and controlling line breaks.
Owner:LANGUAGE TECH

Method of item-all-weighted positive or negative association model mining between text terms and mining system applied to method

The invention discloses a method of item-all-weighted positive or negative association model mining between text terms and a mining system applied to the method. The method comprises the following steps of preprocessing by using a Chinese text preprocessing module to establish a text database and a feature word item library; mining item-all-weighted feature word candidate item sets from the text database by utilizing a feature word frequent item set and negative item set mining implementation module, calculating a weight dimension ratio, and cutting out uninteresting item sets by adopting a multi-interestingness threshold value pruning strategy to obtain an interesting item-all-weighted feature work frequent item set and negative item set model; mining an effective item-all-weighted positive or negative association rule model from frequent item sets and negative item sets by utilizing an item-all-weighted positive or negative association rule mining implementation module between terms, and outputting the mined positive or negative association rule model to a user by utilizing an item-all-weighted association model result display module between terms. By applying the method and the system, unnecessary frequent item sets, negative item sets and association rule models can be greatly reduced, Chinese feature word association rule mining efficiency is improved and a high-quality association model between Chinese terms is obtained.
Owner:GUANGXI UNIVERSITY OF FINANCE AND ECONOMICS

Method and apparatus for accessing multi-dimensional mapping and information

A method and apparatus for providing an interactive mapping and panoramic imaging application for utilization by a computer user is provided. A plurality of panoramic images are stored on a panoramic database, each panoramic image corresponding to a geographical location. A panoramic image is displayed on a screen and is navigable in response to input by the user. The panoramic image has embedded hotspots corresponding to selected panoramic images of geographically adjacent locations. Also displayed on the screen, simultaneously with the panoramic image, is a map image corresponding with the panoramic image. The map image is stored on a map database, and the map is navigable in response to input by the user. The map has embedded hotspots corresponding to the plurality of panoramic images. Also displayed on the screen, simultaneously with the panoramic image and the map image, is a text panel displaying textual information corresponding to the displayed panoramic image. The text panel is capable of receiving textual input from the user for activating a search of a text database having a plurality of text data corresponding to the plurality of panoramic images. The displayed panoramic image, the displayed map image and the displayed textual information are updated in response to the user activating a hotspot, such that the displayed panoramic image, the displayed map image and the displayed textual information correspond to one another.
Owner:TRUE VOYAGER

Text semantic analysis method, text semantic analysis terminal and storage medium

ActiveCN107704453ACorrect understandingCorrect usabilityMetadata text retrievalSemantic analysisText databaseSyntax error
The invention provides a text semantic analysis method, a text semantic analysis terminal and a storage medium. The method comprises the steps of receiving text information input by a user, dividing acharacter string contained in the text information into independent words, and obtaining a word sequence; performing grammar analysis on the divided word sequence, and judging whether a grammar errorexists in the word sequence or not; and converting the words contained in the word sequence into corresponding metadata, calculating semantic similarity between the metadata and feature item weights,extracting a keyword feature item of the word sequence, obtaining a semantic tagged text corresponding to each word, establishing a text database, performing matching in the text database in sequenceaccording to an arrangement sequence of the words in the word sequence to obtain the semantic tagged texts, and performing output display on text information synthesized after sorting. The information is fed back to the user in a metadata format, so that the user can obtain the information fed back by the semantic analysis terminal conveniently and can correctly understand and use the information.
Owner:深圳市前海众兴科研有限公司

Article generation method and device, computer equipment and storage medium

The invention relates to an article generation method and device, computer equipment and a storage medium and relates to the technical field of internet. The method comprises the following steps of: ;the method comprises the steps of obtaining an article generation template; wherein the article generation template comprises paragraph structure information of a to-be-generated article and target text labels corresponding to paragraphs in the to-be-generated article; wherein the paragraph structure information is used for indicating a sorting mode of each paragraph included in a to-be-generatedarticle, and the target text label corresponding to each paragraph is used for indicating the type of text content included in each paragraph in the to-be-generated article; querying from a text database to obtain target texts respectively corresponding to the target text labels corresponding to the paragraphs, wherein the text database stores a corresponding relationship between the texts and the text labels; and finally, generating an article according to the inquired target character and paragraph structure information. By adopting the method provided by the invention, the quality of the generated article can be improved.
Owner:浙江大搜车软件技术有限公司

Method and system for improving data quality in large hyperlinked text databases using pagelets and templates

A computing system and method clean a set of hypertext documents to minimize violations of a Hypertext Information Retrieval (IR) rule set. Then, the system and method performs an information retrieval operation on the resulting cleaned data. The cleaning process includes decomposing each page of the set of hypertext documents into one or more pagelets; identifying possible templates; and eliminating the templates from the data. Traditional IR search and mining algorithms can then be used to search on the remaining pagelets, as opposed to the original pages, to provide cleaner, more precise results.
Owner:IBM CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products