Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

524 results about "Co-occurrence" patented technology

In linguistics, co-occurrence or cooccurrence is an above-chance frequency of occurrence of two terms (also known as coincidence or concurrence) from a text corpus alongside each other in a certain order. Co-occurrence in this linguistic sense can be interpreted as an indicator of semantic proximity or an idiomatic expression. Corpus linguistics and its statistic analyses reveal patterns of co-occurrences within a language and enable to work out typical collocations for its lexical items. A co-occurrence restriction is identified when linguistic elements never occur together. Analysis of these restrictions can lead to discoveries about the structure and development of a language.

Predictive modeling of consumer financial behavior using supervised segmentation and nearest-neighbor matching

Predictive modeling of consumer financial behavior, including determination of likely responses to particular marketing efforts, is provided by application of consumer transaction data to predictive models associated with merchant segments. The merchant segments are derived from the consumer transaction data based on co-occurrences of merchants in sequences of transactions. Merchant vectors represent specific merchants, and are aligned in a vector space as a function of the degree to which the merchants co-occur more or less frequently than expected. Consumer vectors are developed within the vector space, to represent interests of particular consumers by virtue of relative vector positions of consumer and merchant vectors. Various techniques, including clustering, supervised segmentation, and nearest-neighbor analysis, are applied separately or in combination to generate improved predictions of consumer behavior.
Owner:CALLAHAN CELLULAR L L C

Predictive modeling of consumer financial behavior using supervised segmentation and nearest-neighbor matching

Predictive modeling of consumer financial behavior, including determination of likely responses to particular marketing efforts, is provided by application of consumer transaction data to predictive models associated with merchant segments. The merchant segments are derived from the consumer transaction data based on co-occurrences of merchants in sequences of transactions. Merchant vectors represent specific merchants, and are aligned in a vector space as a function of the degree to which the merchants co-occur more or less frequently than expected. Supervised segmentation is applied to merchant vectors to form the merchant segments. Merchant segment predictive models provide predictions of spending in each merchant segment for any particular consumer, based on previous spending by the consumer. Consumer profiles describe summary statistics of each consumer's spending in the merchant segments, and across merchant segments. The consumer profiles include consumer vectors derived as summary vectors of selected merchants patronized by the consumer. Predictions of consumer behavior are made by applying nearest-neighbor analysis to consumer vectors, thus facilitating the targeting of promotional offers to consumers most likely to respond positively.
Owner:CALLAHAN CELLULAR L L C

Context vector generation and retrieval

A system and method for generating context vectors for use in storage and retrieval of documents and other information items. Context vectors represent conceptual relationships among information items by quantitative means. A neural network operates on a training corpus of records to develop relationship-based context vectors based on word proximity and co-importance using a technique of “windowed co-occurrence”. Relationships among context vectors are deterministic, so that a context vector set has one logical solution, although it may have a plurality of physical solutions. No human knowledge, thesaurus, synonym list, knowledge base, or conceptual hierarchy, is required. Summary vectors of records may be clustered to reduce searching time, by forming a tree of clustered nodes. Once the context vectors are determined, records may be retrieved using a query interface that allows a user to specify content terms, Boolean terms, and / or document feedback. The present invention further facilitates visualization of textual information by translating context vectors into visual and graphical representations. Thus, a user can explore visual representations of meaning, and can apply human visual pattern recognition skills to document searches.
Owner:FAIR ISAAC & CO INC

Method for data and text mining and literature-based discovery

Text searching is achieved by techniques including phrase frequency analysis and phrase-co-occurrence analysis. In many cases, factor matrix analysis is also advantageously applied to select high technical content phrases to be analyzed for possible inclusion within a new query. The described techniques may be used to retrieve data, determine levels of emphasis within a collection of data, determine the desirability of conflating search terms, detect symmetry or asymmetry between two text elements within a collection of documents, generate a taxonomy of documents within a collection, and perform literature-based problem solving. (This abstract is intended only to aid those searching patents, and is not intended to limit the disclosure of claims in any manner.)
Owner:NAVY UNITED STATES OF AMERICA AS REPRESENTED BY THE SECY OF THE

Method and apparatus for recommendation engine using pair-wise co-occurrence consistency

The invention, referred to herein as PeaCoCk, uses a unique blend of technologies from statistics, information theory, and graph theory to quantify and discover patterns in relationships between entities, such as products and customers, as evidenced by purchase behavior. In contrast to traditional purchase-frequency based market basket analysis techniques, such as association rules which mostly generate obvious and spurious associations, PeaCoCk employs information-theoretic notions of consistency and similarity, which allows robust statistical analysis of the true, statistically significant, and logical associations between products. Therefore, PeaCoCk lends itself to reliable, robust predictive analytics based on purchase-behavior.
Owner:FAIR ISAAC & CO INC

Automated creation of phonemic variations

A method of generating a phonemic transcription for a word using a computer system is described. In one embodiment, an existing pronunciation generation program is applied to generate an initial transcription. The initial transcription can then be evaluated to identify likely bad pronunciations by looking for phonotactically impossible co-occurrences. Additionally, one or more rules can be applied to generate additional phonemic transcriptions. The resulting transcriptions may be used in place of the initial transcription and / or in addition to the initial transcription. Additionally, when multiple transcriptions result, the transcriptions are ordered according to preference and / or likelihood of use.
Owner:MICROSOFT TECH LICENSING LLC

Suggesting targeting information for ads, such as Websites and/or categories of Websites for example

One or more keywords and / or information about one or more properties may be accepted, and a set of one or more taxonomy categories may be determined using at least some of the keyword(s) and / or property information. Each of the taxonomy categories may be a vertical category, and at least one of the set of one or more determined taxonomy categories may be presented to an advertising user as an ad targeting suggestion. Each of the taxonomy categories may have at least one property (e.g., Web document), that participates in an advertising network, associated with it. An advertiser selection of a suggested taxonomy category may be accepted, the serving of an ad of the advertiser may be targeted to each of the at least one property (e.g., Web document) associated with the selected suggested taxonomy category. An offer for association with the selected suggested taxonomy category may be provided by the advertiser. A set of one or more properties (e.g., Web documents) may be determined using at least some of the determined one or more taxonomy categories. Such properties (perhaps along with viewing information) may be presented to an advertising user as an ad targeting suggestion. A suggested property (e.g., Web document) may be selected by a user. If so, the serving of an ad of the advertiser may be targeted to the selected suggested property. An offer for association with the selected suggested document may be accepted from the advertiser. The set of one or more taxonomy categories may be performed by determining a set of one or more semantic clusters (e.g., term co-occurrence clusters) using the accepted keyword(s) and / or property information, and determining a set of one or more taxonomy categories using at least some of the one or more semantic clusters.
Owner:GOOGLE LLC

Categorizing objects, such as documents and/or clusters, with respect to a taxonomy and data structures derived from such categorization

A Website may be automatically categorized by (a) accepting Website information, (b) determining a set of scored clusters (e.g., semantic, term co-occurrence, etc.) for the Website using the Website information, and (c) determining at least one category (e.g., a vertical category) of a predefined taxonomy using at least some of the set of clusters. A semantic cluster (e.g., a term co-occurrence cluster) may be automatically associated with one or more categories (e.g., vertical categories) of a predefined taxonomy by (a) accepting a semantic cluster, (b) identifying a set of a one or more scored concepts using the accepted cluster, (c) identifying a set of one or more categories using at least some of the one or more scored concepts, and (d) associating at least some of the one or more categories with the semantic cluster. A property (e.g., a Website) may be associated with one or more categories (e.g., vertical categories) of a predefined taxonomy by (a) accepting information about the property, (b) identifying a set of a one or more scored semantic clusters (e.g., term co-occurrence clusters) using the accepted property information, (c) identifying a set of one or more categories (e.g., vertical categories) using at least some of the one or more scored semantic clusters, and (d) associating at least some of the one or more categories with the property.
Owner:GOOGLE LLC

System and method of context-specific searching in an electronic database

A user can search a database within a “context” that can be invoked with a context term, or name. The context is pre-defined by a human expert, or curator. The context definition is used in conjunction with a search term provided by the user to efficiently obtain search results that can otherwise be difficult to attain, such as detecting characteristics of data over multiple documents or other database items to infer trends, phenomena, characteristics, or other properties of the data. A context can be a category of items where each item has a distinct name. Search results are presented using the context based on the number of co-occurrences of the search term and terms relating to the context. In a preferred embodiment, the search results are presented as a list with documents having higher co-occurrences ordered at the top of the list. Context definition sets can be created and updated as an ongoing service to a subscriber. Several processing configurations are presented.
Owner:RGT UNIV OF CALIFORNIA

Method and apparatus for recommendation engine using pair-wise co-occurrence consistency

The invention, referred to herein as PeaCoCk, uses a unique blend of technologies from statistics, information theory, and graph theory to quantify and discover patterns in relationships between entities, such as products and customers, as evidenced by purchase behavior. In contrast to traditional purchase-frequency based market basket analysis techniques, such as association rules which mostly generate obvious and spurious associations, PeaCoCk employs information-theoretic notions of consistency and similarity, which allows robust statistical analysis of the true, statistically significant, and logical associations between products. Therefore, PeaCoCk lends itself to reliable, robust predictive analytics based on purchase-behavior.
Owner:FAIR ISAAC & CO INC

Method and apparatus for retail data mining using pair-wise co-occurrence consistency

The invention, referred to herein as PeaCoCk, uses a unique blend of technologies from statistics, information theory, and graph theory to quantify and discover patterns in relationships between entities, such as products and customers, as evidenced by purchase behavior. In contrast to traditional purchase-frequency based market basket analysis techniques, such as association rules which mostly generate obvious and spurious associations, PeaCoCk employs information-theoretic notions of consistency and similarity, which allows robust statistical analysis of the true, statistically significant, and logical associations between products. Therefore, PeaCoCk lends itself to reliable, robust predictive analytics based on purchase-behavior.
Owner:FAIR ISAAC & CO INC

Keyword Extracting Device

A keyword extracting device includes high-frequency term extracting means (30) for extracting high-frequency terms which are index terms having a great weight among the index terms in a document group (E) including a plurality of documents (D), the weight including evaluation on the level of an appearance frequency of each index term, clustering means (50) for clustering the high-frequency terms on the basis of a co-occurrence degree C. which is based on the presence / absence of the co-occurrence of each document with the index terms (w) in the document group (E) in each document, score calculating means (70) for calculating a score key(w) of each index term (w) such that a high score is given to the index term among the index terms (w) that co-occurs with the high-frequency term belonging to more clusters (g) and that co-occurs with the high-frequency term in more documents (D), and keyword extracting means (90) for extracting keywords on the basis of the scores. Accordingly, the keywords indicating a feature of a document group including a plurality of documents can be automatically extracted.
Owner:INTPROP BANK CORP (JP)

Text content auditing method and system based on sensitive word

The invention discloses a text content auditing method based on a sensitive word. The text content auditing method comprises the following steps of: receiving a text to be audited, carrying out parsing and word segmentation on the text to be audited, and obtaining all keywords in the text to be audited; according to all keywords, inquiring a preset sensitive word database, and obtaining the sensitive word in the text to be audited, wherein the sensitive word database comprises sensitive words and the synonyms or the homoionyms of the sensitive words; obtaining the co-occurrence keyword of the sensitive word in a preset text length, calculating the violation weight of the sensitive word and the co-occurrence keyword of the sensitive word, and judging whether the violation weight is greater than a preset violation threshold value or not; and if the violation weight is greater than the preset violation threshold value, proving that the text to be audited is a violation text, and otherwise, proving that the text to be audited is a normal text. By use of the text content auditing method, a misjudgment probability is effectively lowered, auditing accuracy is improved, and the text content auditing method has quick reaction capacity for anagrams and net neologisms.
Owner:DATAGRAND TECH INC

Speech recognition system and method for generating phonotic estimates

A speech recognition system for transforming an acoustic signal into a stream of phonetic estimates includes a frequency analyzer for generating a short-time frequency representation of the acoustic signal. A novelty processor separates background components of the representation from region of interest components of the representation. The output of the novelty processor includes the region of interest components of the representation according to the novelty parameters. An attention processor produces a gating signal as a function of the novelty output according to attention parameters. A coincidence processor produces information regarding co-occurrences between samples of the novelty output over time and frequency. The coincidence processor selectively gates the coincidence output as a function of the gating signal according to one or more coincidence parameters. A vector pattern recognizer and a probability processor receives the gated coincidence output and produces a phonetic estimate stream representative of acoustic signal.
Owner:ELIZA

Method of indexing and retrieval of electronically-stored documents

A document indexing and retrieval system and method which assigns weights to the key words and assigns a relative value to pairs of key words (i.e. defines a relative relation on KxK) based on their frequency of occurrence and co-occurrence in the document data base. In response to a query both the weights and this relative relation are used to suggest additional and / or alternative key words which are very likely to find relevant documents. Documents are then ranked by number of hits adjusted for the weights of hit words and their relative values.
Owner:KAGENECK KARL ERBO G +1

System and method for identifying critical features in an ordered scale space within a multi-dimensional feature space

A system and method for identifying critical features in an ordered scale space within a multi-dimensional feature space is described. Features are extracted from a plurality of data collections. Each data collection is characterized by a collection of features semantically-related by a grammar. Each feature is normalized and frequencies of occurrence and co-occurrences for the feature for each of the data collections is determined. The occurrence frequencies and the co-occurrence frequencies for each of the features are mapped into a set of patterns of occurrence frequencies and a set of patterns of co-occurrence frequencies. The pattern for each data collection is selected and distance (similarity) measures between each occurrence frequency in the selected pattern is calculated. The occurrence frequencies are projected onto a one-dimensional document signal in order of relative decreasing similarity using the similarity measures. Wavelet and scaling coefficients are derived from the one-dimensional document signal using multiresolution analysis.
Owner:NUIX NORTH AMERICA +1

Ad Relevance In Sponsored Search

Techniques for improving advertisement relevance for sponsored search advertising. The method includes steps for processing a click history data structure containing at least a plurality of query-advertisement pairs, populating a first translation table containing a co-occurrence count field, populating a second translation table containing an expected clicks field, and calculating a click propensity score for an advertisement using the click history data structure, the first translation table (for determining overall click likelihood across all historical traffic), and using the second translation table (for removing biases present in the first translation table). Other method steps calculate a second click propensity score for a second advertisement, then ranking the first advertisement relative to the second advertisement for comparing a click propensity score to a threshold for filtering low quality ad candidates from a plurality of ad candidates, and then ranking advertisements for optimizing placement of ads on a sponsored search display page.
Owner:YAHOO INC

Method and system for recommending query based on user log

The invention discloses a method and system for recommending query based on user log. The method comprises acquiring an effective query log set according to the data set in the user log; selecting a typical query string as the training set, extracting 6 characteristic indexes of each query string in the effective query log set, such as support degree, popularity, recommendation degree, co-occurrence degree, similarity, and association degree, and constructing a composite prediction model based on the training set; and extracting the 6 characteristic indexes of candidate query strings inputted by a user, inputting the extracted characteristic indexes into the composite prediction model as variables, calculating the relevancy between each candidate query string and a given query string, and outputting n query strings with higher rank. The system comprises a data preparation module, a prediction model construction module, and a processing output model for realizing the above method. By fully utilizing the user log of a search engine, the method and system can recommend query strings with higher quality for the user.
Owner:PEKING UNIV

System and method for playlist generation based on similarity data

System, method and computer program for facilitating media playlist generation based at least in part on media library inventory information provided by a plurality of program participants. Data is transmitted from a program participant's client device indicative of media inventory in a media library of the program participant. Media item similarity ratings are received at the client device that have been compiled based on cumulative data collected from a plurality of program participants, including identification data of individual media items contained in media libraries of the program participants regardless of each individual media item's source. Similarity ratings compilation includes processing the cumulative data to determine an incidence of co-occurrence of pairs of individual media items in different program participants' media libraries and making an assignment of a similarity rating based on the determined incidence of co-occurrence.
Owner:APPLE INC

Methods and systems for identification of DNA patterns through spectral analysis

Spectrogram extraction from DNA sequence has been known since 2001. A DNA spectrogram is generated by applying Fourier transform to convert a symbolic DNA sequence consisting of letters A, T, C, G into a visual representation that highlights periodicities of co-occurrence of DNA patterns. Given a DNA sequence or whole genomes, with this method it is easy to generate a large number of spectrogram images. However, the difficult part is to elucidate where are the repetitive patterns and to associate a biological and clinical meaning to them. The present disclosure provides systems and methods that facilitate the location and / or identification of repetitive DNA patterns, such as CpG islands, Alu repeats, tandem repeats and various types of satellite repeats. These repetitive elements can be found within a chromosome, within a genome or across genomes of various species. The disclosed systems and methods apply image processing operators to find prominent features in the vertical and horizontal direction of the DNA spectrograms. Systems and methods for fast, full scale analysis of the derived images using supervised machine learning methods are also disclosed. The disclosed systems and methods for detecting and / or classifying repetitive DNA patterns include: (a) comparative histogram method, (b) feature selection and classification using support vector machines and genetic algorithms, and (c) generation of spectrovideo from a plurality of spectral images.
Owner:KONINKLIJKE PHILIPS ELECTRONICS NV

Automatic dynamic contextual data entry completion system

A method, performed in a character entry system, for interrelating character strings so that incomplete input character strings can be completed by a selection of a presented character string involves computing contextual associations between multiple character strings based upon co-occurrence of character strings relative to each other in documents present in the character entry system, in response to inputting of a specified threshold of individual characters, identifying at least one selectable character string from among the computed contextual associations that can compete the incomplete input character string in context; and providing the identified at least one selectable character string to a user for selection.
Owner:MOUNTECH IP LLC

Industry dictionary generating method and device

The invention provides an industry dictionary generating method and an industry dictionary generating device. The method comprises the following steps of: acquiring a document collection corresponding to the initial industry glossaries according to the initial industry glossaries; acquiring candidate glossaries according to the document collection; performing industry relevance analysis on the candidate glossaries to acquire relevant candidate glossaries; performing co-occurrence analysis and incidence relation excavation on the relevant candidate glossaries to generate industry vocabularies; and adding the industry vocabularies into industry dictionaries. Due to the adoption of the technical scheme, the industry dictionaries can be generated, and the problems of high cost, low efficiency and the like which are generated when workers search the industry vocabularies in the prior art are solved.
Owner:QUNAR CAYMAN ISLANDS

Assistive call center interface

Unstructured voice information from an incoming caller is processed by automatic speech recognition and semantic categorization system to convert the information into structured data that may then be used to access one or more databases to retrieve associated supplemental data. The structured data and associated supplemental data are then made available through a presentation system that provides information to the call center agent and, optionally, to the incoming caller. The system thus allows a call center information processing system to handle unstructured voice input for use by the live agent in handling the incoming call and for storage and retrieval at a later time. The semantic analysis system may be implemented by a global parser or by an information retrieval technique, such as latent semantic analysis. Co-occurrence of keywords may be used to associate prior calls with an incoming call to assist in understanding the purpose of the incoming call.
Owner:PANASONIC CORP

Citation-based information retrieval system and method

Disclosed are a method, machine-readable code, and a system for matching one or more citation tags with citation-rich documents or with professionals who are associated with a group of citation tags. The method takes a user input that can be converted to one or more primary search tags, and accesses a matrix of pair-wise tag co-occurrence values that are related to the co-occurrence of each pair of tags extracted from documents contained in a collection of citation-rich documents, to identify, for each primary tag received those secondary tags whose pair-wise co-occurrence values with respect to the primary tag is above a selected threshold value. A tag search vector constructed from the secondary tags and optionally, the primary vectors, is used in a database search to identify those documents or professionals having the highest tag-matching score with respect to the tag search vector, and these results are then displayed to the user.
Owner:WORD DATA

System and method of context-specific searching in an electronic database

A user can search a database within a “context” that can be invoked with a context term, or name. The context is pre-defined by a human expert, or curator. The context definition is used in conjunction with a search term provided by the user to efficiently obtain search results that can otherwise be difficult to attain, such as detecting characteristics of data over multiple documents or other database items to infer trends, phenomena, characteristics, or other properties of the data. A context can be a category of items where each item has a distinct name. Search results are presented using the context based on the number of co-occurrences of the search term and terms relating to the context. In a preferred embodiment, the search results are presented as a list with documents having higher co-occurrences ordered at the top of the list. Context definition sets can be created and updated as an ongoing service to a subscriber. Several processing configurations are presented.
Owner:RGT UNIV OF CALIFORNIA

Crowd density estimation method and pedestrian volume statistical method based on video analysis

ActiveCN103218816AAvoid separate detectionCrowd density estimation real-timeImage enhancementImage analysisSpectral density estimationCo-occurrence
The invention discloses a crowd density estimation method based on video analysis and a pedestrian volume statistical method based on the video analysis. The crowd density estimation method includes the flowing steps of (1) off-line training: manually counting crowd density data, extracting characteristics and conducting training; and (2) on-line estimating: extracting the characteristics and conducting regression prediction by utilizing trained model parameters. The pedestrian volume statistical method includes the step of setting up a robust relationship between a scene and a line-passing number of people by combing the crowd density and a micro-region pedestrian flow speed before a line is passed. Characteristics such as foregrounds, edges and gray scale co-occurrence matrixes are extracted based on a whole area to conduct crowd density estimation, problems of dense crowds, sheltering and the like can be well solved through mixing of the characteristics, and real-time crowd density estimation is achieved. In addition, on the basis of area crowd density estimation, pedestrian volume estimation is conducted through combination of the pedestrian flow speed based on an optical flow, detection and tracking of a large number of individuals under a complex environment are avoided, and two-way pedestrian volume counting of accurate robust under dense crowds is achieved.
Owner:SUN YAT SEN UNIV

Apparatus and method for supporting document data search

In a search support server, a related word extraction unit generates frequency information and co-occurrence information of keywords, a graph generation unit generates coordinate information of a spring graph including the keywords as nodes, on the basis of the co-occurrence information, a cluster generation unit groups the nodes into clusters and thereby generates cluster definition information, and a display information generation unit generates display information of the spring graph. In addition, an operation determination unit determines which operation is performed on the spring graph. Then, when a level change is instructed, the display information generation unit generates display information of the spring graph after the level is changed. When a node change is instructed, a cluster re-generation unit changes the cluster definition information and the frequency information. When a search query generation is instructed, a search query generation unit generates a search query with a keyword of a selected cluster.
Owner:IBM CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products