Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

699 results about "Frequency of occurrence" patented technology

Frequency of occurrence (FO) is a simple metric measuring the proportion of samples (often expressed as a percentage) where a certain item is present.

Method and system for optimally searching a document database using a representative semantic space

A term-by-document matrix is compiled from a corpus of documents representative of a particular subject matter that represents the frequency of occurrence of each term per document. A weighted term dictionary is created using a global weighting algorithm and then applied to the term-by-document matrix forming a weighted term-by-document matrix. A term vector matrix and a singular value concept matrix are computed by singular value decomposition of the weighted term-document index. The k largest singular concept values are kept and all others are set to zero thereby reducing to the concept dimensions in the term vector matrix and a singular value concept matrix. The reduced term vector matrix, reduced singular value concept matrix and weighted term-document dictionary can be used to project pseudo-document vectors representing documents not appearing in the original document corpus in a representative semantic space. The similarities of those documents can be ascertained from the position of their respective pseudo-document vectors in the representative semantic space.
Owner:KLDISCOVERY ONTRACK LLC

Synthesis unit selection apparatus and method, and storage medium

Input text data undergoes language analysis to generate prosody, and a speech database is searched for a synthesis unit on the basis of the prosody. A modification distortion of the found synthesis unit, and concatenation distortions upon connecting that synthesis unit to those in the preceding phoneme are computed, and a distortion determination unit weights the modification and concatenation distortions to determine the total distortion. An Nbest determination unit obtains N best paths that can minimize the distortion using the A* search algorithm, and a registration unit determination unit selects a synthesis unit to be registered in a synthesis unit inventory on the basis of the N best paths in the order of frequencies of occurrence, and registers it in the synthesis unit inventory.
Owner:CANON KK

System and method for performing efficient document scoring and clustering

A system and method for providing efficient document scoring of concepts within a document set is described. A frequency of occurrence of at least one concept within a document retrieved from the document set is determined. A concept weight is analyzed reflecting a specificity of meaning for the at least one concept within the document. A structural weight is analyzed reflecting a degree of significance based on structural location within the document for the at least one concept. A corpus weight is analyzed inversely weighing a reference count of occurrences for the at least one concept within the document. A score associated with the at least one concept is evaluated as a function of the frequency, concept weight, structural weight, and corpus weight.
Owner:NUIX NORTH AMERICA

Efficient scaling of nonscalable MPEG-2 Video

To reduce bandwidth of non-scalable MPEG-2 coded video, certain non-zero AC DCT coefficients for the 8x8 blocks are removed from the MPEG-2 coded video. In one implementation, high-frequency AC DCT coefficients are removed at the end of the coefficient scan order. This method requires the least computation and is most desirable if the reduced-bandwidth video is to be spatially sub-sampled. In another implementation, the smallest-magnitude AC DCT coefficients are removed. This method may produce an undesirable increase in the frequency of occurrence of escape sequences in the (run, level) coding. This frequency can be reduced by retaining certain non-zero AC DCT coefficients that are not the largest magnitude coefficients, and by increasing a quantization scale to reduce the coefficient levels. The reduced-bandwidth video can be used for a variety of applications, such as browsing for search and play-list generation, bit stream scaling for splicing, and bit-rate adjustment for services with limited resources and for multiplexing of transport streams.
Owner:EMC IP HLDG CO LLC

System and method for dynamically evaluating latent concepts in unstructured documents

A system and method for dynamically evaluating latent concepts in unstructured documents is disclosed. A multiplicity of concepts are extracted from a set of unstructured documents into a lexicon. The lexicon uniquely identifies each concept and a frequency of occurrence. A frequency of occurrence representation is created for the documents set. The frequency representation provides an ordered corpus of the frequencies of occurrence of each concept. A subset of concepts is selected from the frequency of occurrence representation filtered against a pre-defined threshold. A group of weighted clusters of concepts selected from the concepts subset is generated. A matrix of best fit approximations is determined for each document weighted against each group of weighted clusters of concepts.
Owner:NUIX NORTH AMERICA

Methods and systems for selecting a language for text segmentation

Methods and systems for selecting a language for text segmentation are disclosed. In one embodiment, at least a first candidate language and a second candidate language associated with a string of characters are identified, at least a first segmented result associated with the first candidate language and a second segmented result associated with the second candidate language are determined, a first frequency of occurrence for the first segmented result and a second frequency of occurrence for the second segmented result are determined, and an operable language is identified from the first candidate language and the second candidate language based at least in part on the first frequency of occurrence and the second frequency of occurrence.
Owner:GOOGLE LLC

Information management and retrieval

A method and apparatus is provided for extracting key terms from a data set, the method includes identifying a first set of one or more word groups of one or more word that occur more than once in the data set, and removing from this first set a second set of word groups that are sub-strings of longer word groups in the first set. The remaining word groups are key terms. Each word group is weighted according to its frequency of occurrence within the data set. The weighting of any word group may be increased by the frequency of any sub-string of words occurring in the second set and then dividing each weighting by the number of words in the word group. This weighting process operates to determine the order of occurrence of the word groups. Prefixes and suffixes are also removed from each word in the data set. This produces a neutral form of each word so that the weighting values are prefix and suffix independent.
Owner:BRITISH TELECOMM PLC

System and methods for automatic clustering of ranked and categorized search objects

A search results page includes multiple search lists generated by multiple clustering operations applied to an initial match set of documents selected based on a user query. A first result list is constructed by clustering a top-n set of documents by primary domain address and sorting based on extrinsic ranking factors such that the first list includes a ranked and ordered list of primary domain linked anchor text. A second result list is constructed by clustering the top-n set of documents based on a unified ranked occurrence of keywords within the top-n set of documents. The generated second list contains a plurality of cluster class references with each of the cluster class reference including a ranked ordered sub-list of the keywords occurring within the top-n set of documents and respectively associated with the cluster class reference, each of the keywords of the ranked ordered sub-lists including linking references to a corresponding one of the top-n set of documents. A third result list is constructed by clustering the top-n set of documents based on a ranked frequency of occurrence of internally linked anchor texts. The generated third result list includes the top-n set of the internally linked anchor texts and respective ranked and ordered sub-lists of linking references to primary domain Web-pages containing the corresponding one of the internally linked anchor texts.
Owner:YEBOL CORP

Estimating confidence for query revision models

An information retrieval system includes a query revision architecture that integrates multiple different query revisers, each implementing one or more query revision strategies. A revision server receives a user's query, and interfaces with the various query revisers, each of which generates one or more potential revised queries. The revision server evaluates the potential revised queries, and selects one or more of them to provide to the user. A session-based reviser suggests one or more revised queries, given a first query, by calculating an expected utility for the revised query. The expected utility is calculated as the product of a frequency of occurrence of the query pair and an increase in quality of the revised query over the first query.
Owner:GOOGLE LLC

System and method for coordinating product inspection, repair and product maintenance

A system and method for coordinating product inspection, repair and product maintenance features a proactive, data-driven process where industry-wide maintenance baselines, problems, frequency-of-occurrence data and best-in-class repair processes are identified in a central, secure, structured database environment and thereby where the safety and operational and economic impact of a problem can be evaluated and acted upon. Using industry-wide data provided by users, the present system and method will link repair and inspection finds (non-routines), reason codes and severity data related to potential safety, reliability, cycle time and cost implications with the root cause analysis and certified best inspection / repair / preventative process instructions, materials, tools and / or equipment. This information, combined with a maintenance provider's, manufacturer's and OEM's information, will greatly improve safety, reliability, cycle time and budget performance on repair, maintenance and inspection of equipment utilizing the present system and method.
Owner:THE SUMMERS GRP

Method and system for optimally searching a document database using a representative semantic space

A term-by-document matrix is compiled from a corpus of documents representative of a particular subject matter that represents the frequency of occurrence of each term per document. A weighted term dictionary is created using a global weighting algorithm and then applied to the term-by-document matrix forming a weighted term-by-document matrix. A term vector matrix and a singular value concept matrix are computed by singular value decomposition of the weighted term-document index. The k largest singular concept values are kept and all others are set to zero thereby reducing to the concept dimensions in the term vector matrix and a singular value concept matrix. The reduced term vector matrix, reduced singular value concept matrix and weighted term-document dictionary can be used to project pseudo-document vectors representing documents not appearing in the original document corpus in a representative semantic space. The similarities of those documents can be ascertained from the position of their respective pseudo-document vectors in the representative semantic space.
Owner:KLDISCOVERY ONTRACK LLC

Television program recommendation system

The present invention relates to a television system (50) and a method for automatically suggesting suitable programs to a viewer from a large number of available programs. The system (50) includes a DTV-agent system (21). Title information and characteristics of programs are made available as EPG (Electronic Program Guide) data, which including at least one Electronic Program Guide Database (22). A learning module (39) records characteristics associated with each program viewed by the user, and forms sets of these characteristics. The frequency of occurrence of each set is also determined. A recommendation module (40) uses a number of tasks to compile a list of viewer recommendations (67). Various tasks are defined, with each task defining a unique combination of a manner of ordering the viewer profile (500), and particular Relevance Filters for filtering the ordered viewer profile. Upon entry of a user request for a list of program recommendations (67), a search is performed of the EPG data for programs with characteristics that best match sets selected by the task(s). The user is notified of the availability such programs, allowing selection of a particular program.
Owner:CANON KK

Randomly enabled bonus game with controllable frequency of occurence

Method of conducting a game of chance having a randomly enabled bonus game with controllable frequency of occurrence, including the steps of: making a wager to initiate play on a base game; initiating a random bonus enablement determination which is separate from base game outcome; and allowing bonus game play only if the bonus is enabled.
Owner:ACRES JOHN F

Method for extracting feature word of text

The embodiment of the invention discloses a method for extracting subject headings of a text. The method comprises the following steps: a text to be processed is divided into combination sequences of the existing words; for each text to be processed, candidate character strings with a frequency of occurrence greater than a preset frequency in the text to be processed are found and extracted, and new words are filtered from the candidate character strings according to the lexicalization probability of the prefixes and / or suffixes of the candidate character strings; and subject headings of the text to be processed are extracted from the existing words and the new words according to the frequency of occurrences of the existing words and the new words. The invention ensures that the comprehensiveness of extracting subject headings from the text to be processed is improved.
Owner:SHENZHEN SHI JI GUANG SU INFORMATION TECH

Language recognition using sequence frequency

A system is provided for comparing an input query with a number of stored annotations to identify information to be retrieved from a database. The comparison technique divides the input query into a number of fixed-size fragments and identifies how many times each of the fragments occurs within each annotation using a dynamic programming matching technique. The frequencies of occurrence of the fragments in both the query and the annotation are then compared to provide a measure of the similarity between the query and the annotation. The information to be retrieved is then determined from the similarity measures obtained for all the annotations.
Owner:CANON KK

Text representation and method

A computer method for representing a natural-language document in a vector form suitable for text manipulation operations is disclosed. The method involves determining (a) for each of a plurality of terms composed of non-generic words and, optionally, proximately arranged word groups in the document, a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term, and optionally related to the inverse document frequency of that word in one or more libraries of texts. Also disclosed are a computer-readable code for carrying out the method, a computer system that employs the code, and a vector produced by the method.
Owner:WORD DATA

Method of indexing and retrieval of electronically-stored documents

A document indexing and retrieval system and method which assigns weights to the key words and assigns a relative value to pairs of key words (i.e. defines a relative relation on KxK) based on their frequency of occurrence and co-occurrence in the document data base. In response to a query both the weights and this relative relation are used to suggest additional and / or alternative key words which are very likely to find relevant documents. Documents are then ranked by number of hits adjusted for the weights of hit words and their relative values.
Owner:KAGENECK KARL ERBO G +1

System and method for identifying critical features in an ordered scale space within a multi-dimensional feature space

A system and method for identifying critical features in an ordered scale space within a multi-dimensional feature space is described. Features are extracted from a plurality of data collections. Each data collection is characterized by a collection of features semantically-related by a grammar. Each feature is normalized and frequencies of occurrence and co-occurrences for the feature for each of the data collections is determined. The occurrence frequencies and the co-occurrence frequencies for each of the features are mapped into a set of patterns of occurrence frequencies and a set of patterns of co-occurrence frequencies. The pattern for each data collection is selected and distance (similarity) measures between each occurrence frequency in the selected pattern is calculated. The occurrence frequencies are projected onto a one-dimensional document signal in order of relative decreasing similarity using the similarity measures. Wavelet and scaling coefficients are derived from the one-dimensional document signal using multiresolution analysis.
Owner:NUIX NORTH AMERICA +1

Semiotic decision making system used for responding to natural language queries and other purposes and components therefor

InactiveUS6275817B1Digital computer detailsChaos modelsDecision systemSymbolic processing
A decision making system uses semiotic processing modules to transform a training corpus of information, in the form of sequential sets of symbols, into a knowledge database. The knowledge database is thereafter used to make decisions relating to queries input in the same type of training corpus symbols. In the knowledge base, the system stores data representations of analyses of subsets of the training corpus sets of sequential elements. The knowledge base data representations comprise predicates and elemental and non-elemental acts. An inductive processor recursively processes the training corpus sets by evaluating the relationship and frequency of occurrence of individual elements and sets of elements in the training corpus. After processing of the training corpus is completed, the resultant knowledge base is used to evaluate queries in a performance mode of operation.
Owner:UNISYS CORP +1

Method, system, and program product for enhanced search query modification

Methods, programs products and systems are provided for presenting retrieved search engine results text items to a user on a display device through a graphical user interface configured to associate displayed text items with a search term modification action. Selecting a displayed text item through a graphical user interface component cursor routine automatically instigates modifying of the search term through the associated modification action with the selected text item to generate a modified search term and causes a search engine component to search the modified search term and retrieve new search results similarly presented, enabling additional automatic iterations of search term modifying, searching and result presenting. Modifying a search term may occur automatically or through a selection from a generated list of revising actions, and presenting search results text items may include ordering and presenting a list of result text items relative to an occurrence frequency.
Owner:IBM CORP

Searchable encryption processing system

ActiveUS20130262863A1Efficiently be decryptedImprove securityDigital data information retrievalDigital data protectionProbabilistic encryptionDatabase server
In the searchable encryption processing system, a data base server retaining data, a registration client which deposits the data into the data base server, and a search client which causes the data base server to search the data collaborate across a network, wherein the registration client, using a probabilistic encryption method which uses a mask using a homomorphic function and a hash value, deposits the encrypted data into the server, whereupon the search client, using probabilistic encryption which uses the mask which uses the homomorphic function for encryption of the search query, outputs the search query and non-corresponding data as search results without causing the data base server to unmask the mask and without allowing the frequency of occurrences of the data corresponding to the search to leak to the data base server.
Owner:HITACHI LTD

Flexible organic light emitting display and method of manufacturing the same

Provided is a flexible organic light-emitting display device. The flexible display device includes a flexible substrate having a display area, a non-display area, and a bending area. On the flexible substrate, a first insulation layer is formed in a part of the non-display area. The first insulation layer includes a zigzag pattern. A plurality of wirings are electrically connected to the display area and are extended to traverse the non-display area and the bending area and are disposed on the first insulation layer. On the first insulation layer and the plurality of wirings, a passivation layer is formed. By virtue of a zigzag pattern of the first insulation layer, the frequency of occurrence of cracks in the passivation layer is reduced.
Owner:LG DISPLAY CO LTD

System and method for dynamically evaluating latent concepts in unstructured documents

InactiveUS20060089947A1Database updatingData processing applicationsConcept clusterConceptual clustering
A system and method for dynamically evaluating latent concepts in unstructured documents is disclosed. A multiplicity of concepts are extracted from a set of unstructured documents into a lexicon. The lexicon uniquely identifies each concept and a frequency of occurrence. A frequency of occurrence representation is created for the documents set. The frequency representation provides an ordered corpus of the frequencies of occurrence of each concept. A subset of concepts is selected from the frequency of occurrence representation filtered against a pre-defined threshold. A group of weighted clusters of concepts selected from the concepts subset is generated. A matrix of best fit approximations is determined for each document weighted against each group of weighted clusters of concepts.
Owner:NUIX NORTH AMERICA

Estimating confidence for query revision models

An information retrieval system includes a query revision architecture that integrates multiple different query revisers, each implementing one or more query revision strategies. A revision server receives a user's query, and interfaces with the various query revisers, each of which generates one or more potential revised queries. The revision server evaluates the potential revised queries, and selects one or more of them to provide to the user. A session-based reviser suggests one or more revised queries, given a first query, by calculating an expected utility for the revised query. The expected utility is calculated as the product of a frequency of occurrence of the query pair and an increase in quality of the revised query over the first query.
Owner:GOOGLE LLC

Method and apparatus for alphanumeric data entry using a keypad

A keypad for entering letters includes an array of keys with each key being assigned to at least one letter of an alphabetical system based on the frequency of occurrence of the least one letter in a typical body of written work. Thealphabetical system comprises at least one most-frequently-occurring letter that is entered by activation of the same key twice and at least one less-frequently-occurring letter that is entered by activation of two different keys.
Owner:BOZORGUI NESBAT SAIED

Code, system and method for representing a natural-language text in a form suitable for text manipulation

A computer method, system and code, for representing a natural-language document in a vector form suitable for text manipulation operations are disclosed. The method involves determining (a) for each of a plurality of terms selected from one of (i) non-generic words in the document, (ii) proximately arranged word groups in the document, and (iii) a combination of (i) and (ii), a selectivity value of the term related to the frequency of occurrence of that term in a library of texts in one field, relative to the frequency of occurrence of the same term in one or more other libraries of texts in one or more other fields, respectively. The document is represented as a vector of terms, where the coefficient assigned to each term includes a function of the selectivity value determined for that term.
Owner:WORD DATA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products