Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

50 results about "Pairwise similarity" patented technology

Pairwise similarity score provides a relevant measure of similarity between protein sequences. This similarity incorporates biological knowledge about proteins and it is extremely powerful when combined with support vector machine to predict PPI.

Cross-media similarity measures through trans-media pseudo-relevance feedback and document reranking

A multimedia information retrieval system includes a storage and an electronic processing device. The latter is configured to perform a process including: computing values of a pairwise similarity measure quantifying pairwise similarity of documents of a multimedia reference repository; storing the computed values in the storage; performing an initial information retrieval process respective to the multimedia reference repository to return a set of initial repository documents; and identifying a set of top ranked documents of the multimedia reference repository based at least on the stored computed values pertaining to the set of initial repository documents.
Owner:XEROX CORP

Method for Anomaly Detection in Time Series Data Based on Spectral Partitioning

Anomalies in real time series are detected by first determining a similarity matrix of pairwise similarities between pairs of normal time series data. A spectral clustering procedure is applied to the similarity matrix to partition variables representing dimensions of the time series data into mutually exclusive groups. A model of normal behavior is estimated for each group. Then, for the real time series data, an anomaly score is determined, using the model for each group, and the anomaly score is compared to a predetermined threshold to signal the anomaly.
Owner:MITSUBISHI ELECTRIC RES LAB INC

Phylogeny generation

A method is provided for comparing malware or other types of computer programs, and for optionally using such a comparison method for (a) searching for matching programs in a collection of programs, (b) classifying programs, and (c) constructing a classification or a partitioning within a collection of programs. In general, there are three steps to the comparison portion: selecting and extracting tokens from a pair of programs for comparison, building features from these tokens, and comparing the programs based on the frequency of feature occurrences to produce a similarity measure. Pairwise similarity is then used for optionally searching, classifying, or constructing classification systems.
Owner:LAKHOTIA ARUN +2

Adjustment of document relationship graphs

Provided is a process of modifying semantic similarity graphs representative of pair-wise similarity between documents in a corpus, the method comprising obtaining a semantic similarity graph that comprises more than 500 nodes and more than 1000 weighted edges, each node representing a document of a corpus, and each edge weight indicating an amount of similarity between a pair of documents corresponding to the respective nodes connected by the respective edge; obtaining an n-gram indicating that edge weights affected by the n-gram are to be increased or decreased; expanding the n-gram to produce a set of expansion n-grams; adjusting edge weights of edges between pairs of documents in which members of the expanded n-gram set co-occur.
Owner:QUID LLC

Model selection for cluster data analysis

A model selection method is provided for choosing the number of clusters, or more generally the parameters of a clustering algorithm. The algorithm is based on comparing the similarity between pairs of clustering runs on sub-samples or other perturbations of the data. High pairwise similarities show that the clustering represents a stable pattern in the data. The method is applicable to any clustering algorithm, and can also detect lack of structure. We show results on artificial and real data using a hierarchical clustering algorithm.
Owner:HEALTH DISCOVERY CORP +1

Generalized latent semantic analysis

InactiveUS20070067281A1Facilitate documentFacilitate word-level processing operationDigital data information retrievalDigital data processing detailsAlgorithmPaper document
One embodiment of the present invention provides a system that builds an association tensor (such as a matrix) to facilitate document and word-level processing operations. During operation, the system uses terms from a collection of documents to build an association tensor, which contains values representing pair-wise similarities between terms in the collection of documents. During this process, if a given value in the association tensor is calculated based on an insufficient number of samples, the system determines a corresponding value from a reference document collection, and then substitutes the corresponding value for the given value in the association tensor. After the association tensor is obtained, a dimensionality reduction method is applied to compute a low-dimensional vector space representation for the vocabulary terms. Document vectors are computed as linear combinations of term vectors.
Owner:PALO ALTO RES CENT INC

Method and System for Clustering Data Points

Systems and methods for clustering a group of data points based on a measure of similarity between each pair of data points in the group are provided. A pairwise similarity function can be estimated for each pair of data points in the group. A clustering algorithm can be executed to create clusters and associate data points with the clusters using the pairwise similarity function. The algorithm can be iterated multiple times until a stopping condition is reached in order to reduce variance in the output of the algorithm. The pairwise similarity function for each pair of data points can be updated between iterations of the algorithm and the results of each iteration can be aggregated. The data in each data point associated with a cluster can be consolidated into a consolidated data point.
Owner:GOOGLE LLC

Chinese Web document online clustering method based on common substrings

The invention discloses a Chinese Web document online clustering method based on common substrings. As known to all, search engines are important in application of information searching and positioning with sharp increase of information on the internet. Web document clustering can automatically classify return results of the search engines according to different themes so as to assist users to reduce query range and fast position needed information. The Web document online clustering is characterized in that non-numerical and non-structured characteristics of Web documents are required to be met on the one hand, and clustering time is required to meet online search requirements of users on the other hand. According to the two characteristics, the invention provides the Chinese Web document online clustering method based on common substrings, and the method comprises steps as follows: (1) firstly, preprocessing the first n query results returned by the search engines so as to realize deleting and replacing operation of non-Chinese characters in the return results of the search engines, (2) extracting common substrings in the Web documents by utilizing GSA, (3) presenting a weighting calculation formula referring to TF*IDF according to the common substrings which are extracted and then building a document characteristic vector model, (4) computing pairwise similarity of the Web documents on the basis of the model to acquire a similarity matrix, (5) adopting an improved hierarchical clustering algorithm to achieve clustering of the Web documents on the basis of the matrix, and (6) executing clustering description and label extraction. The Chinese Web document online clustering method based on common substrings has obvious advantages on performance, clustering label generation and clustering time effects.
Owner:BEIHANG UNIV

Method and system for clustering data points

Systems and methods for clustering a group of data points based on a measure of similarity between each pair of data points in the group are provided. A pairwise similarity function can be estimated for each pair of data points in the group. A clustering algorithm can be executed to create clusters and associate data points with the clusters using the pairwise similarity function. The algorithm can be iterated multiple times until a stopping condition is reached in order to reduce variance in the output of the algorithm. The pairwise similarity function for each pair of data points can be updated between iterations of the algorithm and the results of each iteration can be aggregated. The data in each data point associated with a cluster can be consolidated into a consolidated data point.
Owner:GOOGLE LLC

Extracting interpretable features for classification of multivariate time series from physical systems

A method and system are provided. The method includes extracting shapelets from each of a plurality of time series dimensions of multi-dimensional time series data. The method further includes building a plurality of decision-tree classifiers, one for each time series dimension, responsive to the shapelets extracted therefrom. The method also includes generating a pairwise similarity matrix between respective different ones of the plurality of time series dimensions using the shapelets as intermediaries for determining similarity. The method additionally includes applying a feature selection technique to the matrix to determine respective feature weights for each of shapelet features of the shapelets and respective classifier weights for each of the decision-tree classifiers that uses the shapelet features. The method further includes combining decisions issued from the decision-tree classifiers to generate a final verdict of classification for a time series dimension responsive to the respective feature weights and the respective classifier weights.
Owner:NEC CORP

Method of identifying outliers in item categories

A system and method of identifying outliers in item categories are described. A pairwise similarity measurement may be determined between each item listing in a plurality of item listings based on a comparison of at least one feature of each item listing. At least one outlier among the plurality of item listings may be determined using the pairwise similarity measurements. The feature(s) may comprise at least one feature from a group of features consisting of: a title, an image, a price, an attribute, and a description. Each item listing in the plurality of item listings may belong to the same leaf or non-leaf category in a network-based marketplace or publication system. The outlier(s) may be determined using at least one clustering algorithm. The clustering algorithm(s) may comprise an agglomerative hierarchical clustering algorithm and / or a density-based clustering algorithm.
Owner:EBAY INC

Matching co-referring entities from serialized data for schema inference

A system and method provide for identifying coreference from serialized data coming from different services. The method includes generating a tree structure from serialized data. The serialized data includes responses to queries from the different services. The responses each identify a hierarchical relationship between a respective set of objects. Nodes of the tree structure each have a name corresponding to a respective one of the objects. The tree structure is traversed in a breadth first manner and, for each node in the tree structure, a respective pairwise similarity is computed with each of the other nodes of the tree structure. The computed pairwise similarity is compared with a threshold to identify co-referring nodes that refer to a same entity. The threshold is a function of a depth of the node in the tree structure.
Owner:CONDUENT BUSINESS SERVICES LLC

System and methods for automated community discovery in networks with multiple relational types

ActiveUS9749406B1Maximizes community density criterionDigital computer detailsStar/tree networksAlgorithmDendrogram
Described is a system for automated community discovery in networks with multiple relational types. The system receives a network as input. The network comprises neighbors, edges connecting the neighbors, and vertices, where edges between two vertices represent a relation. A set of pair-wise similarity comparisons is computed for all pairs of relations. Two relations are considered similar if vertices connected to the two relations share similar relations to the same set of neighbors. A relation dendrogram is created based on the set of pair-wise similarity comparisons. Then, a cut in the relation dendrogram is selected to compute a community solution, resulting in a plurality of relation dendrogram partitions. Each relation dendrogram partition represents a community. A community density criterion is computed based on a density of each community calculated with respect to edge types contained within each community. Finally, a community solution is generated that maximizes the community density criterion.
Owner:HRL LAB

Model selection for cluster data analysis

A model selection method is provided for choosing the number of clusters, or more generally the parameters of a clustering algorithm. The algorithm is based on comparing the similarity between pairs of clustering runs on sub-samples or other perturbations of the data. High pairwise similarities show that the clustering represents a stable pattern in the data. The method is applicable to any clustering algorithm, and can also detect lack of structure. We show results on artificial and real data using a hierarchical clustering algorithm.
Owner:HEALTH DISCOVERY CORP +1

Identifying living skin tissue in a video sequence

According to an aspect, there is provided an apparatus for identifying living skin tissue in a video sequence, the apparatus comprising a processing unit configured to receive a video sequence, the video sequence comprising a plurality of image frames; divide each of the image frames into a plurality of frame segments, wherein each frame segment is a group of neighboring pixels in the image frame; form a plurality of video sub-sequences, each video sub-sequence comprising a frame segment from two or more of the plurality of image frames; analyze the plurality of video sub-sequences to determine a pulse signal for each video sub-sequence; determine a similarity matrix based on pairwise similarities for each determined pulse signal with each of the other determined pulse signals; and identify areas of living skin tissue in the video sequence from the similarity matrix.
Owner:KONINKLJIJKE PHILIPS NV

Model selection for cluster data analysis

A model selection method is provided for choosing the number of clusters, or more generally the parameters of a clustering algorithm. The algorithm is based on comparing the similarity between pairs of clustering runs on sub-samples or other perturbations of the data. High pairwise similarities show that the clustering represents a stable pattern in the data. The method is applicable to any clustering algorithm, and can also detect lack of structure. We show results on artificial and real data using a hierarchical clustering algorithm.
Owner:HEALTH DISCOVERY CORP +1

Level set SAR (Synthetic Aperture Radar) image segmentation method based on self-adaptive finite element

The invention discloses a level set SAR (Synthetic Aperture Radar) image segmentation method based on a self-adaptive finite element, which is mainly used for solving the problem that a conventional variational level set model based on statistical distribution is imprecise in the non-homogeneous SAR image segmentation. The method comprises the concrete implementation steps of: (1) optimizing an image partitioning energy term on the basis of minimum cutset criterion of image partitioning; (2) defining the weighted energy functional through combining with a level set rule term and a length bound term; (3) carrying out variation and minimization on the energy functional to obtain a curve evolution control equation; (4) carrying out discretization on a finite element mesh to obtain a semi-implicit discrete scheme of the curve evolution control equation; and (5) adjusting strategy by adopting the self-adaptive finite element mesh based on posteriori error estimate, realizing the level set evolution based on a triangular mesh and obtaining a segmentation result of the SAR image. According to the invention, the energy functional is defined by utilizing pairing similarity so that the limitation of the conventional statistical model is overcome; in the meantime, the numerical computation strategy based on the self-adaptive finite element is adopted so that the effective balance of segmentation quality and computing efficiency is realized.
Owner:ZHEJIANG GONGSHANG UNIVERSITY

Identifying and labeling fraudulent store return activities

A method and system for identifying and labeling fraudulent store return activities includes receiving, by a server, retailer events from an online transaction system of a retailer, the retailer events comprising records of transactions between customers and the retailer, including sale, exchange and return activities across multiple stores. The retailer events are processed to build a network that associates stores, transactions, payment instruments, and customer identification over related activity sequences of transactions. Return fraud labels are generated for the retailer events representing returns based on identified fraud characteristics of the related activity sequences by: representing behavior variables extracted from activity sequences by respective signature vectors; calculating pairwise similarity between the signature vectors; identifying clusters of the signature vectors having common behavior patterns based on the calculated pairwise similarity; and labeling the identified clusters of signature vectors as non-fraudulent behavior or fraudulent behavior.
Owner:WALMART APOLLO LLC

Techniques for mixed-initiative visualization of data

In various embodiments, a visualization engine generates graphs that facilitate sense making operations on data sets. A graph includes nodes that are associated with a data set and edges that represent relationships between the nodes. In operation, the visualization engine computes pairwise similarities between the nodes. Subsequently, the visualization engine computes a layout for the graph based on the pairwise similarities and user-specified constraints. Finally, the visualization engine renders a graph for display based on the layout, the nodes, and the edges. Advantageously, by interactively specifying constraints and then inspecting the topology of the automatically generated graph, the user may efficiently explore salient aspects of the data set.
Owner:AUTODESK INC

Cosine measurement supervised deep hash algorithm with balanced similarity

The invention discloses a cosine measurement supervised deep hash algorithm with balanced similarity, and belongs to the field of image retrieval. The deep supervised hash has the advantages of low storage cost, high calculation efficiency and the like. However, similarity preserving, quantization error, and imbalance data are still huge challenges in deep supervised hashing. The invention provides a deep hash scheme for pairwise similarity preservation, and solves the problem. According to the method, a deep network is used as a basic model to extract features, and a hash layer is used to replace a final classification layer to enable the final classification layer to output hash codes. According to the method, a loss function is designed, semantic similarity can be effectively kept in the training process, and the problems of class imbalance, difficulty and quantitative loss are solved. When the hash code obtained by the method is used for image retrieval, the retrieval accuracy canbe effectively improved for an extremely unbalanced data set.
Owner:BEIJING UNIV OF TECH

Parking space detection system and method based on hierarchical pairwise similarity PVAnet

The invention, which belongs to the field of computer vision technology, provides a parking space detection system and method based on hierarchical pairwise similarity PVAnet. Two networks are connected to form the hierarchical pairwise similarity PVAnet. The main framework of the first network is PVAnet. In the first network, characteristics of a vehicle are learned by using migration learning; and the characteristics of the vehicle are fused into the PVAnet. The second network is a paired detection network; and The second network is connected behind the first network and is used for learninga similarity relationship of spatial positions of the detected vehicle center and the parking space center, so that the learned network is able to identify the parking space effectively. With the provided method, the parking space can be found out accurately, so that a problem of difficult parking at public places is solved.
Owner:INSPUR GROUP CO LTD

Apparatus, systems, and methods for grouping data records

The present application relates to apparatus, systems, and methods for grouping data records based on entities referenced by the data records. The disclosed grouping mechanism can include determining a pair-wise similarity between a large number of data records, and clustering a subset of the data records based on their pair-wise similarity.
Owner:FACTUAL

Attribute graph literature clustering method based on graph convolutional neural network

The invention discloses an attribute graph literature clustering method based on a graph convolutional neural network, and belongs to the field of graph data mining. Specifically, literature attribute graph feature learning is carried out by using a cross-layer linked graph convolutional neural network; estimating an optimal cluster number from the node features by using a deep clustering estimation model; alternately executing the two steps to complete training; utilizing the trained model to obtain the characteristics of all to-be-clustered literature attribute graph nodes and the estimated number of clustering clusters; and taking the characteristics and the estimated number of the clustering clusters as input, and obtaining a clustering result of the literature attribute graph by using a k-means clustering method. When a cross-layer linked graph convolutional neural network is trained, a self-separation regularization item based on node pairwise similarity is adopted, so that the characteristics of nodes in the same cluster are similar and the characteristics of nodes in different clusters are far away, and the performance of graph clustering is effectively improved. And the clustering estimation module realizes data-driven clustering cluster number estimation, so that the whole system is more suitable for a real data environment without labels.
Owner:BEIJING UNIV OF TECH

System and Method for Evaluating Semantic Closeness of Data Files

PendingUS20220107800A1Rapid assimilationIncreased and rapid accessElectrophonic musical instrumentsSemantic analysisData fileSemantic proximity
The invention provides for the evaluation of semantic closeness of a source data file relative to candidate data files. The system includes an artificial neural network and processing intelligence that derives a property vector from extractable measurable properties of a data file. The property vector is mapped to related semantic properties for that same data file and such that, during ANN training, pairwise similarity / dissimilarity in property is mapped, during towards corresponding pairwise semantic similarity / dissimilarity in semantic space to preserve semantic relationships. Based on comparisons between generated property vectors in continuous multi-dimensional property space, the system and method assess, rank, and then recommend and / or filter semantically close or semantically disparate candidate files from a query from a user that includes the data file. Applications of the categorization and recommendation system apply to search tools, including identification of illicit materials or logically progressive associations between disparate files.
Owner:EMOTIONAL PERCEPTION AI LTD

Model training method, cross-modal representation method and unsupervised image text matching method and device

The invention aims to provide a model training method, a cross-modal characterization method and an unsupervised image text matching method and device. The method comprises the following steps: calculating a pairwise similarity value of a picture and a sentence in a training document; and determining a positive sample pair set and a negative sample pair set based on the similarity value; wherein the positive sample pair set contains a preset number of positive sample pairs; the negative sample pair set contains a preset number of negative sample pairs; the positive sample pair set and the negative sample pair set are used for further training the model until the average similarity value of the preset number of positive sample pairs is greater than the average similarity value of the preset number of negative sample pairs, and the difference value of the positive sample pairs and the negative sample pairs meets a preset condition. According to the embodiment, the sampling deviation can be reduced, and the picture and the sentence are matched through a better training model.
Owner:ZHEJIANG LAB +1

System and Method for Recommending Semantically Relevant Content

PendingUS20220108175A1Rapid assimilationIncreased and rapid accessElectrophonic musical instrumentsSemantic analysisSocial mediaData file
A property vector derived from extractable measurable properties of a data file is mapped to semantic properties for that data file. The property vector is an output from a trained artificial neural network that, following pairwise training of the ANN using pairs of files that map pairwise similarity / dissimilarity in property space towards corresponding pairwise semantic similarity / dissimilarity in semantic space, both preserves and is representative of semantic properties of the data file. The system and method assesses, based on comparisons between generated property vectors, ranks and then recommends and / or filters semantically close or semantically disparate candidate files in a database from a query from a user that includes the data file. Applications of the categorization and recommendation system and method apply to media or search tools and social media platforms, including media in the form of music, video, images data and / or text files.
Owner:EMOTIONAL PERCEPTION AI LTD

Image text cross-modal retrieval method based on category information alignment

ActiveCN113010700AEliminate heterogeneity differencesInvariance guarantees that the learned representations have bothMetadata multimedia retrievalImage codingMachine learningPairwise similarity
The invention discloses an image text cross-modal retrieval method based on category information alignment, and aims to keep distinguishing between different semantic category instances (image texts) and eliminate isomerism differences. In order to achieve the purpose, category information is innovatively introduced into a public representation space, namely an image text public space to minimize distinguishing loss, and cross-modal loss is introduced to align different modal information. In addition, a category information embedding method is adopted to generate false features instead of other methods marking information based on DNN; at the same time, modal invariance loss is minimized in a category public space to learn modal invariance features. Under the guidance of the learning strategy, pairwise similarity semantic information of image-text coupling items is fully utilized as much as possible, and it is guaranteed that learned representation has both the discrimination of a semantic structure and the cross-modal invariance.
Owner:UNIV OF ELECTRONIC SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products