Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

1109 results about "Cosine similarity" patented technology

Cosine similarity is a measure of similarity between two non-zero vectors of an inner product space that measures the cosine of the angle between them. The cosine of 0° is 1, and it is less than 1 for any angle in the interval (0,π] radians. It is thus a judgment of orientation and not magnitude: two vectors with the same orientation have a cosine similarity of 1, two vectors oriented at 90° relative to each other have a similarity of 0, and two vectors diametrically opposed have a similarity of -1, independent of their magnitude.

Text joins for data cleansing and integration in a relational database management system

An organization's data records are often noisy: because of transcription errors, incomplete information, and lack of standard formats for textual data. A fundamental task during data cleansing and integration is matching strings—perhaps across multiple relations—that refer to the same entity (e.g., organization name or address). Furthermore, it is desirable to perform this matching within an RDBMS, which is where the data is likely to reside. In this paper, We adapt the widely used and established cosine similarity metric from the information retrieval field to the relational database context in order to identify potential string matches across relations. We then use this similarity metric to characterize this key aspect of data cleansing and integration as a join between relations on textual attributes, where the similarity of matches exceeds a specified threshold. Computing an exact answer to the text join can be expensive. For query processing efficiency, we propose an approximate, sampling-based approach to the join problem that can be easily and efficiently executed in a standard, unmodified RDBMS. Therefore the present invention includes a system for string matching across multiple relations in a relational database management system comprising generating a set of strings from a set of characters, decomposing each string into a subset of tokens, establishing at least two relations within the strings, establishing a similarity threshold for the relations, sampling the at least two relations, correlating the relations for the similarity threshold and returning all of the tokens which meet the criteria of the similarity threshold.
Owner:AMERICAN TELEPHONE & TELEGRAPH CO +1

Automatically answering method, device, storage medium and electronic equipment

The invention provides an automatically answering method, device, storage medium and electronic equipment. The automatically answering method comprises the steps: features of a to-be-answered questionare extracted; candidate question-answer pairs are screened from a question-answer corpus through similarity calculation which is performed on textual features of the to-be-answered question and textual features of questions of question-answer pairs in the question-answer corpus; feature vectors of the to-be-answered question are calculated based on a neural network model which completes a question-answer training in advance, the answers of the candidate question-answer pairs are calculated based on the neural network model, and the feature vectors of the answers of the candidate question-answer pairs are acquired; an answer is screened from the candidate question-answer pairs through performing cosine similarity calculation on the feature vectors of the to-be-answered question and the feature vectors of the answers of the candidate question-answer pairs, so that generalization of semantic distance is avoided, the semantic distance of the to-be-answered question and the candidate question-answer pairs is accurately measured by utilizing the neural network model, and the accuracy rate of screened answer is improved.
Owner:CLOUDMINDS SHANGHAI ROBOTICS CO LTD

Graph model-based automatic abstracting method

ActiveCN105243152AMeasuring Semantic RelevanceAchieve complementary effectsSpecial data processing applicationsCosine similaritySubject matter
The invention relates to the field of automatic abstracting, and discloses a graph model-based automatic abstracting method. According to the technical scheme, an LDA probability topic model is applied to measurement of semantic correlation between sentences and improvement of the measurement effect of sentence correlation; and an idea of topic correlation and position sensitivity of the sentences is provided, so that abstract generation is relatively reasonable and effective. The method comprises the following steps: firstly, obtaining topic probability distribution of a document and word probability distribution of the topic through training the LDA topic model, determining the topic probability distribution of the sentences and effectively converting a semantic similarity measurement between the sentences into a similarity measurement problem of the topic probability distribution of the sentences; with the sentences as nodes, building edges by referring tothe cosine similarity and according to the semantic similarity between the sentences and generating a text graph representing the document; calculating the topic correlation between the sentences according to the topic probability distribution of the sentences and the topic probability distribution of the document; and calculating the position sensitivity and the like of the sentences according to the positions of the sentences in the document.
Owner:TONGJI UNIV

Vehicle type recognition method based on CNN and domain adaptive learning

The invention relates to a vehicle type recognition method based on a CNN and domain adaptive learning. The method comprises steps of: establishing a CNN-based initial model by adding a rotation-invariant layer in an Alexnet network, distinguishing a discriminant layer and designing a new objective function; separately extracting the feature maps of different-domain sample convolution layers by using the established initial model, calculating the cosine similarity between the sample feature maps, determining the shared convolution kernel or the non-shared convolution kernel of the CNN, retaining the weight and the offset of the shared convolution kernel, and updating the weight and the offset of the non-shared convolution kernel; based on a target-domain training sample, calculating the cosine similarity between respective feature map layers and the average similarity of the entire target domain, and clustering each type of similar feature maps according to the average similarity; expanding source-domain samples with similar distribution characteristics in the target domain to new samples in the target domain, adjusting the entire CNN model by using the new samples in the target domain, and then using a softmax classifier to classify the vehicle types of the test samples in the target domain.
Owner:NANJING UNIV OF INFORMATION SCI & TECH

Healthy diet knowledge network construction method based on neural network and graph structure

The invention discloses a healthy diet knowledge network construction method based on a neural network and a graph structure. The method comprises the steps that word vector modeling is performed on a text corpus, so that each non-stop word in the text corpus corresponds to one word vector with a fixed length; a cosine similarity between two word vectors is used to measure the relational degree between entities corresponding to the two word vectors; food material entity nodes and symptom entity nodes are extracted, the two types of entity nodes are regarded as entity nodes in a topological structure, edge relations between the entity nodes are constructed to form the graph structure, and all the edge relations between the entity nodes are described by one group of representative words; vector expressions corresponding to each representative word are arranged to obtain a representative matrix of the edge relations between the entity nodes; and a classification framework based on a deep neural network is designed, the representative matrix is input, and polarities of the edge relations between the entity nodes are classified. Through the method, the problems that a traditional healthy diet knowledge base is not high in automation degree and obvious in domain limitation are effectively solved.
Owner:SOUTH CHINA UNIV OF TECH

LDA (latent dirichlet allocation) and VSM (vector space model) based similar Chinese herb literature recommendation method

ActiveCN103823848AFast and efficient similar recommendationRobustSpecial data processing applicationsLexical itemVector space model
The invention discloses an LDA (latent dirichlet allocation) and VSM (vector space model) based similar Chinese herb literature recommendation method. The method includes: adopting an IKAnalyzer to perform word segmentation on topics and summary information of literature on the basis of a terminological dictionary for Chinese herbs, constructing a vector space, performing dimensionality reduction on the vector space, constructing a semantic dictionary, numbering all lexical items in the dictionary in sequence, performing vectorization through each document on the basis of the semantic dictionary, constructing term vectors of each document, utilizing LDA and a Gibbs sampling algorithm to perform training to obtain probability distribution of each document on themes, then computing a value of similarity between every two documents by the aid of KL divergence, computing cosine similarity of the term vectors of each document on the basis of term frequency, performing joint weighting on the two kinds of similarities prior to performing similarity sorting, and then making recommendation. By the method, the literature, similar both in content and theme, in the Chinese herb literature can be recommended to users, and recommendation results are closer to user requirements.
Owner:ZHEJIANG UNIV

Video retrieval method based on multi-core canonical correlation analysis

Disclosed is a video retrieval method based on multi-core canonical correlation analysis. The method includes grasping text descriptions corresponding to the video on internet, and then operating on the video: firstly dividing the video according to whether a shot is mutated or not, extracting key frames, extracting vision features of the key frames and moving features of the shot to form video feature vectors, and extracting word-frequency features from the text descriptions of each video; then utilizing the method of the multi-core canonical correlation analysis to obtain mapping matrixes and low-dimensional representation of the video features and the word-frequency features, and allowing the mapping matrixes and the low-dimensional representation to have the maximum correlation in low-dimensional space; finally, when a user inputs key words to perform video retrieval, acquiring the low-dimensional representation of the word-frequency features of the key words according to the mapping matrixes of the word-frequency features, and returning video retrieval results sequentially from large to small of the degrees of cosine similarity with the low-dimensional representation of the video features. The method has the advantages that the correlation of video content and the retrieval key words is enhanced, and the accuracy of retrieval by the user is improved.
Owner:ZHEJIANG UNIV

Zero-sample classifying method based on class transfer

A zero-sample classifying method based on class transfer comprises the steps of acquiring a vision characteristic of C kinds of training samples, a class semantic characteristic of the training sampleand a true label matrix; calculating a semantic similarity matrix by means of cosine similarity or Gaussian similarity through the class semantic characteristic; calculating a diagonal matrix of a class semantic similarity matrix; calling a Sylvester equation in an MATLAB toolset for obtaining a mapping matrix; inputting the vision characteristic of the training sample, the corresponding class semantic characteristic and the true label matrix into a target function, continuously adjusting the value of a model regularization parameter, calculating the least value of the target function, and finishing model training; and in a testing period, inputting the vision characteristic of the testing sample and the corresponding semantic characteristic, calculating scores of the classes, and determining the class with highest score as the predicated class of the testing sample. The zero-sample classifying method based on class transfer has advantages of sufficiently digging the semantic relationbetween different classes, realizing knowledge transfer between a known class classifier and an unknown class classifier, and realizing high convenience in application in image classification.
Owner:TIANJIN UNIV

Action evaluation method based on human body three-dimensional articulation point detection

ActiveCN111144217AAccurate evaluationComprehensive action standard evaluationCharacter and pattern recognitionHuman bodyPersonalization
The invention relates to an action evaluation method based on human body three-dimensional articulation point detection, which belongs to the field of computer vision and comprises the following steps: S1, performing human body three-dimensional articulation point detection on a single-frame picture after video framing; S2, extracting key frames of a specified number of frames of the video; S3, constructing motion vector features and joint kinetic energy features, and extracting feature values; and S4, constructing a key frame action similarity comparison model through multi-feature fusion: fusing the sub-features in the step S3, and constructing a personalized model for different types of actions; constructing a motion vector feature similarity function based on cosine similarity, and constructing a joint kinetic energy similarity function based on a weighting function; and obtaining a key frame action similarity comparison model based on the two similarity functions, comparing the action to be detected with the key frame set of the standard action, and finally obtaining the action similarity of the motion video. The method is more accurate and scientific, and can be used for physical fitness action correction and teaching.
Owner:CHONGQING UNIV OF POSTS & TELECOMM

Modeling method for parallel smart case recommendation model

The invention relates to a modeling method for a parallel smart case recommendation model. The method comprises the following steps of obtaining existing patient cases from an electronic case database, carrying out denoising, clustering and word segmentation on the patient cases, and establishing a patient case corpus database; defining that TFIDFi, j shows the importance degree of a word or an expression in a case of the patient case corpus database, establishing an LSI vector space model according to the TFIDFi, j, and moreover, establishing a BOW word bag model according to all words and expressions in the patient case corpus database; calculating history case vectors and to-be-processed case vectors in the patient case corpus database through utilization of the LSI vector space model and the BOW word bag model; calculating cosine similarity among the history patient cases and storing the cosine similarity; and calculating the cosine similarity between the to-be-processed case vectors and the history patient case vectors, and searching similar cases of to-be-processed cases according to the cosine similarity. The model established through adoption of the method provided by the invention is high in accuracy and low in error. A recommendation result is high in quality.
Owner:QINGDAO ACADEMY OF INTELLIGENT IND

Item-based explicit and implicit feedback mixing collaborative filtering recommendation algorithm

The invention discloses an item-based explicit and implicit feedback mixing collaborative filtering recommendation algorithm. The method comprises the following steps of obtaining the information of interest of users on every item and establishing the score matrix of every user on all the items; calculating the average score of every user, the quantity of the scoring users of every item and the average score of every item; calculating a common comment user quantity matrix; calculating the Pearson similarity and the modified cosine similarity of between any two items; calculating the similarity based on explicit feedback; calculating the cosine similarity based on implicit feedback; calculating a final similarity; obtaining the nearest neighbor set I of a current item; when providing a recommendation list to a target user u, according to the score matrix, obtaining the scored items and the unscored items of the target user u; calculating the prediction scores of the unscored items of the target user u and selecting N items with the highest scores inside the unscored items of the target user u to the user. The item-based explicit and implicit feedback mixing collaborative filtering recommendation algorithm can effectively improve the accuracy of prediction recommendation.
Owner:ZHEJIANG UNIV

Method, device and equipment for obtaining supervision recognition result in multiple modes and storage medium

The invention relates to the field of artificial intelligence, discloses a method, device and equipment for obtaining a supervision recognition result in multiple modes and a storage medium, and solves the problem of semantic similarity matching of current business supervision terms and business products. The method comprises the following steps: creating a knowledge graph; processing the knowledge graph according to a first preset rule, a second preset rule and the entity relationship file to obtain an entity and an entity relationship; updating the knowledge graph according to the entities and the entity relationship to obtain a target knowledge graph; analyzing the target knowledge graph and the training text through an encoder to obtain fused to-be-processed information; performing random masking processing on the fused to-be-processed information according to a preset strategy to obtain training data; performing word embedding vector processing and self-made force mechanism processing on the training data to obtain a target sentence vector and a target word vector; and calculating a weighted average value of the semantic cosine similarity and the character string similarity ofthe target sentence vector and the target word vector according to a preset weight ratio to obtain a supervision and identification result.
Owner:CHINA PING AN LIFE INSURANCE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products