Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

1443 results about "Similarity matrix" patented technology

A similarity matrix is a matrix of scores that represent the similarity between a number of data points. Each element of the similarity matrix contains a measure of similarity between two of the data points. Similarity matrices are strongly related to their counterparts, distance matrices and substitution matrices.

Suggesting and refining user input based on original user input

Systems and methods to generate modified / refined user inputs based on the original user input, such as a search query, are disclosed. The method may be implemented for Roman-based and / or non-Roman based language such as Chinese. The method may generally include receiving an original user input and identifying core terms therein, determining potential alternative inputs by replacing core term(s) in the original input with another term according to a similarity matrix and / or substituting a word sequence in the original input with another word sequence according to an expansion / contraction table where one word sequence is a substring of the other, computing likelihood of each potential alternative input, and selecting most likely alternative inputs according to a predetermined criteria, e.g., likelihood of the alternative input being at least that of the original input. A cache containing pre-computed original user inputs and corresponding alternative inputs may be provided.
Owner:GOOGLE LLC

Document similarity scoring and ranking method, device and computer program product

InactiveUS7689559B2Avoids large and wasted effortSmall similarity scoreData processing applicationsWeb data indexingDocument similarityCollation
A device, computer program product and a method for searching, navigating or retrieving documents in a set of electronic documents, including performing a link analysis of the set of electronic documents. The link analysis includes one of analyzing at least two of the set of documents with at least a portion of a similarity graph constructed among the set of documents and analyzing the at least two of the set of documents with the at least a portion of the similarity graph and at least a portion of a hyperlink graph constructed from hyperlinks between the set of documents. Also described is a method for building a similarity matrix.
Owner:TELENOR AS

Four-dimensional (4D) image verification in respiratory gated radiation therapy

A method for four-dimensional (4D) image verification in respiratory gated radiation therapy, includes: acquiring 4D computed tomography (CT) images, each of the 4D CT images representing a breathing phase of a patient and tagged with a corresponding time point of a first surrogate signal; acquiring fluoroscopic images of the patient under free breathing, each of the fluoroscopic images tagged with a corresponding time point of a second surrogate signal; generating digitally reconstructed radiographs (DRRs) for each breathing phase represented by the 4D CT images; generating a similarity matrix to assess a degree of resemblance in a region of interest between the DRRs and the fluoroscopic images; computing a compounded similarity matrix by averaging values of the similarity matrix across different time points of the breathing phase during a breathing period of the patient; determining an optimal time point synchronization between the DRRs and the fluoroscopic images by using the compounded similarity matrix; and acquiring a third surrogate signal and turning a treatment beam on or off according to the optimal time point synchronization.
Owner:SIEMENS HEALTHCARE GMBH +1

Clustering using non-negative matrix factorization on sparse graphs

Object clustering techniques are disclosed. A nonnegative sparse similarity matrix is constructed for a set of objects. Nonnegative factorization of the nonnegative sparse similarity matrix is performed. Objects of the set of objects are allocated to clusters based on factor matrices generated by the nonnegative factorization of the nonnegative sparse similarity matrix.
Owner:XEROX CORP

Document similarity scoring and ranking method, device and computer program product

InactiveUS20070185871A1Avoids largeAvoids wasted computational effortData processing applicationsWeb data indexingHyperlinkElectronic document
A device, computer program product and a method for searching, navigating or retrieving documents in a set of electronic documents, including performing a link analysis of the set of electronic documents. The link analysis includes one of analyzing at least two of the set of documents with at least a portion of a similarity graph constructed among the set of documents and analyzing the at least two of the set of documents with the at least a portion of the similarity graph and at least a portion of a hyperlink graph constructed from hyperlinks between the set of documents. Also described is a method for building a similarity matrix.
Owner:TELENOR AS

Gesture recognition method based on acceleration sensor

The invention discloses a gesture recognition method based on an acceleration sensor. The gesture recognition method based on an acceleration sensor comprises the following steps: automatically collecting gesture acceleration data, preprocessing, calculating the similarity of all gesture sample data so as to obtain a similarity matrix, extracting a gesture template, constructing a gesture dictionary by utilizing the gesture template, and carrying out sparse reconstruction and gesture classification on the gesture sample data to be recognized by adopting an MSAMP (modified sparsity algorithm adaptive matching pursuit) algorithm. According to the invention, the compressed sensing technique and a traditional DTW (dynamic time warping) algorithm are combined, and the adaptability of the gesture recognition to different gesture habits is improved, and by adopting multiple preprocessing methods, the practicability of the gesture recognition method is improved. Additionally, the invention also discloses an automatic collecting algorithm of the gesture acceleration data; the additional operation of traditional gesture collection is eliminated; the user experience is improved; according to the invention, a special sensor is not required, the gesture recognition method based on the acceleration sensor can be used for terminals carried with the acceleration sensor; the hardware adaptability is favorable, and the practicability of the recognition method is enhanced. The coordinate system is uniform, and can be adaptive to different multiple gesture habits.
Owner:BEIJING UNIV OF POSTS & TELECOMM

Method and device for extracting keyword based on graph model

The embodiment of the invention provides a method and a device for extracting a keyword based on a graph model. The method comprises the steps of acquiring a to-be-processed text, and segmenting words of the to-be-processed text to obtain candidate keywords corresponding to the to-be-processed text; finding out word vectors corresponding to the candidate keywords from a word vector model, wherein the word vector model includes the word vectors of the candidate keyword; constructing a word similarity matrix of the candidate keywords according to the word vectors; acquiring a language database corresponding to the to-be-processed text, calculating global information of the candidate keywords in the language database to obtain a global weight of the candidate keywords, and taking the global weight as an initial weight of the candidate keywords, wherein the global information represents the importance degree of the candidate keywords in the language database, and the language database at least includes a search log and a network document; and ranking the candidate keywords according to the initial weight and the word similarity matrix of the candidate keyword, and extracting the keyword of the to-be-processed text. By use of the embodiment, the keyword extraction accuracy rate is effectively improved.
Owner:BEIJING QIYI CENTURY SCI & TECH CO LTD

Detection method for abnormal behavior of vehicle based on spectrum clustering

InactiveCN102855638ARealize abnormal lane changeAbnormal lane change foundImage analysisDetection of traffic movementFeature vectorMaximum eigenvalue
The invention discloses a detection method for an abnormal behavior of a vehicle based on spectrum clustering. The detection method comprises the following steps: obtaining a space-time track of a moving target through video tracking; removing abnormality and preprocessing, thereby obtaining a normal track; constructing an image for the track, thereby obtaining an undirected image corresponding to a track sequence; calculating similarity among tracks, thereby obtaining a similarity matrix; performing Laplace transformation on the similarity matrix, thereby obtaining a Laplace matrix; clustering the feature vector matrix of the front k maximal feature values; after performing mode learning on a motion track, obtaining motion modes of the target under a normal state; if a new track meets one of the motion modes, i.e. a normal motion mode, confirming that the traffic is normal; and if not, confirming that the vehicle abnormally runs, namely, the traffic abnormality occurs. According to the detection method, through the clustering learning for the vehicle track, the monitoring for the abnormal behavior of the vehicle is realized, the abnormal lane change is detected and the basis for automation of traffic management is supplied.
Owner:SUZHOU UNIV

Systems And Methods For Mining Transactional And Time Series Data

In accordance with the teachings described herein, systems and methods are provided for analyzing transactional data. A similarity analysis program may be used that receives time-series data relating to transactions of an organization and performs a similarity analysis of the time-series data to generate a similarity matrix. A data reduction program may be used that receives the time-series data and performs one or more dimension reduction operations on the time-series data to generate reduced time-series data. A distance analysis program may be used that performs a distance analysis using the similarity matrix and the reduced time-series data to generate a distance matrix. A data analysis program may be used that performs a data analysis operation, such as a data mining operation, using the distance matrix to generate a data mining analysis of the transactional data.
Owner:SAS INSTITUTE

Article information recommending method and device

The invention discloses an article information recommending method and device. The method comprises the steps of obtaining attribute information and user behavior data of an access user when an article access request is received; obtaining a corresponding candidate article set; determining articles satisfying preset conditions in the candidate article set based on a similarity matrix, the attribute information and the user behavior data, wherein the similarity matrix is used for indicating the similarities among the candidate articles, and the similarities between the candidate articles and the attribute information; and recommending the information of the articles satisfying the preset conditions to the access user. Through adoption of the similarity matrix, the attribute information of the access user and recent different click and consumption behaviors to the articles, intention prediction is carried out on user access; and therefore, the articles suitable for the user are determined and recommended to the user. Compared with the mode of carrying out recommendation through prediction of the click-through-rate scores of the user to the articles based on a linear model, the method and the device have the advantages of improving individuation of the recommendation results and improving the accuracy of the recommendation results.
Owner:TENCENT TECH (SHENZHEN) CO LTD

System and method for quantifying, representing, and identifying similarities in data streams

A method of quantifying similarities between sequential data streams typically includes providing a pair of sequential data streams, designing a Hidden Markov Model (HMM) of at least a portion of each stream; and computing a quantitative measure of similarity between the streams using the HMMs. For a plurality of sequential data streams, a matrix of quantitative measures of similarity may be created. A spectral analysis may be performed on the matrix of quantitative measure of similarity matrix to define a multi-dimensional diffusion space, and the plurality of sequential data streams may be graphically represented and / or sorted according to the similarities therebetween. In addition, semi-supervised and active learning algorithms may be utilized to learn a user's preferences for data streams and recommend additional data streams that are similar to those preferred by the user. Multi-task learning algorithms may also be applied.
Owner:CARIN LAWRENCE +4

Systems and methods for mining transactional and time series data

In accordance with the teachings described herein, systems and methods are provided for analyzing transactional data. A similarity analysis program may be used that receives time-series data relating to transactions of an organization and performs a similarity analysis of the time-series data to generate a similarity matrix. A data reduction program may be used that receives the time-series data and performs one or more dimension reduction operations on the time-series data to generate reduced time-series data. A distance analysis program may be used that performs a distance analysis using the similarity matrix and the reduced time-series data to generate a distance matrix. A data analysis program may be used that performs a data analysis operation, such as a data mining operation, using the distance matrix to generate a data mining analysis of the transactional data.
Owner:SAS INSTITUTE

Two-stage hybrid particle swarm optimization clustering method

The invention relates to a two-stage hybrid particle swarm optimization clustering method, which is mainly used for solving the problems of greater time consumption and low accuracy of the conventional particle swarm optimization K-mean clustering method when the number of dimensions of samples is higher. The technical scheme disclosed by the invention comprises the following steps: (1) reading a data set and the number K of clusters; (2) taking statistics on information of dimensionality; (3) standardizing the dimensionality; (4) calculating a similarity matrix; (5) generating a candidate initial clustering center; (6) performing particle swarm K-mean partitional clustering; and (7) outputting a particle swarm optimal fitness value and a corresponding data set class cluster partition result. According to the two-stage hybrid particle swarm optimization clustering method disclosed by the invention, the first-stage clustering is firstly performed by adopting agglomerative hierarchical clustering, a simplified particle encoding way is provided, the second-stage clustering is performed on data by particle swarm optimization K-mean clustering, the advantages of hierarchical agglomeration, K-mean and particle swarm optimization methods are integrated, the clustering speed is accelerated, and the global convergence ability and the accuracy of the clustering result of the method are improved.
Owner:XIDIAN UNIV

Methods and systems for using map-reduce for large-scale analysis of graph-based data

Embodiments are described for a method for processing graph data by executing a Markov Clustering algorithm (MCL) to find clusters of vertices of the graph data, organizing the graph data by column by calculating a probability percentage for each column of a similarity matrix of the graph data to produce column data, generating a probability matrix of states of the column data, performing an expansion of the probability matrix by computing a power of the matrix using a Map-Reduce model executed in a processor-based computing device; and organizing the probability matrix into a set of sub-matrices to find the least amount of data needed for the Map-Reduce model given that two lines of data in the matrix are required to compute a single value for the power of the matrix. One of at least two strategies may be used to computing the power of the matrix (matrix square, M2) based on simplicity of execution or improved memory usage.
Owner:SALESFORCE COM INC

Social network node clustering system and method

Users in a social network are represented by nodes on a network graph. A similarity processor generates a similarity matrix of nodes and neighbors. A clustering processor groups select nodes based on similarity. Nodes initially assigned to one cluster are selectively added to other clusters based on similarity. A social network processor provides features and processing based on the clusters of nodes thus produced.
Owner:GOOGLE LLC

Method for Anomaly Detection in Time Series Data Based on Spectral Partitioning

Anomalies in real time series are detected by first determining a similarity matrix of pairwise similarities between pairs of normal time series data. A spectral clustering procedure is applied to the similarity matrix to partition variables representing dimensions of the time series data into mutually exclusive groups. A model of normal behavior is estimated for each group. Then, for the real time series data, an anomaly score is determined, using the model for each group, and the anomaly score is compared to a predetermined threshold to signal the anomaly.
Owner:MITSUBISHI ELECTRIC RES LAB INC

Cloud computing based real-time mass user behavior analyzing method and system

InactiveCN103793465AImprove interest analysis efficiencyEffective and accurate pushDatabase distribution/replicationSpecial data processing applicationsReal time analysisRecommendation model
The invention discloses a cloud computing based real-time mass user behavior analyzing method and system. User behaviors and context data are collected by a client end in real time and pre-processed and clustered on the basis of a MapReduce model; ontology data are subjected to reasoning, and latest interests of users are analyzed in real time; a track recurrence algorithm on the basis of a user behavior context is provided for track filling; interest similarity of users is computed by a cosine factor method, an interest similarity matrix is established; a Markov transfer matrix and a collaborative filtering based Markov recommendation model are established to realize effective and precise recommendation. The user behaviors and context information are subjected to model establishment via ontology, and semantic-level sharing and reusing of large-scale behavior information are achieved via an Hbase (hadoop database) based ontology memory mode. Technologies of cloud-computing, the ontology, reasoning and knowledge discovering are combined, and problems in instantaneity, efficiency, large-scale memorizing and intelligentization in mass user behavior analysis are solved.
Owner:WUHAN UNIV OF TECH

Systems and methods for media summarization

A stream of ordered information, such as, for example, audio, video and / or text data, can be windowed and parameterized. A similarity between the parameterized and windowed stream of ordered information can be determined, and a probabilistic decomposition or probabilistic matrix factorization, such as non-negative matrix factorization, can be applied to the similarity matrix. The component matrices resulting from the decomposition indicate major components or segments of the ordered information. Excerpts can then be extracted from the stream of ordered information based on the component matrices to generate a summary of the stream of ordered information.
Owner:FUJIFILM BUSINESS INNOVATION CORP

Personalized recommendation method with socialization information fused

The present invention provides a personalized recommendation method with socialization information fused. The method comprises the following steps: S1, constructing a user- user trust matrix; S2, constructing a project- project tag similarity matrix; S3, constructing and training a model; and S4, predicting a preference of a user for an unknown project. The method provided by the present invention mainly has the following advantages: 1) a sorting learning method in the information retrieval field is applied to Top- K recommendation, so that the sorting problem in the recommendation system is effectively solved, and the defect that the conventional score-based prediction method can not perform Top-K recommendation is overcome; and 2) socialization information, i.e. user social information and project tag information, is fused into the model based on sorting learning, so that accuracy of recommendation results is improved.
Owner:DALIAN UNIV OF TECH

Multi-source heterogeneous multi-attribute POI (point of interest) integration method

The invention discloses a multi-source heterogeneous multi-attribute POI (point of interest) integration method. The multi-source heterogeneous multi-attribute POI integration method includes firstly acquiring data sets to be integrated from POI data sources A and B, and deweighting two heterogeneous-attribute data sets respectively; browsing each POI in the two data sets respectively and calculating similarity of each attribute of the POI to obtain an attribute similarity matrix on the premise that the attribute similarity calculation rule is obeyed; solving a weighted multi-attribute POI similarity vector; calculating the maximum value Max of various components in the POI similarity vector and comparing the Max with a threshold value T; adding different attribute items of the POI standing for the same geographic entity and integrating the same attribute items of the same value. According to overall importance and different influence degrees of the attributes, various types of attributes of the POI are considered differently, so that the method better lives up to actual operation of POI integration, and accuracy and efficiency of automatic POI integration can be remarkably improved.
Owner:WUHAN UNIV

Multidimensional individualized recommendation method in heterogeneous network

The invention relates to a multidimensional individualized recommendation method in a heterogeneous network which includes: 1, obtaining information; 2, establishing a similarity matrix between the user and item; 3, establishing a semi-structured heterogeneous information network; calculating the degree of correlation between users and items and between users and users under different meta paths based on the semi-structured heterogeneous information network; distributing different weights to the degree of correlation of each meta path to form a similarity matrix between users and other types of entities in the semi-structured heterogeneous information network; 4, distributing different weights and integrating the degree of correlation between users and items and between users and users with user's preference to items to form the final similarity matrix; 5, recommending a plurality of items with big similarity in the final similarity matrix to users. The multidimensional individualized recommendation method is added with user subordinate information and item subordinate information, considers the rich semantic information between users and items, and improves the recommendation accuracy and percentage of coverage.
Owner:SHANDONG UNIV

Information processing method and device

The embodiment of the invention discloses an information processing method and device. The method comprises the steps that an HTML document set which is obtained in advance is analyzed, and text data sets contained in the HTML document set are extracted; word segmentation is conducted on the text data sets, and a text segmentation table is obtained; word frequency analysis is conducted on all words in the text segmentation table, and a text vector space matrix is constructed; discrete point text vectors in the text vector space matrix are eliminated, and a text similarity matrix of all text vectors in the text vector space matrix without the discrete point text vectors is obtained; according to the text similarity matrix, topic cluster is conducted on the text data set. By means of the method, a word list can be accurately constructed, topic cluster is conducted after the discrete point text vectors are eliminated, the topic cluster speed is increased, and the topic cluster accuracy is improved.
Owner:中国联合网络通信有限公司广东省分公司 +1

Visual target tracking method based on self-adaptive subject sensitivity

The invention discloses a visual target tracking method based on self-adaptive subject sensitivity, and belongs to the technical field of computer vision. The visual target tracking method comprises an overall process, an offline part and an online part. The whole process includes: designing a target tracking process, and designing a network structure; adjusting the feature map of each stage of the network into an adaptive size to complete the end-to-end tracking process of the twin network; the offline part comprises six steps: generating a training sample library; carrying out forward tracking training; calculating a back propagation gradient; calculating a gradient loss item; generating a target template image mask; and training a network model and obtaining the model. The online part comprises three steps: carrying out model updating; carrying out online tracking; and positioning a target area. The model updating comprises forward tracking, back propagation gradient calculation, gradient loss item calculation and target template image mask generation; the online tracking comprises the steps of performing forward tracking to obtain a similarity matrix, calculating the confidencecoefficient of a current tracking result and returning to a target area. The method can better adapt to target robust tracking of appearance changes.
Owner:BEIJING UNIV OF TECH

Combined wrong question recommendation method based on knowledge graph

ActiveCN107273490AUnderstanding Semantic AssociationsImprove accuracySpecial data processing applicationsText miningNear neighbor
The invention discloses a combined wrong question recommendation method based on a knowledge graph. Wrong questions relevant to weak knowledge points of a learner can be accurately recommended for the learner by adopting the method. The method comprises the steps that knowledges are extracted from large-scale unstructured test question data to establish the knowledge graph; text mining and word segmentation are conducted on the wrong questions of the learner to extract wrong question keywords, and thus knowledge points including in the wrong questions are determined; semantic near neighbors of the knowledge points are obtained by analyzing semantic similarity of the test questions; the wrong question knowledge points are mapped into the knowledge graph to obtain test question entities conforming to their knowledge points. In addition, similarity weight calculation is conducted on a test question library to obtain similarity matrixes of test paper, a collaborative filtering technology is utilized to obtain recommended test questions of the wrong questions. Finally, two recommendation results are further combined in weighing, mixing, superposing and element-level modes, and a final recommendation result is given.
Owner:BEIJING UNIV OF TECH

Windowed Statistical Analysis For Anomaly Detection In Geophysical Datasets

Method for identifying geologic features from geophysical or attribute data using windowed principal component (22), or independent component, or diffusion mapping (61) analysis. Subtle features are made identifiable in partial or residual data volumes. The residual data volumes (24) are created by (36) eliminating data not captured by the most prominent principal components (14). The partial data volumes are created by (35) projecting the data (21) on to selected principal components (22, 61). Geologic features may also be identified from pattern analysis (77) or anomaly volumes (62, 79) generated with a variable-scale data similarity matrix (73). The method is suitable for identifying physical features indicative of hydrocarbon potential.
Owner:EXXONMOBIL UPSTREAM RES CO

Recommendation algorithm based on multi-index grading

InactiveCN105095477ACollinearity cannot be eliminatedRecommendation results improveSpecial data processing applicationsPersonalizationCluster algorithm
The invention discloses a recommendation algorithm based on multi-index grading. The recommendation algorithm comprises the following steps of firstly, recognizing index keywords, secondly, extracting suggestion grading, thirdly, constructing a user and commodity similarity matrix, fourthly, using a two-way clustering algorithm for obtaining a clustering matrix, fifthly, conducting single in-cluster recommendation and sixthly using a comprehensive function algorithm for obtaining a final recommendation result. According to the recommendation algorithm, the problem that a user may need individual recommendations for different index preferences for different commodities can be solved, the high accuracy is achieved, and the recommendation result with the higher quality can be obtained.
Owner:SOUTH CHINA UNIV OF TECH

Load curve clustering method based on improved spectral and multi-manifold clustering

The invention discloses a load curve clustering method based on improved spectral and multi-manifold clustering. The load curve clustering method comprises three steps of typical daily load curve extraction, load curve clustering and clustering effect evaluation. Firstly, load characteristic indexes of a user are extracted, and typical daily load curves of the user are calculated and extracted bycombining a non-parameter kernel density estimation method; canonical warping distance metering curve similarity is introduced into an improved spectral and multi-manifold clustering algorithm, localsimilarity is calculated by adopting a Gaussian kernel function, and a similarity matrix is calculated based on the local similarity; and various clustering effectiveness indexes are adopted for evaluating a clustering result and algorithm performance after clustering. The local similarity adopts load data of a plurality of users in Baoding area as calculating example samples for performing clustering analysis, and verifies the rationality and superiority of a typical daily load curve extraction method and the improved spectral and multi-manifold clustering algorithm disclosed in the invention.
Owner:NORTH CHINA ELECTRIC POWER UNIV (BAODING)

Unsupervised pedestrian re-identification method based on transfer learning

InactiveCN110135295AImplement the re-identification methodImprove learning effectBiometric pattern recognitionData setFeature set
The invention discloses an unsupervised pedestrian re-identification method based on transfer learning, and the method comprises the following steps: 1), pre-training a CNN model on a source data setwith a label, and employing cross entropy loss and ternary metric loss as a target optimization function; 2) extracting pedestrian characteristics of the label-free target data set; 3) calculating a feature similarity matrix by combining the candidate column distance and the absolute distance; 4) performing density clustering on the similarity matrix, setting a label for each feature set with thedistance smaller than a preset threshold value, and recombining the feature sets into a target data set with the label; 5) training the CNN model on the recombination data set until convergence; 6) repeating the steps 2)-5) according to a preset number of iterations, and 7) inputting the test pictures into the model to extract features, and sorting the test pictures according to feature similarityto obtain a result. The source domain labeled data and the target domain unlabeled data are reasonably applied, the accuracy of pedestrian re-identification is improved in the target domain, and thestrong dependence on the labeled data is reduced.
Owner:SOUTH CHINA UNIV OF TECH

Image segmentation method based on super pixel clustering

The invention discloses an image segmentation method based on super pixel clustering. More specifically the image segmentation method includes the steps of: 1, segmenting images by using a Simple Linear Iterative Clustering (SLIC) algorithm, and generating super pixels; 2, improving a construction mode of a similarity matrix about the super pixels, and fusing a color characteristic and a texture characteristic through non-symmetry of the similarity matrix; 3, clustering the super pixels through an Affinity Propagation (AP) clustering algorithm based on the similarity matrix; and 4, reaching the purpose of image segmentation by adding spatial information of the super pixels and dividing a disconnected region into different types of super pixel groups by means of breadth-first traversal. The image segmentation method based on the super pixel clustering has good segmentation effect and a fast convergence speed. Target objects can be effectively segmented without arrangement of target quantity.
Owner:SOUTHEAST UNIV

Information search method and system based on interactive document clustering

The invention provides an information search method and system based on interactive document clustering. The method comprises the following steps that a document set is horizontally partitioned and preprocessed; word frequency statistics is conducted, and high-frequency words constitute a characteristic word set; vector space representation of documents is generated, the distances between the documents are calculated, and a similarity matrix is generated; a Laplacian matrix is generated, the number of clusters and a representation matrix are determined according to intervals between proper values of the Laplacian matrix, secondary clustering is conducted, and initial distance results are obtained; users conduct interactive operation on the initial distance results, new characteristic words are mined through chi-square statistics, a vector space is reconstructed, and the clustering process is repeated; finally, clustering results are shown to the users, and therefore the users obtain different categories of search results. According to the information search method and system, a semi-supervised learning approach in which the users intervene is adopted, the documents are clustered and analyzed, and the users obtain the different categories of search results.
Owner:PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products