Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

95 results about "Cluster labeling" patented technology

In natural language processing and information retrieval, cluster labeling is the problem of picking descriptive, human-readable labels for the clusters produced by a document clustering algorithm; standard clustering algorithms do not typically produce any such labels. Cluster labeling algorithms examine the contents of the documents per cluster to find a labeling that summarize the topic of each cluster and distinguish the clusters from each other.

Tag clustering method and system

InactiveCN102129470AOvercome the defect of inaccurate calculation of label similarityImprove accuracySpecial data processing applicationsFeature vectorCluster systems
The embodiment of the invention discloses a tag clustering method and a tag clustering system, wherein the method comprises the steps of; establishing characteristic vectors of every tag to be clustered; calculating a cosine included angle of two characteristic vectors in Euclidean space to obtain the similarity between every two tags to be clustered; and clustering the tags to be clustered by using K-Means algorithm according to the similarity between the tags to be clustered. The tag clustering system comprises: a characteristic vector establishing module which is used for establishing the characteristic vectors of every tag to be clustered, a similarity calculating module which is used for calculating the cosine included angle of two characteristic vectors in Euclidean space to obtain the similarity between every two tags to be clustered, and a clustering module which is used for clustering the tags to be clustered by using the K-Means algorithm according to the similarity between the tags to be clustered. The technical scheme can overcome the defect of inaccurate similarity calculation of tags in the current collaborative tag system, settle the problems of disordered tag organization and fuzzy tag semantics, and enhance the accuracy of tag clustering effectively.
Owner:UNIV OF SCI & TECH OF CHINA

Chinese Web document online clustering method based on common substrings

The invention discloses a Chinese Web document online clustering method based on common substrings. As known to all, search engines are important in application of information searching and positioning with sharp increase of information on the internet. Web document clustering can automatically classify return results of the search engines according to different themes so as to assist users to reduce query range and fast position needed information. The Web document online clustering is characterized in that non-numerical and non-structured characteristics of Web documents are required to be met on the one hand, and clustering time is required to meet online search requirements of users on the other hand. According to the two characteristics, the invention provides the Chinese Web document online clustering method based on common substrings, and the method comprises steps as follows: (1) firstly, preprocessing the first n query results returned by the search engines so as to realize deleting and replacing operation of non-Chinese characters in the return results of the search engines, (2) extracting common substrings in the Web documents by utilizing GSA, (3) presenting a weighting calculation formula referring to TF*IDF according to the common substrings which are extracted and then building a document characteristic vector model, (4) computing pairwise similarity of the Web documents on the basis of the model to acquire a similarity matrix, (5) adopting an improved hierarchical clustering algorithm to achieve clustering of the Web documents on the basis of the matrix, and (6) executing clustering description and label extraction. The Chinese Web document online clustering method based on common substrings has obvious advantages on performance, clustering label generation and clustering time effects.
Owner:BEIHANG UNIV

SAR image segmentation system and segmentation method based on immune clone and projection pursuit

InactiveCN101667292AExcellent projection directionExcellent spaceImage analysisGenetic modelsFeature extractionCo-occurrence
The invention discloses an SAR image segmentation system and a segmentation method based on immune clone and projection pursuit. The system comprises an image characteristic-extracting module, an initial label-selecting submodule, a projection direction-selecting submodule and a subspace clustering submodule, wherein the image characteristic-extracting module extracts the gray co-occurrence matrixcharacteristics, the wavelet characteristics, the brushlet characteristics and the contourlet characteristics of an input image; the initial label-selecting submodule clusters the image characteristics to acquire and transmit an initial label to the projection direction-selecting submodule for calculating a linear judgment analysis projection index and acquiring an optimal projection direction; the subspace clustering submodule projects the image characteristics in the optimal projection direction, acquires and clusters an optimal subspace to acquire a clustering label, returns the clusteringlabel to the initial label-selecting submodule for iteration, acquires a final clustering label corresponding to image pixels and acquires a final image segmentation result. The invention has the advantage of high segmentation accuracy and can be applied to civil and industrial fields or as martial reconnaissance means.
Owner:XIDIAN UNIV

Intelligent adaptive equalizer and equalization demodulation method based on machine learning

The invention discloses an intelligent adaptive equalizer and an equalization demodulation method based on machine learning. The method includes the following steps: the data of an acquired signal ispreprocessed, and the energy of input data is normalized. Then, the cluster groups in the data are clustered by using a Gaussian kernel function and a distance function under the condition of withoutany other prior knowledge, and the cluster groups in the clustered data are labeled by using the nearest neighbor algorithm in order to realize useful informationization of modulation signals. The discrete noise points outside the cluster groups are not clustered, and the cluster halos without clustering labels are marked by using the weighted K-nearest neighbor algorithm. Finally, the data is allintegrated to obtain an overall label, and the error rate of the system is estimated by comparing the overall label with pre-stored labels. By using the method of the invention, the real clustering center can be identified without any other prior knowledge, regardless of the shape and size. The computational complexity is reduced, and the accuracy of the classification result is significantly improved. The method can adapt to most of the modulation formats in the current communication system.
Owner:SUZHOU UNIV

Image segmentation method based on iteration self-organization and multi-agent inheritance clustering algorithm

InactiveCN104050680AOvercoming the disadvantages of relying onImprove stabilityImage analysisPattern recognitionLocal optimum
The invention discloses an image segmentation method based on an iteration self-organization and multi-agent inheritance clustering algorithm. The method mainly solves the problems that a segmentation result depends on initial parameters excessively, and the phenomenon of local optimum occurs easily in the prior art. The method comprises the segmentation steps that 1) gray information of an image to be segmented is extracted; 2) the algorithm thought of the iteration self-organization algorithm ISODATA is used for the image to be segmented to obtain the optimal clustering number; 3) according to the optimal clustering number, a multi-agent algorithm frame is utilized for clustering the image to be segmented to obtain an optimal clustering label; 4) according to the optimal clustering label, image pixels of the image to be segmented are classified to achieve image segmentation. According to the method, the clustering number does not need to be determined definitely, the convergence effect is good, the global optimum value can be obtained easily, the quality of image segmentation can be improved, the stability of the segmentation result is enhanced, and the method can be used for extraction and identification of image targets and other follow-up processing.
Owner:XIDIAN UNIV

Human voice segmentation method and device

ActiveCN107967912AReduce workloadSolve technical problems that are inefficient and time-consumingSpeech recognitionFeature vectorChronological time
An embodiment of the invention provides a human voice segmentation method and a human voice segmentation device. The human voice segmentation method comprises the steps of: extracting feature vectorsfrom audio data; performing voice activation monitoring on the audio data, and labeling muted segments and voice segments separately; extracting the voice segments according to labels, segmenting thevoice segments according to a predetermined time length, performing clustering operation on the feature vectors in the segmented voice segments by adopting a probability distribution clustering method, and outputting corresponding clustering labels; and arranging the voice segments corresponding to the different clustering labels according to a time sequence, and outputting the voice segments withdifferent clustering labels after arrangement and merging. The human voice segmentation method adopts the probability distribution clustering method for performing clustering operation, can perform rapid clustering on the feature vectors of voice without modeling the voice segments, adds the voice activation monitoring, only processes the voice segments, improves the working efficiency, and solves the technical problem of low efficiency and long time consumption of the traditional human voice segmentation system.
Owner:SPEAKIN TECH CO LTD

Object clustering method and device, computer readable medium and electronic equipment

The embodiment of the invention provides an object clustering method and device, a computer readable medium and electronic equipment. The object clustering method comprises the following steps: obtaining an interest label of each object, generating a label sequence corresponding to each object based on the interest label, clustering the objects according to the label sequences corresponding to theobjects; and obtaining an object group corresponding to the same clustering label, and under the condition that the object number of the object group is smaller than an object number threshold value,combining the objects in the object group into the object group associated with the clustering label to obtain a target group of which the object number is greater than or equal to the object numberthreshold value. According to the technical scheme provided by the embodiment of the invention, the target group has the clustering label capable of accurately representing the object preference information; each target group has a balanced scale, so that the clustering groups are correspondingly processed according to the clustering labels of the clustering groups, the accuracy and balance of object clustering are improved, and the processing accuracy of the clustering groups is further improved.
Owner:TENCENT TECH (SHENZHEN) CO LTD

A semi-supervised tourist portrait data clustering method based on density peaks and gravitation influences

The invention relates to a semi-supervised tourist portrait data clustering method based on density peaks and gravitation influences, which comprises the following steps: calculating density values and distance values of all points of tourist portrait data through a density peak algorithm, and finding out all possible clustering center points; Calculating the distance between the tourist portraitseed points and possible clustering center points by using the provided tourist portrait seed points, voting and screening out accurate clustering center points, and pasting clustering labels on the corresponding clustering center points by using the seed label information; Randomly selecting a seed data subset with a certain proportion from all the seed data, and calculating the gravitation influence between the seed data subset and each unlabeled data point by introducing the idea of the universal gravitation law, thereby clustering all the unlabeled data and attaching corresponding clusterlabels to the unlabeled data; And randomly selecting a seed data subset for multiple times to attach a corresponding decision-making cluster label to the unlabeled data, and voting to select cluster label information of each piece of unlabeled data. The method is good in clustering effect and high in accuracy.
Owner:ZHEJIANG UNIV OF TECH

Training data set generation method and device, electronic equipment and storage medium

The invention provides a training data set generation method and device, electronic equipment and a storage medium. The method comprises the steps of obtaining a classified source data set and an unclassified target data set; extracting a first feature vector set of the source data set and a second feature vector set of the target data set through a feature extractor; determining a class center feature vector corresponding to the source data set according to the first feature vector set, and determining a clustering label of the target data set and an average feature vector in a clustering cluster according to the second feature vector set; iteratively optimizing the feature extractor, so that the overall difference between the feature vectors of samples in the source data set and the feature vectors of a class center and the overall difference between the feature vectors of the elements in a clustering cluster and average feature vectors in the clustering cluster are made to be minimum; and obtaining a training data set according to the clustering label of the target data set and the elements in the clustering cluster. According to the method, the workload of manual labeling can be reduced, the manual labeling cost is reduced, and the labeling precision is improved.
Owner:创新奇智(合肥)科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products