Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

82 results about "Medoid" patented technology

Medoids are representative objects of a data set or a cluster with a data set whose average dissimilarity to all the objects in the cluster is minimal. Medoids are similar in concept to means or centroids, but medoids are always restricted to be members of the data set. Medoids are most commonly used on data when a mean or centroid cannot be defined, such as graphs. They are also used in contexts where the centroid is not representative of the dataset like in images and 3-D trajectories and gene expression (where while the data is sparse the medoid need not be). These are also of interest while wanting to find a representative using some distance other than squared euclidean distance (for instance in movie-ratings).

Network abnormal traffic detection method based on PAM (Partitioning Around Medoids) clustering algorithm

The invention discloses a network abnormal traffic detection method based on a PAM (Partitioning Around Medoids) clustering algorithm. The method comprises a traffic collection stage of monitoring a network to obtain network data packets through a network analysis tool; a feature extraction stage of extracting attributes of the network data packets, and carrying out information entropy calculation on the attributes of the network data packets in a time period, thereby obtaining multiple multi-dimensional data records; a center selection stage of clustering data points of the network data packets by employing the PAM clustering algorithm according to the multi-dimensional data records, and selecting precise clustering centers through approximate clustering after approximate clustering centers are obtained; and an outlier judgment state of setting a threshold value, and screening data points of which precise clustering center distance and partial outlier factors are greater than the threshold value, thereby obtaining outlier abnormal data. According to the method, the improved PAM clustering algorithm is applied to abnormal traffic detection, the advantage that clustering is unnecessarily marked is inherited, moreover, the operation time required by the algorithm is reduced, and the capability of processing more data can be realized.
Owner:EAST CHINA NORMAL UNIV

Touch information classified computing and modelling method based on machine learning

The invention relates to a touch information classified computing and modelling method based on machine learning. The method comprises the following steps: acquiring a touch sequence of a training set sample, modelling by adopting a linear dynamic system model, extracting dynamic characteristics of a sub touch sequence, calculating distance of the dynamic characteristics of the sub touch sequence by adopting Martin distance, clustering a Martin matrix by adopting a K-medoids algorithm, constructing a code book, carrying out characterization on each touch sequence by adopting the code book to obtain a system packet model, putting the system packet model of the training set sample and a training set sample label into an extreme learning machine for training a classifier, and putting the system packet model of a to-be-classified sample into the classifier to obtain a label for type of an object. The touch information classified computing and modelling method has the advantages that the actual demand of a robot on stable and complaisant grasping of a non-cooperative target is met, data foundation is provided for completion of a precise operation task, and other sensing results can be fused and computed, so that the description and recognition capability on different targets is enhanced by virtue of multi-source deep perception, and a technical foundation is laid for implementation of intelligent control.
Owner:SHANGHAI AEROSPACE CONTROL TECH INST

Integrated scheduling method for multi-roadway automatic stereoscopic warehouse based on mixed integer programming model

ActiveCN110084545AReduce complexitySuccessfully Solve Integration Scheduling ProblemsCharacter and pattern recognitionLogisticsOperation schedulingProduct order
The invention discloses an integrated scheduling method for a multi-roadway automatic stereoscopic warehouse based on a mixed integer programming model which comprises the steps that step 1, a tray/order arrives so as to perform step 2; if no tray/order arrives, performing the step 5; step 2, carrying out correlation analysis based on a K-Medoids clustering algorithm on arrived tray products making clustering analysis on arrived tray products, finding class clusters where tray products are located, solving an initial goods allocation range, wherein correlation analysis implementation process based on K-Medoids clustering algorithm is as follows: K-Medoids clustering algorithm is used for carrying out correlation analysis: step3, directly warehousing products ordered the same day and performing the step 6; if not, skipping to the step 4, directly warehousing products ordered the same day; step 4, stacking machine task distribution based on operation balance, step 5: distributing a sorting mode, and step 6: performing goods allocation and operation scheduling distribution; step 7, generating a stacking machine operation list until an optimal solution is not changed, otherwise, repeating the step 5 -step 7.
Owner:ZHEJIANG UNIV OF TECH

Short-term power load prediction method based on KM-APSO-SVM (K-medoids-Adaptive Particle Swarm Optimization-Support Vector Machine) model

The invention discloses a short-term power load prediction method based on a KM-APSO-SVM (K-medoids-Adaptive Particle Swarm Optimization-Support Vector Machine) model. The short-term power load prediction method comprises the following steps that: (1) on the basis of big data, carrying out the analysis of a power grid daily load change rule: collecting the data information of prediction site environment, utilizing grey relational analysis to analyze a relationship between each meteorological factor and a load, and laying a foundation for establishing a load prediction model; (2) applying a K-medoids clustering algorithm to carry out clustering analysis on a sample: arranging collected data to form a clustering sample, setting a classification number, selecting a relevant factor to form a sample feature vector, applying the K-medoids clustering algorithm to carry out the clustering analysis on the sample, and mapping to a specific zone through nondimensionalizign processing to form a clustering result; and (3) applying an APSO-SVM prediction model to carry out load prediction: carrying out accumulation preprocessing on the collected data to obtain a training sample, inputting the data of the clustering sample into the SVM to be trained, using the APSO to optimize SVM parameters, establishing a prediction model, and carrying out accumulation reduction on an obtained prediction result.
Owner:NORTH CHINA ELECTRIC POWER UNIV (BAODING)

Method and system for realizing data leakage prevention

InactiveCN107292193ASolve the problem of finding more rigidSolve undetected problemsDigital data protectionNatural language data processingK-medoidsUnsupervised learning
The invention discloses a method and a system for realizing data leakage prevention. The method comprises the steps that preprocessed text data is preprocessed to form vector data; the vector data is used as input data of a K-MEDOIDS clustering algorithm, and unsupervised learning is performed according to preset rules to form a clustering model; an external release file is detected through the clustering model; whether the external release file is a secret leakage file is judged; if yes, the external release file is not allowed to be released externally; and otherwise the external release file is allowed to be released externally. The K-MEDOIDS clustering algorithm is adopted to perform unsupervised learning training on the preprocessed text data, the external release file is detected through the clustering model, and therefore the problems that in traditional DLP technologies, keyword search is hidebound in a basic detection technology, and a file cannot be detected through an EDM and an IDM after file content is modified in an advanced detection technology are solved; and meanwhile, the number of categories of the K-MEDOIDS algorithm does not influence the clustering result, the algorithm is more flexible than an SVM algorithm, and the detection result of the file cannot be influenced even under the condition that a keyword is replaced.
Owner:BEIJING VRV SOFTWARE CO LTD

High-resolution remote sensing image scene multi-label classification method based on multi-packet fusion

ActiveCN110210534ASolve the defect of losing the correlation information between targetsAvoid defectsScene recognitionMulti-label classificationMahalanobis distance
The invention discloses a high-resolution remote sensing image scene multi-label classification method based on multi-packet fusion. The method comprises the following steps: firstly, extracting multiple heterogeneous features on a high-resolution remote sensing image according to grid division and performing encoding; secondly, dividing sub-regions through a layering method and a segmentation method to pool the coded features, and obtaining a layering example package and a segmentation example package; using Mahalanobis distance to cluster the packets by a K-Medoids method, solving distancesfrom the packets to all clustering centers, and forming vectors by all distance values, so as to convert a multi-instance packet into a single instance; carrying out series fusion on the obtained single examples; and finally, designing a plurality of binary classifiers through a pair of other methods to solve the multi-label problem. According to the multi-packet fusion-based high-resolution remote sensing image scene multi-label classification method provided by the invention, the classification performance is improved, and a more excellent classification result is obtained compared with an existing classification method.
Owner:HOHAI UNIV

Full-coverage granular computing based K-medoids text clustering method

The invention discloses a full-coverage granular computing based K-medoids text clustering method. The method comprises the steps of 1) preprocessing texts, including, Chinese word segmentation and stop word removal; 2) performing characteristic extraction on the texts, setting a high frequency word threshold and a low frequency word threshold, filtering away high-frequency words with insufficientdiscrimination degrees and low-frequency words with weak representativeness, and then building a word vector spatial model by utilizing a TF-IDF algorithm; and 3) clustering the texts, firstly performing coarse clustering on the texts by utilizing single-pass and calculating an initial clustering center candidate set by utilizing a concept of granularity importance of a full-coverage granular computing theory, and then calculating an initial clustering center based on the density and a maximum-minimum distance algorithm, and finally performing text clustering by utilizing a k-medoids algorithm. The full-coverage granular computing based K-medoids text clustering method solves the problems of iteration times increase and relatively big fluctuation of clustering results of the traditional K-medoids clustering algorithm in which the initial clustering center is selected randomly, and also solves the problem that the initial clustering center is located at the same type of the cluster inthe currently improved K-medoids clustering algorithm.
Owner:TAIYUAN UNIV OF TECH

Method and device for predicting operation stage and service life of grounding grid of substation

The invention discloses a method and device for predicting the operation stage and service life of a grounding grid of a substation based on the improved random forest algorithm. The method includes the following steps that initial data is obtained, and an original sample set is constructed; based on the characteristics of an original sample set, characteristic variables are extracted; the K-medoids method is used for clustering of the original sample set; the random forest algorithm is used for processing various samples, and a random forest model is formed; the to-be-predicted substation grounding grid characteristic variables are loaded into a random forest model, and the operation time in a feature vector is changed to obtain the relationship of an evaluation result and the operation time, and the predicted result of an operation stage and operation life are deduced. The randomness of the original sample set classification in the random forest algorithm is improved; the forest model is generated on the basis of the random forest algorithm, the generalization error is controllable, and the clustering accuracy is high; various factors affecting the state of the grounding grid ofthe substation are taken into account comprehensively, and different operation stages are divided to identify the corrosion state of the grounding grid in the corresponding stage through combination of the most suitable grounding grid fault detection method.
Owner:STATE GRID HENAN ELECTRIC POWER ELECTRIC POWER SCI RES INST +3

Ct determination by cluster analysis with variable cluster endpoint

Systems and methods for determining the cycle threshold (Ct) value in a kinetic PCR amplification curve. The PCR data set may be visualized in a two-dimensional plot of fluorescence intensity (y-axis) vs. cycle number (x-axis). The data set is transformed to produce a partition table of data points with one column including the fluorescence at cycle (n) and a second column including the fluorescence at cycle (n+i), where i is typically 1 or greater. A cluster analysis process is applied to the partition table data set to determine a plurality of clusters in the partition table data set. In one aspect, the clustering process used includes a k-means clustering algorithm, where the number of identified clusters, k, is greater than or equal to three. In another aspect, a Partitioning Around Medoids (PAM) algorithm is used to identify three or more clusters. Using the identified clusters, a linear slope of each of the clusters is determined based on y(n+1) vs. n, and for each cluster, a ratio of the slope of that cluster with the slope of an adjacent cluster is determined. The ratios are then compared. An end point of a cluster having the largest or smallest ratio represents a specific point of interest in the data curve. The data point representing the elbow or Ct value of the PCR curve is identified as an end point of one of the identified clusters, and the cycle number corresponding to this data point is returned or displayed.
Owner:ROCHE MOLECULAR SYST INC

Voltage sag characteristic quantity random evaluation method based on scene construction

The invention relates to the technical field of power distribution network operation management, and discloses a voltage sag characteristic quantity random evaluation method based on scene construction, and the method comprises the following steps: 1, generating a basic scene set through employing an optimal quantile method based on a power distribution network grid structure according to relay protection segmentation parameters; 2, reducing the basic scene set into a few typical scene sets by applying a K-medoids clustering method; and step 3, according to a short-circuit calculation principle, in combination with historical monitoring data of the short-circuit fault of the power distribution network, performing voltage sag characteristic quantity evaluation at the sensitive load point. According to the method, a K-medoids clustering method is applied, a method for reducing a basic scene set into a few typical scenes is adopted. The difficulty of a large number of repeated tests of methods such as short-circuit calculation or simulation tests is avoided, the number of groups needing to be calculated is greatly reduced in a preprocessing mode, most similarity calculation is avoided, the calculation difficulty and time are reduced, and the applicability of the method in a large-range power distribution network is improved.
Owner:WUHAN UNIV +2

Parallelization clustering method for power communication data resources

InactiveCN108776814AImproved selection mechanismClustering iterative operation worksData processing applicationsCharacter and pattern recognitionCluster algorithmData set
The invention belongs to the technical field of power systems and more specifically relates to a parallelization clustering method for power communication data resources. First, point density of different sample points in a data set is calculated in parallel; k initial clustering centers are selected according to the size of the point density and the mutual distance; finally, a k-medoids algorithmis utilized for non-center point distribution and center point update on a Hadoop platform in parallel until the center point and the non-center point of each cluster do not change any more and the final clustering result is obtained. According to the invention, the initial points of the clustering algorithm are selected through combination of the size of the point density calculated according tothe sample points and the mutual distance, the initial cluster center selection mechanism is improved, the clustering iteration operation effectiveness is improved, the search range is reduced and the clustering effect is improved; parallelization resource allocation is optimized, clustering time is shortened and clustering precision is improved, effective analysis and utilization of power data are well supported and the comprehensive benefits of power communication network related data are developed.
Owner:GUANGDONG POWER GRID CO LTD +1

Power grid economic operation domain generation method and system considering source-load double-side uncertainty

ActiveCN114336607AImprove the ability to deal with uncertain factorsAbility to enhance uncertaintiesPower network operation systems integrationSingle network parallel feeding arrangementsPower system schedulingControl engineering
The invention discloses a power grid economic operation domain generation method and system considering source-load double-side uncertainty. Most of the current power system scheduling methods give out a unit output plan curve, which is difficult to reflect the influence of uncertainty on a power grid scheduling plan. The method comprises the following steps: acquiring a typical wind power output scene and a corresponding probability thereof based on tLocation-scale distribution, Latin hypercube sampling and K-medoids clustering; establishing a demand side response model according to the price type demand side response rule; and on the basis, according to the definition of the economic operation domain, establishing a power grid economic operation domain generation model considering the uncertainty of both sides of the source and the load, and directly solving through a commercial solver. According to the method, the economic operation domain concept and power system dispatching are combined, the economic operation output interval of the unit is generated, the requirement for dispatching operation of the power system is met and further improved, and the influence of uncertainty on a power grid dispatching plan can be quantitatively described.
Owner:ELECTRIC POWER RES INST OF STATE GRID ZHEJIANG ELECTRIC POWER COMAPNY +2
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products