Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

245results about How to "Improve clustering accuracy" patented technology

Intrusion detecting method based on semi-supervised neural network

The invention discloses an intrusion detecting method based on a semi-supervised neural network, belonging to the field of network information security. The intrusion detecting method comprises the following steps of: 1) using a training set A to initialize an Oth layer of neurons of a GHSOM (Growing Hierarchical Self-Organizing Map) neural network, and calculating a QE0; 2) expanding an SOM (Self Organized Mapping) from the Oth layer of the neurons, and setting a layer identification Layer of the SOM as 1; 3) initializing each SOM expanded in a Layerth layer and training each SOM by the following steps of: adjusting weights of a winning neuron and other neurons in adjacent domains, updating a winning vector set and calculating a main label, a main label rate and an information entropy etyi of the winning neuron; and 4) calculating a qei of each neuron in the SOM and a sub network MQE (Message Queue Element), if MQE is more than QEf*mu1, inserting one row or column of the neurons in the SOM, and if QEi is more than QE0*mu2 or etyi is more than etyf*mu3, generating a layer of a new sub network on the neuron, and adding the new sub network into a sub network array of a (Layer+1)th layer. The detection accuracy of a GHSOM algorithm is improved by using the method.
Owner:PEKING UNIV

Log monitoring method based on abnormal behavior detection

ActiveCN105653427AEfficient analysisSolve the problem of excessive data volumeHardware monitoringAnomaly detectionData treatment
The invention provides a log monitoring method based on abnormal behavior detection. The log monitoring method includes the steps of log preprocessing and log anomaly detection. According to the log preprocessing step, log structures are unified, and logs are processed in a clustered mode; according to the anomaly detection step, log flow is converted into a behavior sequence according to the log clustering result, a behavior mode is generated, and anomaly indexes of the real-time log flow are obtained; anomaly indexes and an anomaly threshold value are compared, and whether to give out anomaly early warning or not is determined according to the comparison result. According to the method, starting from the log type characteristics, a generating rule of the logs of different types is analyzed, the problem that the data size is too large is solved in a log information clustering mode, log content is effectively analyzed in real time, the data can be automatically processed conveniently, and the method is in high universality and anomaly detection success rate.
Owner:江阴逐日信息科技有限公司

Load prediction method based on dynamic time warping and long-short time memory

The invention discloses a load prediction method based on dynamic time warping and long-short time memory, and the method comprises the following steps: S1, obtaining the basic data required for short-term load prediction of a user from a power system; S2, carrying out the clustering of users with similar power utilization behaviors through employing a dynamic time warping method according to thehistorical load data of the user; S3, performing pooling processing on the user data of the same category; S4, selecting training data, preprocessing the training data and using the preprocessed training data as input; and S5, constructing a short-term load prediction method based on the deep long-term and short-term memory recurrent neural network, and verifying the effectiveness. According to the method, the users with similar electricity consumption behaviors are clustered according to the characteristic of large cardinal number of the to-be-predicted users, so that the prediction efficiency is improved. Meanwhile, through pooling processing on the data in the same category, the diversity of the training data is increased, the short-term load prediction precision is improved, and certain engineering application significance is achieved.
Owner:NANJING UNIV OF POSTS & TELECOMM

Generation method of high-satisfaction video summary

InactiveCN103150373AImprove the performance of gradient detectionLow miss rateSpecial data processing applicationsVideo retrievalComputer graphics (images)
The invention relates to a generation method of a high-satisfaction video summary. Based on the characteristics of a video data structure, a shot cluster-based video summary system is designed and realized in the invention. The video summary system has the following main functions: coding, decoding and playing of various video files, shot boundary detection, key frame extraction, shot cluster, generation of both static video summary and user input time dynamic summary, and the like. The generation method is suitable for various application occasions, such as multimedia file management, video retrieval, and film and television database construction.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Web image clustering method based on image and text relevant mining

The invention discloses a web image clustering method based on image and text relevant mining, which comprises the following steps of: (1) extracting images and associated texts thereof in Google image searching results according to the query; (2) extracting nouns in the associated texts to form a vocabulary list; (3) calculating the visibility of words in the vocabulary list; the visibility and a TF-IDF method are integrated for calculating the relative association between the words and the images; (4) calculating the theme degree of association between any two words in the vocabulary list; (5) a complex map is used for modeling the relative association; (6) a complex map clustering arithmetic is applied for clustering the images. The method combines the visibility of the words and the TF-IDF method to define the relative association between the words and the images and breakthroughs the restriction that the TF-IDF as a text processing text can not directly measure the relation between the words and the images; by modeling the relative association between the words and the images and between the words by the complex map, a web image clustering frame is provided so that the image searching results are classified according to the theme, thus be convenient for searching by users.
Owner:ZHEJIANG UNIV

Clustering method for network behavior habits based on K-means and LDA (Latent Dirichlet Allocation) two-way authentication

The invention discloses a clustering method for network behavior habits based on K-means and LDA (Latent Dirichlet Allocation) two-way authentication. According to the clustering method, webpage properties, keywords and frequency in internet browsing records of persons are utilized to combine with a K-means algorithm, an LDA document topic extracting model and an annealing algorithm. The clustering method comprises the following steps: firstly, performing K-means algorithm clustering and LDA document topic extracting model generation on a staff-label-frequency set and a person browsing record-person-keyword set; secondly, storing and calculating an intermediate result, and then performing K-means and LDA two-way authentication by using the annealing algorithm; calculating a global best topic-classification label sequence, and optimizing a network behavior habit clustering result by taking the global best topic-classification label sequence as a reference. By means of the K-means and LDA two-way authentication, the sensitivity to person-classification labels is improved; by using the annealing algorithm, the optimizing efficiency of the clustering result can be improved, and further the clustering accuracy is improved.
Owner:HUAIYIN INSTITUTE OF TECHNOLOGY

System and method for clustering gene expression data based on manifold learning

InactiveCN102184349AAccurately discover co-regulatory relationshipsDiscovery of co-regulatory relationshipsSpecial data processing applicationsVisual spaceCluster algorithm
The invention discloses a method for clustering gene expression data based on manifold learning, and the method provided by the invention comprises the following steps: acquiring a gene expression data matrix A through an acquisition system, and preprocessing the gene expression data matrix A by using a local linear smoothing algorithm; introducing the preprocessed data matrix A, and constructing a weighted neighborhood figure G in a three-dimensional space; taking the shortest path between two points as the approximate geodesic distance between two points; calculating a two-dimensional embedded coordinate by using an MDS (minimum discernible signal), and mapping the three-dimensional data matrix A to a two-dimensional visual space; and carrying out clustering on the two-dimensional visual space subjected to mapping by using a k-mean clustering algorithm so as to obtain the clustering result. The clustering method has the characteristics of low calculating cost, capability of eliminating high-order redundancies, suitability for pattern classification tasks, and the like; and by using the method disclosed by the invention, the current states of cells, the effectiveness of medicaments to malignant cells, and the like can be discriminated effectively according to the clustering result. The invention also provides a system for clustering gene expression data based on manifold learning.
Owner:HOHAI UNIV

Intelligent power grid short-term load predication method based on improved RBF neural network

The invention discloses an intelligent power grid short-term load prediction method based on an improved RBF neural network, relates to the technical field of intelligent power grid, and is used for determining the basis function center and improving the load prediction precision of the intelligent power grid. The prediction method includes: S1, performing network initialization; S2, calculating the basis function center ci; S3, calculating the variance [zeta]i according to the basis function center ci; S4, calculating the output Ri of a hidden layer according to the basis function center ci and the variance [zeta]i; S5, calculating the output of an output layer according to the output Ri of the hidden layer; S6, calculating a prediction error E according to a mean squared error and the function; S7, updating connecting weights of neurons of the hidden layer and neurons of the output layer in the neural network; and S8, determining the prediction error E, if the prediction error E is expected, ending iterative calculation, and otherwise, returning to step S4, and re-performing iterative calculation on the prediction error E. The method is used for predicting the load of the power grid.
Owner:BEIJING UNIV OF POSTS & TELECOMM +1

Hybrid filling method for incomplete data

The invention discloses a hybrid filling method for incomplete data. The hybrid filling method comprises the following steps: (1) performing special value filling pre-processing on a missing data value in a data set; (2) extracting data attribute significant characteristics by utilizing a stack type automatic coding machine; (3) performing incremental clustering on the filled data set based on the extracted characteristics; (4) performing attribute value weighted filling on a data missing object by utilizing attribute values, corresponding to front k% objects which are most similar with the data missing object, in the obtained each clustering result; and judging difference between all missing data filling values of this time and a last filling value, and iteratively updating (2) to (4) until filling value convergence conditions are met. According to the embodiment of the invention, local similarity characteristics of data in the data set, the data clustering precision, in-class data filling accuracy and algorithm practical application non-supervision and timeliness are considered to construct an algorithm of firstly clustering the incomplete data and then filling the incomplete data, and the filling result precision and the filling algorithm speed are ensured through ideas of utilizing special value filling, adopting the stack type automatic coding machine, performing incremental clustering, performing weighted filing on in-class front k% complete data objects, and the like.
Owner:DALIAN UNIV OF TECH

Data mining improved type K mean value clustering method based on linear discriminant analysis

The invention relates to a data mining improved type K mean value clustering method based on linear discriminant analysis, namely an LKM algorithm. Firstly, an LDA is adopted to conduct linear dimensionality reduction on an original n-dimensional data set A to obtain a one-dimensional data set Y, then a k mean value clustering algorithm is adopted to conduct clustering analysis on the data set Y after dimensionality reduction, and final results are output. The method that data dimensionality reduction is combined with the K mean value clustering method is adopted, and defects of the k mean value clustering algorithm on high-dimensional data are overcome through the data dimension reduction technology. The aims of lightening dimensionality curses and eliminating other uncorrelated attributes in high-dimensional space are achieved through data dimension reduction. Meanwhile, the performance of the k mean value clustering algorithm for processing the high-dimensional data is also improved, and the correlated defects of the k mean value clustering algorithm are overcome.
Owner:NANJING UNIV OF POSTS & TELECOMM

Method and device used for power load aggregation

The invention discloses a method and device used for power load aggregation. The method and device used for power load aggregation comprise S1, acquiring sample data of n transformer substation comprehensive load static characteristics; S2, mapping the sample data to a Hilbert space through a gaussian kernel function and acquiring samples; S3, confirming an initial aggregation center by selecting K samples in the samples; S4, performing aggregation calculation on mapped samples in a core space by adopting a k-means algorithm, and assigning each sample to closest-type upper and lower approximations according to an upper and lower approximation method; S5, dynamically adjusting weights omega1, omegabnr according to a current iteration; S6, calculating a Jomega value according to an arithmetic convergence criteria, and judging whether |Jomega(t)- Jomega(t-1)|<= epsilon or t>=tmax, if yes, then generating a final collection and finishing, if not, then entering into S7; and S7, enabling t=t+1, reconfirming an aggregation center and transferring to S4. The method and device used for power load aggregation is simple, easy, fast and effective, aggregation results are reasonable, and has important significance on practicability of load modeling research.
Owner:AEROSPACE SCI & IND SHENZHEN GROUP

Domain-and-theme-oriented Web service clustering method

The invention relates to a domain-and-theme-oriented Web service clustering method, which comprises the following steps of: 1, collecting a Web service description document, and preprocessing the collected document; 2, performing domain-oriented classification on a Web service by an iterative supporting vector machine; and 3, and performing theme-oriented clustering on the Web service in a specific domain by using a probability-based Web service clustering method. The domain-and-theme-oriented service clustering method has the beneficial effects that 1) the Web service can be clustered, whichis described by a web service description language (WSDL) mode, an ontology Web language for services (OWL-S) mode, a text mode and the like, and relatively high generality is realized; 2) relativelyhigh efficiency is achieved; 3) the Web service is clustered on the basis of domain-oriented classification, and compared with the method for directly clustering the Web service, the method is relatively high in clustering accuracy; and 4) the Web service clustering result can be used in Web service finding and Web service combination and can be used for Web service recommendation, and the methodis wide in applicability.
Owner:WUHAN UNIV

Short text topic modeling method based on word semantic similarity

The invention discloses a short text topic modeling method based on word semantic similarity. The method comprises: according to word semantic similarity provided by external, establishing a similar word set of short text centralized words; determining the number of topics used in modeling; randomly distributing the topic of each short text; through a Gibbs sampling process, iteratively determining the topic of each short text and the distribution of the words in the topic; according to a final distribution result of the above variable, feeding back the word distribution under each topic and the topic associated to each short text. The method preferably solves problems of sparse information contents of short texts and unclear semantic expression. According to the model result, short texts can be preferably expressed as topic vectors, and the topic vectors are used as final feature vectors of a short essay. The topic vector-based expression has good semantic interpretability, and can be used as algorithm basis of various applications. The method can be widely applied in various short text data, and has wide actual meaning and commercial values.
Owner:WUHAN UNIV

Selection method of K-means initial clustering centers for taxi trajectory data

The invention relates to a selection method of K-means initial clustering centers for taxi trajectory data. The method comprises a step of extracting the road network of city traffic from an electronic map, a step of carrying out preprocessing on collected taxi trajectory data and obtaining sample data suitable for analysis by screening, a step of matching the taxi trajectory data and the road network to obtain a distribution map with taxi data points in a preset analysis range, a step of using the spot detection method in the image recognition technology to identify the main intensive region of taxi trajectory data points as the initial clustering centers of K-means, and a step of outputting the initial clustering centers of K-means. According to the method, through using the spot detection method to determine the position and number of the initial clustering centers of K-means, the defects of fuzziness, subjectivity and initial center random selection of selecting a K value in a traditional K-means method are overcome, for mass car networking data, the clustering speed of the K-Means method is speeded up, the clustering of the taxi trajectory data is realized, and the method has a certain reference value and an actual economic benefit.
Owner:FUZHOU UNIVERSITY

Method for clustering network-based short texts

The present invention discloses a method for clustering network-based short texts. The specific implementation process comprises: firstly acquiring a network-based comment; pre-processing the acquired network-based comment, wherein the pre-processing comprises performing word segmentation on the network-based comment, then removing the word that is not used, segmenting a keyword, and performing weighted calculation on the keyword; and clustering the pre-processed texts. The method for clustering network-based short texts, as compared with the prior art, implements collection and analysis of massive data over the network, such that a user conveniently searches for valued information. With this method, the precision in clustering the network-based short texts is high, thereby accommodating practical needs of the user. Therefore, the method according to the present invention has great practicability and can be simply promoted.
Owner:QILU UNIV OF TECH

Improved Mean Shift-based extraction method and system of road marking lines of Lidar (Light Detection And Ranging) point cloud data

The invention relates to an extraction method of road marking lines, belongs to the field of GIS information technology, and particularly relates to an improved Mean Shift-based extraction method andsystem of road marking lines of Lidar (Light Detection And Ranging) point cloud data. According to the method, a vehicle-mounted Lidar collection system and an inertial measurement system are utilizedto acquire the road point cloud data, adjacency-graph relationships are established for the point cloud data through an octree, and supervoxel clustering is carried out; a Mean Shift algorithm fusedwith multiple features of intensity, normal vectors and the like is utilized to carry out road marking line extraction on the point cloud data; and road marking line points can be accurately extracted, the calculation amount of the Mean Shift algorithm is also greatly decreased at the same time, and thus the purpose of improving precision and efficiency of automated extraction of the road markinglines of the Lidar data is achieved.
Owner:HUBEI UNIV

Probability clustering method of cross-categorical data based on key word

A probabilistic clustering method of trans-type data based on keyword entries belongs to the database field and comprises the following steps: (1) defining the type of the keyword entry; and dividing the trans-type data into a keyword correlation entry, a keyword half-correlation entry and a keyword non-correlation entry; (2) allocating probability for each entry; (3) expressing data keywords by the probability; (4) constructing a data keyword entry probabilistic similarity matrix M; for any two data of the trans-type data dx and dy in the step (3), computing similarity of any two descriptive forms of the dx and the dy, summing the probabilities of the similarity which is greater than a certain threshold, and storing the direct correlation probabilities of the any two data in the matrix M; (5) constructing a clustering model M<c> based on the matrix M; and (6) obtaining the clustering method based on the clustering model M<c>. The method clusters the trans-type data by utilizing the similarity of the entry related to the keywords, which improves the data clustering precision and reduces the clustering time.
Owner:NORTHEASTERN UNIV

Global social service network oriented Web service clustering method

The invention discloses a global social service network oriented Web service clustering method, comprising the following steps that 1, a global social service network oriented Web service clustering framework is established, wherein the framework comprises a service register module, a service operation information collection module, a service clustering module and a service visualization module; and 2, on the basis of the clustering framework, the Web service clustering method comprises the following steps of S2.1, integrating Web service, S2.2, calculating Web service similarities, S2.3, clustering GSSNs (Global Social Service Networks), and S2.4, visualizing the GSSNs. A clustered result is visualized and a user is helped to mine information hidden behind the service visually. According to the method, the Web service clustering precision is improved and the method has relatively high universality.
Owner:ZHEJIANG UNIV OF TECH

Taxpayer industry classification method based on noise label learning

ActiveCN112765358AClassification method improvementReduce labeling costsFinanceSemantic analysisNetwork structureNear neighbor
A taxpayer industry classification method based on noise label learning comprises the steps that firstly, text information to be mined in taxpayer industry information is extracted for text embedding, and feature processing is conducted on the embedded information; secondly, non-text information in the taxpayer industry information is extracted and coded; thirdly, a BERT-CNN deep network structure conforming to the taxpayer industry classification problem is constructed, and the number of layers of the network, the number of neurons of each layer and the input and output dimensions are determined according to the processed feature information and the target category number; then, the constructed network is pre-trained through comparative learning, nearest neighbor semantic clustering and self-label learning in sequence; finally, a noise modeling layer is added on the basis of the constructed deep network, modeling is carried out on noise distribution through network self-trust and noise label information, and model training is carried out based on noise label data; and finally, the deep network in front of the noise modeling layer is taken as a classification model, and taxpayer industry classification is performed based on the model.
Owner:XI AN JIAOTONG UNIV

Professional field-oriented on-line theme detection method

ActiveCN107066555ASolve the difficulty of satisfying the user's need for more professional informationSolve needsCharacter and pattern recognitionSpecial data processing applicationsState of artAlgorithm
The invention discloses a professional field-oriented on-line theme detection method. The method comprises the following steps: obtaining a text vector matrix of a preprocessed text set, and extracting a dictionary from the text set; modeling the text vector matrix; calculating a mixed weight p (thetak|d) from a text d to a theme thetak and a frequency p (w|thetak) that a feature word appears in each theme thetak; obtaining the similarity between two texts di and dj, defining a theme model-based theme distance between the texts into a relative entropy distance of a text vector, and calculating a similarity matrix; compressing the text set, thus obtaining a new text sample sect; calculating a similarity matrix of the new text sample set, and selecting a deviation parameter p according to the similarity matrix; combining clustering results, thus generating a new clustering result; calculating distances between all texts in the original text set and compressed classified texts, and performing classification; outputting a text set theme and a final clustering result. Compared with the prior art, the professional field-oriented on-line theme detection method disclosed by the invention has the advantage that by the adoption of an optimal clustering algorithm, the accuracy and the efficiency of the clustering effect are improved.
Owner:TIANJIN UNIV

Video association method, video display method and device and storage medium

The embodiment of the invention provides a video association method, a video display method and device and a storage medium and relates to the technical field of multimedia. The method comprises the following steps: extracting at least one target image frame in a first video; performing image recognition on the target image frame so as to obtain target image elements; acquiring tags correspondingto the target image elements, and tagging the target image frames in the first video; and associating the first video with a second video, wherein at least one image frame in the second video is tagged with the tag. According to the video association method disclosed by the invention, the effect of classifying the videos by taking the image elements as the granularity is achieved, and the video clustering precision is improved. Moreover, users can check the associated videos by virtue of the tags through selecting the tags only, a complicated searching process is avoided, and the step of checking related videos by the users is reduced.
Owner:BEIJING XIAOMI MOBILE SOFTWARE CO LTD

Hierarchical clustering method based on mutual shared nearest neighbors

The invention discloses a hierarchical clustering method based on mutual shared nearest neighbors. The hierarchical clustering method comprises the steps of firstly calculating a nearest neighbor matrix T1 and a nearest neighbor matrix T2 of a whole data set D; calculating a nearest ranking matrix M according to the nearest neighbor matrix T1 and the nearest neighbor matrix T2; calculating a local density through the nearest ranking matrix M to obtain a submanifold set; finally calculating the similarity of submanifolds, and gathering the submanifolds to obtain a final partitioning result. According to the hierarchical clustering method based on mutual shared nearest neighbors, the problem of low clustering precision caused by point partitioning errors during generation of the submanifold set in the conventional sparsification and graph partitioning processes based on K near graph clustering is solved.
Owner:XIAN UNIV OF TECH

Method and system for possibly fuzzy K-harmonic means clustering

The invention provides a method and system for possibly fuzzy K-harmonic means clustering. The method includes the following steps: determining an initial clustering center; setting a parameter value of a clustering algorithm; calculating covariance of sample data; calculating a fuzzy membership value of the possibly fuzzy K-harmonic means clustering; calculating a typical value of the possibly fuzzy K-harmonic means clustering; calculating a clustering center value of the possibly fuzzy K-harmonic means clustering; judging whether an iteration termination condition is met, if on yes judgment, terminating iteration, if on no judgment, performing iteration calculation continuously; and utilizing the fuzzy membership value and the typical value to achieve division of data sets finally. The method and system for the possibly fuzzy K-harmonic means clustering effectively processes data containing noise and can obtain the fuzzy membership value and the typical value. The typical value does not belong to the fuzzy membership value and does not have probability constraint conditions, therefore the method and system for the possibly fuzzy K-harmonic means clustering is insensitive to noise, high in clustering accuracy and rapid in clustering speed.
Owner:JIANGSU UNIV

Overall reconstruction design method of plane line position of existing railway

The invention discloses an overall reconstruction design method of a plane line position of an existing railway. The method comprises the steps that the line element types of testing points are identified based on the tangent azimuth change rate of each testing point, and initial clustering of the testing points is conducted; based on the number of the testing points in each line element point group, the initial clustered line element point groups are adjusted; based on the crossing point position of straight lines at two ends of each circular curved segment, the line element point groups arefurther adjusted; iterative computation is conducted, easement curve line element testing points in a linear element point group and a circular curve line element point group are gradually identifiedand adjusted, so that the number of the testing points in the line element point groups stable, final clustering of three kinds of line element testing points is achieved, and fitting of local line positions is conducted; all the local line positions are connected and form an initial overall fitting line position, the fitting line position is optimized, and the final plane reconstruction scheme ofthe existing railway is obtained. The overall reconstruction design method of the plane line position of the existing railway can precisely identify different types of line element testing points, and can optimize the fitting line position from an overall prospective and achieve rapid overall reconstruction of the plane line position of the existing railway.
Owner:CENT SOUTH UNIV

Local similarity preserving-based hyperspectral image extreme learning machine clustering method

The invention discloses a local similarity preserving-based hyperspectral image extreme learning machine clustering method. The method comprises the steps of organizing a hyperspectral pixel matrix; calculating a linear random response of a hidden layer neuron; calculating a nonlinear activation value of the hidden layer neuron; performing three-dimensional reconstruction of hidden layer feature data; performing spatial guided filtering; performing two-dimensional reconstruction of the filtered hidden layer feature data; building local similarity preserving regular terms and an optimization model; and calculating local similarity preserving projection features, and performing K-means clustering to obtain a final clustering tag. Based on a conventional extreme learning machine, hyperspectral image spatial information of local neighborhoods is integrated through the guided filtering, and spectral local similarity of a hyperspectrum is fully utilized; projection with local preservabilityis calculated through model optimization; spatial spectral joint information is extracted; the clustering precision is improved; the calculation complexity is lowered; and the method can be widely applied to the hyperspectral unsupervised classification in the fields of territorial resources, mineral survey and precision agriculture.
Owner:NANJING UNIV OF SCI & TECH

Method and device for network traffic clustering

The invention discloses a method and equipment for clustering network flow. The method comprises the following steps of: acquiring global network flow; cutting the global network flow according to a single user to generate sample data; classifying network flow types of the flow according to the sample data; and selecting different characteristic combinations for clustering according to the flow types. The equipment comprises an acquiring unit, a sample data generating unit, a primary clustering unit and a secondary clustering unit. The method for clustering network flow has the advantages of high accuracy, high efficiency, wide flow identification range and capability of accurately mining application quantity in the network flow, and can be realized as network flow control equipment.
Owner:BEIJINGNETENTSEC

Feature selection algorithm based on dynamic programming and K-means clustering

The invention discloses a feature selection algorithm based on dynamic programming and K-means clustering. The feature selection algorithm includes that 1) data preprocessing is carried out to mainly solve the problems of data repetition and data attribute value missing in the feature data; 2) feature sub-sets are pre-selected by means of dynamic programming core idea and the within-class and between-class distance is taken as the performance function in the dynamic programming decision process; 3) an original K-means clustering algorithm is improved, the feature sub-sets generated at the dynamic programming stage are clustered by means of an improved K-means algorithm to reject redundant features, and the selected sub-sets are optimized. Based on the feature selection algorithm, the feature sub-sets being low in noise, high in correlation, and free of redundancy can be selected, the effective dimensionality reduction can be realized, the generalization ability and learning efficiency of the machine learning algorithm are improved, the running time of the algorithm is reduced, and finally a simple, efficient and easy-understand learning model is generated.
Owner:SOUTH CHINA UNIV OF TECH

Clustering model training method and device, electronic equipment and computer storage medium

The embodiment of the invention discloses a clustering model training method and device, electronic equipment and a computer storage medium. The method includes the steps that clustering is conductedon new photos through a clustering model and clustered photos to obtain a clustering result of the new photos, wherein the new photos carry class labels; the return function value of the clustering result is calculated based on the clustering result of the new photos and the class labels; the clustering model is trained according to the return function value of the clustering result. According tothe embodiment, the new photos and photos in an initial-state photo album with a classifying result are clustered through the trained clustering model, the obtained clustering result is closer to a manual classifying result, and the clustering accuracy of the trained clustering model is higher.
Owner:BEIJING SENSETIME TECH DEV CO LTD

Self-coding neural network processing method and device, computer equipment and storage medium

The invention discloses a self-encoding neural network processing method and device, computer equipment and a storage medium, and the method comprises the steps: converting a text sample into a sampleword vector, inputting the sample word vector into a convolutional neural network model, and carrying out the preliminary feature extraction of the sample word vector, and obtaining a preliminary hidden feature of the sample; inputting the preliminary implicit features of the sample into a plurality of self-coding neural networks, training the self-coding neural networks to obtain a plurality ofself-coding neural network models, and inputting the preliminary implicit features of the sample into the self-coding neural network models for feature extraction to obtain sample implicit features output by the self-coding neural network models; clustering the extracted feature samples with the hidden features of the samples to obtain a clustering result; determining whether to reconstruct the self-coding neural network according to the clustering result; and if determining that the self-coding neural network needs to be reconstructed, constructing a target self-coding neural networ accordingto the contour coefficient, and acquiring the self-coding neural network with the high clustering accuracy.
Owner:PING AN TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products