Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

219 results about "Data imbalance" patented technology

Imbalance means that the number of data points available for different the classes is different: If there are two classes, then balanced data would mean 50% points for each of the class. For most machine learning techniques, little imbalance is not a problem.

Short-impending rainfall prediction method based on ConvLSTM and 3D-CNN

ActiveCN110363327AEvenly distributedIncreased accuracy of rainstorm forecastsForecastingNeural architecturesData imbalanceShort-term memory
The invention discloses a short-impending rainfall prediction method based on ConvLSTM and 3D-CNN, and belongs to the technical field of weather forecast. The method comprises the following steps: firstly, inputting a historical radar echo map, a gridding temperature and total rainfall at a moment t, and performing data cleaning and denoising on the historical radar echo map and the gridding temperature; performing statistical analysis on the rainfall data imbalance problem, and establishing a new loss function using different weights at different rainfall rate levels; secondly, standardizingthe gridding temperature and the total rainfall by using a meteorological data mapping method based on power and logarithm transformation; and finally, fusing the t-moment input data subjected to theprevious step into a data block, carrying out model building and testing based on a convolutional long-term and short-term memory neural network and a three-dimensional convolutional neural network, and outputting a short-term and temporary rainfall prediction result. According to the method, the rainstorm prediction precision can be improved, the meteorological data is reasonably visualized and standardized, the image features of various meteorological data are fused, and the noise interference is reduced.
Owner:SOUTHEAST UNIV

Bullet screen text classification method, device, equipment, and storage medium

PendingCN110399490AImprove performanceSolve the problem caused by the uneven distribution of proportional dataCharacter and pattern recognitionSelective content distributionData imbalanceData set
The invention provides a bullet screen text classification method, a bullet screen text classification device, equipment and a storage medium. The method comprises the steps: obtaining an imbalance training data set with a pre-marked category, and dividing the training data set into a sufficient sample and an insufficient sample; training the sufficient samples by adopting a textCNN model; carrying out model training on the insufficient samples by adopting an SVM classifier; inputting a text to be tested into the trained textCNN model, and outputting classification probabilities of various categories in sufficient samples; and if the output classification probability is smaller than a first preset threshold, inputting the to-be-tested text into a trained SVM classifier, and outputting a predicted category. According to the method, the classification models for different text scales are obtained through separate training according to the sizes of the training samples, then the two classification models are combined to be used for classifying the to-be-detected text, the problem of data imbalance of the training samples is solved, compared with single model training, the risk of over-fitting can be reduced, bias is reduced, and the recognition accuracy is higher.
Owner:WUHAN DOUYU NETWORK TECH CO LTD

Biological information recognition method based on dynamic sample selection integration

The invention discloses a biological information recognition method based on dynamic sample selection integration, mainly solving the problem of low correct recognition rate of subclass samples caused by data imbalance. The realizing process for solving the problem comprises the following steps: (1) a training set is divided into a series of balanced sub data sets by adopting a training set dividing method; (2) the obtained balanced sub data sets are divided into respective matrix classifiers as initial training sets; (3) on the matrix classifiers, cyclic training is carried out by adopting a dynamic sample selecting method; (4) a testing set is tested by decision functions obtained in each training, thus obtaining decision results; (5) weight of the decision results is calculated by adopting a cost-sensitive idea; and (6) the decision results of each time are weighted and integrated, thus obtaining the final recognition result. Compared with the prior art, the method has the advantages of high accuracy and low calculation complexity, the size relation between a correct ratio and a recall ratio can be regulated as required, and the method is used for recognizing biological information, network intrusion and financial fraud and detecting anti-spam.
Owner:XIDIAN UNIV

A transformer state evaluation clustering analysis method based on data imbalance measurement

ActiveCN109816031AImprove status assessment accuracyClustering effect is goodCharacter and pattern recognitionData imbalanceTransformer
The invention discloses a transformer state evaluation clustering analysis method based on data imbalance measurement. The method comprises the following steps of screening out index parameters corresponding to different types of fault analysis of the power transformer from unbalanced monitoring data according to a common fault index system of the power transformer, and processing the index parameters by using a proportional normalization method; randomly selecting two groups of data in the index parameters as an initial clustering center, and setting clustering analysis parameters; calculating the Euclidean distance between each type of fault index parameters and the initial clustering center, dividing data in the unbalanced monitoring data into a lower approximate set or a boundary areaof a class cluster according to the Euclidean distance, and calculating the degree of unbalance between the class clusters; measuring the membership degree of the monitoring data by fusing the class cluster imbalance degree; carrying out iterative computation on the class cluster center according to the class cluster data distribution condition; and finally, carrying out state evaluation on the power transformer according to a clustering result. According to the present invention, the state evaluation precision of the power transformer is effectively improved.
Owner:NANJING UNIV OF POSTS & TELECOMM

K neighbor-based Bayesian personalized recommendation method and device

The invention discloses a K neighbor-based Bayesian personalized recommendation method. The K neighbor-based Bayesian personalized recommendation method comprises the following steps: 1) through behavior data of a user, seeking K neighbors of the user; 2) according to observed positive feedback items of the user and observed positive feedback items of a user group consisting of k neighbor users of the user, dividing an item set; 3) determining an item level preference relation of the user; 4) maximizing the probabilities of all the users on the item set to obtain an objective function, wherein item prediction of the user is realized by adopting a matrix decomposition model; parameters in the objective function are solved by adopting a stochastic gradient descent method. The invention further discloses a K neighbor-based Bayesian personalized recommendation device. Through the K neighbor-based Bayesian personalized recommendation method and the K neighbor-based Bayesian personalized recommendation device, mutual impact between the users is taken into account, and through the impact, the item set is divided, so that the number of unobserved items is reduced, and an adverse impact caused by data imbalance and data sparseness in the recommendation process is effectively relieved.
Owner:PEKING UNIV +1

Data equalization method based on deep learning multi-weight loss function

The invention relates to a data equalization method based on a deep learning multi-weight loss function, and the method comprises the steps: firstly obtaining a target image data set in a training process employing a deep learning model, determining the class number C of data samples and the size Ni of each class of samples according to the target data set, determining hyper-parameters [alpha] and [gamma] and a weighting coefficient Ci of the importance of each class of samples, and determining a multi-weight loss function MWLfocal (z, y), carrying out continuous iterative training by using the neural network model, carrying out error calculation by using the multi-weight loss function in the training process, and continuously updating weight parameters of the model by using a back propagation algorithm until network convergence reaches an expected target, thereby finally completing training. By means of the loss function, the problems of sample number imbalance and classification difficulty imbalance of different data classes can be solved at the same time, the detection accuracy of key classes can be further improved, the method can be applied to a data set with the data imbalance problem, and therefore the influence of the class imbalance problem is effectively relieved.
Owner:UNIV OF SCI & TECH OF CHINA

Heterogeneous federated learning mine electromagnetic radiation trend tracking method based on SVD algorithm

The invention discloses a heterogeneous federated learning mine electromagnetic radiation intensity trend tracking method based on an SVD algorithm, and the method comprises: firstly proposing a heterogeneous model federated learning algorithm for the problem of data imbalance in a federated learning client, and setting a heterogeneous central model in a server for the client to select, so as to improve the precision of a local model; aiming at the problem of uploading communication cost of local model parameters, providing an efficient communication algorithm that an SVD algorithm is firstlyused for decomposing a parameter matrix to obtain a corresponding singular value matrix, and then the singular value matrix is uploaded to a central server for aggregation updating; and finally, usingthe updated local model by each client to extract local data features, and using the features and real data values by each client to train the ESN and then to execute trend tracking. According to theinvention, trend tracking of electromagnetic radiation intensity acquired by multiple sensors can be realized on the premise of protecting data privacy, the trend tracking precision of each client can be improved, and the communication cost required by a framework is reduced.
Owner:CHINA UNIV OF MINING & TECH

Real-time video face key point detection method based on deep learning

The invention relates to a real-time video face key point detection method based on deep learning, and the method employs a convolutional neural network to carry out the key point detection of a single frame, employs a depth separable convolution to improve the model detection rate, employs a boundary heat map as an additional subtask of an original network to improve the constraint of a global face structure of the original network. The method improves the detection accuracy of an original network, is used for solving a data imbalance loss function of a heat map, improves the generalization capability of a model for a large attitude sample under a limited sample, and improves the inter-frame smoothness through an optical flow loss function. In the detection process, for a frame of which the confidence is lower than a key point confidence threshold due to an extremely large angle, fitting is carried out by utilizing 3DMM to obtain dense key point coordinates, 68-point sampling is carried out on the obtained dense key points according to a projection error between minimum frames, and the consistency with the previous frame is kept. The method has the advantages of real-time performance, capability of utilizing global inter-frame information, high detection accuracy of a face large posture condition and the like.
Owner:HEBEI UNIV OF TECH

Automatic identification method for shale microsections

The invention discloses an automatic identification method for shale microsections. The method comprises the steps that (1) at the first stage, rock slices are divided into igneous rock, sedimentary rock and metamorphic rock; (2) at the second stage, shale is identified from the sedimentary rock obtained according to classification at the first stage. The classification identification technologies adopted at the two stages are both the decision tree technology, and extracted characteristics all belong to statistical characteristics of RGB channels and fractal characteristics of gray channels of images of the rock slices. According to the method, the shale microsections are automatically identified through the information processing technology at the two stages, so that the problem of non-ideal classification results caused by data imbalance is solved. With respect to characteristic selection, the good fractal characteristics of the shale are fully utilized, and the method is applicable to automatic identification of the shale. The automatic identification method is simple and efficient in calculation and has expansibility, and the accuracy of the identification method can be improved along with an increase in data storage of the rock slices; the method has application value in geological prospecting and mineral research.
Owner:NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products