Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

33 results about "Active learning (machine learning)" patented technology

Active learning is a special case of machine learning in which a learning algorithm is able to interactively query the user (or some other information source) to obtain the desired outputs at new data points. In statistics literature it is sometimes also called optimal experimental design.

Unbalanced data classification method based on active learning

The invention discloses an unbalanced data classification method based on active learning, and the method comprises the steps: randomly sampling and selecting a sample from original label-free data for marking, and taking the sample as initial training data; performing cost-sensitive learning training on the initial training data by adopting a universal machine learning model; predicting all samples which are not labeled in the original training data samples by utilizing a trained binary supervised classification model, and selecting N samples which are most uncertain according to uncertainty;respectively calculating the sum of Euclidean distances between the N samples and the central point of the trained data set, and selecting M samples from the N samples according to the descending order of the distances; marking the selected M samples, and adding the marked M samples into a training data set; performing cost-sensitive learning training on the initial training data set by using a universal machine learning model; and continuously repeating the process, iteratively circulating until the average uncertainty of the selected M samples is smaller than a set uncertainty threshold, and stopping training. According to the method, on the basis of keeping the performance of the unbalanced data classifier, the sample size of labeling can be effectively reduced, so that the labeling time and the labor cost are saved.
Owner:NANJING UNIV OF SCI & TECH

Medical image classification method and device, medium and electronic equipment

The invention relates to the field of machine learning, and discloses a medical image classification method and device, a medium and electronic equipment. The method comprises the following steps: selecting a target medical image sample from an unlabeled medical image sample set by utilizing an active learning framework, wherein a query strategy of the active learning framework is provided by a reinforcement learning model; inputting the target medical image sample labeled by the labeling expert into a medical image classification model, and training the medical image classification model; ifthe training does not meet the preset condition, obtaining a training result, training a reinforcement learning model based on the training result, updating a query strategy by utilizing the trained reinforcement learning model, and turning to a sample selection step until the training meets the preset condition; and inputting to-be-classified medical image data into the trained medical image classification model for classification. According to the method, a long-acting working mechanism for training the medical image classification model through man-machine cooperation is established, the labeling cost is reduced, and the labeling efficiency is improved.
Owner:PING AN TECH (SHENZHEN) CO LTD

Transfer learning algorithm based on active learning

The invention discloses a transfer learning algorithm based on active learning, and belongs to the field of machine learning. For a general unsupervised transfer learning algorithm, a large number ofresearches exist at present, but on this basis, the improvement of the algorithm performance in a target field can be obtained at a relatively low sample labeling cost. The active transfer learning algorithm accesses a batch of data based on an active sampling method to finely tune and update network parameters after an unsupervised domain self-adaption process is carried out, so that the extracted features have good migration capability and good discrimination capability. In the invention, an active sampling strategy is not only based on a traditional information entropy method, but also provides an evaluation index of one characteristic under a transfer learning background.
Owner:NANJING UNIV OF AERONAUTICS & ASTRONAUTICS

Model training method and device, short message auditing method and device, equipment and storage medium

The invention discloses a model training method, a short message auditing method and device, equipment and a storage medium, and relates to the field of artificial intelligence. According to the specific implementation scheme, model training is as follows: performing sample reduction on a first unlabeled sample to obtain a second unlabeled sample; inputting the second unlabeled sample into a machine learning model for prediction to obtain a probability corresponding to a prediction result of the second unlabeled sample; selecting a third unlabeled sample from the second unlabeled samples according to the probability; and training the machine learning model by using the labeled third unlabeled sample. According to the embodiment of the application, redundant samples are removed through sample reduction, so that the selected samples have certain representativeness. And an active learning technology is used, and a machine learning model is used to further select a sample with the most annotation value for the current model and a large amount of information, so that the annotation cost is reduced.
Owner:BEIJING BAIDU NETCOM SCI & TECH CO LTD

Method for optimizing gradient titanium dioxide nanotube micro-patterns under assistance of machine learning

The invention discloses a method for optimizing a gradient titanium dioxide nanotube micro-pattern under the assistance of machine learning, and relates to preparation of the gradient titanium dioxide nanotube micro-pattern. The method comprises the following steps: 1) setting related experimental conditions to prepare a TiO2 nanotube micro-pattern and characterize the TiO2 nanotube micro-pattern to obtain experimental data; 2) preprocessing the obtained experimental data and performing machine learning modeling; 3) performing, by the machine learning model, prediction and recommending an optimization experiment scheme; and (4) verifying a prediction result through an experiment, supplementing data, and iterating the steps (1)-(4). By means of the technical scheme, sample data expansion, self-learning and automatic training of a model meeting preset precision can be automatically realized, and then an active learning framework for predicting the parameter structure property of the material is automatically constructed, and intelligent generation and reverse design of the material are realized. The TiO2 nanotube micro-pattern sample with the maximum gradient range, which is prepared in one step by utilizing a bipolar oxidation method under an ammonium fluoride / water / glycerol system, and the experimental conditions of the TiO2 nanotube micro-pattern sample can be found under fewer experimental conditions. Operation is simple and convenient, and operation time is short.
Owner:XIAMEN UNIV

Method, apparatus and computer program for operating a machine learning framework with active learning technique

Provided is a method for analyzing a user in a data analysis server, the method including; step A of establishing a question database comprising a plurality of questions, of collecting solving result data of a user for the plurality of questions, and of learning the solving result data, thereby generating a data analysis model for modeling the user; step B of generating an expert model that recommends learning data necessary for machine learning of the data analysis model; step C of extracting at least one question from the question database according to recommendation from the expert model, and of updating the data analysis model using solving result data of a user for the at least one extracted question; and step D of updating the expert model by applying, to update information of the data analysis model, a reward that is set in a direction to improve prediction accuracy of the data analysis model.
Owner:RIIID CO

Method and device for training image video quality evaluation model based on active learning

The invention provides an image video quality evaluation model training method and device based on active learning. The method is suitable for the technical field of machine learning and the like, and comprises the following steps: determining the uncertainty of a certain to-be-evaluated sample by using a quality evaluation model based on an active learning mode under known k labeled samples (0<k); and based on the uncertainty of the to-be-evaluated sample, if the uncertainty of the sample is very high, preferentially performing subjective scoring on the sample, adding the sample into K marked samples, and retraining the quality evaluation model to achieve better performance. According to the method, active learning and an image video quality evaluation model are combined, more valuable samples are selected from a massive image video library to be provided for marking personnel to score through an active learning mode, on the premise that the model performances are the same, the number of samples needing subjective scoring can be reduced, and thus the marking cost is saved.
Owner:TENCENT TECH (SHENZHEN) CO LTD

Intraocular lens diopter calculation system based on machine learning

The invention relates to an intraocular lens diopter calculation system based on machine learning. The intraocular lens diopter calculation system comprises a prediction model, an input module, a calculation module, an additional module and an output module. The calculation module is used for acquiring a diopter number of an intraocular lens to be implanted (IOL) by taking target equivalent spherical power as an ideal value and taking preoperative information of a cataract patient as input on the basis of a prediction model; the additional module is used for providing a plurality of differentsimulated intraocular lens diopters to the calculation module, and the calculation module generates postoperative optometry equivalent spherical lenses corresponding to the different simulated intraocular lens diopters. The artificial intelligence active learning data characteristics are fully utilized, the error calculation capacity is autonomously optimized, the diopter of the intraocular lens needing to be implanted in cataract surgery is accurately calculated, eyeball biological parameters of all dimensions are matched, and the eyeball prediction accuracy of the extreme eyeball biologicalparameters is higher.
Owner:WENZHOU MEDICAL UNIV

Traditional Chinese medicine tongue picture greasiness classification method containing noisy annotation data

PendingCN113657449ASolve the problem of strong subjectivity and noisy labelsConfidenceCharacter and pattern recognitionNeural architecturesData setClassification methods
The invention relates to a traditional Chinese medicine tongue picture greasy classification method with noisy annotation data, and belongs to the field of computer vision. According to the method, a noisy annotation tongue picture greasy data set is subjected to label confidence evaluation and updating processing through a plurality of machine learning classification models; label data set labeling quality updating and classification network model parameter updating are put in an iteration process, an active learning idea is introduced, and a novel traditional Chinese medicine tongue picture greasy sample labeling quality and classification model interactive iteration improvement method is provided. According to the method, the confidence coefficient of the labeled samples is improved, so that high-quality labeled samples are screened, the method can be applied to other tongue picture samples containing noise labels, the problems that tongue picture labeling in traditional Chinese medicine is high in subjectivity and contains noise labels can be effectively solved, and the method has higher generalization performance.
Owner:BEIJING UNIV OF TECH

Method, device and equipment for determining training sample

The invention relates to the technical field of machine learning, and discloses a method for determining a training sample, and the method comprises the steps: obtaining an unlabeled sample set and a plurality of alternative models, and distributing corresponding labeled sample sets for the alternative models, wherein the plurality of alternative models are active learning models with different sample selection strategies; training corresponding alternative models by using the labeled sample set to obtain evaluation models corresponding to the alternative models; evaluating the unlabeled sample set by using an evaluation model to obtain a first evaluation result; and determining a training sample according to the first evaluation result. According to the method, the active learning models with different sample selection strategies are trained through the labeled sample set to obtain the corresponding evaluation models, and the unlabeled sample set is evaluated by using the evaluation models to determine the training samples, so that the tendency of a single active learning algorithm is avoided, and the diversity of the training samples is improved. The invention also discloses a device and equipment for determining the training sample.
Owner:SHANGHAI MININGLAMP ARTIFICIAL INTELLIGENCE GRP CO LTD

System and method for active machine learning

An electronic device for active learning includes at least one memory and at least one processor coupled to at least one memory. At least one processor is configured to select one or more entries froma data set including unlabeled data based on a similarity between the one or more entries and labeled data. At least one processor is further configured to cause the one or more entries to be labeled.
Owner:SAMSUNG ELECTRONICS CO LTD

Sample classification method based on active learning and neural network

The invention discloses a sample classification method based on active learning and a neural network, and belongs to the field of machine learning in intelligent science and technology. According to the method, the uncertainty of a neural network model to sample points is taken as a reference, three traditional uncertainty indexes of Least content, Margin and Entry are calculated respectively, the three indexes are used for voting samples, the sample with the highest vote number is the finally screened sample points, the sample points are the most uncertain samples of the model, and then training of the neural network model is facilitated. The number of sample points needing to be marked can be effectively reduced, the marking cost is reduced, and the classification precision of the model is improved.
Owner:XIANGTAN UNIV

Classification method of unbalanced data

The invention discloses a method for classifying unbalanced data, and belongs to the technical field of machine learning, the method comprises an active learning method and an oversampling method, the unbalanced data comprises marked data and unmarked data, and the method specifically comprises the following steps: preprocessing the marked data, and calculating distance features to obtain an initial training set; training the initial training set to obtain an initial classifier; calculating the uncertainty of the unmarked data by using the initial classifier; sorting the unmarked data according to the uncertainty, and manually marking the unmarked data to obtain a marked data set; performing probability oversampling on the marked data set to obtain a balanced data set; and training the balanced data set to obtain a classifier which is used for classifying unbalanced data. According to the unbalanced data classification method, active learning and an oversampling method are combined, so that the number of samples participating in training is reduced; meanwhile, it is guaranteed that the classifier has high classification precision for majority class data and minority class data.
Owner:NANJING UNIV OF POSTS & TELECOMM

Novel machine learning approach for the identification of genomic features associated with epigenetic control regions and transgenerational inheritance of epimutations

A two-step (sequential) machine learning analysis tool is provided that involves a combination of an initial active learning step followed by an imbalance class learner (ACL-ICL) protocol. This technique provides a more tightly integrated approach for a more efficient and accurate machine learning analysis. The combination of ACL and ICL work synergistically to improve the accuracy and efficiency of machine learning and can be used with any type of dataset including biological datasets.
Owner:WASHINGTON STATE UNIVERSITY

Cross-domain image example level active labeling method

The invention discloses a cross-domain image example level active labeling method. Digital image target detection is one of the basic tasks of computer vision, and generally needs a large number of samples with object frame labels to be used for training a machine learning model. However, in real tasks, a large number of training samples of target tasks cannot be obtained due to sensibility and the like, so that the model performance is low, and the model is difficult to promote. According to the method, the unsupervised source domain which is easy to obtain and rich in knowledge is utilized, and efficient example labeling is automatically selected through an active learning technology, so that finer labeling information is obtained, and the data labeling difficulty is greatly reduced; meanwhile, the obtained supervision information is fully utilized, the performance of the model on the target task is efficiently improved, and the participation cost of the user can be remarkably reduced.
Owner:NANJING UNIV OF AERONAUTICS & ASTRONAUTICS

Method for predicting aging defects of software in project based on Active Learning

ActiveCN112527670AImprove robustnessAlleviate time-consuming and labor-intensive collection of aging datasetsKernel methodsCharacter and pattern recognitionData setEngineering
The invention discloses an in-project software aging prediction method based on Active Learning, and the method comprises the steps: collecting the static measurement of a code in software, selectinga sample through Active Learning, carrying out the labeling of the sample, taking the sample as a training set, and predicting the remaining samples without class labels; and adopting active Learningfor sample selection and manual labeling, and forming a training set. An oversampling and undersampling combined method is adopted to relieve the class imbalance problem, and a machine learning classifier is used for prediction. According to the method, few software aging defect data set samples are considered, time and labor are consumed for collection, the problem of polar imbalance is relievedby adopting an undersampling and oversampling combined method, developers are helped to discover and remove software aging related defects in the development and test stage, and losses caused by the software aging problem are avoided. The feasibility of the method is verified on real software, and the method can be popularized to other software to predict software aging related defects.
Owner:WUHAN UNIV OF TECH

Malicious PDF document detection method based on active learning

The invention relates to a malicious PDF document detection method based on active learning, is used for detecting malicious documents in PDF files, and belongs to the technical field of data storagesecurity. According to the method, a machine learning method and malicious PDF document detection are combined, the structural features of the PDF document are extracted, the features are processed ina structural multi-mapping and structural path merging mode, and feature drifting is limited within a certain period of time while hidden attacks are reduced. Malicious PDF document feature distribution is learned by using a full-connection deep model, for the condition that an identification result is uncertain, an active learning method is adopted to improve the model performance, and a commonprotocol analysis method is adopted to select a small part of samples with rich information amount and add the samples into a training set for the next round of training. On the premise of not increasing too many samples, the model performance is remarkably improved, and the trained recognizer can reliably and effectively recognize malicious PDF documents.
Owner:BEIJING INSTITUTE OF TECHNOLOGYGY

Student cooperation state evaluation method and system based on brain-computer interface

The invention provides a student cooperation state evaluation method and system based on a brain-computer interface. The method comprises the following steps: determining brain wave data of a student in a cooperation learning process, the brain wave data being collected by using a brain-computer interface technology; inputting the brain wave data into a trained classifier, and classifying the cooperation learning states of the students, wherein the cooperation learning states comprise interactive learning, construction learning, active learning and passive learning. The classifier is constructed based on a long short-term memory (LSTM) network and is used for classifying the cooperation learning states of the students according to the brain wave data in the cooperation learning process of the students. The brain-computer interface and machine learning technology is adopted, the cooperation learning state of the students is automatically monitored, and compared with a traditional method for judging the states of the students through observation of teachers, the method is more scientific and efficient.
Owner:HUAZHONG NORMAL UNIV

Machine Learning Systems And Methods For Interactive Concept Searching Using Attention Scoring

Machine learning systems and methods for interactive concept searching using attention scoring are provided. The system receives textual data. The system identifies one or more word representations of the textual data. The system further receives a concept. The system determines a score indicative of a likelihood of each of the one or more word representations being representative of the concept using an attention scoring process having a temperature variable. The system generates a dataset for training and evaluation based at least in part on the score. The dataset includes the one or more word representations and concept. The system further processes the dataset to train one or more deep active learning models capable of the interactive concept search.
Owner:INSURANCE SERVICES OFFICE INC

A method for detecting changes in land cover types based on time series polsar images

ActiveCN110414566BImprove the efficiency of change detectionLow costCharacter and pattern recognitionLand coverAlgorithm
The present invention provides a land cover type change detection method based on time series PolSAR images. The purpose is to solve the problem that the current change detection method is difficult to make full use of the time dimension information between time series images, and usually requires a large number of high-quality training for each image. sample etc. The invention uses the Omnibus hypothesis test likelihood ratio algorithm to fully mine the time dimension information of the historically accumulated time series PolSAR images, and combines the marked category labels, classification thematic maps and other rich prior knowledge, and uses active learning and associated knowledge transfer learning And other machine learning algorithms to automatically label the category labels of each scene image, obtain reliable training samples, and then realize high-precision extraction of dynamic change information of land cover categories.
Owner:WUHAN UNIV

Shortlist selection model for active learning

Method(s) and apparatus are provided for generating a selection model based on a machine learning (ML) technique, the selection model for selecting a shortlist of compounds requiring validation with aparticular property. An iterative procedure or feedback loop for generating the selection model may include: receiving a prediction result list output from a property model for predicting whether a plurality of compounds are associated with a particular property and an property model score; retraining the selection model based on the property model score and / or the prediction result list; selecting a shortlist of compounds using the retrained selection model from the plurality of compounds associated with the prediction result list; sending the selected shortlist of compounds for validation with the particular property, where another ML technique is used to update the property model based on the validation; repeating the receiving and retraining of the selection model until determining the selection model has been validly trained.
Owner:BENEVOLENTAI TECH LTD

Active learning for data matching

The inventive method comprises: a) training a machine learning model using a current set of tagged data points, each data point being a plurality of data records, where the tagging of the data points indicates a classification of the data points, the training resulting in a trained machine learning model configured to classify the data points as representing the same entity or different entities. B) a subset of unmarked data points can be selected from the current unmarked data point set using the classification results of the current unmarked data point set. C) a subset of unlabeled data points may be provided to a classifier and a label of the subset of unlabeled data points may be received in response to the providing. Steps a) to c) may be repeated using the current set of tagged data points plus the subset of tagged data points as the current set of tagged data points.
Owner:IBM CORP

A kind of machine active learning method and learning system

The invention discloses a machine active learning method and a learning system, which obtains classified corpus data by clustering original corpus data; automatically recommends the classified corpus data according to preset rules to obtain recommended corpus data; Recommend corpus data for manual labeling to obtain labeled corpus data; input the labeled corpus data into the test model for machine learning, and output learning results; thus organically combining supervised learning and unsupervised learning, while ensuring better learning effects Basically, the workload of manual labeling is greatly reduced, and the learning efficiency is improved.
Owner:XIAMEN KUAISHANGTONG TECH CORP LTD

Active learning model validation

Method(s), apparatus, and computer-implemented method(s) are provided for training a machine learning (ML) technique to generate a property model for predicting whether a compound has a particular property. An iterative procedure / feedback loop may be performed for generating the property model, the procedure including: generating a prediction result list for a plurality of compounds and their association with the particular property based on the property model; validating the property model based on compounds from the prediction result list having an association with the particular property; and updating the property model based on the property model validation. The procedure / loop may be repeated using the updated property model until it is determined the property model has been validly trained. The property model validation may include selecting a shortlist of compounds, performing simulation analysis and / or laboratory analysis on the shortlist of compounds in relation to the particular property and using the simulation and / or laboratory results in updating the property model.
Owner:BENEVOLENTAI TECH LTD

Method and device for identifying life cycle operation and maintenance state of optical channel

The invention discloses a method and device for identifying the life cycle operation and maintenance state of an optical channel. The method includes: collecting historical data of a current network through a network management system, and defining the life cycle operation and maintenance state of the optical channel according to the historical data, wherein the historical data comprise topological structure data, historical alarm data and historical performance data; performing sample labeling on the collected historical data through active learning to obtain a labeled sample set containing a plurality of labeled data; performing feature engineering processing on the data in the labeled sample set, and calling a machine learning algorithm to train the processed data to obtain an optical channel life cycle operation and maintenance state recognition model; calling the optical channel life cycle operation and maintenance state identification model for the to-be-detected optical channel to obtain the optical channel life cycle operation and maintenance state of the to-be-detected optical channel, and positioning the position and reason of the hidden danger according to the corresponding characteristics. According to the scheme, the life cycle operation and maintenance state of the optical channel can be quickly and accurately identified, and faults can be predicted and fault causes can be positioned in advance.
Owner:FENGHUO COMM SCI & TECH CO LTD +1

Method, device and computer program for operating machine learning framework having active learning technique applied thereto

The present invention relates to a user analysis method, carried out by a data analysis server, the method comprising: step A for configuring a question database including a plurality of questions, collecting solving result data, for the questions, of a user, and by using the solving result data, generating a data analysis model for modeling the user; step B for generating an expert model for suggesting data necessary for the machine learning of the data analysis model; step C for, according to a recommendation from the expert model, extracting at least one question from the question database and updating the data analysis model by using solving result data, for the extracted question, of the user; and step D for updating the expert model by applying, to update information of the data analysis model, a reward configured so as to improve the prediction accuracy of the data analysis model.
Owner:RIIID CO

Industrialized system for rice grain recognition and method thereof

Proposed is an industrialized system (1) and method for rice grain recognition. An optical image (123) is taken by a user (5) and transmitted to a digital platform (11), wherein the system (1) segments the optical image (123) and extracts and / or measures appropriate grain features (41) from the image (123) describing different aspects of the grain (4). The image (123) is processed by the system (1), comprising a selector (1122) selecting different machine learning structures (11211,...,1121i), applying the different machine learning structures (11211,...,1121i) to the extracted features (41) for rice grain (4) recognition, and selecting the best of the applied machine learning structures (11211,...,1121i) by a random sampling process. The selected best of the applied machine learning structures (11211,...,1121i) is further optimized by varying an appropriate threshold (11231) by a threshold trigger (1123) based on a confusion matrix comprising (11221). An active learning structure (113) based on the confusion matrix comprising (11221) the values of True Positive (TP), False Negative (FN), False Positive (FP) and True Negative (TN) for the classified rice grains, providing a feedback loop to the user or human expert (5), wherein the system (1) is retrained based on the feedback parameters of the feedback loop (1132).
Owner:BUEHLER AG

A Method for Predicting In-Project Software Aging Defects Based on Active Learning

ActiveCN112527670BImprove robustnessAlleviate time-consuming and labor-intensive collection of aging datasetsKernel methodsCharacter and pattern recognitionData setEngineering
The invention discloses an Active Learning-based software aging prediction method in a project. By collecting static metrics of codes in the software, Active Learning is used to select samples and label them as a training set to predict the remaining samples without class labels. Active Learning is used for sample selection and manual labeling to form a training set. A combination of oversampling and undersampling is used to alleviate the class imbalance problem, and a machine learning classifier is used for prediction. The present invention considers that there are few samples in the software aging defect data set, and the collection is time-consuming and labor-intensive. The method of combining under-sampling and over-sampling is used to alleviate the problem of extreme class imbalance, which helps developers to find software aging-related defects during the development and testing stage. Removed to avoid losses caused by software aging issues. The feasibility of the invention has been verified on real software, and can be extended to other software to predict defects related to software aging.
Owner:WUHAN UNIV OF TECH

Active learning classification method based on uncertainty and similarity measurement

The invention discloses an active learning classification method based on uncertainty and similarity measurement. The method comprises the following steps: S1, carrying out preprocessing and vectorization on unlabeled classification data; S2, clustering, selecting most representative samples in each class, carrying out manual labeling, recording the samples as a data set L, and recording the rest samples as a set U; S3, calculating a similarity metric value of each sample in the U; S4, enabling the L to be used for training a plurality of different machine learning models, and obtaining the accuracy rate and the output value of each model; S5, determining a weight value and an uncertainty degree of each model so as to determine an uncertainty decision value; S6, determining a diversified training sample with the maximum value, labeling the diversified training sample, updating the labeled diversified training sample to the data set L, and removing the labeled diversified training sample from the set U to obtain an updated set U; and S7, repeating the steps S3-S6 until the accuracy of each model does not change any more, and obtaining a final marked data set L. According to the method, the information redundant sample size can be reduced, and the data labeling cost is reduced on the basis of ensuring the training effect.
Owner:SOUTHWEST PETROLEUM UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products