An artificial intelligence-based automatic design method for molecular structure of mineral processing flotation reagents
By using an AI-driven molecular generation and performance prediction model, the molecular structure of flotation reagents is automatically designed, solving the problem of low design efficiency in traditional methods. This achieves efficient and intelligent multi-objective collaborative optimization, improving the design efficiency and innovation capability of flotation reagents.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- WUHAN UNIV OF TECH
- Filing Date
- 2026-03-06
- Publication Date
- 2026-06-12
AI Technical Summary
Traditional methods for manually designing flotation reagents rely on the experience of researchers, resulting in long design cycles, high costs, difficulty in systematically exploring the vast molecular space, inability to achieve multi-objective synergistic optimization, and failure to provide timely feedback of experimental data to the model, leading to low design efficiency and insufficient innovation.
By employing an AI-based molecular generation model and flotation performance prediction model, the molecular structure of flotation reagents is automatically designed. Combined with a multi-objective constraint screening mechanism, efficient and intelligent design is achieved.
It significantly improves the efficiency and innovation of flotation reagent design, and can meet the performance requirements of flotation recovery rate and selectivity while taking into account environmental friendliness and synthesis feasibility, shortening the research and development cycle and reducing costs.
Smart Images

Figure CN122201506A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of technology, specifically to an automatic design method for the molecular structure of mineral processing flotation reagents based on artificial intelligence. Background Technology
[0002] Flotation is the most widely used separation technology in mineral processing, utilizing the selective adsorption of flotation reagents to separate target minerals from gangue minerals. The molecular structure of flotation reagents directly determines their surface activity, selective adsorption capacity, and flotation efficiency, making it a key technical aspect of the flotation process. Currently, the molecular structure design of flotation reagents mainly relies on the accumulated experience and trial-and-error experiments of researchers. Based on their understanding of mineral surface properties and reagent mechanisms, designers modify existing reagents through structural modifications and functional group substitutions, and then verify their performance through laboratory flotation tests. This experience-driven design approach has played a crucial role in the history of flotation reagent development, resulting in the development of a variety of flotation reagents, including fatty acid and amine collectors, as well as various frothers and modifiers.
[0003] However, with increasingly complex and degraded mineral resources, and ever-increasing demands for green mine construction, traditional manual design relies heavily on the personal experience and chemical intuition of researchers, resulting in long design cycles and high costs. Furthermore, the limitations of human cognition often confine designs to local modifications within known structural frameworks, hindering original innovations in molecular structures and severely restricting the development speed of novel, high-efficiency flotation reagents. The potential space for synthesizable organic molecular structures is vast, while existing flotation reagents cover only a tiny fraction. Manual methods cannot systematically and efficiently traverse and evaluate such a vast molecular space, leading to the omission of numerous potential high-performance reagent molecules and a severe lack of depth and breadth in the exploration of molecular resources. Moreover, the development of modern flotation reagents requires balancing multiple objectives, including flotation performance, environmental friendliness, and synthetic feasibility. Traditional manual methods struggle to simultaneously quantify and evaluate these heterogeneous indicators, failing to achieve true multi-objective synergistic optimization. In existing technologies, computer-aided molecular design and experimental verification are relatively independent, lacking effective data feedback and model iteration mechanisms. Experimental data failed to provide timely feedback to the design model, making it difficult to improve prediction accuracy and preventing the formation of a closed loop of design, verification, and optimization.
[0004] Based on the above-mentioned shortcomings, we will develop an intelligent and universal method that can automatically generate and optimize the molecular structure of flotation reagents, integrate performance prediction and multi-objective screening, and establish an experimental feedback closed loop. This method is of great practical significance and application value for breaking through the traditional R&D bottlenecks of flotation reagents, accelerating the development of new green and efficient reagents, and enhancing my country's technological independent innovation capabilities in the field of mineral processing. Summary of the Invention
[0005] To address the aforementioned problems, this invention provides an automatic design method for the molecular structure of mineral processing flotation reagents based on artificial intelligence.
[0006] An AI-based method for automatically designing the molecular structure of mineral processing flotation reagents includes the following steps: Construct a molecular generation model; the input of the molecular generation model is the string, atoms and bonds or molecular fingerprint and descriptor of the existing flotation reagent molecular structure, and the output is the candidate molecular structure; The target performance and constraints of flotation reagent molecules are set; the target performance includes flotation recovery rate and selectivity to target minerals in the flotation reagent; the constraints include environmental constraints, safety constraints, and synthesis constraints. Based on the molecular generation model, multiple candidate molecular structures are output; Based on the target performance and constraints of the flotation reagent molecules, flotation reagent molecules are screened from the plurality of candidate molecular structures.
[0007] Note: The above method automatically explores the potential molecular structure space of reagents through an artificial intelligence-driven molecular generation model. Combined with flotation performance prediction and multi-objective constraint screening mechanism, it realizes efficient and intelligent design of flotation reagent molecular structures. Compared with the traditional design method that relies on human experience, this method significantly improves design efficiency and innovation. It can meet the performance requirements of flotation recovery rate and selectivity while taking into account environmental friendliness and synthetic feasibility. It effectively solves the technical problem that it is difficult for manual methods to systematically explore the huge molecular space and achieve multi-objective synergistic optimization.
[0008] Furthermore, the molecular generation model is a generative AI model, which is a graph neural network generation model or a sequence generation model.
[0009] Note: By using a generative AI model as the molecular generation model, we can fully utilize its powerful probability distribution learning and structure generation capabilities to automatically learn molecular topological features and chemical laws from existing flotation reagent data, thereby efficiently generating chemically effective and structurally diverse candidate reagent molecules.
[0010] Furthermore, the construction of the molecular generation model includes: Obtain existing molecular structure data of flotation reagents as training data; Based on the training data, a molecular generation model architecture was constructed. The molecular generation model architecture is trained and validated using the training data to obtain the molecular generation model.
[0011] Note: The above method is based on existing flotation reagent molecular structure data to construct and train a molecular generation model, which enables the model to fully learn the chemical characteristics and structural distribution rules of known high-efficiency reagents, thereby generating candidate molecules that are chemically reasonable and structurally novel, in line with the application background of flotation reagents.
[0012] Furthermore, the environmental constraints include: avoiding low biodegradability and high ecotoxicity; The safety constraints include: avoiding health toxicity and avoiding physicochemical hazards; The synthesis constraints include ensuring the feasibility of the synthesis route, the availability of raw materials, and the safety of the synthesis process.
[0013] Note: By setting the above constraints, we can ensure that the designed candidate molecules not only have excellent flotation performance, but also meet the actual needs of green mine construction and industrial application, effectively avoiding the technical defects of traditional methods that make it difficult to balance performance and environmental safety.
[0014] Further, the step of screening flotation reagent molecules from the plurality of candidate molecular structures based on the target performance and constraints of the flotation reagent molecules includes: A flotation performance prediction model is established; the input of the flotation performance prediction model is the molecular structure of the candidate reagent molecules and the flotation process parameters, and the output is the predicted flotation recovery rate and the selectivity of the flotation reagent to the target mineral; Multiple candidate molecular structures are input into the flotation performance prediction model to obtain the flotation performance of each candidate molecular structure; Based on the flotation performance of each candidate molecular structure, flotation reagent molecules that meet the target performance and constraint conditions are screened from multiple candidate molecular structures.
[0015] Note: The above method enables virtual high-throughput screening of a large number of candidate molecules by establishing a flotation performance prediction model. This method can pre-evaluate the flotation performance of molecules, significantly reducing R&D costs and time cycles, while ensuring that the screened candidate molecules strictly meet the preset performance indicators and constraints, thus improving the accuracy and efficiency of flotation reagent molecule design.
[0016] Furthermore, the flotation performance prediction model employs a deep learning model.
[0017] Furthermore, the establishment of the flotation performance prediction model includes: Obtain data on the molecular structure and flotation performance of known flotation reagents; The molecular structure of a known flotation reagent is converted into a numerical feature representation, which includes molecular fingerprints, molecular descriptors, and / or molecular graph data. Data on the molecular structure and flotation performance of known flotation reagents are preprocessed and divided to obtain a training set for training the model and a validation set for adjusting hyperparameters and preventing overfitting. Model training and validation are then performed to obtain a flotation performance prediction model.
[0018] Explanation: The above method converts the molecular structure of known flotation reagents into numerical feature representations such as molecular fingerprints, descriptors, or molecular graph data, and employs a training set and validation set partitioning strategy for training and validating deep learning models, thus constructing a structured model development process. This method ensures that the flotation performance prediction model can fully learn the mapping relationship between molecular features and flotation performance, while effectively preventing overfitting through validation set monitoring, thereby improving the model's prediction accuracy, generalization ability, and reliability, laying a solid foundation for subsequent high-throughput virtual screening of candidate molecules.
[0019] Furthermore, the step of screening flotation reagent molecules that meet the target performance and constraint conditions from multiple candidate molecular structures based on the flotation performance of each candidate molecular structure includes: The predicted flotation recovery rate of each candidate molecular structure is compared with a preset recovery rate threshold. Candidate molecular structures with predicted flotation recovery rates not lower than the preset recovery rate threshold are retained to obtain the first candidate molecule set. Each candidate molecule structure in the first candidate molecule set is checked for environmental and safety constraints, and candidate molecule structures containing environmental risk structural units or health toxicity structural units are eliminated to obtain the second candidate molecule set. For each candidate molecule structure in the second candidate molecule set, the synthesis constraints are checked, and the candidate molecule structures whose molecular weight is within a preset range and whose number of synthesis steps does not exceed a preset upper limit are retained to obtain the third candidate molecule set. A multi-objective comprehensive score is performed on each candidate molecular structure in the third candidate molecule set. The candidate molecular structures are ranked according to the score results, and the candidate molecular structures in the top 1% are selected as flotation reagent molecules that meet the target performance and constraint conditions.
[0020] Note: The above method sequentially performs flotation recovery threshold determination, environmental safety constraint verification, synthesis feasibility verification, and multi-objective comprehensive scoring and ranking of candidate molecules, realizing a step-by-step refined screening from performance compliance to environmental safety and then to synthetic feasibility.
[0021] Furthermore, the multi-objective comprehensive score includes: Obtain the predicted flotation recovery rate, environmental friendliness score, and synthesis feasibility score for each candidate molecular structure in the third candidate molecule set; The performance indicators were normalized to obtain normalized values for flotation recovery, selectivity, environmental friendliness, and synthesis feasibility. Based on preset weighting coefficients, the normalized values were weighted and summed to obtain a comprehensive score for each candidate molecular structure.
[0022] Explanation: By eliminating dimensional differences between different performance indicators through normalization and using preset weighting coefficients for weighted summation, the quantitative unification and flexible weighting of multi-dimensional heterogeneous indicators such as flotation recovery rate, environmental friendliness, and synthesis feasibility are achieved.
[0023] Furthermore, the environmental friendliness score is determined based on the predicted values of biodegradability and ecotoxicity of the molecular structure; the synthesis feasibility score is determined based on the number of steps in the synthesis route, the availability rating of raw materials, and the complexity of the reaction conditions.
[0024] Note: The above method refines the environmental friendliness score into a comprehensive quantification of biodegradability and ecotoxicity prediction values, and decomposes the synthesis feasibility score into a multi-dimensional assessment of the number of synthesis route steps, raw material availability, and reaction condition complexity, thus achieving a refined and calculable characterization of environmental performance and synthesis difficulty.
[0025] The beneficial effects of this invention are: The method of this invention automatically explores the potential molecular structure space of reagents through an artificial intelligence-driven molecular generation model. Combined with flotation performance prediction and multi-objective constraint screening mechanism, it realizes efficient and intelligent design of flotation reagent molecular structures. Compared with the traditional design method that relies on human experience, this method significantly improves design efficiency and innovation. It can meet the performance requirements of flotation recovery rate and selectivity while taking into account environmental friendliness and synthetic feasibility. It effectively solves the technical problem that it is difficult for manual methods to systematically explore the huge molecular space and achieve multi-objective synergistic optimization. Attached Figure Description
[0026] Figure 1 This is a flowchart of the design method of Embodiment 1 of the present invention. Detailed Implementation
[0027] To further illustrate the methods and effects of this invention, the technical solution of this invention will be clearly and completely described below in conjunction with experiments.
[0028] Example 1: An automatic design method for the molecular structure of mineral processing flotation reagents based on artificial intelligence, comprising the following steps: S1. Construct a molecular generation model; the input of the molecular generation model is the string, atoms and bonds or molecular fingerprint and descriptor of the existing flotation reagent molecular structure, and the output is the candidate molecular structure; In this embodiment of the invention, the molecular generation model is a generative AI model, which is a graph neural network generation model or a sequence generation model.
[0029] The construction of the molecular generation model includes: 1) Obtain existing molecular structure data of flotation reagents as training data; For example, molecular structure data of existing flotation reagents are collected from flotation reagent databases (including PubChem, ChEMBL, internal laboratory data, and industrial field data). In this embodiment, approximately 10 typical flotation reagents, including xanthates, dioxins, fatty acids, and amines, are collected. 3 Order-of-magnitude molecular structure data. SMILES strings, such as the SMILES of isopropyl xanthate, are represented as CC(C)OC(=S)S. Molecular fingerprints, for example, generate Morgan fingerprints (radius 2, 2048 bits) and represent them as vectors. 2) Based on the training data, construct the molecular generation model architecture; A hybrid architecture combining variational autoencoders (VAEs) and graph neural networks (GNNs) is employed. For example, the molecular generation model first extracts and aggregates the features of the input flotation reagent molecular graph data (including node features such as atom type and charge, and equilateral features such as bond type) through a three-layer graph convolutional network (GCN) with a hidden dimension of 256. Then, it uses a variational autoencoder architecture to encode the graph structure into a 128-dimensional latent space vector that follows a normal distribution. Finally, it uses a graph decoder to reconstruct a new molecular graph structure from the latent vector. The final output is a string representation of the SMILES of novel flotation reagent candidate molecules that conforms to chemical rules. 3) Use the training data to train and validate the molecular generation model architecture to obtain the molecular generation model.
[0030] Specifically, the data is divided into a training set and a validation set, with 80% used for training and 20% for validation. The specific training and validation methods are similar to existing technologies. For example, the training parameters are set with a learning rate of 0.001, a batch size of 64, 500 training epochs, and an early stopping patience value of 50 epochs. The loss function includes reconstruction loss, KL divergence regularization, and chemiluminescence penalty. Validation metrics require molecular efficiencies of at least 95%, uniqueness of at least 90%, and novelty of at least 80%. After training, the model can generate novel flotation reagent molecular structures that conform to chemical rules. It should be understood that the above parameters are merely examples; conventional training methods in the field can also be used, which will not be elaborated here.
[0031] S2. Set the target performance and constraints for the flotation reagent molecules; the target performance includes flotation recovery rate and selectivity for the target minerals in the flotation reagent; the constraints include environmental constraints, safety constraints, and synthesis constraints. The environmental constraints include ensuring biodegradability and avoiding ecotoxicity (i.e., avoiding both low biodegradability and high ecotoxicity). The safety constraints include: avoiding health toxicity and avoiding physicochemical hazards; The synthesis constraints include ensuring the feasibility of the synthesis route, the availability of raw materials, and the safety of the synthesis process.
[0032] For example, the target performance parameters are as follows: the predicted flotation recovery of the candidate flotation reagent under given flotation process conditions is not less than 85%. The candidate flotation reagent has good selectivity for the target mineral, with a selectivity index of not less than 12. The copper grade in the concentrate is not less than 25%.
[0033] The specific environmental constraints are: avoid structural units containing persistent organic pollutants such as polychlorinated biphenyls (PCBs) and polycyclic aromatic hydrocarbons (PAHs). The predicted biodegradability value must be no less than 0.3. The median lethal concentration (LD50) for acute toxicity in fish must be greater than 10 mg / L after 96 hours.
[0034] The specific safety constraints are as follows: avoid highly reactive groups (such as nitro, azo, etc.) that may introduce safety risks. Avoid known carcinogenic structures such as the benzo[a]pyrene skeleton. The skin corrosivity and irritation classification must not be Class 1 or Class 2.
[0035] The specific constraints for synthesis are as follows: the molecular weight should be controlled between 150 and 500 g / mol to avoid reduced solubility due to excessive molecular weight. The number of synthesis steps should not exceed 5, calculated based on commercially available raw materials. Key intermediates must be available from mainstream supplier catalogs. Avoid using pressures exceeding 10 MPa, temperatures exceeding 200°C, or highly toxic reagents.
[0036] S3. Based on the molecular generation model, output multiple candidate molecular structures; Under typical model parameter settings and exemplary calculation conditions, the generation scale of a single round can reach 10. 3 -10 4 On an order of magnitude, approximately 9,500 chemically valid molecular structures can be generated in this embodiment; chemical validity is checked using cheminformatics tools to eliminate structures that cannot be resolved or have unreasonable valence bonds.
[0037] S4. Based on the target performance and constraints of the flotation reagent molecules, screen flotation reagent molecules from the plurality of candidate molecular structures; including: (1) Establish a flotation performance prediction model; the input of the flotation performance prediction model is the molecular structure of the candidate reagent molecules and the flotation process parameters, and the output is the predicted flotation recovery rate and the selectivity of the target mineral of the flotation reagent; The flotation performance prediction model adopts a deep learning model.
[0038] The establishment of the flotation performance prediction model includes: Obtain data on the molecular structure and flotation performance of known flotation reagents; The molecular structure of a known flotation reagent is converted into a numerical feature representation, which includes molecular fingerprints, molecular descriptors, and / or molecular graph data. Data on the molecular structure and flotation performance of known flotation reagents are preprocessed and divided to obtain a training set for training the model and a validation set for adjusting hyperparameters and preventing overfitting. Model training and validation are then performed to obtain a flotation performance prediction model.
[0039] For example, the specific process of establishing a flotation performance prediction model is as follows: Historical flotation test data is collected, including the molecular structures of 320 known reagents and their corresponding flotation performance, including recovery, grade, and selectivity. Feature engineering is performed, including generating a 2048-dimensional Morgan molecular fingerprint, extracting 12-dimensional molecular descriptors such as molecular weight, oil-water partition coefficient, number of hydrogen bond donors and acceptors, polar surface area, and number of rotatable bonds, as well as 3-dimensional flotation process parameters such as pulp pH, collector dosage, and grinding fineness. The model adopts a deep neural network architecture. The input layer receives molecular features and process parameters. Subsequently, three fully connected layers are used: layer 1 with 1024 neurons, layer 2 with 512 neurons, and layer 3 with 256 neurons, all using batch normalization and random deactivation regularization, with a linear rectified activation function. The output layer includes two regression tasks: recovery prediction and selectivity prediction, both using the root mean square error loss function. Training results show that the coefficient of determination for recovery prediction is 0.89, and the root mean square error is 3.2%. The coefficient of determination for selective prediction is 0.85, and the root mean square error is 1.8. The above model accuracy is used to illustrate the engineering prediction capability of the method of this invention and is an exemplary result, not a limitation on the model's performance.
[0040] (2) Input multiple candidate molecular structures into the flotation performance prediction model to obtain the flotation performance of each candidate molecular structure; (3) Based on the flotation performance of each candidate molecular structure, select flotation reagent molecules that meet the target performance and constraint conditions from multiple candidate molecular structures; including: The predicted flotation recovery rate of each candidate molecular structure is compared with a preset recovery rate threshold. Candidate molecular structures with predicted flotation recovery rates not lower than the preset recovery rate threshold are retained to obtain the first candidate molecule set. Each candidate molecule structure in the first candidate molecule set is checked for environmental and safety constraints, and candidate molecule structures containing environmental risk structural units or health toxicity structural units are eliminated to obtain the second candidate molecule set. For each candidate molecule structure in the second candidate molecule set, the synthesis constraints are checked, and the candidate molecule structures whose molecular weight is within a preset range and whose number of synthesis steps does not exceed a preset upper limit are retained to obtain the third candidate molecule set. A multi-objective comprehensive score is performed on each candidate molecular structure in the third candidate molecule set. Based on the score results, the candidate molecular structures are ranked, and the top 1% of candidate molecular structures are selected as flotation reagent molecules that meet the target performance and constraint conditions. The multi-objective comprehensive score includes: Obtain the predicted flotation recovery rate, environmental friendliness score, and synthesis feasibility score for each candidate molecular structure in the third candidate molecule set; The performance indicators were normalized to obtain normalized values for flotation recovery, selectivity, environmental friendliness, and synthesis feasibility. Based on preset weighting coefficients, the normalized values were weighted and summed to obtain a comprehensive score for each candidate molecular structure.
[0041] The environmental friendliness score is determined based on the predicted values of biodegradability and ecotoxicity of the molecular structure; the synthesis feasibility score is determined based on the number of steps in the synthesis route, the availability rating of raw materials, and the complexity of the reaction conditions.
[0042] For example, firstly, with a recovery threshold set at no less than 85%, 2100 molecules are retained from 9500 candidate molecules to form the first candidate molecule set. Then, environmental and safety constraints are checked. A pre-trained toxicity prediction model is used, based on a publicly available toxicity database. Molecules with environmentally risky structures such as halogenated aromatics, polycyclic aromatic hydrocarbon cores, and nitro groups are excluded through pattern matching. The median lethal concentration (LD50) and median effective concentration (LD50) are predicted using a quantitative structure-activity relationship model, excluding highly toxic molecules. At this stage, approximately 450 molecules with risky structures are eliminated, leaving 1650 molecules to form the second candidate molecule set. Next, synthetic constraints are checked. Retrosynthetic analysis tools are used for verification. Molecular weight checks retain molecules in the range of 150 to 500 g / mol. Synthetic route planning limits the number of reaction steps to no more than 5, and the starting materials must be commercially available reagents. After verification, 1280 molecules are retained to form the third candidate molecule set. A multi-objective comprehensive scoring is then performed. The scoring system includes four indicators: flotation recovery (from the prediction model, weight 35%), selectivity index (from the prediction model, weight 25%), and a multi-objective comprehensive scoring system. Environmental friendliness is determined by biodegradability and toxicity predictions, with a weight of 20%. Synthetic feasibility is determined by the number of steps, raw material ratings, and condition complexity, with a weight of 20%. The screening quantities, scoring results, and representative molecular structures mentioned above are exemplary technical results obtained under the conditions of this embodiment, used to illustrate the effectiveness of the high-throughput screening process, and do not constitute a limitation on specific values or molecular structure ranges.
[0043] All indicators were normalized, i.e., the value minus the minimum value and divided by the difference between the maximum and minimum values. The environmental friendliness score was calculated as 0.6 times the normalized value of biodegradability plus 0.4 times the normalized value of toxicity reduction. Biodegradability was predicted using a dedicated model, outputting a value from 0 to 1, with 1 indicating easy degradation. Ecotoxicity was predicted using a dedicated model to determine the median lethal concentration (LD50) for fish. The synthetic feasibility score was calculated as 0.4 times the difference between 1 and the number of steps divided by 5, plus 0.4 times the raw material rating, plus 0.2 times the difference between 1 and the complexity of conditions divided by 10. High raw material ratings were 1, medium were 0.5, and low were 0.
[0044] A comprehensive score was calculated for 1280 candidate molecules, and they were sorted in descending order of score. The top 1%, or 13 molecules, were selected as the preferred molecule set. The molecule ranked first among the preferred molecules is CC(C)CSC(=S)N(C)CC(=O)O, with a predicted recovery of 91.2%, selectivity of 14.5, environmental score of 0.85, synthesis score of 0.78, and comprehensive score of 0.89. The molecule ranked second is CCCCSC(=S)SCC(C)C, with a predicted recovery of 88.7%, selectivity of 13.2, environmental score of 0.82, synthesis score of 0.85, and comprehensive score of 0.86. The molecule ranked third is c1cc(ccc1CS(=O)(=O)O)C(=S)S, with a predicted recovery of 87.5%, selectivity of 15.1, environmental score of 0.79, synthesis score of 0.72, and comprehensive score of 0.84.
[0045] S5. Experimental Verification Thirteen candidate molecules were selected and subjected to laboratory synthesis and flotation experiments. Taking the top-ranked molecule as an example, the synthetic route consisted of three steps: thiolation, condensation, and hydrolysis, with a total yield of 67%. The structure was verified by 1H NMR, 1C NMR, and mass spectrometry. The flotation conditions were as follows: the ore sample was a copper ore containing 1.2% copper and 8% pyrite; the grinding fineness was -74 μm, accounting for 75%; the pulp concentration was 30%; the pH was 9.5, adjusted with lime; the collector dosage was 80 g / t; and the frother dosage was 20 g / t.
[0046] The experimental results showed that molecule 1 had a predicted recovery rate of 91.2%, an actual recovery rate of 89.5%, a predicted selectivity of 14.5, an actual selectivity of 13.8, and an error of -1.8%. Molecule 2 had a predicted recovery rate of 88.7%, an actual recovery rate of 86.2%, a predicted selectivity of 13.2, an actual selectivity of 12.5, and an error of -2.8%.
[0047] Data feedback and model updates; Experimental results for 13 molecules were added to the training dataset, increasing the size of the training set from 320 to 333. The flotation performance prediction model was retrained using the expanded dataset. After the update, the model's coefficient of determination increased from 0.89 to 0.91, and the root mean square error decreased to 2.8%.
[0048] Molecules with excellent experimental performance were used as positive samples, and molecules with poor performance were used as negative samples. A reinforcement learning algorithm was used to fine-tune the molecule generation model. The reward function was based on a comprehensive score, calculated as follows: 0.4 times the actual recovery rate divided by 100, 0.3 times the actual selectivity divided by 20, 0.15 times the environmental score, and 0.15 times the synthesis score. After fine-tuning, the model increased the probability of generating high-recovery molecules by approximately 25% in the next generation round.
[0049] Iterative loop; Steps 3 through 5 were repeated for a total of 5 iterations. Each iteration generated 1000 candidate molecules, and 10 to 15 preferred molecules were experimentally validated in each iteration. As data accumulated, the accuracy of the prediction model continuously improved, reaching a coefficient of determination of 0.94 in the 5th iteration.
[0050] After five rounds of iterative optimization, three high-performance flotation reagent molecules were obtained. Among them, the optimal molecule achieved a copper recovery rate of 90.3% in an exemplary industrial verification scenario, which is 4.5 percentage points higher than that of traditional xanthates; a selectivity index of 15.2, which is 20% higher than that of traditional reagents; a biodegradability rate of 72% after 28 days, which meets the easy degradation standard; and a synthesis cost that is 12% lower than that of traditional reagents.
[0051] This embodiment achieves efficient exploration of the chemical space through a closed-loop process of generation, prediction, screening, verification, and optimization, generating over 9500 novel molecular structures from this space; accurately predicts flotation performance with prediction errors controlled within ±3%; performs multi-objective balance optimization, simultaneously optimizing flotation performance, environmental friendliness, and synthetic feasibility; and continuously evolves and improves, with model performance constantly increasing with data accumulation. This method shortens the traditional flotation reagent development cycle from 3 to 5 years to 8 to 12 months, and is expected to reduce development costs by more than 60%, providing key technical support for the construction of smart mines. The above experimental data, industrial test results, and performance improvement are exemplary verification results obtained under the conditions of this embodiment, used to illustrate the feasibility and potential technical effects of the method in engineering applications, and do not constitute a limitation on specific industrial indicators or application effects.
Claims
1. An automatic design method for the molecular structure of mineral processing flotation reagents based on artificial intelligence, characterized in that, Includes the following steps: Construct a molecular generation model; the input of the molecular generation model is the string, atoms and bonds or molecular fingerprint and descriptor of the existing flotation reagent molecular structure, and the output is the candidate molecular structure; The target performance and constraints of flotation reagent molecules are set; the target performance includes flotation recovery rate and selectivity to target minerals in the flotation reagent; the constraints include environmental constraints, safety constraints, and synthesis constraints. Based on the molecular generation model, multiple candidate molecular structures are output; Based on the target performance and constraints of the flotation reagent molecules, flotation reagent molecules are screened from the plurality of candidate molecular structures.
2. The method for automatic design of molecular structure of mineral processing flotation reagents based on artificial intelligence as described in claim 1, characterized in that, The molecular generation model is a generative AI model, which is either a graph neural network generation model or a sequence generation model.
3. The method for automatically designing the molecular structure of mineral processing flotation reagents based on artificial intelligence as described in claim 2, characterized in that, The construction of the molecular generation model includes: Acquire existing molecular structure data of flotation reagents as training data; Based on the training data, a molecular generation model architecture was constructed. The molecular generation model architecture is trained and validated using the training data to obtain the molecular generation model.
4. The method for automatic design of molecular structure of mineral processing flotation reagents based on artificial intelligence as described in claim 1, characterized in that, The environmental constraints include: ensuring biodegradability and avoiding ecotoxicity; The safety constraints include: avoiding health toxicity and avoiding physicochemical hazards; The synthesis constraints include ensuring the feasibility of the synthesis route, the availability of raw materials, and the safety of the synthesis process.
5. The method for automatic design of molecular structure of mineral processing flotation reagents based on artificial intelligence as described in claim 1, characterized in that, The step of screening flotation reagent molecules from the plurality of candidate molecular structures based on the target performance and constraints of the flotation reagent molecules includes: A flotation performance prediction model is established; the input of the flotation performance prediction model is the molecular structure of the candidate reagent molecules and the flotation process parameters, and the output is the predicted flotation recovery rate and the selectivity of the flotation reagent to the target mineral; Multiple candidate molecular structures are input into the flotation performance prediction model to obtain the flotation performance of each candidate molecular structure; Based on the flotation performance of each candidate molecular structure, flotation reagent molecules that meet the target performance and constraint conditions are screened from multiple candidate molecular structures.
6. The method for automatically designing the molecular structure of mineral processing flotation reagents based on artificial intelligence as described in claim 5, characterized in that, The flotation performance prediction model adopts a deep learning model.
7. The method for automatic design of molecular structure of mineral processing flotation reagents based on artificial intelligence as described in claim 6, characterized in that, The establishment of the flotation performance prediction model includes: Obtain data on the molecular structure and flotation performance of known flotation reagents; The molecular structure of a known flotation reagent is converted into a numerical feature representation, which includes molecular fingerprints, molecular descriptors, and / or molecular graph data. Data on the molecular structure and flotation performance of known flotation reagents are preprocessed and divided to obtain a training set for training the model and a validation set for adjusting hyperparameters and preventing overfitting. Model training and validation are then performed to obtain a flotation performance prediction model.
8. The method for automatically designing the molecular structure of mineral processing flotation reagents based on artificial intelligence as described in claim 5, characterized in that, The step of screening flotation reagent molecules that meet the target performance and constraint conditions from multiple candidate molecular structures based on the flotation performance of each candidate molecular structure includes: The predicted flotation recovery rate of each candidate molecular structure is compared with a preset recovery rate threshold. Candidate molecular structures with predicted flotation recovery rates greater than or equal to the preset recovery rate threshold are retained to obtain the first candidate molecule set. Each candidate molecule structure in the first candidate molecule set is checked for environmental and safety constraints, and candidate molecule structures containing environmental risk structural units or health toxicity structural units are eliminated to obtain the second candidate molecule set. For each candidate molecule structure in the second candidate molecule set, the synthesis constraints are checked, and the candidate molecule structures whose molecular weight is within a preset range and whose number of synthesis steps does not exceed a preset upper limit are retained to obtain the third candidate molecule set. A multi-objective comprehensive score is performed on each candidate molecular structure in the third candidate molecule set. The candidate molecular structures are ranked according to the score results, and the candidate molecular structures in the top 1% are selected as flotation reagent molecules that meet the target performance and constraint conditions.
9. The method for automatically designing the molecular structure of mineral processing flotation reagents based on artificial intelligence as described in claim 8, characterized in that, The multi-objective comprehensive score includes: Obtain the predicted flotation recovery rate, environmental friendliness score, and synthesis feasibility score for each candidate molecular structure in the third candidate molecule set; The performance indicators were normalized to obtain normalized values for flotation recovery, selectivity, environmental friendliness, and synthesis feasibility. Based on preset weighting coefficients, the normalized values were weighted and summed to obtain a comprehensive score for each candidate molecular structure.
10. The method for automatically designing the molecular structure of mineral processing flotation reagents based on artificial intelligence as described in claim 8 or 9, characterized in that, The environmental friendliness score is determined based on the predicted values of biodegradability and ecotoxicity of the molecular structure; the synthesis feasibility score is determined based on the number of steps in the synthesis route, the availability rating of raw materials, and the complexity of the reaction conditions.