[0033] Example one
[0034] Such as figure 1 As shown, the embodiment of the present invention proposes a flow chart of a method of constructing a plant metabolite database, including:
[0035] Step S11. Export all plant metabolites data in the public database. Optionally, the public database can be used in Metlin, HMDB, MassBank, etc. Take HMDB as an example, use Python to export all of the layer information of all metabolites on the HMDB website; find the DISPOSITION of each metabolite to find the DISPOSITION, find the Biological column in this layer, confirm whether the compound is plant with Plant Information
[0036] Step S12. Screened the exported data based on preset screening conditions to obtain plant metabolic data sets. Specifically, based on the composition element, molecular weight, state, the number of nitrogen atoms, the number of sulfur atoms, and / or the number of phosphorus atoms, for screening of the derived data.
[0037] In the preferred embodiment of the present embodiment, the preset screening conditions include: a first screening condition, the compound should be non-single, remove all of the monosensic elements. In the preferred embodiment of the present embodiment, the preset screening conditions include: second screening conditions, the molecular weight of the compound should be less than 1500. In the preferred embodiment of the present embodiment, the preset screening conditions include: third screening conditions, the recording of the compound in the STATUS layer information should be: detected, or quantified, or detected and quantified. In the preferred embodiment of the present embodiment, the preset screening conditions include: fourth screening conditions, the number of compound nitrogen atoms should be less than or equal to 7. In the preferred embodiment of the present embodiment, the preset screening conditions include: a fifth screening condition, the number of sulfur atoms in the compound should be less than or equal to 2. In the preferred embodiment of the present embodiment, the preset screening conditions include: sixth screening conditions, the number of phosphorus atoms in the compound should be less than or equal to 3. In the preferred embodiment of the present embodiment, the preset screening conditions include: seventh screening conditions, when the phosphorus atoms in the compound are present, and when the number is 1, the number of oxygen atoms should be greater than or equal to 4. In the preferred embodiment of the present embodiment, the preset screening conditions include: eight screening conditions, when the phosphorus atoms in the compound are present, and when the number is 2, the number of oxygen atoms should be greater than or equal to 7. In the preferred embodiment of the present embodiment, the preset screening conditions include: ninth screening conditions, when the phosphorus atoms in the compound are present, and when the number is 3, the number of oxygen atoms should be greater than or equal to 9. In the preferred embodiment of the present embodiment, the preset screening conditions include: tenth screening conditions, when there is no phosphorus atom in the compound, the sum of the nitrogen atoms and oxygen atoms should be less than or equal to the number of carbon atoms.
[0038] Further, after the screening is completed, the information list of the target plant compound and the SDF structure file of the compound, combined all SDF files, forming a data set comprising 6000+ plant metabolites.
[0039] Step S13. Collect plant tissue samples and processed to extract mass spectrum data that matches the plant metabolite data set. Specifically, the tissue samples of the preselected plant roots, stems, leaves, flowers and / or fruit are collected and pre-treatment; the pre-treatment form includes solid phase extraction, defensive entry medium, infusion solid extraction, ultrafiltration or Immunogenesis; use liquid chromatography-mass spectrometry techniques to obtain sample nature spectrum data and sample chromatographic data, based on pre-treated plant tissue samples to obtain sample nature spectrum data and sample chromatographic data; The data set is mapped to obtain mass spectrum data that matches the plant metabolite data set; where the positive ion candidate add form, the negative ion candidate adds form, the precursor ion mass deviation range, the fragment ion mass deviation range The mapping matching range and / or the fragment matching score is mapped, and finally extracts mass spectrum data that matches the plant metabolite data set.
[0040] In some examples, 23 common plants such as wheat, sand, sunflower, rape, blueberry, stem, leaves, flowers or fruits are collected, and the pretreatment is summarized as follows:
[0041] A. Weigh 80 mg samples, add internal standard (L-2-chlorophenylalanine, 0.3 mg / ml; LYSO PC17: 0, 0.01 mg / mL;) 20 μl, 600 ml of methanol - water ( V1: V2 = 7: 3).
[0042] B. Two small steel beads were added, and 2min was pre-cooled at -20 ° C, and the grinder (60 Hz, 2min) was added.
[0043] C. Extraction of ice water bath for 30 min, - 20 ° C for 20 min.
[0044] D. Centrifuge for 10 min (13000 rpm, 4 ° C), and all supernatant was loaded into a 1.5 ml EP tube.
[0045] E. Add 400 μl of methanol-water again in the residue (V3: V4 = 7: 3).
[0046] F. Ice water bath Ultrasonic extraction 20 min, - 20 ° C for 20 min.
[0047] G. Centrifuge for 10 min (13000 rpm, 4 ° C), and all supernatants were mixed with the supernatant in step D, and the supernatant was totally 1 mL.
[0048] H. Take 300 ul, filter membrane filtration, and bottled.
[0049] I. Take 300 ul to the supernatant, smooth, recall the pure water with 300 ul, centrifuge, take the upper filter, and bottled.
[0050] J. The remaining 400 ul supernatant saves to the -80 degree refrigerator.
[0051] Further, a liquid chromatography-mass spectrometry (such as AB 6600PLUS and Thermo QE instrument), the above tissue samples are subjected to data acquisition, and the mass spectrum data and chromatographic data of plant tissue samples are obtained. Analyze the mass spectrum data of the plant tissue sample (for example, using Waters company's Progenesis Qi Analysis Software), set the positive ion candidate plus form: [M + H] +, M +, [2M + H] +, [M + K] +, [M + Na] +, [M + NH4] +, [M-H2O + H] +; Set an negative ion candidate add form: [2M-H] +, [M-H2O-H] -, [M + FA-H] -, [M + CL] -, [M-H] -; Set precursor ion mass deviation ≤5ppm; set fragment ion mass deviation ≤10ppm; set mapping matching total score ≥ 40; Debris matching points ≥ 10; mapping of data sets of 6000+ plant metabolites.
[0052] In the preferred embodiment of the present embodiment, the plant metabolite data set is matched to the biofile data of the plant tissue sample, and the matching result is positively correlated with the total score of the mapping match. For example, the biore source information of the candidate compound is matched to the sample tissue of the spectrum data, and the candidate compound is from the tissue sample from the same subject, and the total score +5 is mapped.
[0053] Further, the spectral information corresponding to the successful metabolite is exported in the form of a data matrix, and the collection summary is saved in the form of an MSP file to store the mass spectrum information of 6000+ plant metabolites.
[0054] Step S14. Chromatographic data that matches the plant metabolite data set is obtained based on the retention time of the standard product and the plant tissue sample. Specifically, based on the original data matrix of plant tissue samples corresponding to the above-described matching metabolite (optional Watersis Qi Analysis Software), the Metabolic Number of Matching Success is CSV Reserved Time Data corresponding to its retention time. The list export; the same, the original data matrix of the standard is analyzed, form a list of CSV retention time data for standard metabolites and its retention time; integrate the above two retention time lists to get the complete 6000+ plant metabolites Chromatographic data.
[0055] Step S15. Based on the plant metabolite data set, mass spectrogram data and chromatographic data, plant metabolites database is constructed. In some examples, use Waters' ProGenesis Qi Analysis Software, call 6000+ plant metabolites, the MSP files of the spectral information, and the CSV file of the chromatographic information, integrated the formation of the complete plant metabolites database. Preferably, the biological source information in the database corresponds to each compound, forming a separate Excel form, convenient to call. When using a database, the database after the QI analysis software is called, and the retention time deviation is set to ≤0.1min, the precursor ion mass deviation is ≤ 100 ppm, and the fragment ion mass deviation is ≤10ppm, which can be used normally.
[0056] In order to further illustrate the advantages of the plant metabolite database (self-construction library) constructed in the present invention, the original data of the blueberry seedlings is obtained in accordance with the above-mentioned plant tissue samples, in QI software On the original data of the blueberry seedling, the public library is used to search and self-study libraries.
[0057] Figure 2A and Figure 2b A schematic diagram of traceability of blueberry seedlings in public libraries and self-study libraries, respectively. Figure 2A For the public library, the metabolite portion of the black marker is a predictive compound, and has never been reported in plants; some metabolites are derived from animal-specific metabolic ways; some metabolites are non-natural products, source from the environment Pollutants or pharmaceuticals synthesized by plant. Figure 2b For the traceability of the self-study library, it is obviously compared with the public library. The compounds whose comments are all plants natural metabolic products that can be traced back to the HMDB web page record or literature.
[0058] Figure 3A and Figure 3b A schematic diagram of the blueberry seedlings in the public library and the self-study library, which is more known, blueberry seedlings Figure 3b The spectrum match in the self-study library is higher. The resulting spectral information and chromatographic information are more abundant, and completely matched fragment ion information, accurate to the number of female ion ion in the decimal point, error is not more than 0.1 min keep time.
[0059] Figure 4A and Figure 4b A selection result of the symbolized isomer in the example in an example in the public library and the self-study library. Figure 4A The comment results of the public community are shown that there is a candidate compound that is substantially consistent, it is difficult to distinguish it. Figure 4b By retention time dimension comparison, the target compounds of the annotation can be easily distinguished.
[0060] Figure 5A and Figure 5b A schematic diagram of the file contents of the public library and the self-study library. Figure 5AThe Chinese public library opens with NOTEPAD, which is more single, except for the spectrogram matrix information, only basic information such as compound molecular formula, Inchikey, classification. The self-study library in Figure B opens, divides the spectrogram information, compound molecular formula, compound add form, Inchikey, classification, etc. The hyperlink is included in the form of inclusion, which is convenient to find.
[0061] Table 1 Comparison of annotation results for different plant tissue samples in public libraries and self-study libraries. By analyzing the search results of different plant tissue samples and plant self-study libraries, although compared to the public library, the compounds of the self-study library have decreased, but the number of plant metabolites contained in the public library The ratio is only about 35%, and the result is too positive. In contrast, the self-study library avoids this problem, ensuring that the results are plant source metabolites.
[0062] Table 1 Comparison of annotation results of different plant tissue samples in public libraries and self-study libraries
[0063]
[0064] In some embodiments, the method can be applied to a controller, such as an ARM (Advanced Risc Machines) controller, FPGA (Field Programmable Gate Array) controller, SOC (System on Chip) controller, DSP (Digital Signal Processing) controller, or MCU (MicroController Unit) controller, etc. In some embodiments, the method can also be applied to include a memory, a memory controller, one or more processing units (CPUs), peripheral interfaces, RF circuits, audio circuitry, speakers, microphones, input / output (I / O) Computers of components such as subsystems, display, other outputs, or control devices, and external ports; including, but not limited to, such as desktop, laptop, tablet, smartphone, smart TV, personal digital assistant (Personal) Digital Assistant, referred to as PDA, etc. In other embodiments, the method can also be applied to a server that can be arranged on one or more entity servers based on a functional, load, and the like, or may be composed of a distributed or set server cluster.