Development and application of molecular marker for major-effect qhp3.7 of maize kernel protein
By locating the major-effect QTL qHP3.7 on maize chromosome 3 and developing PARMS marker primers, the problem of low protein content in maize kernels was solved, enabling efficient screening and improvement of maize kernel protein content, and increasing the protein content and yield of maize varieties.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- INST OF FOOD CROPS HUBEI ACAD OF AGRI SCI
- Filing Date
- 2026-05-13
- Publication Date
- 2026-06-23
AI Technical Summary
Existing technologies have low protein content and unbalanced amino acid composition in corn kernels, resulting in high livestock and poultry breeding costs and insufficient high-quality protein resources, which restricts the upgrading of the corn deep processing industry.
The major QTL qHP3.7 was located on chromosome 3 of maize using genome-wide association analysis, and PARMS marker primers closely linked to it were developed for marker-assisted selection breeding to screen for high-protein maize varieties.
It enables efficient screening and improvement of maize kernel protein content, enhances the protein content and yield of maize varieties, and provides genetic resources and selection targets for the creation of high-protein varieties.
Smart Images

Figure FT_1 
Figure FT_2 
Figure FT_3
Abstract
Description
Technical Field
[0001] This invention belongs to the field of molecular biology, specifically relating to the acquisition of the major QTL qHP3.7 of maize kernel protein, and the development and application of its molecular marker primers. Background Technology
[0002] Maize is a globally important food, feed, and industrial raw material crop. The protein content of its kernels is a core trait determining its feed efficiency, nutritional quality, and industrial added value. Common maize kernels have low protein content and an unbalanced amino acid composition, requiring livestock and poultry farmers to supplement with large amounts of soybean meal, significantly increasing breeding costs. Simultaneously, maize protein is a high-quality raw material for high-value-added industries such as food, pharmaceuticals, and textiles. The insufficient supply of high-quality protein resources restricts the upgrading and value enhancement of the maize deep-processing industry. However, currently, there are still relatively few clearly identified functional genes in maize that regulate maize kernel protein, and our understanding of the molecular mechanisms of maize kernel protein formation is limited. Therefore, accurately identifying high-protein kernel traits provides support for creating new high-yield and high-quality maize varieties and identifies target points for breeding high-protein varieties, which has significant production and scientific research value.
[0003] Based on this, this study extensively collected 567 backbone inbred lines from maize producing areas in Southwest China and the Huang-Huai-Hai Plain, established a population of superior germplasm with different characteristics, completed whole-genome resequencing at a depth of 20X using the DNBSEQ-T7 / PE150 sequencing platform, determined the protein content of kernels using a near-infrared spectroscopy analyzer, and further identified superior allelic variations through association analysis, developed specific functional markers and used them to create intermediate breeding materials, providing superior allelic resources for carrying out high-protein molecular breeding of maize. Summary of the Invention
[0004] The purpose of this invention is to provide a reagent for detecting the base at position 183842554 of chromosome 3 of the maize genome and its application in screening breeding for maize kernel protein content.
[0005] Another objective of this invention is to provide the application of a reagent for detecting the base at position 183842554 of chromosome 3 of the maize genome in the preparation of a screening kit for maize kernel protein content.
[0006] The final objective of this invention is to provide a method for screening breeding of maize kernel protein content.
[0007] To achieve the above objectives, the present invention is implemented through the following technical solution:
[0008] Obtaining the major QTL qHP3.7 controlling maize kernel protein:
[0009] 1) Through genome-wide association analysis, this invention detected the major QTL qHP3.7, which regulates maize kernel protein, on chromosome 3 of maize. This region contains 5 tightly linked SNP sites with a total length of 80.658 Kb (Chr3:183761896-183842554).
[0010] 2) The applicant developed a closely linked functional molecular marker using the optimal haplotype PARMS marker, with the physical location (Chr3:183842554). The PARMS marker primers are: qHP3.7R: AGTGATGGCTAGACGCCTGG, qHP3.7F1 (haplotype G with low protein content): GAAGGTGACCAAGTTCATGCT CAAAGAAGCCCTGCTGGTG and qHP3.7F2 (high-protein haplotype T): GAAGGTCGGAGTCAACGGATT CAAAGAAGCCCTGCTGGTT.
[0011] 3) The applicant verified, through the analysis of maize kernel protein phenotypes of the 567 maize inbred line under seven environmental conditions, that the superior haplotype qHP3.7 contributed 10.23% to the maize kernel protein phenotype in the natural population and was strongly selected in breeding practice. This superior haplotype provides genetic resources for the creation of high-protein maize lines.
[0012] The scope of protection of this invention also includes:
[0013] Application of reagents for detecting base 183842554 on chromosome 3 of the maize genome in screening breeding for maize kernel protein content trait.
[0014] Application of reagents for detecting base 183842554 on chromosome 3 of the maize genome in the preparation of screening kits for maize kernel protein content traits.
[0015] In the above-described application, if a homozygote is detected with a base T at position 183842554 on chromosome 3 of the maize genome, the maize is determined to be a high-protein sample with a protein content greater than or equal to 12%. If a homozygote is detected with a base G at position 183842554 on chromosome 3 of the maize genome, the maize is determined to be a low-protein sample with a protein content less than 12%.
[0016] Application of reagents for detecting base 183842554 on chromosome 3 of the maize genome in screening breeding for maize kernel protein content and yield traits.
[0017] In the above-described application, if a homozygote is detected with a base of T at position 183842554 on chromosome 3 of the maize genome, the maize is determined to have a protein content greater than or equal to 12% and a low yield. If a homozygote is detected with a base of G at position 183842554 on chromosome 3 of the maize genome, the maize is determined to have a protein content less than 12% and a high yield.
[0018] In the above applications, the preferred reagent is a primer.
[0019] The primers described above are preferably PARMS detection primers, and more preferably the primers provided by this invention: qHP3.7R: AGCTGAGAGATCCGGCAGG, qHP3.7F1 (low protein haplotype G): GAAGGTGACCAAGTTCATGCT CAAAGAAGCCCTGCTGGTG and qHP3.7F2 (high-protein haplotype T): GAAGGTCGGATTCAACGGATT CAAAGAAGCCCTGCTGGTT.
[0020] A method for screening and breeding maize kernel protein content includes detecting base 183842554 on chromosome 3 of the maize genome using conventional methods in the art. These conventional methods include, but are not limited to: sequencing, TaqMan probe method, AS-PCR method, molecular beacon method, high-resolution melting curve method, CAPS method, SNaPshot method, KASP method, PARMS method, gene chip method, and mass spectrometry.
[0021] The reference genome of maize used in this invention is Zm-B73-REFERENCE-NAM-5.0.
[0022] Compared with the prior art, the present invention has the following advantages:
[0023] This invention is the first to finely map a novel major QTL controlling maize kernel protein based on genome-wide association analysis. The major QTL is located on chromosome 3, which contains five closely linked SNP loci with a total length of 80.658 kb (Chr3:183761896-183842554). All of these loci are major QTL loci controlling phenotypic variation in maize kernel protein content, explaining 10.23% of the phenotypic contribution. The PARMS marker developed based on its optimal allele can be used for marker-assisted selection breeding. Attached Figure Description
[0024] Figure 1 Box plots showing the normal distribution of maize kernel protein content in 567 inbred line populations under 7 environmental conditions.
[0025] Figure 2This is a distribution map of BLUP values for grain protein content in the associated population.
[0026] Figure 3 This is a schematic diagram of the association analysis of the qHP3.7 site in Ezhou, Hubei (EZ) in 2023;
[0027] Association analysis of 13.2 million polymorphic variation sites with a minimum allele frequency greater than 0.05 at the qHP3.7 locus with grain protein content phenotype in 567 different inbred lines. Each dot represents a polymorphic site.
[0028] Figure 4 This is a schematic diagram of the association analysis of the qHP3.7 site in the Shihezi (XJ) environment of Xinjiang in 2024;
[0029] Association analysis of 13.2 million polymorphic variation sites with a minimum allele frequency greater than 0.05 at the qHP3.7 locus with grain protein content phenotype in 567 different inbred lines. Each dot represents a polymorphic site.
[0030] Figure 5 This is a schematic diagram of the correlation analysis of grain protein content based on BLUP value for qHP3.7.
[0031] Association analysis of 13.2 million polymorphic variation sites with a minimum allele frequency greater than 0.05 at the qHP3.7 locus with grain protein content phenotype in 567 different inbred lines. Each dot represents a polymorphic site.
[0032] Figure 6 A schematic diagram illustrating the superior haplotype effect of qHP3.7;
[0033] A comparative analysis of protein content was conducted on 105 inbred lines with the qHP3.7 genotype (haplotype T) and 450 inbred lines with the qhp3.7 genotype (haplotype G). Each box represents the median and interquartile range, extended to the maximum and minimum values. The significance of the differences was estimated by one-way ANOVA.
[0034] Figure 7 A stratified haplotype frequency analysis plot of the qHP3.7 protein content gradient.
[0035] Figure 8 A schematic diagram illustrating the genetic effects of qHP3.7 on other traits;
[0036] Scatter dots represent the distribution of protein content in families. qHP3.7 represents the favorable haplotype T of qHP3.7, and qhp3.7 represents the unfavorable haplotype G of qHP3.7. Each box represents the median and interquartile range, extended to the maximum and minimum values. Error bars represent SD. The significance of differences was estimated by one-way ANOVA.
[0037] Figure 9 A schematic diagram illustrating the development and utilization of the optimal haplotype functional marker for qHP3.7;
[0038] In the diagram: green dots indicate that the qHP3.7F2 family has high protein content in its grains, while blue dots indicate that the qHP3.7F1 family has low protein content in its grains. Detailed Implementation
[0039] Unless otherwise specified, the technical solutions described in this invention are conventional solutions in the field; unless otherwise specified, the reagents or materials described are all publicly available.
[0040] The reference genome for maize in this invention is Zm-B73-REFERENCE-NAM-5.0 (MaizeGDB GenomeCenter).
[0041] Example 1:
[0042] Obtaining the major QTL qHP3.7 of corn kernel protein:
[0043] 1. Materials and Methods
[0044] 1.1 Materials
[0045] A total of 567 backbone inbred lines from maize producing areas in Southwest China and the Huang-Huai-Hai Plain were collected. A population of superior germplasm with different characteristics was established. The whole genome was resequencing at a depth of 20X based on the DNBSEQ-T7 / PE150 sequencing platform, and 13.2 million high-quality SNP markers were obtained.
[0046] 1.2 Experimental Methods
[0047] 1.2.1 Phenotypic Identification
[0048] The protein content of corn kernels was determined using a near-infrared spectroscopy analyzer. Each sample was measured twice, and the average value was taken.
[0049] 1.2.1 Genome-wide association analysis of maize kernel protein loci
[0050] For the raw sequencing data after sequencing, data quality control was performed to obtain high-quality clean data. This clean data was then aligned to a reference genome for variant detection. The reference genome used was AGPv5, downloaded from http: / / plants.ensembl.org / Zea_mays / Info / Index. BWA software was used to align PE reads with the B73 reference genome sequence, obtaining alignment results in AM format. The SAM format file was converted to BAM format using samtools software. Then, the reads in the BAM file were sorted using the SortSam tool in Picard, and PCR duplicates were removed to obtain the final BAM file suitable for variant calling. Finally, the HaplotypeCaller module of GATK was used for variant detection, including SNPs and InDels. Q and K were calculated using STRUCTURE and TESSEL 5.0 software, respectively. After correction, the P-value was set to 1.0 × 10⁻⁶. -5 As a threshold for the significance of GWAS results.
[0051] 1.2.2 Development of optimal haplotype molecular markers
[0052] Based on the B73 genome and the differential site information provided by qHP3.7 sequencing, specific primers were designed. Using PARMS detection technology, the adapter sequence that matches FAM fluorescence is GAAGGTGACCAAGTTCATGCT, and the adapter sequence that matches HEX fluorescence is GAAGGTCGGAGTCAACGGATT.
[0053] 1.2.3 Genotype Analysis
[0054] Small-scale DNA extraction from maize was performed using the CTAB (Cetyltrimethyl Ammonium Bromide) method (Saghai-Maroof et al 1984), followed by PARMS SNP detection.
[0055] 2. Results and Analysis
[0056] 2.1 Identification of grain protein content in 567 inbred line related populations
[0057] Grain protein data from 567 inbred line populations in seven environments across two years (2023-2024) were collected in Shihezi (XJ), Lingshui (HN), Ezhou (EZ), and Gucheng (GC). The results are shown in Table 1. The phenotypic variation in grain protein content ranged from 7.56% to 16.26%. Figure 1 It can be seen that the protein content in the seven environments follows a normal distribution, and the population phenotypic variation is rich.
[0058] Table 1. Phenotypic Statistical Analysis of Associated Groups
[0059] .
[0060] Based on phenotypic data from seven environments, the best linear unbiased prediction (BLUP) method was used to estimate the grain protein content of the associated populations. The data generally conformed to a normal distribution. There were 2 inbred lines with a protein content ≥14%, 35 inbred lines with a protein content between 13% and 14%, 142 inbred lines with a protein content between 12% and 13%, 255 inbred lines with a protein content between 11% and 12%, 115 inbred lines with a protein content between 10% and 11%, and 8 inbred lines with a protein content <10%. Figure 2 ).
[0061] 2.2 Identification and Genetic Effect Analysis of Major-Affect QTL qHP3.7 in Maize Kernel Protein
[0062] This application identifies a novel major QTL controlling maize kernel protein based on genome-wide association analysis. This major QTL is located on chromosome 3, and the region contains five tightly linked SNPs with a total length of 80.658 kb (Chr3:183761896-183842554), named qHP3.7. (Table 2, ...) Figure 3 , 4 5).
[0063] qHP3.7 locus lead SNP 183842554 (T / G) A significant association was found at p=2.12E-06, located at base 183842554 on chromosome 3 of the maize genome (maize B73 reference genome Zm-B73-REFERENCE-NAM-5.0, referred to in this invention as maize B73V5 reference genome), explaining a phenotypic contribution of 10.23%.
[0064] PARMS primers were designed for the above SNP sites as follows:
[0065] (1) Labeling leadSNPs for peak SNPs closely linked to qHP3.7 183842554 (T / G)The sequence of 200 bp upstream and downstream of position 183842554 on chromosome 3 of the maize B73V5 reference genome was extracted. The PARMS marker detection primer sequences were obtained according to primer design principles as follows:
[0066] qHP3.7R: AGCTGAGAGATCCGGCAGG, as shown in SEQ ID NO.3;
[0067] qHP3.7F1: GAAGGTGACCAAGTTCATGCT CAAAGAAGCCCTGCTGGTG, as shown in SEQ ID NO.4;
[0068] qHP3.7F2: GAAGGTCGGAGTCAACGGATT CAAAGAAGCCCTGCTGGTT, as shown in SEQ ID NO.5;
[0069] The underlined part indicates the connector sequence.
[0070] (2) Using the genomic DNA of the maize inbred line population as a template, the above primers were used to perform real-time PCR amplification. The FAM and HEX signals were scanned using a Tecan F200 and the results were output. Finally, the genotype was converted.
[0071] Using the primers described above, the sequence amplified from the low-protein material S121 (parent of Liangyu 66) is:
[0072] GAAGGTCGGAGTCAACGGATT CAAAGAAGCCCTGCTGGTGGCCTGCCGGATCTCTCAGCT, as shown in SEQ IDNO.1;
[0073] The amplification product sequence of the high-protein material CT1669 (Yufeng 303 parent stock) is as follows:
[0074] GAAGGTGACCAAGTTCATGCT CAAAGAAGCCCTGCTGGTTGCCTGCCGGATCTCTCAGCT, as shown in SEQ IDNO.2.
[0075] Amplification system:
[0076] .
[0077] Amplification parameters:
[0078] .
[0079] According to SNP 183842554 (T / G) Containing two haplotypes, the differences in grain protein content between the two haplotypes were compared based on the BLUP value of grain protein content. Among them, 105 inbred lines were SNPs. 183842554 (T / T)Alternatively known as the qHP3.7 allele, 450 materials were SNPs 183842554 (G / G) This is also known as the qhp3.7 allele. The average protein content of haplotype qHP3.7 inbred lines was significantly higher than that of haplotype qhp3.7 (p<0.05). Figure 6 The protein content of inbred lines carrying the T allele (12.38% ± 0.79%) was significantly higher than that of inbred lines carrying the G allele (11.78% ± 0.67%), with an average increase of 0.60% (P = 1.15e-14), and the phenotypic variation explained (PVE) was 10.23%. The frequency of superior haplotypes in qHP3.7 increased with increasing grain protein content. In the 10%-11% protein content range, the frequency of superior haplotypes was 8.33%; in the 11%-12% range, the frequency was 9.40%; in the 12%-13% range, the frequency was 30.69%; and in the high-protein range of 13%-15%, the frequency reached 42.50%, indicating that superior haplotypes were significantly enriched in the high-protein inbred line population. Figure 7 This indicates that qHP3.7 is a key gene locus for improving maize kernel protein content and has great potential for high-protein genetic improvement. Based on BLUP value analysis of multi-year, multi-environment yield data, the analysis showed that the two haplotypes at the qHP3.7 locus exhibited highly significant differences in the key yield traits of 100-kernel weight and single-ear grain weight (p < 0.01). Figure 8 Regarding yield-related traits such as 100-grain weight (HKW) and single-ear grain weight (KWPE), the average 100-grain weight of haplotype qHP3.7 was 26.63 g, while the average of haplotype qhp3.7 was 27.24 g (p=0.01); the average single-ear grain weight of haplotype qHP3.7 was 51.97 g, while the average of haplotype qhp3.7 was 53.14 g (p=0.18). These results indicate that qHP3.7 negatively regulates yield traits such as 100-grain weight and single-ear grain weight, suggesting a linkage burden between superior alleles regulating grain protein content and yield-related trait loci. qHP3.7 is a pleiotropic gene locus, exhibiting a significant selection effect on yield traits such as 100-grain weight and single-ear grain weight.
[0080] Table 2. SNP sites in the qHP3.7 region that are significantly associated with grain protein content.
[0081] .
[0082] a Physical location of qHP3.7 site (V5). b Single nucleotide variation, c Minimum allele, d Minimum allele frequency.
[0083] Example 2:
[0084] Application of excellent haplotype molecular marker primers for major QTL of maize kernel protein qHP3.7:
[0085] Fifty inbred lines were randomly selected from a population of 567 inbred lines, with eight plants from each family pooled together. DNA was extracted, and the grain protein content of the samples provided in Table 3 was determined using a near-infrared spectroscopy analyzer. Each sample was measured twice, and the average value was taken. Simultaneously, the PARMS primers provided in Example 1 of this invention were used to detect the genotype of the maize samples.
[0086] The results are shown in Tables 3 and 4. Figure 9 As shown, the genotyping results obtained by comparing with the resequencing results showed a 100% agreement between the PARMS marker typing and the resequencing results. When the allele at position 183842554 on chromosome 3 of maize was homozygous for T, the tested sample was high-protein material; when the allele at position 183842554 was homozygous for G, the tested sample was low-protein material. High-protein samples had a protein content ≥12%, and low-protein samples had a protein content <12%.
[0087] exist Figure 9 In the diagram, green dots indicate that the adapter primer sequence is HEX fluorescent, meaning that the qHP3.7F2 family has a high grain protein content; blue dots indicate that the adapter primer sequence is FAM fluorescent, meaning that the qHP3.7F1 family has a low grain protein content. This demonstrates that the PARMS marker can effectively distinguish between the two haplotypes.
[0088] The above results confirm that the developed functional markers can be used for marker-assisted selection of genetic improvement of protein traits in maize kernels, providing selection targets for creating new high-protein maize germplasm and breeding new high-protein varieties.
[0089] Table 3. Protein content of maize kernels and identified genotypes
[0090] .
[0091] Table 4. Protein content of maize kernels and identified genotypes
[0092] .
[0093] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the present invention, and all such modifications or substitutions should be covered within the scope of the claims of the present invention.
Claims
1. The application of a reagent for detecting base 183842554 on chromosome 3 of the maize genome in screening breeding for maize kernel protein content trait, characterized in that... If a homozygote with a base of T at position 183842554 on chromosome 3 of the maize genome is detected, the protein content of the maize is determined to be greater than or equal to 12%. If a homozygote with a base of G at position 183842554 on chromosome 3 of the maize genome is detected, the protein content of the maize is determined to be less than 12%. The maize genome is Zm-B73-REFERENCE-NAM-5.
0.
2. The application of a reagent for detecting base 183842554 on chromosome 3 of the maize genome in the preparation of a screening kit for maize kernel protein content traits, characterized in that... If a homozygote with a base of T at position 183842554 on chromosome 3 of the maize genome is detected, the protein content of the maize is determined to be greater than or equal to 12%. If a homozygote with a base of G at position 183842554 on chromosome 3 of the maize genome is detected, the protein content of the maize is determined to be less than 12%. The maize genome is Zm-B73-REFERENCE-NAM-5.
0.
3. The application of a reagent for detecting base position 183842554 on chromosome 3 of the maize genome in screening breeding for maize kernel protein content and yield traits, characterized in that... If a homozygote is detected with a base of T at position 183842554 on chromosome 3 of the maize genome, the maize is determined to have a protein content greater than or equal to 12% and a low yield. If a homozygote is detected with a base of G at position 183842554 on chromosome 3 of the maize genome, the maize is determined to have a protein content less than 12% and a high yield. The maize genome is Zm-B73-REFERENCE-NAM-5.
0.
4. The application according to claim 1, 2 or 3, characterized in that: The reagent mentioned is a primer.
5. The application according to claim 4, wherein the primer is: qHP3.7R :AGCTGAGAGATCCGGCAGG、 qHP3.7F1 : GAAGGTGACCAAGTTCATGCT CAAAGAAGCCCTGCTGGTG and qHP3.7F2: GAAGGTCGGAGTCAACGGATT CAAAGAAGCCCTGCTGGTT.
6. A method for screening and breeding maize kernels based on protein content, comprising detecting the base at position 183842554 of chromosome 3 of the maize genome, wherein the detection method is: sequencing, TaqMan probe method, AS-PCR method, molecular beacon method, high-resolution melting curve method, CAPS method, SNaPshot method, KASP method, PARMS method, gene chip method, or mass spectrometry. If a homozygous individual with a base of T at position 183842554 of chromosome 3 of the maize genome is detected, the protein content of the maize is determined to be greater than or equal to 12%. If a homozygous individual with a base of G at position 183842554 of chromosome 3 of the maize genome is detected, the protein content of the maize is determined to be less than 12%. The maize genome is Zm-B73-REFERENCE-NAM-5.0.