A multi-site discrimination method for wild and domestication populations of pseudosciaena crocea based on SNaPshot technology
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- YELLOW SEA FISHERIES RES INST CHINESE ACAD OF FISHERIES SCI
- Filing Date
- 2024-12-25
- Publication Date
- 2026-06-26
Smart Images

Figure SMS_1 
Figure SMS_3 
Figure SMS_4
Abstract
Description
Technical Field
[0001] This invention belongs to the field of fish genetic identification technology, specifically involving a multi-site identification method for wild-caught large yellow croaker populations based on SNaPshot technology. Background Technology
[0002] Large yellow croaker (Larimichthys crocea) belongs to the order Perciformes, family Sciaenidae, and genus Larimichthys. It is also commonly known as yellow croaker, large fresh yellow croaker, or large yellow flower. It is a warm-temperate migratory coastal fish living in the northwestern Pacific Ocean. In my country, large yellow croaker is currently mainly divided into three geographical groups: the Daiqu group, the Min-Yuedong group, and the Naozhou group, distributed in many areas. Large yellow croaker is an important aquaculture fish in my country, renowned for its tender and delicious flesh, rich nutrition, and beautiful shape, making it a popular dish for banquets. It contains abundant protein, minerals, amino acids, and unsaturated fatty acids, making it an ideal protein source for humans. Wild large yellow croaker has higher levels of beneficial minerals such as selenium (Se) and zinc (Zn), as well as umami amino acids and polyunsaturated fatty acids (EPA, DHA, etc.) than farmed large yellow croaker, while farmed large yellow croaker is more likely to have higher protein and fatty acid content.
[0003] Resequencing is a bioinformatics method that involves obtaining the genome sequence information of an organism, comparing it with existing genomes, and identifying differences in sequence information to explore the genetic, evolutionary, and biological characteristics of the organism. Genome resequencing can obtain a large number of single nucleotide polymorphism (SNP), short in-deletion (SID), structural variation (SV), and copy number variation (CNV) sites. This data helps reveal the characteristics of population differentiation and domestication selection regions, narrows down the range of candidate gene locations, and lays the foundation for molecular breeding research. With the rise of next-generation sequencing platforms, their high-throughput and low-cost advantages have led to more and more species joining the ranks of high-throughput sequencing. Detecting variations in individuals or populations of a species and constructing a genetic variation database for that species is of great significance for promoting germplasm resource preservation and gene bank management. Depending on the different needs for obtaining genomic genetic information, it can be divided into whole genome resequencing (WGS) and reduced-representation genome sequencing (RRGS). Simplified genome mutation detection obtains partial sequences of the entire genome, typically used for detecting single nucleotide polymorphisms (SNPs), but its accuracy is relatively low for detecting other mutation types. Whole-genome resequencing (WGS) refers to the complete sequencing of an organism's genome using high-throughput sequencing technology and comparing it with a reference genome. This technology can be used to analyze genetic differences between individuals, identify disease-related genes or gene mutation sites, and reveal genome structure. With the continuous development of high-throughput sequencing technology, WGS technology has also been widely used. Advances in sequencing technology have strongly promoted the development of genomics, enabling the accurate sequencing and precise assembly of complex genome sequences. Whole-genome resequencing has also gone through three stages along with the development of sequencing technology: Sanger sequencing, NGS, and PacBio and Nanopore technologies. In recent years, whole-genome resequencing has been increasingly applied to population genetic studies of various vertebrates, such as Korean cattle (Bos taurus var. coreana), domestic pigs (Sus scrofa var. domesticus), and red junglefowl (Gallus gallus). Whole-genome resequencing is increasingly being used in fish, including carp (Cyprinus carpio), Atlantic salmon, and goldfish (Carassius auratus). Researchers have used whole-genome resequencing to analyze fish population structure and population history evolution, and by screening candidate genes and population genome datasets related to genomic regions, they have elucidated the molecular mechanisms of fish genetic traits.
[0004] Genome-wide selection signals refer to statistically significant genetic variations detected in a species' genome. These variations may be the result of natural or artificial selection, reflecting the advantages of different genotypes in adapting to the environment. Genome selection signal detection is an important research area, used to determine which genes and genomic regions are affected by natural or artificial selection. Currently commonly used methods mainly include allele frequency spectrum-based methods, linkage disequilibrium-based methods, and population differentiation-based methods. Allele frequency spectrum-based methods are statistical methods for detecting positive selection signals. They detect selection signals by comparing differences in allele frequency spectra between two or more populations. This method is mainly used to detect strong selection signals. Commonly used allele frequency spectra include single nucleotide polymorphism (SNP) frequency spectra and haplotype frequency spectra. Linkage disequilibrium-based methods utilize linkage disequilibrium relationships between different loci on the genome to detect selection signals. This method can detect selection signals in low-frequency alleles. It typically uses haplotype frequency-based methods to determine linkage disequilibrium relationships on the genome and detects selection signals by comparing structural differences in linkage disequilibrium between different populations. Population differentiation-based methods are statistical approaches that utilize the degree of differentiation of genomes across different populations to detect selection signals. These methods can detect weak selection signals and can be used to identify artificial selection. They are commonly used to detect the effects of natural and artificial selection on species genomes.
[0005] Accurately and rapidly identifying wild and domesticated large yellow croaker populations is crucial for the identification of large yellow croaker germplasm resources. In 2015, Ao et al. used a strategy combining bacterial artificial chromosome sequencing and whole-genome shotgun sequencing to sequence the entire large yellow croaker genome, obtaining a fine genome map. In 2019, Chen et al. assembled a large yellow croaker reference genome using third-generation sequencing technology (PacBio single-molecule sequencing) and high-throughput chromosome conformation capture technology. This highly accurate, chromosome-level reference genome provides important genomic resources for supporting the identification and evaluation of large yellow croaker germplasm resources. Currently, the development of genetically specific molecular markers has been applied in large yellow croaker. In 2022, Yu et al. developed sex-specific molecular markers for the large yellow croaker Daiqu population through the genome between dmrt1 and cfap157, providing a valuable tool for promoting sex-controlled breeding of the large yellow croaker Daiqu population.
[0006] In 2023, Chen et al., based on genome resequencing and comparison of SNP marker data from large yellow croaker populations distributed in the coastal waters of eastern and southern China, discovered that climate-driven habitat change may have occurred between the *Large Yellow Croaker* var. *naozhouensis* and *Large Yellow Croaker* var. *minensis* populations. In 2024, Yuan et al., using whole-genome resequencing data from a large sample including domesticated and wild populations for genetic structure analysis, indicated that wild populations along the Chinese coast lack a clear geographical structure. This study overturned the long-held view of dividing them into three genetic management units. However, currently, there is a lack of accurate and rapid methods for identifying wild / domesticated large yellow croaker populations. Developing accurate and rapid methods for identifying wild / domesticated large yellow croaker using genetically specific markers could effectively promote research on the distribution of large yellow croaker populations in my country. Summary of the Invention
[0007] The first objective of this invention is to provide a set of SNP molecular markers for the identification of wild and domesticated populations of large yellow croaker.
[0008] A second aspect of the present invention is to provide a primer set for amplifying the SNP molecular markers of the first aspect of the present invention.
[0009] A third aspect of the present invention is to provide a detection reagent, gene chip, or kit.
[0010] The fourth aspect of this invention aims to provide the application of the SNP molecular markers of the first aspect of this invention, the primer sets of the second aspect of this invention, or the detection reagents, gene chips, or kits of the third aspect of this invention in the identification of wild and / or domesticated populations of large yellow croaker.
[0011] The fifth aspect of this invention aims to provide a method for identifying wild and / or domesticated populations of large yellow croaker.
[0012] The sixth aspect of this invention aims to provide the application of the method for identifying wild and / or domesticated populations of large yellow croaker, as described in the fifth aspect of this invention, in the identification and evaluation of large yellow croaker germplasm resources.
[0013] To achieve the above objectives, the technical solution adopted by the present invention is as follows:
[0014] In a first aspect, the present invention provides a set of SNP molecular markers for the identification of wild and domesticated populations of large yellow croaker, wherein the SNP molecular markers include at least three of the following: SNP1-30187747, SNP1-30206146, SNP1-30225064, SNP13-42432874, SNP13-43081647, SNP13-43107208, SNP13-43228958, and SNP15-22344715.
[0015] Among them, SNP1-30187747, SNP1-30206146 and SNP1-30225064 are located at positions 30187747, 30206146 and 30225064 of chromosome 1 of large yellow croaker, respectively, and their polymorphisms are A / G, A / G and T / C, respectively.
[0016] The SNPs 13-42432874, 43081647, 43107208, and 43228958 are located at positions 42432874, 43081647, 43107208, and 43228958 on chromosome 13 of large yellow croaker, respectively, with polymorphisms of G / A, C / T, G / A, and G / A, respectively.
[0017] The SNP15-22344715 is located at position 22344715 on chromosome 15 of large yellow croaker NC_040023.1, and its polymorphism is G / A.
[0018] In some embodiments of the present invention, the SNP molecular markers are composed of SNP1-30187747, SNP1-30206146, SNP1-30225064, SNP13-42432874, SNP13-43081647, SNP13-43107208, SNP13-43228958 and SNP15-22344715.
[0019] In some embodiments of the present invention, the SNP molecular markers are composed of SNP1-30187747, SNP1-30225064, SNP13-42432874, SNP13-43081647, SNP13-43107208, SNP13-43228958 and SNP15-22344715.
[0020] In some embodiments of the present invention, the SNP molecular markers are composed of SNP1-30187747, SNP1-30225064, SNP13-42432874, SNP13-43081647, SNP13-43107208 and SNP13-43228958.
[0021] In some embodiments of the present invention, the SNP molecular markers are composed of SNP1-30187747, SNP1-30225064, SNP13-42432874 and SNP13-43228958.
[0022] In some embodiments of the present invention, the SNP molecular markers consist of SNP1-30187747, SNP1-30225064 and SNP13-42432874.
[0023] The above 8 SNP molecular markers were obtained through the following steps:
[0024] S1. After whole-genome sequencing of 395 wild-domesticated populations of large yellow croaker, population genetic selection signal analysis was used to screen for differential SNPs in wild-domesticated large yellow croaker.
[0025] S2. Strong selection signals based on Fst & π were detected in the genome to explore genetic differences caused by different living environments of wild and domesticated large yellow croaker. The top 2% of Fst and the top 5% of the π ratio were identified as selection signal regions between wild and domesticated large yellow croaker. Based on the selection signal analysis, candidate genes with extremely significant differences in allele frequencies between wild and farmed large yellow croaker populations were screened within the selection signal regions (P<0.001), resulting in the identification of 139 genes. Furthermore, eight candidate genes causing non-synonymous mutations by SNPs in exons were identified for further SNP validation.
[0026] Based on the above screening, the SNP loci are located on three different chromosomes of large yellow croaker (chromosome 1 (NC_040011.1), chromosome 13 (NC_040023.1), and chromosome 15 (NC_040025.1)), with the following location information: SNP1-30187747, SNP1-30206146, SNP1-30225064, SNP13-42432874, SNP13-43081647, SNP13-43107208, SNP13-43228958, SNP15-22344715. A set of SNaPshot marker primers was designed for the identification of wild-caught domesticated populations of large yellow croaker.
[0027] S3. Use single / multiple PCR reaction system to perform typing experiments to verify whether the reaction system of 8 SNPs can identify the wild domesticated population of large yellow croaker.
[0028] S4. Compare the accuracy and detection rate of the SNaPshot sequencing results with the results of known marker detection to prove whether the above 8 SNP combinations can identify wild domesticated populations of large yellow croaker.
[0029] In a second aspect, the present invention provides a primer set for amplifying the SNP molecular marker of the first aspect of the present invention, the nucleotide sequence of the primer set being shown in SEQ ID NO:1 to SEQ ID NO:16.
[0030] In some embodiments of the present invention, SEQ ID NO:1 to SEQ ID NO:16 are arranged in sequence to form a primer pair of two nucleic acid sequences.
[0031] A third aspect of the present invention provides detection reagents, gene chips, or kits comprising the primer set of the second aspect of the present invention.
[0032] In some embodiments of the present invention, the kit further comprises one or more of dNTPs, DNA polymerase, PCR reaction buffer, and Taq.
[0033] In some embodiments of the present invention, the kit further includes SAP and ExoI.
[0034] In some embodiments of the present invention, the kit further includes single-base extension primers.
[0035] In some embodiments of the present invention, the single-base extension primers are shown in SEQ ID NO:17 to SEQ ID NO:24.
[0036] A fourth aspect of the present invention provides the application of the SNP molecular marker of the first aspect of the present invention, the primer set of the second aspect of the present invention, or the detection reagent, gene chip, or kit of the third aspect of the present invention in the identification of wild and / or domesticated populations of large yellow croaker.
[0037] A fifth aspect of the present invention provides a method for identifying wild and / or domesticated populations of large yellow croaker, comprising the step of detecting SNP molecular markers of the first aspect of the present invention in the population of large yellow croaker to be tested using primer sets of the second aspect of the present invention or detection reagents, gene chips or kits of the third aspect of the present invention.
[0038] This identification method was developed based on multiple single nucleotide polymorphism (SNP) sites identified through whole genome resequencing and population genetic selection signal analysis of large yellow croaker. The method was designed and tested using SNaPshot, a genotyping technique based on fluorescently labeled single-base extension. The system integrates eight candidate indicator SNPs into a single multiplex PCR reaction, specifically amplifying and genotyping them to determine their accuracy in two population groups. Validation using SNaPshot on 240 large yellow croakers with known genotypes demonstrated that the eight SNP marker system can successfully identify wild / domesticated large yellow croaker populations in the vast majority of groups. Compared to current identification methods and markers, this marker system and detection method significantly improve the accuracy of identifying wild and domesticated large yellow croaker populations. It enables rapid and accurate genotyping on various genetic analyzers, automates SNP analysis, and facilitates convenient and efficient high-throughput detection.
[0039] In some embodiments of the present invention, the identification method includes the following steps:
[0040] (1) Using the DNA of the large yellow croaker sample to be tested as a template, PCR amplification is performed using the primer set of the second aspect of the present invention or the detection reagent, gene chip or kit of the third aspect of the present invention to obtain PCR amplification products.
[0041] (2) Perform SNaPshot sequencing analysis on the PCR amplification products to obtain the genotype of the SNP molecular marker described in claim 1 in the genome of the large yellow croaker to be tested;
[0042] (3) Analyze the frequency of the genotypes of the SNP molecular markers, score them, construct ROC curves, and determine whether the large yellow croaker population to be tested is wild large yellow croaker or domesticated large yellow croaker.
[0043] In some embodiments of the present invention, the scoring rules in step (3) are as follows:
[0044] If the frequency of the AA genotype of SNP1-30187747 in the SNP molecular marker is greater than that of the GG genotype, then 1 point is awarded; otherwise, 0 points are awarded.
[0045] If the frequency of the GG genotype of SNP1-30206146 in the SNP molecular marker is greater than that of the AA genotype, then 1 point is awarded; otherwise, 0 points are awarded.
[0046] If the frequency of the TT genotype of SNP1-30225064 in the SNP molecular marker is greater than that of the CC genotype, then 1 point is awarded; otherwise, 0 points are awarded.
[0047] If the frequency of the GG genotype of SNP13-42432874 in the SNP molecular marker is greater than that of the AA genotype, then 1 point is awarded; otherwise, 0 points are awarded.
[0048] If the frequency of the TT genotype of SNP13-43081647 in the SNP molecular marker is greater than that of the CC genotype, then 1 point is awarded; otherwise, 0 points are awarded.
[0049] If the frequency of the GG genotype of SNP13-43107208 in the SNP molecular marker is greater than that of the AA genotype, then 1 point is awarded; otherwise, 0 points are awarded.
[0050] If the frequency of the GG genotype of SNP13-43228958 in the SNP molecular marker is greater than that of the AA genotype, then 1 point is awarded; otherwise, 0 points are awarded.
[0051] If the frequency of the GG genotype of SNP15-22344715 in the SNP molecular marker is greater than that of the AA genotype, then 1 point is awarded; otherwise, 0 points are awarded.
[0052] Generally, if a discrimination method is constructed based on x loci (x≤8), the total number of discrimination methods is 2x (number of alleles * 2).
[0053] In some embodiments of the present invention, the cut-off value of the ROC curve (i.e., the maximum value of ROC curve sensitivity (sensitivity%) + specificity (specificity%)) is set as the discrimination threshold. When the total score of the large yellow croaker population to be tested is higher than the threshold, it is a wild large yellow croaker population; otherwise, it is a domesticated large yellow croaker population.
[0054] In some embodiments of the present invention, the PCR amplification in step (1) is singlet / multiplex PCR amplification.
[0055] In some embodiments of the present invention, the PCR amplification reaction program is as follows: pre-denaturation at 92-96°C for 3-6 min; denaturation at 92-96°C for 28-35 s, annealing at 58-62°C for 28-35 s, extension at 70-72°C for 20-30 s, 32-37 cycles; and extension at 70-72°C for 8-12 min.
[0056] In some embodiments of the present invention, the PCR amplification products are subjected to digestion, single-base extension, purification, and other treatments before sequencing.
[0057] In some embodiments of the present invention, the digestion reaction system includes 8–12 μL of PCR amplification product, 0.2–0.4 U SAP and 0.05–0.2 U ExoI; the digestion reaction conditions are 35–38 °C for 55–65 min; 70–77 °C for 12–16 min.
[0058] In some embodiments of the present invention, the reaction conditions for the single base extension are 94-97°C for 8-13 s, 46-52°C for 3-7 s, 58-62°C for 28-32 s, and 25-30 cycles.
[0059] The sixth aspect of the present invention provides the application of the method for identifying wild and / or domesticated populations of large yellow croaker, as described in the fifth aspect of the present invention, in the identification and evaluation of large yellow croaker germplasm resources.
[0060] The beneficial effects of this invention are:
[0061] This invention provides a set of SNP molecular markers for differentiating wild and domesticated populations of large yellow croaker. By detecting these SNP molecular markers, the identification of wild and domesticated large yellow croaker populations can be achieved with only a single multiplex PCR reaction system and SNaPshot genotyping sequencing. The method can be automated by operating on various genetic analysis instruments, enabling the identification of wild and domesticated large yellow croaker populations. Compared to current methods for identifying large yellow croaker populations, this method is faster, has higher throughput, and, more importantly, boasts extremely high accuracy and cumulative exclusion rate. Attached Figure Description
[0062] Figure 1 Signal analysis was performed on wild and domesticated large yellow croaker populations. Here, A represents the distribution of Fst values for the wild and domesticated large yellow croaker populations (Wild and Breed) within a 50kb sliding window with a step size of 10kb, and B represents the distribution of the π ratio between the wild and domesticated large yellow croaker populations (Breed) within a 50kb sliding window with a step size of 10kb.
[0063] Figure 2 The results of SNaPshot primers designed for SNPs for individual amplification (top) or mixed amplification (bottom) are shown in the figure. In the figure, C3, B7, Y1, and Y2 are known domesticated populations of large yellow croaker, X142, X143, A91, and E93 are known wild populations of large yellow croaker, N is a negative control without samples, and M is a maker.
[0064] Figure 3 This is a peak diagram of SNaPshot detection.
[0065] Figure 4The figures show the ROC curves for the multi-site large yellow croaker wild / domesticated population identification method constructed based on SNPs; where A is the ROC curve of the identification method constructed based on H1, H2, H3, H4, H5, H6, H7 and H8, B is the ROC curve of the identification method constructed based on H1, H2, H3, H4, H5, H6 and H7, C is the ROC curve of the identification method constructed based on H1, H3, H4, H5, H6, H7 and H8, D is the ROC curve of the identification method constructed based on H1, H3, H4, H5, H6 and H7, E is the ROC curve of the identification method constructed based on H1, H3, H4 and H7, and F is the ROC curve of the identification method constructed based on H1, H3 and H4. Detailed Implementation
[0066] The present invention will be further described in detail below through specific embodiments.
[0067] It should be understood that these embodiments are for illustrative purposes only and are not intended to limit the scope of the invention.
[0068] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below. Where specific conditions are not specified in the embodiments, conventional conditions or conditions recommended by the manufacturer shall apply. Reagents or instruments whose manufacturers are not specified are all conventional products that can be purchased commercially.
[0069] The features and performance of the present invention will be further described in detail below with reference to embodiments.
[0070] Example 1
[0071] A set of SNPs for identifying wild-caught, domesticated large yellow croaker populations based on SNaPshot technology. The SNPs are located on three different chromosomes of large yellow croaker (chromosome 1 (NC_040011.1), chromosome 13 (NC_040023.1), and chromosome 15 (NC_040025.1)), with the following location information: SNP1-30187747, SNP1-30206146, SNP1-30225064, SNP13-42432874, SNP13-43081647, SNP13-43107208, SNP13-43228958, and SNP15-22344715.
[0072] The above 8 SNPs were obtained through whole-genome resequencing and population genetic selection signal analysis, as follows:
[0073] A total of 576.43G of raw whole-genome resequencing data was generated from 395 large yellow croaker individuals, with an average of 13405.42M of raw data per sample. Based on this, a total of 574.60G of filtered data was generated, with an average of 13362.97M per sample. 99.48% of reads were mapped to the large yellow croaker reference genome, yielding a total of 78,370,338 raw SNPs. After filtering, 2,041,211 high-quality SNPs were obtained. Strong selection signals based on the fixed index (Fst) and genetic polymorphism ratio (π) were detected on the SNP dataset, serving as a significant genetic difference between wild and domesticated large yellow croaker populations. Figure 1 The specific screening criteria were the top 2% of Fst values and the top 5% of π ratios, resulting in 622 differentially expressed genes. Further, based on selection signal analysis, 139 candidate genes (P<0.001) showing highly significant differences in allele frequencies between wild and domesticated large yellow croaker populations were screened within the selection signal region. Among these, 8 candidate genes causing non-synonymous mutations in exons were identified and validated as candidate SNPs.
[0074] The specific genotypes of the 8 SNPs are shown in Table 1.
[0075] Table 18 genotypes of SNPs
[0076]
[0077] For the above 8 SNPs, corresponding peripheral amplification primer sequences and extension primers were designed respectively. The primer sequences are shown in Table 2 and Table 3.
[0078] Table 28 shows the amplification primer sequences corresponding to the SNPs.
[0079]
[0080]
[0081] Table 38 shows the extended primer sequences corresponding to the SNPs.
[0082]
[0083] Note: R is a degenerate base of GA.
[0084] Example 2
[0085] A multi-site identification method for wild-caught large yellow croaker populations based on SNaPshot technology includes the following steps:
[0086] S1: Genomic DNA was extracted from the fin tissue of the large yellow croaker population using the phenol-chloroform extraction method;
[0087] S2: Construct a multiplex PCR reaction system for genotyping experiments. The multiplex PCR reaction system is shown in Table 4, and the reaction procedure is shown in Table 5.
[0088] Table 4 Multiplex PCR reaction system
[0089]
[0090] Table 5 Multiplex PCR reaction procedure
[0091]
[0092] The amplified products were digested. The digestion system is shown in Table 6. The digestion conditions were 37℃ for 60 min and 75℃ for 15 min.
[0093] Table 6 Digestive System
[0094]
[0095] The digested product was extended. The extension reaction system is shown in Table 7. The extension conditions were 96℃ for 10s, 50℃ for 5s, 60℃ for 30s, and 27 cycles.
[0096] Table 7 Extended Reaction System
[0097]
[0098] The extension product was purified by adding 0.5 μL of CIP to 6 μL of the extension product; 37℃ for 1.0 h, then 75℃ for 15 min.
[0099] S3: 3730XL sequencer detection
[0100] 1) Add 9 μL of a mixture of molecular weight internal standard and formamide to each well of a 96-well plate, and 1 μL of product;
[0101] 2) After 3 minutes at 95℃, place in an ice bath and then detect using a 3730 sequencer;
[0102] 3) Data analysis: Import the raw data files obtained from the detection into the analysis software for analysis.
[0103] S4: Based on the SNaPshot validation results, the differences in allele frequencies of the above 8 SNPs between wild and domesticated large yellow croaker populations were analyzed (P<0.05).
[0104] S5: Construct a method for identifying wild / domesticated populations of large yellow croaker based on the above 8 SNPs.
[0105] Based on the genotypes in Table 1, alleles with a higher frequency in the wild population of large yellow croaker than in the domesticated population are scored as 1 point, and alleles with a lower frequency in the wild population than in the domesticated population are scored as 0 points. A method for distinguishing between wild and domesticated large yellow croaker populations based on the above 8 SNP loci is constructed. Generally, if a method is constructed based on x loci (x≤8), the total score of the discrimination method is 2x (number of alleles * 2).
[0106] Scoring was performed on all test groups, and total score datasets for wild and domesticated groups were constructed based on the total scores of the 8 SNPs. These datasets were then input into GraphPad Prism 8 to construct receiver operating characteristic (ROC) curves (the area under the ROC curve (AUC) greater than 0.9 indicates a feasible method). A cut-off value was set as the discrimination threshold (the cut-off value is the maximum of the ROC curve's sensitivity% + specificity%). Groups with scores above the threshold were classified as wild large yellow croaker, while those with scores below the threshold were classified as domesticated large yellow croaker.
[0107] Example 3
[0108] The multi-site identification method for wild and domesticated large yellow croaker populations based on SNaPshot technology, as described in Example 2, was used to identify known wild and domesticated large yellow croaker populations, with 120 wild and 120 domesticated large yellow croakers in each group.
[0109] Electrophoresis images of PCR reaction products from some large yellow croaker samples are shown below. Figure 2 As shown in the figure. SNaPshot analysis of the above-mentioned large yellow croaker yielded the following partial SNaPshot peak diagram: Figure 3 The specific genotype analysis results are shown in Table 8. The screening method based on the above 8 SNPs is as follows: Figure 4 As shown, this is achieved through a combination of 8 SNPs ( Figure 4 In the case of A), the area under the ROC curve (AUC) obtained was 0.9469, the sensitivity of the optimal threshold was 93.3% (i.e., the probability of a wild large yellow croaker being identified by this evaluation method was 93.3%), and the specificity was 89.2% (i.e., the probability of a domesticated large yellow croaker being excluded by this evaluation method was 89.2%). An AUC area greater than 0.9 indicates that the method is feasible.
[0110] Furthermore, when the number of SNPs was reduced to 7 (e.g., a combination of H1, H2, H3, H4, H5, H6, and H7, or a combination of H1, H3, H4, H5, H6, H7, and H8), 5 (e.g., a combination of H1, H3, H4, H5, H6, and H7), 4 (e.g., a combination of H1, H3, H4, and H7), and 3 (e.g., a combination of H1, H3, and H4), the steps of Example 2 were used to construct the discrimination methods. Validation showed that the AUC area of each discrimination method was greater than 0.9, and it had a sensitivity of over 90% and a specificity of over 70%. Figure 4 (B to F). This indicates that the above methods can all be applied to distinguish between wild and domesticated large yellow croaker.
[0111] Table 8. Genotypic analysis results of wild and domesticated large yellow croaker populations.
[0112]
[0113] Example 4
[0114] To further verify the universality of the multi-site identification method for wild and domesticated large yellow croaker populations based on SNaPshot technology in Example 2, an additional 30 known wild / domesticated large yellow croaker populations were selected for a second round of screening. Based on the SNaPshot detection results of the 30 populations, 12 wild large yellow croaker populations and 18 domesticated large yellow croaker populations were identified, consistent with the known genotypes of large yellow croaker. This indicates that the mixed identification method using 8 SNPs has universality in indicating wild / domesticated large yellow croaker populations.
[0115] The embodiments of the present invention have been described in detail above with reference to the accompanying drawings. However, the present invention is not limited to the above embodiments, and various changes can be made within the scope of knowledge possessed by those skilled in the art without departing from the spirit of the present invention. Furthermore, the embodiments of the present invention and the features thereof can be combined with each other unless otherwise specified.
Claims
1. Application of SNP molecular marker combinations in the differentiation of wild and domesticated populations of large yellow croaker; The SNP molecular marker combinations are SNP1-30187747, SNP1-30206146, SNP1-30225064, SNP13-42432874, SNP13-43081647, SNP13-43107208, SNP13-43228958, and SNP15-22344715; or The SNP molecular marker combinations are SNP1-30187747, SNP1-30206146, SNP1-30225064, SNP13-42432874, SNP13-43081647, SNP13-43107208, and SNP13-43228958; or The SNP molecular marker combinations are SNP1-30187747, SNP1-30225064, SNP13-42432874, SNP13-43081647, SNP13-43107208, SNP13-43228958, and SNP15-22344715; or The SNP molecular marker combinations are SNP1-30187747, SNP1-30225064, SNP13-42432874, SNP13-43081647, and SNP13-43228958; or The SNP molecular marker combination is SNP1-30187747, SNP1-30225064, SNP13-42432874, and SNP13-43228958; or The SNP molecular marker combinations are SNP1-30187747, SNP1-30225064 and SNP13-42432874; in, The SNP1-30187747, SNP1-30206146, and SNP1-30225064 are located at positions 30187747, 30206146, and 30225064 on chromosome 1 of large yellow croaker, respectively, with polymorphisms of A / G, A / G, and T / C, respectively. The SNPs 13-42432874, 43081647, 43107208, and 43228958 are located at positions 42432874, 43081647, 43107208, and 43228958 on chromosome 13 of large yellow croaker, respectively, with polymorphisms of G / A, C / T, G / A, and G / A, respectively. The SNP15-22344715 is located at position 22344715 on chromosome 15 of large yellow croaker NC_040023.1, and its polymorphism is G / A.
2. A set of primers, characterized in that, The nucleotide sequences of the primer set are shown in SEQ ID NO:1 to SEQ ID NO:
16.
3. The primer set according to claim 2, characterized in that, The SEQ ID NO:1 to SEQ ID NO:16 are arranged in sequence to form a primer pair of two nucleic acid sequences.
4. A kit comprising the primer set as described in claim 2 or 3.
5. The use of the primer set of claim 2 or 3 or the kit of claim 4 in the identification of wild and / or domesticated populations of large yellow croaker.
6. A method for identifying wild and / or domesticated populations of large yellow croaker, comprising the step of using the primer set of claim 2 or 3 or the kit of claim 4 to detect the SNP molecular marker combination of claim 1 in the population of large yellow croaker to be tested.