Four-ploid long oyster 40k breeding liquid chip and preparation method and application thereof
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- LUDONG UNIVERSITY
- Filing Date
- 2025-11-17
- Publication Date
- 2026-06-26
Smart Images

Figure CN121496065B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of gene chips, specifically relating to the 40K breeding liquid phase chip of tetraploid long oysters, its preparation method and application. Background Technology
[0002] The Pacific oyster (Crassostrea gigas) is the world's largest-scale and highest-yielding farmed marine shellfish, possessing advantages such as high nutritional value, rapid growth, and ease of cultivation, thus occupying an important position in my country's marine aquaculture industry. Compared to diploid oysters, triploid Pacific oysters have even higher nutritional value, faster growth rates, and their poor fertility ensures long-term high gonadal maturity, allowing them to meet market demand year-round. Tetraploid oysters are the core germplasm resource of triploid Pacific oysters; hybridization with diploid oysters can reliably yield triploid Pacific oysters. Currently, methods for obtaining tetraploid Pacific oysters have evolved from drug-induced breeding to self-population, but challenges remain, including high breeding costs, long cycles, low efficiency, and difficulty in preserving superior traits. Therefore, developing economical and efficient breeding and preservation techniques for superior tetraploid Pacific oyster varieties is of great significance for promoting the development of the Pacific oyster industry.
[0003] Molecular markers are a core technology in molecular breeding, identifying genetic polymorphisms in individuals by marking variations in nucleotide sequences. With the development of molecular marker technology, molecular markers such as restriction fragment length polymorphism (RFLP), random amplified polymorphic DNA (RAPD), and simple sequence repeats (SSR) have been proposed and used, driving the development of the field of molecular genetic breeding. However, these technologies are complex to operate and have long sequencing cycles, making it difficult to support the needs of large-scale molecular genetic breeding. Single nucleotide polymorphisms (SNPs) are a new generation of molecular marker technology that can mark variations in a single nucleotide in the genome. SNPs are widely distributed, have high genome coverage, high throughput, high accuracy, and are easy to analyze, playing an important role in genetic breeding techniques such as population genetic structure analysis, genome selection analysis, and germplasm identification. Currently, SNP-based genetic analysis and selection are widely used in marine organisms and have shown good results.
[0004] Currently, common methods for obtaining SNPs include whole-genome resequencing, 2b-RAD sequencing, and simplified genome sequencing. These sequencing technologies can identify SNPs across the entire genome and uncover loci that can be used for genetic breeding. However, while these sequencing technologies identify high locus abundance, they are costly and have relatively low analysis efficiency. Therefore, SNP microarrays have gained widespread attention and use due to their high analysis efficiency, high depth, and low cost. SNP liquid-phase microarrays based on whole-genome targeted capture sequencing (GBTS) achieve high-depth sequencing of specific loci by reducing library abundance. This microarray designs probes complementary to the target sequence bases at the target locus and captures these probes to obtain specific fragments for sequencing, thus enabling locus detection within those fragments. Liquid-phase microarrays significantly reduce sequencing costs, increase sequencing depth and efficiency at specific loci, have no locus abundance requirements, and allow for the continuous addition or deletion of loci after development. These advantages have made GBTS-based liquid-phase microarrays play a crucial role in the breeding and genetic analysis of numerous plants and animals.
[0005] Currently, GBTS-based SNP liquid chromatography chips have been developed and applied in a large number of fish and a small number of marine mollusks, promoting the development of marine animal genetic breeding. However, there are no liquid chromatography chips developed specifically for the breeding of tetraploid Pacific oysters. The development of a 40K breeding liquid chromatography chip for tetraploid Pacific oysters will significantly reduce the cost of genotyping in tetraploid Pacific oysters, improve the accuracy and efficiency of genotyping, and provide technical support for the breeding of tetraploid Pacific oysters and the development of oyster farming. Summary of the Invention
[0006] To address the shortcomings of the existing technologies, this invention provides a 40K breeding liquid phase chip for tetraploid Pacific oysters based on targeted capture sequencing, its preparation method, and its applications. The breeding liquid phase chip of this invention can effectively reduce the cost of genotyping and breeding of tetraploid Pacific oysters, improve breeding accuracy and efficiency, and promote the development of genomic selection for tetraploid Pacific oysters.
[0007] The specific technical solution is as follows:
[0008] One of the objectives of this invention is to provide a 40K breeding liquid phase chip for tetraploid Pacific oysters, which includes probes for detecting 41,270 SNP sites; the location information of the 41,270 SNP sites is determined by sequence alignment based on a reference version of the Pacific oyster genome, which is GCA_011032805.1.
[0009] Specifically, the information on the 41,270 SNP loci is shown in Tables 1 to 10 of the specific implementation. Tables 1 to 10 are, respectively, tables of SNP loci information on chromosomes CHR01 to CHR10.
[0010] Specifically, each SNP site corresponds to two single-stranded nucleotide probes for specific detection of that site, which are used to target and capture that site in genotyping, genome selection, or genome-wide association studies of tetraploid oysters.
[0011] Specifically, the probe preferably has a biotin group modified at its 5' end.
[0012] Specifically, the 41,270 SNP loci include 7,904 core SNP loci and 33,366 associated SNP loci.
[0013] The second objective of this invention is to provide a method for preparing the above-mentioned tetraploid oyster 40K breeding liquid phase chip, which includes the following steps:
[0014] S1. Extract DNA from tetraploid oysters and resequencing it;
[0015] S2. Align the resequencing results with the reference version of the Pacific oyster genome and perform genotyping;
[0016] S3. Filter to obtain the background SNP site set, and use the MOLO method to filter out the core SNP sites from the background SNP site set;
[0017] S4. Based on the correlation with shell length, shell width, shell height, body weight and soft body weight of tetraploid long oysters, associated SNP sites were screened from the background SNP site set to obtain associated SNP sites.
[0018] S5. Design probes for detecting core SNP sites and associated SNP sites.
[0019] Furthermore, in step S3, the screening criteria for the background SNP locus set are: biallelic SNPs; MAF ≥ 0.05; missing rate = 0; and QUAL ≥ 600.
[0020] Furthermore, in step S3, the screening criteria for core SNP sites are: biallelic SNPs; MAF ≥ 0.05; deletion rate = 0; QUAL ≥ 800; and uniform distribution on the chromosome.
[0021] Furthermore, in step S3: when screening the background SNP site set, Indels are filtered out.
[0022] Furthermore, in step S3: using the MOLO method, based on maximizing the average Shannon information entropy of the site, core SNP sites evenly distributed on the chromosome are screened out. In a specific embodiment of the present invention, 10,000 core SNP sites are obtained through screening.
[0023] Furthermore, in step S4: when screening associated SNP sites, SNP sites that are evenly distributed on the chromosome are selected.
[0024] Further, in step S4: GWASpoly software is used to perform genome-wide association analysis to screen for loci highly associated with shell length, shell width, shell height, body weight, and soft body weight of tetraploid long oysters from the background SNP locus set. In a specific embodiment of the present invention, the screened associated SNP loci are merged with the core SNP loci (duplicates are removed), resulting in a total of 52,320 high-quality SNP loci.
[0025] Furthermore, in step S5, the probe design criteria are: GC content between 30% and 70%; and the number of homologous regions ≤ 5. More specifically, in a specific embodiment of the present invention, the probe length is 110 bp. In a specific embodiment of the present invention, 41,270 SNP sites suitable for chip development were ultimately screened.
[0026] A third objective of this invention is to provide the application of the above-mentioned tetraploid oyster 40K breeding liquid phase chip in the genotyping of tetraploid oysters. The liquid phase chip exhibits high individual detection rate and high site polymorphism, indicating good quality and enabling high-quality and efficient genotyping.
[0027] Specifically, the above-mentioned tetraploid Pacific oyster 40K breeding liquid phase chip was used for genotyping of tetraploid Pacific oysters, and the steps are as follows:
[0028] S1. Extract DNA from tetraploid oysters and construct a high-throughput sequencing library;
[0029] S2. DNA fragments containing target sites in a library were captured using a 40K breeding liquid phase chip for tetraploid Pacific oysters, and then amplified and purified.
[0030] S3. Perform high-throughput sequencing and align the results to the reference genome of Pacific oyster (reference version GCA_011032805.1) to obtain the genotype of the tetraploid Pacific oyster to be tested at the target site.
[0031] The fourth objective of this invention is to provide the application of the aforementioned 40K breeding liquid phase chip for tetraploid Pacific oysters in genomic selection. Using the SNP sites in this liquid phase chip, relatively accurate breeding value estimation can be performed, and the predictions are stable across various breeding models, demonstrating that this breeding liquid phase chip can be applied to the molecular breeding of tetraploid Pacific oysters.
[0032] The fifth objective of this invention is to provide the application of the aforementioned 40K breeding liquid-phase chip for tetraploid Pacific oysters in genome-wide association analysis (GWAS) of tetraploid Pacific oysters. This liquid-phase chip can be used to discover SNPs that are significantly associated with growth traits of tetraploid Pacific oysters, playing an important role in improving liquid-phase chip technology and breeding high-quality oysters.
[0033] Compared with the prior art, the present invention has the following beneficial effects:
[0034] This invention is the first to develop a liquid-phase microarray for breeding tetraploid Pacific oysters using targeted capture sequencing. The microarray includes core SNP sites and associated SNP sites highly correlated with shell length, shell width, shell height, body weight, and soft body weight in tetraploid Pacific oysters. This liquid-phase microarray exhibits high sequencing quality and high genotyping accuracy, overcoming the problem of poor genotyping results in tetraploid Pacific oysters when sequencing quality is low. The liquid-phase microarray uses strict criteria to screen SNP sites suitable for microarray development. During probe design, GC content is controlled between 30% and 70%, and the number of homologous regions is limited to no more than five, significantly improving the hybridization stability of the sequences containing the sites and the capture rate of DNA fragments captured by the microarray.
[0035] This liquid-phase microarray, developed based on targeted capture technology, allows for the addition of new loci as needed, offering high flexibility. Simultaneously, low-abundance genotyping significantly reduces sequencing costs, providing technical support for large-scale sequencing and genotyping of tetraploid oysters. The loci in this liquid-phase microarray performed excellently in estimating breeding values for growth traits in tetraploid Pacific oysters, demonstrating high and stable accuracy, and can be used for screening high-quality individuals and constructing superior populations. This liquid-phase microarray can be used for genome-wide association analysis of tetraploid Pacific oysters, identifying high-quality loci significantly associated with traits, promoting the development of new varieties, and further improving the accuracy of breeding. Attached Figure Description
[0036] Figure 1 This is a distribution diagram of SNP sites in each chromosome of the tetraploid oyster 40K breeding liquid phase chip in Example 1;
[0037] Figure 2 This is a statistical chart showing the genotyping detection rate of tetraploid oysters in Example 2;
[0038] Figure 3 This is a genotyping MAF statistical chart of tetraploid oysters in Example 2;
[0039] Figure 4 This is a statistical chart showing the accuracy of genome selection based on body weight in tetraploid oysters in Example 3;
[0040] Figure 5 Manhattan plot of genome-wide association analysis of body weight in tetraploid long oysters in Example 4. Detailed Implementation
[0041] The embodiments of the present invention will be described in further detail below with reference to the accompanying drawings and examples. The following examples are for illustrative purposes only and should not be construed as limiting the scope of the invention.
[0042] Example 1
[0043] The preparation of a liquid-phase microarray for tetraploid Pacific oyster 40K breeding, including the screening of breeding sites and probe preparation for the tetraploid Pacific oyster 40K breeding liquid-phase microarray, is as follows:
[0044] S1. Three hundred 1.5-year-old tetraploid adult Pacific oysters were randomly selected from Kongtong Island Industrial Co., Ltd. DNA was extracted and a resequencing library of tetraploid Pacific oysters was constructed using the GenoBaits® DNA Library Prep Kit for ILM. The tetraploid Pacific oysters were resequencing using the BGI-2000 sequencing platform in PE150 mode, with an average sequencing depth of 30X.
[0045] S2. Use Fastp to filter the obtained raw reads into clean reads, then use BWA-mem and Samtools to align the clean reads to the reference genome of Pacific oyster GCA_011032805.1, and use GATK for genotyping.
[0046] S3. Filter out Indels and SNPs that are not biallelic, have a MAF < 0.05, a deletion rate > 0, and a QUAL < 600 to obtain the background SNP locus set. Using the MOLO method, based on maximizing the average Shannon information entropy of the locus, select 10,000 core loci from the background SNP locus set that are biallelic, have a MAF ≥ 0.05, a deletion rate = 0, a QUAL ≥ 800, and are evenly distributed on the chromosome.
[0047] S4. Genome-wide association analysis was performed using GWASpoly software. Loci highly associated with shell length, shell width, shell height, body weight, and soft body weight of tetraploid long oysters were screened from the background SNP locus set. SNP loci evenly distributed on chromosomes were further screened to obtain associated SNP loci. After merging associated SNP loci with core SNP loci (removing duplicates), a total of 52,320 high-quality SNP loci were obtained.
[0048] S5. Probes were designed targeting core SNP sites and associated SNP sites (merged into 52,320 high-quality SNP sites). Probes were designed based on GC content between 30% and 70% and the number of homologous regions ≤ 5. A total of 41,270 SNP sites were finally selected, and their distribution is shown below. Figure 1 As shown.
[0049] Information on 41,270 SNP loci is shown in Tables 1 to 10. Tables 1 to 10 are the SNP loci information tables on chromosomes CHR01 to CHR10, respectively.
[0050] Table 1 shows the SNP information for chromosome CHR01. The site information in Table 1 represents: site located in CHR01 / reference base / mutant base.
[0051] Table 1 CHR01 site information (site located in CHR01 / reference base / mutant base)
[0052]
[0053]
[0054]
[0055]
[0056]
[0057]
[0058] Table 2 shows the SNP information for chromosome CHR02. The site information in Table 2 represents: site located on CHR02 / reference base / mutant base.
[0059] Table 2 CHR02 site information (site at CHR02 / reference base / mutant base)
[0060]
[0061]
[0062]
[0063]
[0064]
[0065]
[0066]
[0067]
[0068]
[0069]
[0070]
[0071] Table 3 shows the SNP information for chromosome CHR03. The site information in Table 3 represents: site located in CHR03 / reference base / mutant base.
[0072] Table 3 CHR03 site information (site at CHR03 / reference base / mutant base)
[0073]
[0074]
[0075]
[0076]
[0077]
[0078]
[0079]
[0080]
[0081] Table 4 shows the SNP information for chromosome CHR04. The site information in Table 4 represents: site located at CHR04 / reference base / mutant base.
[0082] Table 4 CHR04 site information (site at CHR04 / reference base / mutant base)
[0083]
[0084]
[0085]
[0086]
[0087]
[0088]
[0089]
[0090]
[0091] Table 5 shows the SNP information for chromosome CHR05. The site information in Table 5 represents: site located at CHR05 / reference base / mutant base.
[0092] Table 5 CHR05 site information (site at CHR05 / reference base / mutant base)
[0093]
[0094]
[0095]
[0096]
[0097]
[0098]
[0099]
[0100]
[0101]
[0102]
[0103]
[0104]
[0105]
[0106]
[0107] Table 6 shows the SNP information for chromosome CHR06. The site information in Table 6 represents: site located at CHR06 / reference base / mutant base.
[0108] Table 6 CHR06 site information (site at CHR06 / reference base / mutant base)
[0109]
[0110]
[0111]
[0112]
[0113]
[0114]
[0115]
[0116]
[0117]
[0118] Table 7 shows the SNP information for chromosome CHR07. The site information in Table 7 represents: site located on CHR07 / reference base / mutant base.
[0119] Table 7 CHR07 site information (site at CHR07 / reference base / mutant base)
[0120]
[0121]
[0122]
[0123]
[0124]
[0125]
[0126]
[0127]
[0128]
[0129]
[0130] Table 8 shows the SNP information for chromosome CHR08. The site information in Table 8 represents: site located on CHR08 / reference base / mutant base.
[0131] Table 8 CHR08 site information (site at CHR08 / reference base / mutant base)
[0132]
[0133]
[0134]
[0135]
[0136]
[0137]
[0138]
[0139]
[0140]
[0141]
[0142] Table 9 shows the SNP information for chromosome CHR09. The site information in Table 9 represents: site located on CHR09 / reference base / mutant base.
[0143] Table 9 CHR09 site information (site at CHR09 / reference base / mutant base)
[0144]
[0145]
[0146]
[0147]
[0148]
[0149]
[0150]
[0151]
[0152]
[0153]
[0154] Table 10 shows the SNP information located on chromosome CHR10. The site information in Table 10 represents: site located on CHR10 / reference base / mutant base.
[0155] Table 10 CHR10 site information (sites in CHR10 / reference bases / mutant bases)
[0156]
[0157]
[0158]
[0159]
[0160]
[0161]
[0162] For each screened probe site, two nucleotide sequences with 60%–70% overlap and covering the probe site were designed. Subsequently, two 110 bp single-stranded nucleotide probes with biotinylate groups modified at the 5' end were synthesized for each site. The obtained probes were mixed in equimolar mass and then diluted with EDTA and Tris-HCl solution to a concentration of 3 pmol / mL in tetraploid Pacific oyster 40K probe mixture, which is the tetraploid Pacific oyster 40K breeding liquid phase chip.
[0163] Example 2
[0164] The tetraploid Pacific oyster 40K breeding liquid phase chip prepared in Example 1 was used for genotyping of tetraploid Pacific oysters, and the steps are as follows:
[0165] S1. 408 tetraploid adult long oysters were collected from Kongtong Island Industrial Co., Ltd., DNA was extracted, and an initial high-throughput library was constructed;
[0166] S2. Following the standard procedure of targeted capture technology, the DNA fragments containing the target site in the library were captured using the tetraploid oyster 40K breeding liquid phase chip. The captured library was then amplified and purified by PCR to obtain the final library for sequencing.
[0167] S3. The library obtained in step S2 was sequenced using the BGI-2000 sequencing platform. The raw reads obtained from sequencing were cleaned after Fastp quality control. Then, BWA-mem and Samtools were used to align the reads to the reference genome of Pacific oyster GCA_011032805.1. Finally, Sentieon was used for genotyping to obtain the genotypes of 41,270 target loci and to count the individual detection rate and the locus MAF.
[0168] The detection rate of 408 individuals is shown in the figure. Figure 2 .like Figure 2 As shown, the detection rate of 408 individuals ranged from 95.93% to 98.79%, with 66.91% (273 individuals) having a detection rate between 97% and 98%, and only 0.49% (2 individuals) having a detection rate below 96%. See the MAF locus. Figure 3 .like Figure 3 As shown, 92.87% (38,327) of the loci had a MAF higher than 0.05, mainly distributed between 0.3 and 0.5. In summary, this liquid-phase microarray exhibits high individual detection rate and high locus polymorphism, indicating good quality and the ability to perform high-quality and efficient genotyping.
[0169] Example 3
[0170] The tetraploid Pacific oyster 40K breeding liquid phase chip prepared in Example 1 was used for the genome selection of tetraploid Pacific oysters.
[0171] To further estimate the value of this 40K breeding liquid phase chip in actual tetraploid oyster breeding, it is now applied to the estimation of breeding values for body weight, an important growth trait of tetraploid oysters.
[0172] The 41,270 loci obtained from genotyping in Example 2 were filtered out, removing Indels, multiple alleles, SNPs with MAF < 0.05, and SNPs with deletion rates or individual deletion rates higher than 0.1%, resulting in 37,760 SNP loci for analysis. After calculating heritability and constructing the G matrix, the breeding value estimation accuracy of tetraploid oyster body weight was calculated using the GBLUP, BayesA, BayesB, BayesC, Bayes Lasso, and RKHS models, respectively. The results are shown in […]. Figure 4 .like Figure 4 As shown, the accuracy of the six breeding models ranged from 57.67% to 58.25%. The results indicate that the SNP sites in this liquid-phase chip can be used for relatively accurate breeding value estimation, and the predictions are stable across all breeding models, proving that this breeding liquid-phase chip can be applied to the molecular breeding of tetraploid oysters.
[0173] Example 4
[0174] The tetraploid Pacific oyster 40K breeding liquid phase chip prepared in Example 1 was used for genome-wide association analysis of tetraploid Pacific oyster.
[0175] To further estimate the value of this 40K breeding liquid chip in actual tetraploid oyster breeding, it was applied to a genome-wide association analysis of body weight in tetraploid oysters.
[0176] For the 37,760 SNP loci obtained in Example 3, GWASpoly software was used to identify SNP loci significantly associated with body weight in tetraploid oysters. The results are as follows: Figure 5 As shown, 14 significantly associated SNPs were identified using the Bonferroni method, and 14 SNPs related to body weight were also identified at the suggested threshold. These results indicate that this liquid-phase microarray can be used to mine SNPs significantly associated with growth traits in tetraploid oysters, playing an important role in improving liquid-phase microarrays and breeding high-quality oysters.
[0177] The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.
Claims
1. A 40K liquid phase chip for breeding tetraploid long oysters, characterized in that, This includes probes for detecting 41,270 SNP sites; the location information of the 41,270 SNP sites was determined by sequence alignment based on a reference version of the Pacific oyster genome, which is GCA_011032805.
1. Information on the 41,270 SNP sites is shown in Tables 1 to 10.
2. The tetraploid oyster 40K breeding liquid phase chip according to claim 1, characterized in that, Each SNP site corresponds to two single-stranded nucleotide probes for specific detection of that site.
3. The application of the tetraploid oyster 40K breeding liquid phase chip as described in claim 1 or 2 in the genotyping of tetraploid oysters.
4. The application of the tetraploid Pacific oyster 40K breeding liquid phase chip as described in claim 1 or 2 in the genome selection of tetraploid Pacific oysters.
5. The application of the tetraploid Pacific oyster 40K breeding liquid phase chip as described in claim 1 or 2 in the whole genome association analysis of tetraploid Pacific oyster.