A brine shrimp 5k liquid chip based on a targeted capture sequence, a preparation method and application thereof
By designing a 5K liquid-phase chip for Artemia based on targeted capture sequences, the problems of insufficient marker quantity and low identification accuracy in the development of SNP chips for Artemia were solved, enabling efficient and low-cost genotyping and population genetics research of Artemia.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- TIANJIN UNIV OF SCI & TECH
- Filing Date
- 2026-05-29
- Publication Date
- 2026-06-30
AI Technical Summary
Existing SNP chip development for Artemia spp. suffers from problems such as a limited number of SNP markers, low identification accuracy, high cost, and high sample detection requirements. Furthermore, the Artemia genus is quite complex, which makes the development and application of liquid-phase chips difficult.
A 5K liquid-phase chip for Artemia based on targeted capture sequences was designed, containing 4329 SNP sites. Using double-stranded DNA probes, the selection of SNP sites with uniform distribution across the genome and the synthesis of probes were achieved through screening and nucleotide sequence design, combined with whole-genome sequencing data from 25 Artemia populations.
It increases the number of SNP markers, improves identification accuracy, reduces detection costs, and supports flexible addition of marker sites, making it suitable for large-scale Artemia genotyping and population genetics studies.
Smart Images

Figure CN122303450A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the fields of molecular biology, germplasm resources and molecular breeding technology, and in particular to a 5K liquid phase chip of artichoke based on a targeted capture sequence, its preparation method and its application. Background Technology
[0002] SNP liquid-phase microarray technology addresses single nucleotide polymorphisms (SNPs), which are polymorphisms in nucleic acid sequences caused by changes in a single nucleotide base. SNPs are the most common type of polymorphism in animal and plant genomes, characterized by their large number, wide distribution, genetic stability, and high coverage density, making them widely applicable in biology, medicine, and genetics. With the development of high-throughput sequencing technology, the genomes of more and more species are now being sequenced. By developing SNP markers based on the genome and designing microarrays, and hybridizing randomly fragmented genomic DNA fragments with oligonucleotide probes on the microarray, the SNP genotype at the corresponding loci can be determined.
[0003] Currently, there are no SNP chips developed and used for Artemia salina. The published invention patent "CN117821611A, A KASP-based Artemia core molecular marker combination and its application" reports a fingerprint spectrum developed based on 13 Artemia SNP molecular markers, which can be used for analyzing the phylogenetic relationships of Artemia resources. However, fingerprint spectra developed based on SNP molecular markers have relatively low accuracy in identifying populations due to the limited number of SNP markers. Furthermore, fingerprint typing is costly and requires a large sample volume. In addition, the Artemia genus is quite complex, generally considered to contain six sexually reproducing species and multiple parthenogenetic Artemia populations, which creates significant difficulties in SNP marker development and liquid-phase chip development, further limiting its development and application.
[0004] Genotyping by Target Sequencing (GBTS) is a technique that achieves deep resequencing of only target loci by reducing library abundance. It significantly reduces the cost of genotyping, and SNP liquid-phase chips based on this technology contain more molecular markers, resulting in high population detection accuracy. New loci can be added later, allowing for testing with a small sample size. Summary of the Invention
[0005] The purpose of this invention is to overcome the shortcomings of the prior art and provide an artesian 5K liquid phase chip based on a targeted capture sequence, its preparation method, and its application.
[0006] The technical solution adopted by this invention to solve its technical problem is: A 5K liquid-phase chip for Artemia salina based on targeted capture sequences, the liquid-phase chip comprising 4329 SNP sites, the positions of the 4329 SNP molecular markers are shown in Table 1.
[0007] Furthermore, the Artemia 5K liquid phase chip consists of an independently packaged Artemia 5K probe mixture and a hybridization capture reagent. The probes are DNA double-stranded probes, and the nucleotide sequences are designed and synthesized based on the selected SNP sites. The SNP sites are selected by comparing the whole genome sequencing results of Artemia to the Artemia reference genome.
[0008] The method for preparing the 5K liquid phase chip of brine shrimp as described above includes the following steps: Site selection and liquid-phase chip construction: Background sites were selected following the principle of uniform distribution across the genome. Based on simplified genome sequencing data of 250 individuals from 25 Artemia populations, background SNP sites were added to the chip. The processing procedure for resequencing VCF files is as follows: (1) deletion rate ≤ 50%; (2) maf ≥ 0.05; (3) het_rate ≤ 50%; (4) remove non-diaryl SNP sites; (5) remove INDEL sites; The site selection criteria are: (1) ΔRAF top number 200; (2) miss_rate ≤ 30% (single subpopulation); (3) maf ≥ 0.01; (4) ΔRAF difference between subpopulations ≥ 0.1; The site evaluation principles are: (1) probe length 110 bp; (2) probe GC content: 30-70%; (3) number of homologous regions ≤ 5; Using the above methods, a total of 4329 SNP sites were selected that are evenly distributed in the Artemia genome. Based on the location of the 4329 SNP sites and the sequence information on both sides, primers were designed and probes were synthesized to obtain Artemia 5K liquid phase chip.
[0009] Furthermore, the 250 individual Artemia worms from the 25 Artemia worm populations are shown in Table 2.
[0010] The application of the 5K liquid phase chip for Artemia salina as described above in population genetic analysis and / or kinship identification in Artemia salina producing areas.
[0011] The advantages and positive effects of this invention are as follows: 1. Compared with previous fingerprint spectra developed based on SNP molecular markers, the brine shrimp liquid phase chip of this invention contains more SNP markers, has higher identification accuracy, lower post-detection costs, and is flexible in design, allowing for the addition of marker sites of interest at any time.
[0012] 2. Compared with whole-genome resequencing, this study has completed the development of a large number of SNP markers for Artemia and the fabrication of the first liquid-phase chip. This liquid-phase chip has a significant price advantage and can be used for large-scale genotyping and population genetics research of Artemia in the future.
[0013] 3. This invention selects 250 individuals from 25 representative Artemia populations worldwide. The sources are wide-ranging and highly representative. High-quality SNP loci are screened to construct liquid-phase chips, which can realize accurate identification and genetic research of Artemia populations in different ways with low cost and high efficiency. Attached Figure Description
[0014] Figure 1 PCA diagram showing the differentiation of 15 Artemia populations using the 5K liquid phase chip for Artemia in this invention; Figure 2 This is a clustering diagram of the phylogenetic relationships of 15 Artemia populations using the 5K liquid phase chip for Artemia in this invention. Detailed Implementation
[0015] The present invention will be further described below with reference to the embodiments. The following embodiments are descriptive and not limiting, and should not be used to limit the scope of protection of the present invention.
[0016] The various experimental operations involved in the specific embodiments are all conventional techniques in the field. For parts not specifically annotated in this document, those skilled in the art can refer to various commonly used reference books, scientific and technological documents or related instructions and manuals prior to the filing date of this invention to carry out the operations.
[0017] A 5K liquid-phase chip for Artemia salina based on targeted capture sequences, the liquid-phase chip comprising 4329 SNP sites, the positions of the 4329 SNP molecular markers are shown in Table 1.
[0018] Preferably, the Artemia 5K liquid phase chip consists of an independently packaged Artemia 5K probe mixture and a hybridization capture reagent. The probes are double-stranded DNA probes, and the nucleotide sequences are designed and synthesized based on the selected SNP sites. The SNP sites are selected by comparing the whole genome sequencing results of Artemia to the Artemia reference genome.
[0019] The Artemia parthenogenetica reference genome used in this invention is a publicly available reference genome, referring to the "reference genome" disclosed in the Chinese authorized invention patent "CN117821611A, A KASP-based Artemia parthenogenetica core molecular marker combination and its application". That is, the Artemia parthenogenetica reference genome used in this invention is completely consistent with that described in Chinese patent publication CN117821611A, specifically the contig sequence version of Artemia parthenogenetica attached to the chromosome (paragraph
[0020] of the specification of Chinese patent publication CN117821611A clearly states: "The physical positions of the 13 SNP molecular markers are determined based on the contig sequence of Artemia parthenogenetica attached to the chromosome as the reference genome"). Furthermore, this reference genome has also been uploaded to the Genome Warehouse database (https: / / ngdc.cncb.ac.cn / gwh), accession number GWHJJES00000000.1.
[0020] This invention ultimately identified 4329 SNP sites evenly distributed within the Artemia genome. For those skilled in the art, once the reference genome version and the physical location of the SNP sites are determined, the sequence information flanking each SNP site can be extracted from the reference genome using conventional bioinformatics methods. This invention explicitly states: "Based on the locations of the 4329 SNP sites and the flanking sequence information, primers were designed and probes were synthesized." Furthermore, the site evaluation principles clearly specify a probe length of 110 bp, a GC content of 30–70%, and ≤5 homologous regions. These parameters are standard practices for designing targeted capture probes in this field. Targeted capture probe design is a conventional technique in this field, and various mature commercial software and online tools (such as Agilent SureDesign and Illumina Design Studio) can automatically design suitable probes based on the input SNP site locations and reference genome sequences. For example, the applicant could commission Shijiazhuang Borui Biotechnology Co., Ltd. to design primers and synthesize probes using targeted capture sequencing technology, which is also a standard commercial service in this field. Therefore, those skilled in the artisanal can design and synthesize the required double-stranded DNA probes using conventional techniques based on the SNP site locations, reference genome version, and probe design parameters disclosed in this application. Since the Artemia reference genome version used in this patent application is clearly defined, all SNP site information can be uniquely resolved, enabling probe design and liquid-phase chip fabrication. Therefore, the details of the double-stranded DNA probes are not further described here.
[0021] The method for preparing the 5K liquid phase chip of brine shrimp as described above includes the following steps: Site selection and liquid-phase chip construction: Background sites were selected following the principle of uniform distribution across the genome. Based on simplified genome sequencing data of 250 individuals from 25 Artemia populations, background SNP sites were added to the chip. The processing procedure for resequencing VCF files is as follows: (1) deletion rate ≤ 50%; (2) maf ≥ 0.05; (3) het_rate ≤ 50%; (4) remove non-diaryl SNP sites; (5) remove INDEL sites; The site selection criteria are: (1) ΔRAF top number 200; (2) miss_rate ≤ 30% (single subpopulation); (3) maf ≥ 0.01; (4) ΔRAF difference between subpopulations ≥ 0.1; The site evaluation principles are: (1) probe length 110 bp; (2) probe GC content: 30-70%; (3) number of homologous regions ≤ 5; Using the above methods, a total of 4329 SNP sites were selected that are evenly distributed in the Artemia genome. Based on the location of the 4329 SNP sites and the sequence information on both sides, primers were designed and probes were synthesized to obtain Artemia 5K liquid phase chip.
[0022] Preferably, the 250 individual Artemia worms from the 25 Artemia worm populations are shown in Table 2.
[0023] The application of the 5K liquid phase chip for Artemia salina as described above in population genetic analysis and / or kinship identification in Artemia salina producing areas.
[0024] Specifically, the relevant preparation and testing methods are as follows: A 5K liquid phase microarray based on targeted capture sequences for Artemia salina includes 4329 SNP sites. This low-density liquid phase microarray can be used for population genetic analysis and phylogenetic identification in major Artemia salina producing regions worldwide. The Artemia salina 5K liquid phase microarray consists of individually packaged Artemia salina 5K probe mixtures and hybridization capture reagents. The probes are double-stranded DNA probes, with nucleotide sequences designed and synthesized based on the selected SNP sites. The SNP sites are selected by aligning the whole genome sequencing results of Artemia salina to an Artemia salina reference genome.
[0025] (I) Site selection and liquid-phase chip construction Background sites were selected following the principle of uniform distribution across the genome. Based on simplified genome sequencing data of 250 Artemia individuals from 25 Artemia populations (Table 2), background SNP sites were added to the chip.
[0026] The processing procedure for resequencing VCF files is as follows: (1) deletion rate ≤ 50%; (2) maf ≥ 0.05; (3) het_rate ≤ 50%; (4) remove non-diaryl SNP sites; (5) remove INDEL sites.
[0027] The site selection principles are: (1) ΔRAF top number 200; (2) miss_rate ≤ 30% (single subgroup); (3) maf ≥ 0.01; (4) ΔRAF difference between subgroups ≥ 0.1.
[0028] The site evaluation principles are: (1) probe length 110bp; (2) probe GC content: 30-70%; (3) number of homologous regions ≤ 5.
[0029] A total of 4329 SNP sites were selected using the methods described above, evenly distributed across the Artemia genome. Their specific genomic coordinates are shown in Table 1. Based on the locations of these 4329 SNP sites and their flanking sequence information, Shijiazhuang Borui Biotechnology Co., Ltd. designed primers and synthesized probes using targeted capture sequencing technology, thereby obtaining a low-density SNP liquid-phase chip from Artemia.
[0030] (II) Application of Artemia SNP Liquid Chromatography Chip in the Detection of Major Artemia Populations Worldwide To assess the accuracy of the samples, 61 samples from 15 Artemia populations were selected and tested using an Artemia 5K liquid phase chip. The specific methods were as follows: (1) Extraction of Artemia genomic DNA: Adult Artemia were selected, and the whole genome extraction kit from Tianjin Lanrui Biotechnology Co., Ltd. was used to extract genomic DNA. (2) DNA sample quality detection: DNA concentration was measured using an Eppendorf BioPhotometer plus, and DNA integrity was detected by 1% agarose gel electrophoresis. The samples were then stored at -80℃ for later use. (3) Liquid phase chip detection: The standard procedure for liquid phase chip detection was followed (http: / / www.molbreeding.com / index.php / Technology / GenoBaits.html). (4) Data analysis: The raw data were quality controlled using FASTP software. Then, the sequencing data was aligned to the Artemia reference genome using BWA software. SNPs were detected using the standard procedure of GATK4 software, and genotyping was performed. The SNP sets obtained from genotyping were then subjected to principal component analysis and phylogenetic tree construction.
[0031] After analyzing the data using the above methods, the PCA results for different Artemia populations are shown below. Figure 1 As shown. The phylogenetic tree results are shown in [link to phylogenetic tree]. Figure 2As shown in the results, samples R_4 and U_2 had low identification probabilities and their identification results were unreliable; therefore, they were not identified. Sample BGC6 was predicted to belong to the LGC population. Overall, the prediction accuracy for the test samples was 98.3%. The results show that this chip can effectively distinguish different Artemia populations.
[0032] Example (1) GenoBaits experimental procedure: Take a quantitative amount of DNA, fragment it using restriction endonucleases, and ligate the fragmented DNA with A tails after end repair. Use ligase to ligate the A-tailed DNA fragments to sequencing adapters, and then purify the library using carboxyl-modified magnetic beads. Add the ligation product to the sequencing primers with barcodes and a high-fidelity PCR reaction system for PCR amplification. Different barcodes are used to distinguish different samples. After purification with carboxyl magnetic beads, the amplified product can be used for probe hybridization experiments. Take 500 ng of the constructed sequencing library, freeze-dry it, add probes and hybridization reagents, denature it, and incubate it at 65°C for 2 hours to complete the hybridization reaction. After washing the hybridization product with washing buffer, perform another round of PCR to complete the construction of the hybridization capture library.
[0033] Example (2) GenoPlexs experimental procedure: Add multiplex PCR panel mix and multiplex PCR amplification enzyme system to the quantitative DNA, and place it on a PCR instrument to complete the PCR reaction. After the PCR product is purified using carboxyl magnetic beads, it is again added to the sequencing primers with barcodes and the high-fidelity PCR reaction system for PCR amplification. Different barcodes are used to distinguish different samples. After purification with carboxyl magnetic beads, the amplified product completes the multiplex PCR capture and library construction.
[0034] Table 1. Locations of 4329 SNP molecular markers
[0035] Table 2. Artemia populations and quantities used to simplify genome sequencing
[0036] Currently, there is no development or application of liquid phase chips for Artemia. This invention can detect more SNP sites and has a significant price advantage compared to whole genome resequencing. The chip samples are derived from 250 Artemia populations from 25 populations around the world, which is widely sourced and highly representative. This is beneficial for Artemia strain identification, population genetics research, and the research and protection of germplasm resources.
[0037] Although embodiments of the invention have been disclosed for illustrative purposes, those skilled in the art will understand that various substitutions, variations, and modifications are possible without departing from the spirit and scope of the invention and the appended claims. Therefore, the scope of the invention is not limited to the contents disclosed in the embodiments.
Claims
1. A 5K liquid phase chip for artemia based on targeted capture sequences, characterized in that: The liquid-phase chip includes 4329 SNP sites, and the molecular marker positions of the 4329 SNPs are shown below: 。 2. The 5K liquid phase chip for brine shrimp according to claim 1, characterized in that: The Artemia 5K liquid phase chip consists of individually packaged Artemia 5K probe mixture and hybridization capture reagent. The probes are DNA double-stranded probes, and the nucleotide sequences are designed and synthesized based on the selected SNP sites. The SNP sites are selected by comparing the whole genome sequencing results of Artemia to the Artemia reference genome.
3. The method for preparing the 5K liquid phase chip of artichoke as described in claim 1 or 2, characterized in that: Includes the following steps: Site selection and liquid-phase chip construction: Background sites were selected following the principle of uniform distribution across the genome. Based on simplified genome sequencing data of 250 individuals from 25 Artemia populations, background SNP sites were added to the chip. The processing procedure for resequencing VCF files is as follows: (1) deletion rate ≤ 50%; (2) maf ≥ 0.05; (3) het_rate ≤ 50%; (4) remove non-diaryl SNP sites; (5) remove INDEL sites; The site selection criteria are: (1) ΔRAF top number 200; (2) miss_rate ≤ 30% (single subpopulation); (3) maf ≥ 0.01; (4) ΔRAF difference between subpopulations ≥ 0.1; The site evaluation principles are: (1) probe length 110 bp; (2) probe GC content: 30-70%; (3) number of homologous regions ≤ 5; Using the above methods, a total of 4329 SNP sites were selected that are evenly distributed in the Artemia genome. Based on the location of the 4329 SNP sites and the sequence information on both sides, primers were designed and probes were synthesized to obtain Artemia 5K liquid phase chip.
4. The preparation method according to claim 2, characterized in that: The 250 individual artichokes from the 25 artichoke populations are: 。 5. The application of the 5K liquid phase chip for Artemia as described in claim 1 or 2 in population genetic analysis and / or kinship identification in Artemia producing areas.