Methods and systems for predicting the likelihood of preterm birth

By performing 16S rRNA gene sequencing and harmonizing vaginal microbiome data, the method addresses the challenges of predicting preterm birth, offering a reliable and convenient approach for identifying at-risk pregnancies.

WO2026136934A1PCT designated stage Publication Date: 2026-06-25RGT UNIV OF CALIFORNIA +5

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
RGT UNIV OF CALIFORNIA
Filing Date
2025-12-19
Publication Date
2026-06-25

AI Technical Summary

Technical Problem

Current clinical tools lack an effective and reliable method for early and quantitative assessment of the risk of preterm birth, and the use of vaginal microbiome data for predictive modeling is hindered by biological and technical challenges such as high dimensionality and variability, leading to a risk of model overfitting.

Method used

Perform 16S rRNA gene sequencing on vaginal fluid samples, harmonize the microbiome data, and transform it into features suitable for predictive modeling using a software harmonization workflow, enabling the development of robust predictive models for preterm birth.

Benefits of technology

The method provides a convenient and accurate way to identify pregnancies at risk for preterm birth, using vaginal fluid samples collected by the subject, thereby facilitating targeted interventions and improving prediction accuracy.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure US2025060739_25062026_PF_FP_ABST
    Figure US2025060739_25062026_PF_FP_ABST
Patent Text Reader

Abstract

Provided are methods of predicting the likelihood of preterm birth (PTB) or early preterm birth (ePTB) for a subject. The methods comprise performing 16S rRNA gene sequencing on vaginal fluid sample DNA to obtain vaginal microbiome 16S rRNA gene sequencing data, and harmonizing the vaginal microbiome 16S rRNA gene sequencing data. The methods further comprise transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into vaginal microbiome features, inputting the vaginal microbiome features into a predictive model, and predicting, using the predictive model, the likelihood of PTB or ePTB for the subject from whom the vaginal fluid sample was obtained. Computer readable media and systems that find use in practicing the methods of the present disclosure are also provided.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0002] METHODS AND SYSTEMS FOR PREDICTING THE LIKELIHOOD OF PRETERM BIRTH CROSS-REFERENCE TO RELATED APPLICATIONS

[0003] This application claims the benefit of U. S. Provisional Patent Application No. 63 / 737,496, filed December 20, 2024, which application is incorporated herein by reference in its entirety.

[0004] INTRODUCTION

[0005] Preterm birth (PTB) is the leading cause of infant morbidity and mortality worldwide. Globally, every year approximately 11% of infants every year are born preterm, defined as birth prior to 37 weeks of gestation, totaling nearly 15 million births. In addition to the emotional and financial toll on families, preterm births result in higher rates of neonatal death, nearly 1 million deaths each year, and long-term health consequences for some children. Infants born preterm are at risk for a variety of adverse outcomes, such as respiratory illnesses, cerebral palsy, infections, and blindness, with infants born early preterm (i.e., before 32 weeks) at increased risk of these conditions. Thus, the ability to accurately identify women at risk for PTB is a first step in the development and implementation of treatment and prevention strategies. Currently, available treatments for pregnant women at risk of preterm delivery include corticosteroids for fetal maturation and magnesium sulfate provided prior to 32 weeks to prevent cerebral palsy. Progesterone supplementation may also be administered as early as the second trimester to reduce the risk of PTB.

[0006] There are several known factors associated with PTB, including history of PTB, a short cervix, extremes of maternal age and body mass index (BMI), low socio-economic status, smoking, and genetic polymorphisms. Nevertheless, there is a need for additional clinical tools that enable the early and reliable assessment of the risk of preterm birth for an individual with quantitative rigor. Machine learning (ML) modeling has demonstrated potential to aid in the determination of individuals at risk of conditions and diseases across medical domains. By applying ML methods to large amounts of heterogeneous data, patterns in data can be discerned that would be otherwise difficult for humans to distinguish. Moreover, deducing which features contribute most to the predictive performance of an ML model allows for the identification of biomarkers that can be important for a condition or disease. There are a variety of ML algorithms that can be used individually, or combined into an ensemble approach to improve prediction performance. After ML modeling has been applied to and optimized on a training dataset, then the model is ideally tested on an independent dataset to assess how well the model is able to generalize to data it has never seen before. The validation on independent data is a critical step Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0007] to guard against overfitting and hence optimistically biased accuracy estimates. In the past several decades, applications of machine learning approaches to various types of clinical, molecular, and other data have been explored to predict complications of pregnancy including preterm birth. The results of these works to date demonstrate that the prediction of PTB from varied data types including metabolites in amniotic fluid and maternal blood and urine, ultrasound images, and electronic health records, appears to be feasible to a certain extent.

[0008] There is some indication that the vaginal microbiome is associated with adverse pregnancy outcomes, specifically PTB. Previous studies have shown that there are significant differences between the vaginal microbiome of patients who deliver at term and those who deliver prematurely. Vaginal microbiomes with increased diversity as well as communities where Lactobacillus is not dominant were more frequent in patients with PTB. Therefore, the vaginal microbiome is a tempting source of data to use for predictive modeling of PTB. However, there are significant biological and technical challenges to using microbiome data for predictive modeling. Biologically, human-associated microbiomes (including the vaginal microbiome) are incredibly variable-with any two individuals typically sharing less than half of microbes at the sequence-variant level of resolution. Thus, microbiome data, particularly compositional microbiome data, is both highly dimensional and sparse. These microbiome data attributes contribute to a substantial risk of model overfitting. Meta-analysis as well as rigorous evaluation of models on independent validation data is a robust approach to contend with these biological challenges with microbiome data. However there are significant technical challenges in aggregating and combining microbiome data across studies, therefore there have been few studies taking on this task.

[0009] SUMMARY

[0010] Provided are methods of predicting the likelihood of preterm birth (PTB) or early preterm birth (ePTB) for a subject. The methods comprise performing 16S rRNA gene sequencing on vaginal fluid sample DNA to obtain vaginal microbiome 16S rRNA gene sequencing data, and harmonizing the vaginal microbiome 16S rRNA gene sequencing data. The methods further comprise transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into vaginal microbiome features, inputting the vaginal microbiome features into a predictive model, and predicting, using the predictive model, the likelihood of PTB or ePTB for the subject from whom the vaginal fluid sample was obtained. Computer readable media and systems that find use in practicing the methods of the present disclosure are also provided. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0011] BRIEF DESCRIPTION OF THE FIGURES FIG. 1A-1C: Study Design and Challenge Overview and Data Harmonization. A) Left is the depiction of the assembled training and test data sets, harmonization of the data, transformation into feature tables and the outcomes posed to the participating teams. Right are the two sub-challenges, the global locations of the participating teams, the number of participants per sub-challenge, assessment process, and analysis of the better-performing models. B) Uniform Manifold approximation and projection (UMAP) ordination plots of the aggregated data before (left) and after (right) harmonization where each dot represents one vaginal microbiome sample colored by study. C) Violin plots of Shannon alpha diversity by trimester before (top) and after (bottom) harmonization stratified by study.

[0012] FIG. 2A-2C: Data visualization of microbiome features by outcome. A) Uniform Manifold approximation and projection (UMAP) ordination plots of the vaginal microbiome colored by outcome, B) Violin plot of diversity before (left) and after (right) harmonization stratified and colored by outcome and C) Alluvial plot of community state type (CST) frequencies across time stratified by birth outcome.

[0013] FIG. 3A-3B: Prediction accuracy of models against sequestered validation data from two independent studies not available to modeling teams. Bootstrapped area under the receiver operator characteristics (AU ROC) curves and Bayes factors for A) sub-challenge 1 and B) subchallenge 2 of the best-performing model of each team for each sub-challenge and the organizer’s baseline model (purple) against bootstrapped data (n=1000) with replacement from the two validation studies harmonized post hoc into the same feature sets. Bootstrapping was done by pregnancy not specimen. Left column are the box-and-whisker plots of the bootstrapped AUROC values; middle column are the Bayes Factors when compared to the top-performing model; right column are the Bayes Factors when comparing against the organizer’s model. Yellow are the two ‘best performing’ models for each sub-challenge. Blue are models with a Bayes Factor <= 20 when compared to the top-performing model.

[0014] FIG. 4A-4B: Feature Sets and Individual Compositional Features used by Top Performing Models. Top performing models here are defined a bootstrapped area under receiver operator curve greater than 0.64 or 0.8 respectively for sub-challenge 1 and 2, further limited to models that could make a prediction in less than 10 seconds on a twelve-core AMD Ryzen 3900X processor. A) Feature tables used by the top performing models for sub-challenge 1 (left) and sub-challenge 2 (right) to make their predictions of preterm birth and early-preterm birth respectively. Filled in blocks indicate this feature table (by row) was used by a given model (columns) to make the prediction. Unfilled blocks are for feature tables that when randomized did Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0015] not affect the prediction. B) For the six sub-challenge 2 models evaluated by feature permutation that also made use of phylotypes at 0.1 distance, thirty-two of the phylotypes were used by all six models, 73 by 5 of the six models (right Venn diagram). Of the 32 phylotypes used by all six models grouped by the closest species (left) for that phylotype.

[0016] FIG. 5: Ensemble Model Results. For a) sub-challenge 1 and b) sub-challenge 2, the area under the receiver operator characteristics (AUROC, left) curve and area under the precisionrecall curve (AUPRG, right) of three ensemble models (‘ensemble_top2’: top two performing models, ‘ensemble_top2’: models with Bayes factor less than 20; and ‘ensemble_aH': all models), as well as first place, second place, and baseline models, colored by model.

[0017] FIG. 6: Schematic illustration of a method according to some embodiments of the present disclosure.

[0018] DETAILED DESCRIPTION

[0019] Before the methods of the present disclosure are described in greater detail, it is to be understood that the methods are not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the methods will be limited only by the appended claims.

[0020] Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the methods. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges and are also encompassed within the methods, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the methods.

[0021] Certain ranges are presented herein with numerical values being preceded by the term “about.” The term “about” is used herein to provide literal support for the exact number that it precedes, as well as a number that is near to or approximately the number that the term precedes. In determining whether a number is near to or approximately a specifically recited number, the near or approximating unrecited number may be a number which, in the context in which it is presented, provides the substantial equivalent of the specifically recited number. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0022] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the methods belong. Although any methods similar or equivalent to those described herein can also be used in the practice or testing of the methods, representative illustrative methods are now described.

[0023] All publications and patents cited in this specification are herein incorporated by reference as if each individual publication or patent were specifically and individually indicated to be incorporated by reference and are incorporated herein by reference to disclose and describe the materials and / or methods in connection with which the publications are cited. The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the present methods are not entitled to antedate such publication, as the date of publication provided may be different from the actual publication date which may need to be independently confirmed.

[0024] It is noted that, as used herein and in the appended claims, the singular forms “a”, “an”, and “the” include plural referents unless the context clearly dictates otherwise. It is further noted that the claims may be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements, or use of a “negative” limitation.

[0025] It is appreciated that certain features of the methods, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the methods, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination. All combinations of the embodiments are specifically embraced by the present disclosure and are disclosed herein just as if each and every combination was individually and explicitly disclosed, to the extent that such combinations embrace operable processes and / or compositions. In addition, all sub-combinations listed in the embodiments describing such variables are also specifically embraced by the present methods and are disclosed herein just as if each and every such sub-combination was individually and explicitly disclosed herein.

[0026] As will be apparent to those of skill in the art upon reading this disclosure, each of the individual embodiments described and illustrated herein has discrete components and features which may be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present methods. Any recited method can be carried out in the order of events recited or in any other order that is logically possible. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0027] METHODS OF PREDICTING PRETERM BIRTH

[0028] The present disclosure provides methods of predicting preterm birth. In some embodiments, the methods comprise harmonizing vaginal microbiome 16S rRNA gene sequencing data, and transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into vaginal microbiome features, and inputting the vaginal microbiome features into a predictive model. The methods further comprise, using the predictive model, predicting the likelihood of preterm birth (PTB) or early preterm birth (ePTB) for the subject from whom the vaginal fluid sample was obtained.

[0029] Every year 11% of infants are born preterm with significant health consequences, with the vaginal microbiome a risk factor for preterm birth. PTB, particularly ePTB (before 32 weeks of gestation), remains a potentially devastating outcome of pregnancy. Without a clear way of identifying pregnancies at risk for PTB, it remains difficult to target interventions. The methods of the present disclosure are based in part on the development of robust predictive models to identify pregnancies at risk for PTB. The inventors applied a scientific and technical schema, implemented in a software harmonization workflow, and improving upon previous efforts using phylogenetic placement, for harmonizing microbiome data at the sequence-level, even when generated with different underlying primers and sequencing platforms, to transform the raw data into a stable and generalizable set of features suitable for predictive modeling. Moreover, because the methods utilize vaginal fluid samples, the methods conveniently enable the pregnant subject to obtain her own sample (e.g., using a vaginal swab, e.g., at the subject’s home) which can then be mailed to the healthcare provider and / or testing center - obviating the need for a blood draw requiring a visit to the healthcare provider and / or testing center. An non-limiting exemplary workflow is schematically illustrated in FIG. 6. Details regarding embodiments of the present disclosure will now be provided.

[0030] In certain embodiments, the methods comprise performing 16S rRNA gene sequencing on vaginal fluid sample DNA to obtain vaginal microbiome 16S rRNA gene sequencing data. Approaches for obtaining vaginal fluid samples, isolating DNA therefrom, and amplification (e.g., using primers targeting one or more variable regions of the 16S rRNA gene) to produce amplicons for subsequent sequencing are known, and non-limiting examples are described in the Experimental section herein.

[0031] Sequencing may be performed using any of a variety of available high throughput nucleic acid sequencing machines and systems. Illustrative sequencing systems include those available from: Illumina, Inc. (San Diego, Calif.), non-limiting examples of which include the Illumina iSeq Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0032] 100, Miniseq, MiSeq series, NextSeq series (e.g., NextSeq 500 series, NextSeq 1000, NextSeq 2000), and NovaSeq sequencing systems; MGI Tech Co. Ltd. (or “MGI”), non-limiting examples of which include the DNBSEQ T series or DNBSEQ T series G series nucleic acid sequencing systems; Oxford Nanopore Technologies (Oxford, UK), non-limiting examples of which include the MinION™, GridIONx5TM, PromethION™, and SmidgION™ nanopore-based sequencing systems; Pacific Biosciences (“PacBio”, Menlo Park, Calif.), non-limiting examples of which include the REVIO, ONSO, and SEQUEL IIe sequencing systems.

[0033] Illumina sequencing technology leverages clonal array formation and reversible terminator technology for large-scale sequencing. For cluster generation, sequencing templates are immobilized on a flow cell surface designed to present the DNA in a manner that facilitates access to enzymes while ensuring high stability of surface-bound template and low non-specific binding of fluorescently labeled nucleotides. Solid-phase amplification creates up to 1,000 identical copies of each single template molecule in close proximity (diameter of one micron or less). Because this process does not involve photolithography, mechanical spotting, or positioning of beads into wells, densities on the order of ten million single-molecule clusters per square centimeter are achieved. The next phase involves sequencing by synthesis (SBS) utilizing four fluorescently-labeled nucleotides to sequence the tens of millions of clusters on the flow cell surface in parallel. During each sequencing cycle, a single labeled deoxynucleoside triphosphate (dNTP) is added to the nucleic acid chain. The nucleotide label serves as a terminator for polymerization, so after each dNTP incorporation, the fluorescent dye is imaged to identify the base and then enzymatically cleaved to allow incorporation of the next nucleotide. Since all four reversible terminator-bound dNTPs (A, C, T, G) are present as single, separate molecules, natural competition minimizes incorporation bias. Base calls are made directly from signal intensity measurements during each cycle.

[0034] In some embodiments, a flow cell for clustering in the Illumina platform is a glass slide with lanes. Each lane is a glass channel coated with a lawn of two types of oligos (e.g., P5 and P7' oligos). Hybridization is enabled by the first of the two types of oligos on the surface. This oligo may be complementary to the first completed sequencing adapter or the second completed sequencing adapter. A polymerase creates a compliment strand of the hybridized sequencing template. The double-stranded molecule is denatured, and the original template strand is washed away. The remaining strand, in parallel with many other remaining strands, is clonally amplified through bridge amplification.

[0035] In bridge amplification and other sequencing methods involving clustering, a strand folds over, and a second adapter region on a second end of the strand hybridizes with the second type Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0036] of oligos on the flow cell surface. A polymerase generates a complementary strand, forming a double-stranded bridge molecule. This double-stranded molecule is denatured resulting in two single-stranded molecules tethered to the flow cell through two different oligos. The process is then repeated iteratively, and occurs simultaneously for millions of clusters resulting in clonal amplification of all the sequencing templates. After bridge amplification, the reverse strands are cleaved and washed off, leaving only the forward strands. The 3' ends are blocked to prevent unwanted priming.

[0037] After clustering, sequencing starts with extending a first sequencing primer to generate the first read. With each cycle, fluorescently tagged nucleotides compete for addition to the growing chain. Only one is incorporated based on the sequence of the template. After the addition of each nucleotide, the cluster is excited by a light source, and a characteristic fluorescent signal is emitted. The number of cycles determines the length of the read. The emission wavelength and the signal intensity determine the base call. For a given cluster all identical strands are read simultaneously. Hundreds of millions of clusters may be sequenced in a massively parallel manner. At the completion of the first read, the read product is washed away.

[0038] In the next step of protocols involving two index primers, an index 1 primer is introduced and hybridized to an index 1 region on the template. Index regions provide identification of fragments, which is useful for de-multiplexing samples in a multiplex sequencing process. The index 1 read is generated similar to the first read. After completion of the index 1 read, the read product is washed away and the 3' end of the strand is de-protected. The template strand then folds over and binds to a second oligo on the flow cell. An index 2 sequence is read in the same manner as index 1. Then an index 2 read product is washed off at the completion of the step.

[0039] After reading two indices, read 2 initiates by using polymerases to extend the second flow cell oligos, forming a double-stranded bridge. This double-stranded DNA is denatured, and the 3' end is blocked. The original forward strand is cleaved off and washed away, leaving the reverse strand. Read 2 begins with the introduction of a read 2 sequencing primer. As with read 1, the sequencing steps are repeated until the desired length is achieved. The read 2 product is washed away. This entire process generates millions of reads, representing all the fragments. Sequences from pooled sample libraries are separated based on the unique indices introduced during sample preparation. For each sample, reads of similar stretches of base calls are locally clustered. Forward and reversed reads are paired creating contiguous sequences. These contiguous sequences are aligned to the reference genome for variant identification. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0040] The sequencing by synthesis may involve paired end reads. Paired end sequencing involves 2 reads from the two ends of a fragment. Paired end reads are used to resolve ambiguous alignments. Paired-end sequencing allows users to choose the length of the DNA to be sequenced and sequence either end of the DNA, generating high-quality, alignable sequence data. Because the distance between each paired read is known, alignment algorithms can use this information to map reads over repetitive regions more precisely. This results in better alignment of the reads, especially across difficult-to-sequence, repetitive regions of the genome. Paired-end sequencing can detect rearrangements, including insertions and deletions (indels) and inversions.

[0041] DNA nanoball sequencing (e.g., employing a sequencing system available from MGI Tech Co., Ltd. (“MGI”)) involves the amplification of genomic DNA into nanoballs, followed by sequencing by synthesis (SBS) using fluorescently labeled nucleotides. In nanoball sequencing, DNA fragments are amplified by rolling circle amplification. The original circular DNA fragment serves as a template for the amplification of each clonal copy of DNA. This results in a spherical "nanoball" of amplified DNA. The negatively charged nanoballs are then hybridized to positively charged binding spots on a patterned flow cell. The sequencing process then proceeds in a similar fashion to standard SBS sequencing. The nucleotides (A, C, G, or T) are added one at a time to the flow cell, and the incorporated nucleotides are detected by a camera. Fluorescently labeled nucleotides, or alternatively, nucleotides labeled with fluorescently labeled antibodies, can be used.

[0042] In nanopore sequencing (e.g., implemented using a sequencer available from Oxford Nanopore Technologies), the nanopore serves as a biosensor and provides the sole passage through which an ionic solution on the cis side of the membrane contacts the ionic solution on the trans side. A constant voltage bias (trans side positive) produces an ionic current through the nanopore and drives ssDNA or ssRNA in the cis chamber through the pore to the trans chamber. A processive enzyme (e.g., a helicase, polymerase, nuclease, or the like) may be bound to the polynucleotide such that its step-wise movement controls and ratchets the nucleotides through the small-diameter nanopore, nucleobase by nucleobase. Because the ionic conductivity through the nanopore is sensitive to the presence of the nucleobase’s mass and its associated electrical field, the ionic current levels through the nanopore reveal the sequence of nucleobases in the translocating strand. A patch clamp, a voltage clamp, or the like, may be employed. Details for obtaining raw sequencing reads of nucleic acid molecules using nanopores are described, e.g., in Feng et al. (2015) Genomics, Proteomics & Bioinformatics 13(1):4-16. Nanopore-based sequencing systems are available and include the MinION™, Flongle, GridIONx5TM, Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0043] PromethION™, and SmidgION™ nanopore-based sequencing systems available from Oxford Nanopore Technologies Limited. Additional nanopore-based sequencing systems are available and include the QNome-9604 sequencer available from Qitan Technology, the AXP100 sequencer available from Axbio Biotechnology, the PolyseqOne sequencer available from Polyseq, the Gseq-500 sequencer available from Geneus-tech, and the CycloneSEQ sequencer available from Beijing Genomics Institute (BGI). Detailed design considerations and protocols for performing nucleic acid sequencing are provided with such systems.

[0044] In zero mode waveguide (ZMW)-based sequence analysis (e.g., using a sequencer available from Pacific Biosciences (PacBio), the ZMW is a nanoscale-sized well that serves as an optical confinement that allows observation of individual polymerase molecules. As a result, nucleotide incorporation events provide observation of an incorporating nucleotide analog that is readily distinguishable from non-incorporated nucleotide analogs. For a description of ZMWs and their application in nucleic acid sequencing, see, e.g., U. S. Patent Application Publication No.

[0045] 2003 / 0044781 and U. S. Pat. No. 6,917,726, each of which is incorporated herein by reference in its entirety for all purposes. See also Levene et al. (2003) “Zero-mode waveguides for singlemolecule analysis at high concentrations” Science 299:682-686, Eid et al. (2009) “Real-time DNA sequencing from single polymerase molecules” Science 323:133-138, and U. S. Pat. Nos.

[0046] 7,056,676, 7,056,661, 7,052,847, 7,033,764, and 7,907,800, the full disclosures of which are incorporated herein by reference in their entirety for all purposes.

[0047] The methods of the present disclosure may comprise harmonizing the vaginal microbiome 16S rRNA gene sequencing data. In some instances, harmonizing the vaginal microbiome 16S rRNA gene sequencing data comprises phylogenetic placement of amplicon sequence variants (ASVs) onto a maximum likelihood phylogenetic tree comprised of full-length 16S rRNA alleles. Approaches for performing such harmonization include Nextflow-based workflows. A non-limiting example of a Nextflow-based workflow that finds use in harmonizing 16S rRNA gene sequencing data is MaLiAmPi (see, e.g., Minot et al. (2022) Preprint at bioRxiv, 10.1101 / 2022.07.26.501561 10.1101 / 2022.07.26.501561).

[0048] The methods of the present disclosure may comprise transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into vaginal microbiome features. In some instances, the methods comprise transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into feature tables.

[0049] Vaginal microbiome features of interest include, but are not limited to, one or more diversity measures, one or more community state types, one or more phylotypes, one or more Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0050] taxons (e.g., at the family, genus, and / or species levels), or any combination thereof. For examples, the features may comprise one or more diversity measures, one or more community state types, and one or more phylotypes, optionally wherein the features further comprise one or more taxons.

[0051] According to some embodiments, the vaginal microbiome features comprise one or more of the features listed in FIG. 4B. In some instances, the features comprise two or more, 5 or more, 10 or more, 15 or more, or 20 or more of the features listed in FIG. 4B.

[0052] In certain embodiments, the vaginal microbiome features comprise one or more of the features listed in Table 1. In some instances, the vaginal microbiome features comprise two or more, 5 or more, 10 or more, 15 or more, 20 or more, 30 or more, 40 or more, 50 or more, 60 or more, 70 or more, 80 or more, 90 or more, or 100 or more of the features listed in Table 1. According to some embodiments, the vaginal microbiome features comprise two or more, 5 or more, 10 or more, 15 or more, 20 or more, 30 or more, 40 or more, or 50 or more of the first 103 features listed in Table 1. In certain embodiments, the vaginal microbiome features comprise 103 or fewer, 90 or fewer, 80 or fewer, 70 or fewer, 60 or fewer, 50 or fewer, or 40 or fewer of the first 103 features listed in Table 1. According to any of the embodiments of the methods of the present disclosure, the vaginal microbiome features may comprise Taxonomy (Genus): Mobiluncus.

[0053] Table 1 - Vaginal Microbiome Features

[0054] Row Ensemble Feature

[0055] (bf<20)

[0056] 1 1.08% Taxonomy (Genus): Mobiluncus

[0057] 2 0.37% Alpha Diversity: rootedjod

[0058] 3 0.37% Phylotypes (0.1): pt _ 00048 (Mobiluncus curtisii | Mobiluncus mulieris |

[0059] Mobiluncus curtisii / holmesii | Mobiluncus holmesii)

[0060] 4 0.35% Alpha Diversity: quadratic

[0061]

[0062] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0063] 0.34% Phylotypes (0.5): pt _ 00006 (Prevotella salivae | Prevotella oulorum |

[0064] Prevotella brunnea | Prevotella disiens | Prevotella scopos | Prevotella fusca | Prevotella jejuni | Prevotella pallens | Prevotella bivia | Prevotella maculosa | Prevotella melaninogenica / scopos | Prevotella veroralis | Prevotella multiformis | Prevotella copri | Prevotella denticola | Prevotella histicola / veroralis | Prevotella oris | Prevotella corporis | Prevotella melaninogenica | Prevotella ihumii | Prevotella enoeca Prevotella amnii / corporis | Prevotella histicola)

[0065] 0.33% Phylotypes (0.5): pt _ 00048 (Fusobacterium massiliense |

[0066] Fusobacterium equinum / gonidiaformans | Fusobacterium hwasookii / nucleatum | Fusobacterium periodonticum | Fusobacterium nucleatum | Fusobacterium equinum | Fusobacterium nucleatum / periodonticum | Fusobacterium gonidiaformans)

[0067] 0.33% Phylotypes (0.5): pt _ 00009 (Prevotella timonensis | Prevotella buccalis | Prevotella nanceiensis | Prevotella stercorea | Prevotella shahii | Prevotella enoeca | Prevotella micans | Prevotella bivia / veroralis | Prevotella conceptionensis)

[0068] 0.32% Phylotypes (1): pt _ 00005 (Prevotella salivae | Prevotella buccae / nigrescens | Prevotella oulorum | Prevotella brunnea | Prevotella baroniae | Prevotella nanceiensis | Prevotella disiens | Prevotella intermedia | Prevotellamassilia timonensis | Prevotella scopos | Prevotella oralis | Prevotella fusca | Prevotella jejuni | Prevotella pallens | Prevotella bergensis | Prevotella micans | Prevotella marseillensis | Prevotella bivia | Prevotella maculosa | Prevotella melaninogenica / scopos | Prevotella bivia / veroralis | Prevotella conceptionensis | Prevotella veroralis | Prevotella pleuritidis | Prevotella saccharolytica | Prevotella timonensis | Prevotella multiformis | Prevotella copri | Alloprevotella rava | Bacteroides vulgatus | Prevotella bergensis / corporis | Prevotella colorans | Prevotella nigrescens | Prevotella denticola | Prevotella histicola / veroralis | Prevotella marshii | Prevotella amnii | Prevotella corporis | Prevotella oris | Prevotella stercorea | Paraprevotella clara | Prevotella buccae | Prevotella melaninogenica | Prevotella ihumii | Petrimonas sulfuriphila |

[0069]

[0070] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0071] Prevotella shahii | Prevotella enoeca | Prevotella amnii / corporis | Prevotella histicola)

[0072] 0.32% CST (Valencia): CST lll-A

[0073] 0.31% Phylotypes (0.1): pt _ 00019 (Prevotella bivia)

[0074] 0.31% Alpha Diversity: shannon

[0075] 0.31% Alpha Diversity: phylo_entropy

[0076] 0.31% Taxonomy (Species): Lactobacillus crispatus

[0077] 0.30% Taxonomy (Family): Lactobacillaceae

[0078] 0.30% Alpha Diversity: inv simpson

[0079] 0.30% CST (Valencia): CST l-A

[0080] 0.30% Alpha Diversity: unrootedjod

[0081] 0.30% Alpha Diversity: bwpd

[0082] 0.29% Phylotypes (0.1): pt _ 00009 (Lactobacillus crispatus / helveticus |

[0083] Lactobacillus crispatus / gallinarum | Lactobacillus crispatus)

[0084] 0.29% Phylotypes (0.5): pt _ 00002 (Lactobacillus acidophilus | Lactobacillus crispatus / helveticus | Lactobacillus gallinarum | Lactobacillus amylovorus / crispatus | Lactobacillus crispatus / gallinarum | Lactobacillus crispatus | Lactobacillus gallinarum / helveticus | Lactobacillus acidophilus / crispatus / gallinarum | Lactobacillus acidophilus / crispatus | Lactobacillus helveticus | Lactobacillus amylovorus)

[0085] 0.28% Phylotypes (0.5): pt _ 00005 (Lactobacillus iners)

[0086]

[0087] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0088] 0.28% Phylotypes (0.5): pt _ 00016 (Streptococcus fryi | Streptococcus salivarius | Streptococcus mutans | Streptococcus salivarius / vestibularis | Streptococcus parasanguinis | Streptococcus equinus / pasteurianus | Streptococcus mitis / oralis | Streptococcus pseudopneumoniae | Streptococcus halichoeri | Streptococcus sanguinis | Streptococcus thermophilus | Streptococcus agalactiae | Streptococcus sanguinis / sinensis | Streptococcus urinalis | Streptococcus periodonticum | Streptococcus pneumoniae | Streptococcus dysgalactiae | Streptococcus equinus / gallolyticus / pasteurianus | Streptococcus pseudoporcinus | Streptococcus mitis | Streptococcus equinus / gallolyticus | Streptococcus infantis | Streptococcus oralis | Streptococcus salivarius / thermophilus | Streptococcus peroris / sanguinis | Streptococcus cristatus | Streptococcus intermedius | Streptococcus gordonii | Streptococcus timonensis | Streptococcus anginosus | Streptococcus mitis / pneumoniae | Streptococcus mitis / pseudopneumoniae | Lactococcus chungangensis / raffinolactis) 0.28% Phylotypes (0.1): pt _ 00004 (Lactobacillus jensenii / psittaci |

[0089] Lactobacillus psittaci | Lactobacillus jensenii)

[0090] 0.28% Phylotypes (0.5): pt _ 00025 (Prevotella timonensis | Prevotella buccalis | Prevotella stercorea)

[0091]

[0092] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0093] 0.27% Phylotypes (1): pt _ 00017 (Enterobacter cloacae | Enterobacter cloacae / hormaechei | Haemophilus parainfluenzae | Serratiaquinivorans | Klebsiella oxytoca | Citrobacter amalonaticus / murliniae | Haemophilus sputorum | Pantoea ananatis / dispersa | Serratia fonticola | Citrobacter koseri | Enterobacter_kobei / Klebsiella_aerogenes | Haemophilus_haemolyticus / Aggregatibacter_segnis | Aeromonas caviae | Citrobacter braakii / freundii | Escherichia coli / fergusonii | Yokenella regensburgei | Escherichia_coli / Shigella_sonnei | Shigella dysenteriae / sonnei | Raoultella planticola | Kluyvera ascorbata / cryocrescens | Proteus mirabilis | Shewanella putrefaciens | Escherichia_coli / Shigella_dysenteriae | Serratia marcescens | Morganella morganii | Vibrio litoralis / rumoiensis | Enterobacter_cloacae / Escherichia_coli | Shigella flexneri / sonnei | Haemophilus haemolyticus | Escherichia coli | Kluyvera intermedia | Serratia proteamaculans / quinivorans | Haemophilus parahaemolyticus | Escherichia fergusonii | Pantoea agglomerans | Enterobacter roggenkampii | Klebsiella aerogenes | Kosakonia_cowanii / Salmonella_bongori | Aeromonas popoffii / salmonicida | Citrobacter freundii | Enterobacter hormaechei | Kluyvera ascorbata | Erwinia aphidicola / persicina | Klebsiella pneumoniae / variicola | Aggregatibacter segnis | Shigella flexneri | Shewanella baltica | Pseudoalteromonas marina / piratica | Aeromonas veronii | Haemophilus influenzae | Aeromonas caviae / popoffii / veronii | Enterobacter asburiae | Haemophilus quentini | Aggregatibacter aphrophilus | Escherichia_coli / Shigella_flexneri | Pseudoalteromonas porphyrae | Haemophilus parainfluenzae / pittmaniae | Shigella sonnei | Pantoea_agglomerans / Escherichia_coli | Erwinia_billingiae / [Pantoea]_cedenensis | Haemophilus pittmaniae | Haemophilus haemolyticus / influenzae | Aeromonas caviae / hydrophila | Enterobacter asburiae / cloacae | Shewanella decolorationis / seohaensis | Pararheinheimera tangshanensis Pararheinheimera soli | Klebsiella milletis / pneumoniae | Salmonella enterica | Citrobacter amalonaticus |

[0094]

[0095] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0096] Enterobacter_hormaechei / Enterobacter_cloacae / Escherichia_coli | Klebsiella pneumoniae)

[0097] 0.27% Phylotypes (1): pt _ 00007 (Prevotella amnii | Prevotella timonensis |

[0098] Prevotella buccalis | Prevotella stercorea | Paraprevotella clara | Prevotellamassilia timonensis | Prevotella buccae | Prevotella colorans | Prevotella buccae / nigrescens | Prevotella nigrescens | Prevotella bergensis | Prevotella oulorum Prevotella marseillensis | Prevotella baroniae | Prevotella marshii | Prevotella pleuritidis)

[0099] 0.27% Taxonomy (Family): Prevotellaceae

[0100] 0.27% Phylotypes (0.1): pt _ 00056 (Fusobacterium nucleatum)

[0101] 0.27% Taxonomy (Genus): Prevotella

[0102]

[0103] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0104] 0.27% Phylotypes (0.5): pt _ 00004 (Corynebacterium thomssenii |

[0105] Corynebacterium pyruviciproducens | Corynebacterium timonense | Corynebacterium urinapleomorphum | Corynebacterium senegalense | Corynebacterium dentalis | Corynebacterium neomassiliense / variabile | Corynebacterium aurimucosum / minutissimum | Corynebacterium mucifaciens / ureicelerivorans | Corynebacterium pseudogenitalium / tuberculostearicum | Corynebacterium glyciniphilum | Corynebacterium kroppenstedtii | Corynebacterium imitans | Corynebacterium tuscaniense | Corynebacterium freneyi | Corynebacterium matruchotii | Corynebacterium argentoratense | Lawsonella clevelandensis | Corynebacterium simulans | Corynebacterium accolens / macginleyi | Corynebacterium neomassiliense | Corynebacterium phoceense | Corynebacterium coyleae / genitalium | Corynebacterium atypicum | Corynebacterium sundsvallense | Corynebacterium pilbarense | Corynebacterium hadale / imitans | Corynebacterium appendicis | Corynebacterium aurimucosum | Corynebacterium coyleae | Corynebacterium urealyticum | Dietzia timorensis | Corynebacterium ammoniagenes / casei | Corynebacterium mucifaciens | Corynebacterium sundsvallense / thomssenii | Corynebacterium tuberculostearicum | Corynebacterium massiliense | Corynebacterium coyleae / mucifaciens | Corynebacterium riegelii | Corynebacterium minutissimum | Corynebacterium fournierii / mucifaciens | Corynebacterium lipophiloflavum | Corynebacterium pseudogenitalium | Corynebacterium variabile | Corynebacterium jeikeium | Corynebacterium coyleae / pilbarense | Corynebacterium genitalium | Corynebacterium hadale | Corynebacterium freneyi / xerosis | Corynebacterium frankenforstense | Corynebacterium mastitidis | Dietzia cinnamea | Corynebacterium striatum | Corynebacterium simulans / striatum | Corynebacterium diphtheriae | Corynebacterium otitidis | Corynebacterium amycolatum / lactis | Corynebacterium amycolatum | Corynebacterium mycetoides | Corynebacterium glucuronolyticum) 0.27% Phylotypes (0.5): pt _ 00031 (Sneathia amnii)

[0106]

[0107] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0108] 0.27% Phylotypes (0.5): pt _ 00017 (Lactobacillus gasseri | Lactobacillus iners)

[0109] 0.26% Phylotypes (1): pt _ 00015 (Streptococcus fryi | Streptococcus salivarius | Streptococcus mutans | Streptococcus salivarius / vestibularis | Streptococcus parasanguinis | Streptococcus equinus / pasteurianus | Streptococcus mitis / oralis | Streptococcus pseudopneumoniae | Streptococcus halichoeri | Streptococcus sanguinis | Streptococcus thermophilus | Streptococcus agalactiae | Streptococcus sanguinis / sinensis | Streptococcus urinalis | Streptococcus periodonticum | Streptococcus pneumoniae | Streptococcus dysgalactiae | Streptococcus equinus / gallolyticus / pasteurianus | Streptococcus pseudoporcinus | Streptococcus mitis | Streptococcus equinus / gallolyticus | Streptococcus infantis | Streptococcus oralis | Streptococcus salivarius / thermophilus | Streptococcus peroris / sanguinis | Lactococcus lactis | Streptococcus intermedius | Streptococcus cristatus | Streptococcus gordonii | Streptococcus timonensis | Streptococcus anginosus | Streptococcus mitis / pneumoniae | Streptococcus mitis / pseudopneumoniae | Lactococcus chungangensis / raffinolactis) 0.26% Phylotypes (0.5): pt _ 00007 (Megasphaera hexanoica | Megasphaera micronuciformis | Colibacter massiliensis | Anaeroglobus geminatus | Megasphaera stantonii | Megasphaera massiliensis)

[0110] 0.26% Phylotypes (0.5): pt _ 00027 (Peptoniphilus vaginalis | Peptoniphilus harei | Peptoniphilus asaccharolyticus / harei | Peptoniphilus grossensis / harei | Peptoniphilus asaccharolyticus)

[0111]

[0112] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0113] 0.26% Phylotypes (1): pt _ 00001 (Lactobacillus coleohominis | Lactobacillus acidophilus | Lactobacillus crispatus / helveticus | Lactobacillus amylovorus / crispatus | Lactobacillus gasseri / paragasseri | Lactobacillus acidophilus / crispatus / gallinarum | Lactobacillus acidophilus / crispatus | Lactobacillus plantarum | Lactobacillus salivarius | Lactobacillus helveticus | Lactobacillus oris / reuteri | Lactobacillus reuteri | Lactobacillus jensenii / psittaci | Lactobacillus brevis | Lactobacillus pontis | Lactobacillus gasseri | Lactobacillus crispatus / gallinarum | Lactobacillus psittaci | Lactobacillus vaginalis | Lactobacillus gasseri / johnsonii | Lactobacillus fermentum | Lactobacillus taiwanensis | Lactobacillus kitasatonis | Lactobacillus rhamnosus | Lactobacillus iners | Lactobacillus gasseri / johnsonii / paragasseri | Lactobacillus ruminis | Lactobacillus crispatus | Lactobacillus johnsonii | Lactobacillus delbrueckii | Lactobacillus casei / paracasei | Lactobacillus gallinarum / helveticus | Lactobacillus oris | Lactobacillus amylovorus | Lactobacillus paracasei | Lactobacillus jensenii | Lactobacillus gallinarum)

[0114] 0.26% Phylotypes (0.1): pt _ 00045 (Streptococcus intermedius | Streptococcus anginosus)

[0115] 0.26% Phylotypes (0.5): pt _ 00003 (Lactobacillus jensenii / psittaci |

[0116] Lactobacillus jensenii | Lactobacillus acidophilus | Lactobacillus kitasatonis | Lactobacillus gasseri / johnsonii / paragasseri | Lactobacillus iners | Lactobacillus gasseri | Lactobacillus delbrueckii | Lactobacillus johnsonii | Lactobacillus gasseri / paragasseri | Lactobacillus psittaci | Lactobacillus gasseri / johnsonii | Lactobacillus taiwanensis)

[0117] 0.26% Phylotypes (0.5): pt _ 00026 (Finegoldia magna)

[0118] 0.26% Phylotypes (0.5): pt _ 00001 (Bifidobacterium tsurumiense |

[0119] Bifidobacterium pseudocatenulatum | Bifidobacterium longum | Bifidobacterium animalis | Bifidobacterium animalis / dentium | Bifidobacterium bifidum | Bifidobacterium adolescentis | Bifidobacterium breve | Gardnerella vaginalis | Bifidobacterium scardovii | Bifidobacterium angulatum | Bifidobacterium dentium)

[0120]

[0121] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0122] 0.26% Phylotypes (0.5): pt _ 00034 (Candidatus Peptoniphilus massiliensis |

[0123] Peptoniphilus coxii | Peptoniphilus coxii / pacaensis | Peptoniphilus urinimassiliensis | Peptoniphilus pacaensis | Urinicoccus massiliensis | Peptoniphilus_urinimassiliensis / Candidatus_Peptoniphilus_massiliensis)

[0124] 0.26% Phylotypes (0.5): pt _ 00022 (Gardnerella vaginalis | Varibaculum cambriense | Alloscardovia omnicolens)

[0125] 0.26% Phylotypes (0.5): pt _ 00024 (Ureaplasma parvum / urealyticum |

[0126] Ureaplasma parvum | Ureaplasma urealyticum)

[0127] 0.26% Phylotypes (0.5): pt _ 00010 (Eubacterium ramulus | Shuttleworthia satelles)

[0128] 0.25% Phylotypes (0.1): pt _ 00021 (Lactobacillus crispatus | Lactobacillus amylovorus / crispatus | Lactobacillus amylovorus)

[0129] 0.25% Phylotypes (1): pt _ 00019 (Gordonibacter urolithinfaciens | Gollinsella aerofaciens | Olegusella massiliensis | Eggerthella lenta | Cryptobacterium curtum | Atopobium parvulum | Gollinsella tanakaei | Olsenella uli | Atopobium deltae | Atopobium rimae | Atopobium vaginae | Atopobium minutum | Asaccharobacter_celatus / Adlercreutzia_equolifaciens)

[0130] 0.25% Phylotypes (0.5): pt _ 00030 (Prevotella bergensis / corporis | Prevotella bergensis | Prevotella colorans | Prevotella melaninogenica)

[0131] 0.25% Phylotypes (0.5): pt _ 00014 (Anaerococcus lactolyticus | Anaerococcus nagyae | Anaerococcus vaginalis | Anaerococcus octavius | Anaerococcus hydrogenalis / jeddahensis | Anaerococcus rubeinfantis / vaginalis | Anaerococcus hydrogenalis / vaginalis | Anaerococcus hydrogenalis | Anaerococcus prevotii | Anaerococcus marasmi / prevotii | Anaerococcus marasmi | Anaerococcus hydrogenalis / rubeinfantis | Anaerococcus senegalensis / vaginalis | Anaerococcus rubeinfantis | Anaerococcus senegalensis | Anaerococcus tetradius | Anaerococcus prevotii / tetradius)

[0132]

[0133] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0134] 0.25% Phylotypes (1): pt _ 00006 (Corynebacterium thomssenii |

[0135] Corynebacterium pyruviciproducens | Rhodococcus erythropolis | Corynebacterium urinapleomorphum | Corynebacterium senegalense | Corynebacterium dentalis | Corynebacterium timonense | Corynebacterium neomassiliense / variabile | Corynebacterium aurimucosum / minutissimum | Corynebacterium pseudogenitalium / tuberculostearicum | Mycolicibacterium iranicum | Corynebacterium mucifaciens / ureicelerivorans | Corynebacterium glyciniphilum | Mycolicibacterium_obuense / Mycobacterium_kyogaense | Corynebacterium kroppenstedtii | Corynebacterium imitans | Corynebacterium tuscaniense | Corynebacterium freneyi | Corynebacterium matruchotii | Corynebacterium argentoratense | Lawsonella clevelandensis | Corynebacterium simulans | Corynebacterium accolens / macginleyi | Corynebacterium neomassiliense | Corynebacterium phoceense | Corynebacterium atypicum | Corynebacterium coyleae / genitalium | Corynebacterium sundsvallense | Corynebacterium pilbarense | Corynebacterium hadale / imitans | Corynebacterium aurimucosum | Corynebacterium appendicis | Corynebacterium coyleae | Corynebacterium urealyticum | Dietzia timorensis | Corynebacterium ammoniagenes / casei | Corynebacterium mucifaciens | Corynebacterium sundsvallense / thomssenii | Corynebacterium tuberculostearicum | Corynebacterium massiliense | Corynebacterium coyleae / mucifaciens | Corynebacterium riegelii | Corynebacterium minutissimum | Corynebacterium fournierii / mucifaciens | Mycolicibacterium mucogenicum | Corynebacterium lipophiloflavum | Corynebacterium pseudogenitalium | Corynebacterium coyleae / pilbarense | Corynebacterium jeikeium | Corynebacterium variabile | Corynebacterium genitalium | Corynebacterium hadale | Corynebacterium freneyi / xerosis | Corynebacterium frankenforstense | Dietzia cinnamea | Corynebacterium mastitidis | Corynebacterium striatum | Corynebacterium simulans / striatum | Corynebacterium diphtheriae | Corynebacterium otitidis | Corynebacterium amycolatum / lactis | Corynebacterium

[0136]

[0137] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0138] amycolatum | Corynebacterium mycetoides | Corynebacterium glucuronolyticum)

[0139] 0.25% Phylotypes (0.1): pt _ 00001 (Lactobacillus iners)

[0140] 0.25% Phylotypes (0.5): pt _ 00054 (Haemophilus parainfluenzae |

[0141] Haemophilus sputorum | Aggregatibacter segnis | Haemophilus pittmaniae | Haemophilus haemolyticus / influenzae | Haemophilus_haemolyticus / Aggregatibacter_segnis | Haemophilus influenzae | Haemophilus haemolyticus | Haemophilus parahaemolyticus | Haemophilus quentini | Frederiksenia canicola | Aggregatibacter aphrophilus | Haemophilus parainfluenzae / pittmaniae)

[0142] 0.25% CST (Valencia): CST lll-B

[0143] 0.24% CST (Valencia): IV-B_sim

[0144] 0.24% Phylotypes (0.5): pt _ 00008 (Dialister invisus | Dialister micraerophilus |

[0145] Dialister hominis / massiliensis | Allisonella histaminiformans | Negativicoccus succinicivorans | Dialister propionicifaciens | Dialister pneumosintes)

[0146] 0.24% Phylotypes (0.1): pt _ 00020 (Peptoniphilus vaginalis | Peptoniphilus harei | Peptoniphilus asaccharolyticus / harei | Peptoniphilus grossensis / harei | Peptoniphilus asaccharolyticus)

[0147] 0.24% Phylotypes (1): pt _ 00011 (Eubacterium ramulus | Shuttleworthia satelles)

[0148] 0.24% Phylotypes (0.1): pt _ 00013 (Ureaplasma parvum | Ureaplasma parvum / urealyticum | Ureaplasma urealyticum)

[0149] 0.24% Phylotypes (0.1): pt _ 00003 (Caecibacter spp | Megasphaera spp)

[0150] 0.24% CST (Valencia): CST IV-B

[0151]

[0152] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0153] 0.24% Phylotypes (0.5): pt _ 00015 (Staphylococcus haemolyticus / hominis |

[0154] Staphylococcus warneri | Staphylococcus caprae / epidermidis | Salinicoccus qingdaonensis / salitudinis | Staphylococcus hominis / lugdunensis | Staphylococcus aureus | Staphylococcus capitis | Staphylococcus pseudintermedius | Staphylococcus auricularis | Staphylococcus caprae | Staphylococcus epidermidis / lugdunensis | Staphylococcus cohnii | Staphylococcus haemolyticus | Staphylococcus hominis | Staphylococcus hyicus | Staphylococcus aureus / hyicus | Staphylococcus saccharolyticus | Staphylococcus devriesei / epidermidis | Staphylococcus carnosus | Staphylococcus simulans | Staphylococcus epidermidis / hyicus | Staphylococcus lugdunensis | Staphylococcus pettenkoferi | Staphylococcus carnosus / condimenti | Staphylococcus epidermidis)

[0155] 0.24% Phylotypes (0.5): pt _ 00013 (Aerococcus urinaeequi / viridans |

[0156] Granulicatella elegans | Aerococcus christensenii | Enterococcus faecalis / faecium | Aerococcus urinaeequi | Enterococcus cecorum | Enterococcus faecalis | Lactobacillus sakei | Aerococcus urinae | Granulicatella adiacens | Aerococcus sanguinicola | Brochothrix thermosphacta | Enterococcus durans / faecium | Enterococcus casseliflavus / gallinarum | Enterococcus italicus | Enterococcus faecium) 0.24% Phylotypes (0.5): pt _ 00018 (Caecibacter spp | Megasphaera spp)

[0157] 0.24% Taxonomy (Species): Lactobacillus gasseri

[0158]

[0159] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0160] 0.24% Phylotypes (1): pt _ 00004 (Staphylococcus haemolyticus / hominis |

[0161] Staphylococcus warneri | Aerococcus urinaeequi / viridans | Leuconostoc citreum | Bacillus nealsonii | Eremococcus coleocola | Staphylococcus caprae / epidermidis | Bacillus foraminis | Salinicoccus qingdaonensis / salitudinis | Aeribacillus pallidus | Globicatella sulfidifaciens | Enterococcus durans / faecium | Dolosigranulum pigrum | Enterococcus casseliflavus / gallinarum | Facklamia ignava | Staphylococcus hominis / lugdunensis | Staphylococcus aureus | Bacillus simplex | Gemella morbillorum | Staphylococcus capitis | Granulicatella elegans | Aerococcus christensenii | Lactobacillus sakei | Gemella asaccharolytica | Staphylococcus auricularis | Nosocomiicoccus ampullae | Staphylococcus pseudintermedius | Staphylococcus epidermidis / lugdunensis | Staphylococcus caprae | Aerococcus urinae | Staphylococcus cohnii | Staphylococcus haemolyticus | Anoxybacillus kestanbolensis | Brochothrix thermosphacta | Gemella sanguinis | Nosocomiicoccus massiliensis Bacillus thermoamylovorans | Gemella haemolysans | Facklamia hominis | Geobacillus stearothermophilus | Facklamia languida | Leuconostoc gelidum | Gemella haemolysans / sanguinis | Granulicatella adiacens | Staphylococcus hominis | Aerococcus sanguinicola | Abiotrophia defectiva | Staphylococcus hyicus | Bacillus pumilus | Enterococcus faecium | Aerococcus urinaeequi | Staphylococcus aureus / hyicus | Staphylococcus saccharolyticus | Nosocomiicoccus ampullae / massiliensis | Enterococcus faecalis / faecium | Turicibacter sanguinis | Enterococcus faecalis | Leuconostoc mesenteroides | Exiguobacterium acetylicum | Enterococcus cecorum | Staphylococcus devriesei / epidermidis | Pianococcus rifietoensis | Staphylococcus carnosus | Staphylococcus simulans | Staphylococcus epidermidis / hyicus | Staphylococcus lugdunensis | Staphylococcus pettenkoferi | Enterococcus italicus | Gemella taiwanensis | Staphylococcus carnosus / condimenti | Staphylococcus epidermidis)

[0162]

[0163] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0164] 0.24% Phylotypes (1): pt _ 00030 (Fusobacterium massiliense | Fusobacterium equinum / gonidiaformans | Fusobacterium hwasookii / nucleatum | Fusobacterium periodonticum | Fusobacterium nucleatum | Fusobacterium equinum | Fusobacterium nucleatum / periodonticum | Fusobacterium gonidiaformans)

[0165] 0.24% Phylotypes (0.5): pt _ 00019 (Olsenella urininfantis | Olegusella massiliensis | Atopobium parvulum | Olsenella uli | Atopobium deltae | Atopobium rimae | Atopobium vaginae | Atopobium minutum)

[0166] 0.24% Phylotypes (0.5): pt _ 00049 (Terrisporobacter glycolicus |

[0167] Peptostreptococcus stomatis | Paeniclostridium sordellii | Clostridioides difficile | Intestinibacter bartlettii | Peptostreptococcus anaerobius) 0.24% Taxonomy (Species): Corynebacterium tuberculostearicum

[0168] 0.24% Phylotypes (1): pt _ 00021 (Leptotrichia trevisanii | Leptotrichia massiliensis | Leptotrichia hofstadii | Leptotrichia goodfellowii | Leptotrichia shahii | Leptotrichia buccalis | Leptotrichia wadei | Leptotrichia buccalis / massiliensis / shahii | Sneathia amnii | Sneathia sanguinegens)

[0169] 0.24% Phylotypes (0.1): pt _ 00008 (Prevotella timonensis)

[0170] 0.24% Taxonomy (Family): Lachnospiraceae

[0171] 0.24% Phylotypes (0.5): pt _ 00021 (Oligella urethralis | Achromobacter spanius | Achromobacter insuavis / xylosoxidans | Alcaligenes faecalis | Achromobacter xylosoxidans | Achromobacter spanius / xylosoxidans) 0.24% Phylotypes (0.1): pt _ 00068 (Prevotella timonensis)

[0172] 0.24% Phylotypes (1): pt _ 00008 (Selenomonas sputigena | Selenomonas noxia | Selenomonas infelix)

[0173] 0.23% Taxonomy (Species): Lactobacillus jensenii

[0174] 0.23% Phylotypes (0.5): pt _ 00036 (Slackia exigua | Gordonibacter urolithinfaciens | Eggerthella lenta | Asaccharobacter_celatus / Adlercreutzia_equolifaciens | Slackia isoflavoniconvertens)

[0175]

[0176] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0177] 0.23% Phylotypes (0.1): pt _ 00042 (Fenollaria timonensis | Fenollaria massiliensis / timonensis)

[0178] 0.23% Taxonomy (Family): Pasteurellaceae

[0179] 0.23% Phylotypes (0.1): pt _ 00041 (Lactobacillus crispatus / helveticus |

[0180] Lactobacillus crispatus)

[0181] 0.23% Taxonomy (Species): Prevotella buccalis

[0182] 0.23% Phylotypes (1): pt _ 00016 (Peptoniphilus vaginalis | Helcococcus kunzii | Peptoniphilus harei | Lagierella massiliensis | Peptoniphilus catoniae | Peptoniphilus asaccharolyticus / harei | Peptoniphilus obesi | Peptoniphilus grossensis / harei | Helcococcus sueciensis | Peptoniphilus asaccharolyticus)

[0183] 0.23% Phylotypes (0.5): pt _ 00011 (Lactobacillus coleohominis | Lactobacillus pontis | Lactobacillus oris / reuteri | Lactobacillus vaginalis | Lactobacillus oris | Lactobacillus panis | Lactobacillus fermentum | Lactobacillus reuteri) 0.23% Phylotypes (0.5): pt _ 00055 (Mycoplasma salivarium | Mycoplasma hominis)

[0184] 0.23% Phylotypes (0.5): pt _ 00029 (Gemella asaccharolytica | Gemella haemolysans / sanguinis | Gemella sanguinis | Gemella haemolysans | Gemella morbillorum)

[0185]

[0186] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0187] 0.23% Phylotypes (1): pt _ 00009 (Selenomonas felix | Dialister micraerophilus |

[0188] Selenomonas infelix | Colibacter massiliensis | Veillonella dispar | Anaeroglobus geminatus | Megasphaera stantonii | Veillonella seminalis | Veillonella atypica / dispar | Acidaminococcus intestini | Dialister pneumosintes | Dialister invisus | Veillonella denticariosi / parvula | Veillonella dispar / parvula | Veillonella parvula | Veillonella dispar / tobetsuensis | Veillonella montpellierensis | Phascolarctobacterium faecium | Veillonella atypica / parvula / rogosae | Megasphaera massiliensis | Veillonella rogosae | Megasphaera hexanoica | Dialister hominis / massiliensis | Negativicoccus succinicivorans | Veillonella atypica / rogosae | Dialister propionicifaciens | Veillonella parvula / tobetsuensis | Phascolarctobacterium succinatutens | Selenomonas sputigena | Megasphaera micronuciformis | Allisonella histaminiformans | Veillonella atypica)

[0189] 0.23% Phylotypes (0.1): pt _ 00014 (Gardnerella vaginalis)

[0190] 0.23% Phylotypes (0.5): pt _ 00037 (Parvimonas micra)

[0191]

[0192] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0193] 0.23% Phylotypes (0.5): pt _ 00033 (Citrobacter amalonaticus / murliniae |

[0194] Enterobacter cloacae / hormaechei | Serratia quinivorans | Pantoea ananatis / dispersa | Klebsiella oxytoca | Enterobacter cloacae | Serratia fonticola | Citrobacter koseri | Enterobacter_kobei / Klebsiella_aerogenes | Citrobacter braakii / freundii | Escherichia coli / fergusonii | Yokenella regensburgei | Escherichia_coli / Shigella_sonnei | Shigella dysenteriae / sonnei | Raoultella planticola | Kluyvera ascorbata / cryocrescens | Proteus mirabilis | Serratia marcescens | Escherichia_coli / Shigella_dysenteriae | Morganella morganii | Enterobacter_cloacae / Escherichia_coli | Shigella flexneri / sonnei | Escherichia coli | Serratia proteamaculans / quinivorans | Escherichia fergusonii | Kluyvera intermedia | Pantoea agglomerans | Enterobacter roggenkampii | Klebsiella aerogenes | Kosakonia_cowanii / Salmonella_bongori | Citrobacter freundii | Kluyvera ascorbata | Erwinia aphidicola / persicina | Enterobacter hormaechei | Klebsiella pneumoniae / variicola | Shigella flexneri | Enterobacter asburiae | Escherichia_coli / Shigella_flexneri | Shigella sonnei | Erwinia_billingiae / [Pantoea]_cedenensis | Pantoea_agglomerans / Escherichia_coli | Franconibacter helveticus | Enterobacter asburiae / cloacae | Klebsiella milletis / pneumoniae | Salmonella enterica | Enterobacter_hormaechei / Enterobacter_cloacae / Escherichia_coli | Klebsiella pneumoniae)

[0195] 0.23% Phylotypes (0.1): pt _ 00002 (Gardnerella vaginalis)

[0196] 0.23% Phylotypes (0.1): pt _ 00006 (Pseudomonas fluorescens / protegens / veronii | Pseudomonas fragi | Pseudomonas veronii | Pseudomonas fluorescens / veronii | Pseudomonas fragi / weihenstephanensis)

[0197] 0.23% Phylotypes (0.5): pt _ 00088 (Moryella indoligenes |

[0198] Moryella_indoligenes / Fusobacterium_naviforme | Stomatobaculum longum)

[0199] 0.23% Taxonomy (Species): Lachnospiraceae

[0200]

[0201] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0202] 0.23% Phylotypes (0.5): pt _ 00023 (Peptoniphilus lacrimalis | Peptoniphilus phoceensis / timonensis | Peptoniphilus grossensis | Peptoniphilus harei | Peptoniphilus duerdenii | Peptoniphilus grossensis / lacydonensis | Peptoniphilus senegalensis | Peptoniphilus harei / lacydonensis | Peptoniphilus lacydonensis)

[0203] 0.23% Phylotypes (0.5): pt _ 00012 (Pseudomonas oleovorans | Pseudomonas fluorescens | Pseudomonas entomophila / putida | Pseudomonas thivervalensis | Pseudomonas fluorescens / protegens | Pseudomonas guguanensis | Pseudomonas protegens | Pseudomonas fluorescens / protegens / veronii | Pseudomonas fragi | Pseudomonas graminis | Pseudomonas oleovorans / peli | Pseudomonas brenneri | Pseudomonas plecoglossicida / putida | Pseudomonas alcaligenes | Pseudomonas nitroreducens | Pseudomonas putida | Pseudomonas oryzihabitans | Pseudomonas cichorii / putida | Pseudomonas monteilii / putida | Pseudomonas xanthomarina | Pseudomonas aeruginosa | Pseudomonas stutzeri | Pseudomonas alcaliphila / mendocina | Pseudomonas rhizosphaerae | Pseudomonas japonica | Pseudomonas luteola | Pseudomonas veronii | Pseudomonas fluorescens / veronii | Pseudomonas fragi / weihenstephanensis | Pseudomonas koreensis Pseudomonas formosensis | Pseudomonas rhizosphaerae / vranovensis | Pseudomonas guezennei / otitidis)

[0204]

[0205] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0206] 0.23% Phylotypes (1): pt _ 00003 (Alloscardovia omnicolens | Brachybacterium timonense | Actinotignum schaalii / timonense | Kytococcus schroeteri | Micrococcus luteus | Winkia neuii | Brachybacterium paraconglomeratum | Actinomyces hongkongensis | Brevibacterium lutecium | Actinomyces naeslundii | Actinomyces viscosus | Brevibacterium celere / sanguinis | Actinomyces massiliensis | Schaalia_odontolytica / Actinomyces_pacaensis | Pseudoclavibacter faecalis | Bifidobacterium scardovii | Rothia aeria | Brevibacterium casei / sanguinis | Rothia mucilaginosa | Janibacter indicus | Pseudoglutamicibacter albus | Gardnerella vaginalis | Gleimia europaea | Actinotignum schaalii | Bifidobacterium pseudocatenulatum | Dermacoccus nishinomiyaensis | Rothia aeria / dentocariosa | Varibaculum timonense | Kocuria rhizophila | Brevibacterium paucivorans | Actinomyces oris | Trueperella bernardiae | Trueperella pyogenes | Varibaculum cambriense / timonense | Dermabacter hominis / jinjuensis | Bifidobacterium breve | Pseudoglutamicibacter albus / cumminsii | Arcanobacterium ihumii | Varibaculum anthropi / massiliense | Actinomyces urogenitalis | Bifidobacterium longum | Dermabacter jinjuensis | Actinotignum timonense | Varibaculum vaginae | Mobiluncus curtisii | Flaviflexus huanghaiensis / massiliensis / salsibiostraticola | Brevibacterium sanguinis | Micrococcus luteus / terreus | Rothia dentocariosa | Bifidobacterium dentium | Dermabacter hominis | Actinomyces hongkongensis / pacaensis | Clavibacter michiganensis | Bifidobacterium bifidum | Brachybacterium muris | Arthrobacter russicus | Actinobaculum massiliense | Actinomyces_hongkongensis / Rothia_dentocariosa | Brevibacterium ravenspurgense | Bifidobacterium animalis | Rothia amarae | Rhodoluna lacicola | Pseudoclavibacter bifida | Micrococcus luteus / yunnanensis | Varibaculum cambriense | Varibaculum anthropi / cambriense | Arcanobacterium urinimassiliense | Varibaculum anthropi | Mobiluncus curtisii / holmesii | Bifidobacterium tsurumiense | Kocuria palustris | Mobiluncus mulieris | Varibaculum timonense / vaginae | Kytococcus sedentarius | Micrococcus yunnanensis | Schaalia odontolytica | Brevibacterium mcbrellneri | Actinomyces oris / viscosus | Bifidobacterium

[0207]

[0208] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0209] angulatum | Micrococcus lylae | Pseudoglutamicibacter cumminsii | Trueperella bemardiae / pyogenes | Nesterenkonia alba | Bifidobacterium adolescentis | Bifidobacterium animalis / dentium | Schaalia turicensis)

[0210] 0.23% Phylotypes (0.1): pt _ 00090 (Dialister micraerophilus)

[0211] 0.23% Phylotypes (1): pt _ 00020 (Ureaplasma parvum / urealyticum |

[0212] Ureaplasma parvum | Ureaplasma urealyticum)

[0213] 0.23% Phylotypes (0.1): pt _ 00058 (Mycoplasma hominis)

[0214] 0.23% Phylotypes (1): pt _ 00014 (Slackia exigua | Olsenella urininfantis |

[0215] Senegalimassilia anaerobia | Atopobium vaginae | Slackia isoflavoniconvertens)

[0216]

[0217] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0218] 0.23% Phylotypes (1): pt _ 00002 (Peptoniphilus lacrimalis | Anaerococcus rubeinfantis / vaginalis | Parvimonas micra | Anaerococcus hydrogenalis | [Bacteroides] coagulans | Levyella massiliensis | Ndongobacter massiliensis | Urinicoccus massiliensis | Anaerococcus provencensis | Anaerococcus mediterraneensis | Ezakiella massiliensis / peruensis | Anaerococcus lactolyticus | Anaerococcus nagyae | Neofamilia massiliensis | Peptoniphilus coxii | Peptoniphilus harei | Finegoldia magna | Anaerococcus marasmi | Anaerococcus hydrogenalis / rubeinfantis | Lagierella massiliensis | Anaerococcus rubeinfantis | Peptoniphilus_urinimassiliensis / Candidatus_Peptoniphilus_massiliensis | Fenollaria massiliensis | Murdochiella vaginalis | Anaerococcus tetradius | Anaerococcus prevotii / tetradius | Anaerococcus octavius | Anaerococcus hydrogenalis / jeddahensis | Peptoniphilus grossensis | Murdochiella asaccharolytica | Peptoniphilus urinimassiliensis | Anaerococcus marasmi / prevotii | Peptoniphilus grossensis / lacydonensis | Anaerococcus senegalensis / vaginalis | Peptoniphilus obesi | Peptoniphilus phoceensis / timonensis | Anaerococcus senegalensis | Fenollaria massiliensis / timonensis | Murdochiella_asaccharolytica / Levyella_massiliensis | Candidatus Peptoniphilus massiliensis | Anaerococcus vaginalis | Anaerococcus hydrogenalis / vaginalis | Murdochiella massiliensis | Peptoniphilus coxii / pacaensis | Fenollaria timonensis | Anaerococcus prevotii | Peptoniphilus pacaensis | Peptoniphilus duerdenii | Peptoniphilus senegalensis | Ezakiella massiliensis | Peptoniphilus harei / lacydonensis | Anaerococcus urinomassiliensis | Peptoniphilus lacydonensis | Anaerococcus lactolyticus / mediterraneensis)

[0219] 0.23% Phylotypes (0.1): pt _ 00016 (Finegoldia magna)

[0220] 0.23% Phylotypes (0.5): pt _ 00040 (Veillonella rogosae | Veillonella denticariosi / parvula | Veillonella parvula / tobetsuensis | Veillonella dispar / parvula | Veillonella parvula | Veillonella dispar / tobetsuensis | Veillonella dispar | Veillonella seminalis | Veillonella atypica / rogosae | Veillonella montpellierensis | Veillonella atypica / parvula / rogosae | Veillonella atypica / dispar | Veillonella atypica)

[0221]

[0222] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0223] 0.23% Phylotypes (0.1): pt _ 00034 (Sneathia amnii)

[0224] 0.22% Phylotypes (0.1): pt _ 00036 (Sneathia sanguinegens)

[0225] 0.22% Phylotypes (0.1): pt _ 00030 (Corynebacterium pseudogenitalium |

[0226] Corynebacterium pseudogenitalium / tuberculostearicum | Corynebacterium tuberculostearicum)

[0227] 0.22% Phylotypes (0.5): pt _ 00071 (Porphyromonas endodontalis)

[0228] 0.22% Phylotypes (0.1): pt _ 00007 (Staphylococcus haemolyticus / hominis |

[0229] Staphylococcus warneri | Staphylococcus caprae / epidermidis | Staphylococcus hominis / lugdunensis | Staphylococcus aureus | Staphylococcus capitis | Staphylococcus pseudintermedius | Staphylococcus auricularis | Staphylococcus caprae | Staphylococcus epidermidis / lugdunensis | Staphylococcus haemolyticus | Staphylococcus hominis | Staphylococcus hyicus | Staphylococcus aureus / hyicus | Staphylococcus saccharolyticus | Staphylococcus devriesei / epidermidis | Staphylococcus epidermidis / hyicus | Staphylococcus lugdunensis | Staphylococcus pettenkoferi | Staphylococcus epidermidis)

[0230] 0.22% Taxonomy (Genus): Sneathia

[0231] 0.22% GST (Valencia): V sim

[0232] 0.22% Taxonomy (Species): Prevotella bivia

[0233]

[0234] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0235] 0.22% Phylotypes (1): pt _ 00035

[0236] (Agrobacterium tumefaciens / Rhizobium pusense | Oceanicola granulosus | Pleomorphomonas koreensis / oryzae | Brevundimonas aurantiaca / mediterranea | Aureimonas ureilytica | Caulobacter vibrioides | Bosea lupin! | Methylobacterium adhaesivum | Methylobacterium radiotolerans | Methylorubrum populi / zatmanii | Methylobacterium aquaticum | Methylorubrum rhodesianum | Afipia_massiliensis / Bradyrhizobiumjicamae | Bradyrhizobium elkanii | Methylorubrum rhodinum / salsuginis | Pannonibacter carbonis | Agrobacterium larrymoorei | Brevundimonas lenta / subvibrioides | Shinella zoogloeoides | Paracoccus solventivorans | Rhizobium grahamii | Mesorhizobium loti | Microvirga calopogonii / flocculans / lotononidis | Rhizobium arenae | Paracoccus siganidrum | Paracoccus yeei | Microvirga guangxiensis | Bosea vestrisii | Caulobacter segnis / vibrioides | Bosea robiniae / thiooxidans | Methylobacterium goesingense | Paracoccus marcusii | Microvirga flocculans / makkahensis | Chelatococcuscomposti / daeguensis | Bosea thiooxidans | Ochrobactrum pituitosum / rhizosphaerae | Microvirga aerilata | Afipia birgiae | Rhizobium rosettiformans / Agrobacterium tumefaciens | Methylobacterium adhaesivum / goesingense | Aquabacter spiritensis | Brevundimonas vesicularis | Paracoccus contaminans | Afipia genosp._1 / genosp._2 | Methylocystis hirsuta / parvus | Phyllobacterium myrsinacearum | Paracoccus marinus / siganidrum | Brevundimonas terrae | Aquamicrobium lusatiense | Brevundimonas diminuta / olei | Brevundimonas bullata | Agrobacterium tumefaciens | Brevundimonas diminuta | Bosea robiniae | Paracoccus pantotrophus | Bradyrhizobium canariense | Methylorubrum extorquens / zatmanii | Rhizobium flavum / halotolerans | Phenylobacterium koreense | Bradyrhizobium jicamae | Phreatobacter stygius | Bradyrhizobium japonicum | Rhizobium petrolearium)

[0237] 0.22% Taxonomy (Genus): Veillonella

[0238]

[0239] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0240] 0.22% Phylotypes (0.5): pt _ 00052 (Varibaculum anthropi / cambriense |

[0241] Mobiluncus curtisii / holmesii | Varibaculum anthropi / massiliense | Mobiluncus mulieris | Mobiluncus holmesii | Mobiluncus curtisii | Varibaculum cambriense)

[0242] 0.22% Phylotypes (1): pt _ 00013 (Ralstonia insidiosa | Alcaligenes faecalis |

[0243] Herbaspirillum lusitanum | Acidovorax caeni | Pseudacidovorax intermedius | Conchiformibius steedae | Massilia eurypsychrophila | Lautropia dentalis | Gomamonas denitrificans / testosteroni | Noviherbaspirillum suwonense | Azospira oryzae | Lautropia mirabilis | Neisseria cinerea | Parasutterella excrementihominis | Sutterella wadsworthensis | Pelomonas aquatica / puraquae | Achromobacter insuavis / xylosoxidans | Vogesella perlucida | Neisseria elongata | Neisseria bacilliformis | Massilia consociata | Neisseria flavescens / perflava | Acidovorax temperans | Curvibacter lanceolatus | Massilia timonae | Telluria mixta | Candidatus Methylopumilus turicensis | Xylophilus ampelinus | Tepidiphilus succinatimandens | Acidovorax wautersii | Neisseria gonorrhoeae | Neisseria flavescens / meningitidis | Neisseria mucosa | Acidovorax avenae / wautersii | Gomamonas koreensis / sediminis | Massilia_eurypsychrophila / Oligella_urethralis | Massilia aerilata | Gomamonas aquatica | Achromobacter xylosoxidans | Neisseria oralis | Gomamonas denitrificans | Mesosutterella multiformis | Neisseria meningitidis | Gomamonas aquatica / jiangduensis / kerstersii | Mitsuaria chitosanitabida | Massilia consociata / varians | Morococcus cerebrosus | Microvirgula aerodenitrificans | Oligella urethralis | Variovorax ginsengisoli | Variovorax_soli / Xenophilus_aerolatus | Achromobacter spanius / xylosoxidans | Dechloromonas agitata | Variovorax guangxiensis / paradoxus | Eikenella corrodens | Massilia consociata / niastensis | Methylophilus leisingeri | Acidovorax delafieldii | Achromobacter spanius | Parasutterella secunda | Kingella oralis | Pelomonas puraquae / saccharophila | Schlegelella thermodepolymerans | Neisseria flavescens / pharyngis | Janthinobacterium lividum | Massilia putida | Tepidimonas aquatica | Gomamonas koreensis | Neisseria flavescens / subflava | Neisseria flava | Zoogloea oryzae | Ralstonia

[0244]

[0245] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0246] pickettii | Neisseria mucosa / pharyngis | Pelomonas aquatica | Comamonas testosteroni | Massilia varians | Paraburkholderia caledonica / strydomiana | Massilia armeniaca / dura | Neisseria perflava | Neisseria sicca | Massilia aurea / timonae | Neisseria subflava | Ralstonia insidiosa / pickettii | Comamonas jiangduensis | Burkholderia cepacia | Brachymonas denitrificans | Neisseria mucosa / sicca | Massilia haematophila | Polynucleobacter acidiphobus | Sutterella stercoricanis | Acidovorax radicis Methyloversatilis universalis Paraburkholderia caballeronis | Oxalobacter formigenes | Neisseria flavescens | Nitrosospira briensis Melaminivora alkalimesophila | Herbaspirillum huttiense)

[0247] 0.22% Taxonomy (Family): Hungateiclostridiaceae

[0248] 0.22% Phylotypes (0.5): pt _ 00057 (Lactobacillus acidophilus | Lactobacillus crispatus / helveticus | Lactobacillus crispatus | Lactobacillus delbrueckii) 0.22% Phylotypes (0.5): pt _ 00043 (Sneathia amnii | Sneathia sanguinegens)

[0249] 0.22% Phylotypes (0.5): pt _ 00038 (Atopobium vaginae)

[0250] 0.22% Taxonomy (Family): Mycoplasmataceae

[0251] 0.22% Phylotypes (0.5): pt _ 00020 (Anaerococcus lactolyticus | Anaerococcus octavius | Anaerococcus marasmi | Anaerococcus provencensis | Anaerococcus urinomassiliensis | Anaerococcus mediterraneensis | Anaerococcus lactolyticus / mediterraneensis)

[0252]

[0253] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0254] 0.22% Phylotypes (1): pt _ 00010 (Varibaculum anthropi / cambriense |

[0255] Mobiluncus curtisii / holmesii | Actinomyces pacaensis | Arcanobacterium ihumii | Actinomyces_urinae / Gleimia_europaea | Varibaculum cambriense / timonense | Scardovia wiggsiae | Mobiluncus holmesii | Varibaculum timonense | Schaalia turicensis | Actinomyces hongkongensis | Actinomyces oris | Gardnerella vaginalis | Mobiluncus curtisii | Varibaculum cambriense | Actinotignum schaalii)

[0256] 0.22% Phylotypes (0.1): pt _ 00012 (Prevotella buccalis)

[0257] 0.22% Phylotypes (0.5): pt _ 00039 (Saccharofermentans acetigenes) 0.22% Taxonomy (Genus): Streptococcus

[0258] 0.22% Phylotypes (0.5): pt _ 00050 (Gardnerella vaginalis | Scardovia wiggsiae)

[0259] 0.22% Taxonomy (Genus): Staphylococcus

[0260] 0.22% Taxonomy (Species): Megasphaera

[0261] 0.22% Taxonomy (Family): Leptotrichiaceae

[0262] 0.22% Taxonomy (Species): Finegoldia magna

[0263] 0.22% Taxonomy (Genus): Lactobacillus

[0264] 0.22% Taxonomy (Species): Parvimonas micra

[0265] 0.22% Phylotypes (0.1): pt _ 00069 (Anaerococcus vaginalis)

[0266] 0.22% Phylotypes (0.5): pt _ 00104 (Porphyromonas asaccharolytica / uenonis |

[0267] Porphyromonas uenonis | Porphyromonas asaccharolytica)

[0268] 0.22% Taxonomy (Genus): Enterococcus

[0269] 0.22% Taxonomy (Family): Streptococcaceae

[0270] 0.22% Phylotypes (0.5): pt _ 00032 (Arcanobacterium ihumii | Arcanobacterium urinimassiliense | Actinomyces_hongkongensis / Rothia_dentocariosa | Flaviflexus huanghaiensis / massiliensis / salsibiostraticola | Trueperella pyogenes | Actinotignum schaalii / timonense | Varibaculum timonense | Actinotignum timonense | Trueperella bernardiae / pyogenes | Actinomyces hongkongensis | Trueperella bernardiae | Actinobaculum massiliense | Actinotignum schaalii)

[0271]

[0272] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0273] 0.22% Taxonomy (Genus): Gardnerella

[0274] 0.22% Phylotypes (0.1): pt _ 00039 (Lactobacillus crispatus / helveticus |

[0275] Lactobacillus crispatus / gallinarum | Lactobacillus crispatus | Lactobacillus helveticus)

[0276] 0.22% Phylotypes (0.5): pt _ 00041 (Fenollaria massiliensis | Fenollaria timonensis | Fenollaria massiliensis / timonensis)

[0277] 0.22% Phylotypes (0.1): pt _ 00027 (Lactobacillus coleohominis)

[0278] 0.22% Phylotypes (0.5): pt _ 00044 (Actinomyces urogenitalis | Varibaculum timonense | Winkia neuii | Actinomyces hongkongensis | Mobiluncus mulieris)

[0279] 0.22% Phylotypes (0.1): pt _ 00029 (Dialister micraerophilus)

[0280] 0.22% Phylotypes (0.5): pt _ 00028 (Prevotella amnii | Prevotella bivia) 0.22% Phylotypes (0.5): pt _ 00065 (Porphyromonas asaccharolytica / uenonis |

[0281] Porphyromonas asaccharolytica | Porphyromonas asaccharolytica / bennonis / uenonis | Porphyromonas uenonis)

[0282] 0.22% Phylotypes (0.1): pt _ 00053 (Prevotella disiens)

[0283] 0.22% Phylotypes (0.1): pt _ 00005 (Lactobacillus gasseri / johnsonii / paragasseri | Lactobacillus gasseri | Lactobacillus johnsonii | Lactobacillus gasseri / paragasseri | Lactobacillus gasseri / johnsonii) 0.22% Phylotypes (0.5): pt _ 00045 (Campylobacter hominis | Campylobacter rectus / showae | Campylobacter gracilis | Campylobacter ureolyticus | Campylobacter showae | Campylobacter concisus | Campylobacter gracilis / hominis | Campylobacter canadensis | Campylobacter hominis / sputorum | Campylobacter sputorum | Campylobacter canadensis / hominis)

[0284] 0.21% Taxonomy (Genus): Haemophilus

[0285] 0.21% Taxonomy (Genus): Aerococcus

[0286] 0.21% Phylotypes (0.1): pt _ 00025 (Lactobacillus oris / reuteri | Lactobacillus reuteri)

[0287]

[0288] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0289] 0.21% Phylotypes (0.5): pt _ 00046 (Enterocloster asparagiformis / lavalensis |

[0290] Lactonifactor longoviformis | Dorea formicigenerans | Ruminococcus lactaris | Massilistercora timonensis | [Ruminococcus] torques | Coprococcus_phoceensis / Tyzzerella_nexilis | Lachnoclostridium phocaeense | Eubacterium ventriosum | [Ruminococcus] gnavus | Dorea longicatena | Mediterraneibacter massiliensis | Enterocloster bolteae / clostridioformis | Coprococcus comes | Fusicatenibacter saccharivorans | Hungatella hathewayi | Faecalimonas umbilicata | [Clostridium] symbiosum | Dorea phocaeensis | Enterocloster citroniae | Enterocloster aldensis | Sellimonas intestinalis | Blautia hominis | Faecalicatena contorta | Ruminococcus faecis | Lacrimispora amygdalina / saccharolytica | Lachnospira eligens | Anaerostipes hadrus | Lachnoclostridium pacaense)

[0291] 0.21% Taxonomy (Genus): Dialister

[0292] 0.21% Taxonomy (Species): Dialister micraerophilus

[0293] 0.21% Phylotypes (1): pt _ 00029 (Terrisporobacter glycolicus |

[0294] Peptostreptococcus stomatis | Paeniclostridium sordellii | Clostridioides difficile | Romboutsia timonensis | Intestinibacter bartlettii | Peptostreptococcus anaerobius)

[0295] 0.21% Phylotypes (0.1): pt _ 00017 (Atopobium vaginae)

[0296]

[0297] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0298] 0.21% Phylotypes (1): pt _ 00012 (Halomonas hamiltonii | Psychrobacter faecalis / pulmonis | Psychrobacter piechaudii / sanguinis | Moraxella catarrhalis / nonliquefaciens | Acinetobacter modestus | Pseudomonas nitroreducens | Acinetobacter baumannii / calcoaceticus | Pseudomonas cichorii / putida | Pseudomonas monteilii / putida | Pseudomonas alcaliphila / mendocina | Pseudomonas veronii | Psychrobacter alimentarius | Pseudomonas formosensis | Pseudomonas guezennei / otitidis | Acinetobacter calcoaceticus / pittii | Psychrobacter sanguinis | Acinetobacter harbinensis / lwoffii | Pseudomonas oleovorans | Psychrobacter cibarius | Pseudomonas guguanensis | Acinetobacter schindleri | Acinetobacter radioresistens | Alkanindiges hongkongensis | Pseudomonas plecoglossicida / putida | Pseudomonas brenneri | Psychrobacter urativorans | Pseudomonas alcaligenes | Acinetobacter bereziniae / guillouiae | Pseudomonas oryzihabitans | Acinetobacter bereziniae / guillouiae / junii | Marinomonas arctica / rhizomae | Pseudomonas rhizosphaerae | Prolinoborus fasciculus / Acinetobacter lwoffii | Acinetobacter guillouiae / junii | Pseudomonas fluorescens / veronii | Acinetobacter calcoaceticus / schindleri | Acinetobacter bouvetii / johnsonii | Marinobacter hydrocarbonoclasticus | Pseudomonas fluorescens | Pseudomonas thivervalensis | Psychrobacter namhaensis | Pseudomonas fluorescens / protegens | Moraxella osloensis | Pseudomonas fluorescens / protegens / veronii | Acinetobacter haemolyticus | Pseudomonas graminis | Acinetobacter Iwoffii | Pseudomonas oleovorans / peli | Cardiobacterium hominis | Psychrobacter alimentarius / aquaticus | Pseudomonas xanthomarina | Acinetobacter ursingii | Acinetobacter calcoaceticus | Pseudomonas stutzeri | Acinetobacter pittii | Halomonas meridiana | Halomonas ventosae | Pseudomonas japonica | Pseudomonas luteola | Acinetobacter parvus | Pseudomonas koreensis | Cellvibrio mixtus | Pseudomonas fragi / weihenstephanensis | Pseudomonas rhizosphaerae / vranovensis | Pseudomonas entomophila / putida | Pseudomonas protegens | Moraxella catarrhalis | Pseudomonas fragi | Psychrobacter phenylpyruvicus | Pseudomonas putida | Acinetobacter baumannii | Acinetobacter johnsonii

[0299]

[0300] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0301] | Pseudomonas aeruginosa | Acinetobacter guillouiae | Psychrobacter cryohalolentis / fozii | Acinetobacter junii | Salinicola acroporae / salarius)

[0302] 0.21% Phylotypes (0.5): pt _ 00144 (Peptococcus niger | Peptococcus simiae)

[0303] 0.21% Taxonomy (Genus): Peptoniphilus

[0304]

[0305] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0306] 0.21% Phylotypes (0.5): pt _ 00035 (Brevibacterium casei / sanguinis | Rothia mucilaginosa | Dermabacter jinjuensis | Brachybacterium timonense | Janibacter indicus | Pseudoglutamicibacter albus | Kytococcus schroeteri | Micrococcus luteus | Dermacoccus nishinomiyaensis | Brevibacterium sanguinis | Rothia aeria / dentocariosa | Kocuria rhizophila | Brachybacterium paraconglomeratum | Brevibacterium paucivorans | Micrococcus luteus / terreus | Rothia dentocariosa | Kocuria palustris | Kytococcus sedentarius | Brevibacterium luteolum | Dermabacter hominis | Dermabacter hominis / jinjuensis Micrococcus yunnanensis | Brevibacterium celere / sanguinis | Brachybacterium muris | Pseudoglutamicibacter albus / cumminsii | Brevibacterium mcbrellneri | Arthrobacter russicus | Micrococcus lylae | Brevibacterium ravenspurgense | Rothia amarae | Pseudoglutamicibacter cumminsii | Nesterenkonia alba | Micrococcus luteus / yunnanensis | Rothia aeria) 0.21% Taxonomy (Genus): Escherichia

[0307] 0.21% Phylotypes (0.5): pt _ 00060 (Eremococcus coleocola | Facklamia hominis | Facklamia languida)

[0308] 0.21% Phylotypes (0.5): pt _ 00047 ([Bacteroides] coagulans | Ezakiella massiliensis / peruensis | Ezakiella massiliensis)

[0309] 0.21% Phylotypes (0.1): pt _ 00064 (Atopobium vaginae)

[0310] 0.21% Taxonomy (Species): Aerococcus christensenii

[0311] 0.21% Taxonomy (Genus): Atopobium

[0312] 0.21% Phylotypes (0.1): pt _ 00055 (Peptostreptococcus anaerobius) 0.21% CST (Valencia): lll-A_sim

[0313] 0.21% Taxonomy (Species): Dialister

[0314] 0.21% Phylotypes (0.5): pt _ 00042 (Varibaculum anthropi | Varibaculum anthropi / cambriense | Varibaculum anthropi / massiliense | Varibaculum cambriense / timonense | Varibaculum timonense | Varibaculum vaginae | Varibaculum cambriense | Varibaculum timonense / vaginae)

[0315] 0.21% Taxonomy (Species): Gardnerella vaginalis

[0316]

[0317] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0318] 0.21% Phylotypes (1): pt _ 00018 (Enterocloster asparagiformis / lavalensis |

[0319] Ruminococcus lactaris | Lactonifactor longoviformis | Dorea formicigenerans | Massilistercora timonensis | Roseburia intestinalis | [Ruminococcus] torques | Blautia wexlerae | Coprococcus_phoceensis / Tyzzerella_nexilis | Blautia massiliensis | Lachnoclostridium phocaeense | Eubacterium ventriosum | [Clostridium] scindens | [Eubacterium] rectale | Eubacterium ramulus | Oribacterium parvum | Anaerobutyricum soehngenii | Shuttleworthia satelles | Roseburia hominis | [Ruminococcus] gnavus | Dorea longicatena | Anaerobutyricum hallii | Mediterraneibacter massiliensis | Lachnoanaerobaculum saburreum | Coprococcus eutactus | Roseburia faecis | Oribacterium asaccharolyticum | Blautia luti / obeum | Eisenbergiella tayi | Enterocloster bolteae / clostridioformis | Blautia faecis | Anaerotignum lactatifermentans | Eisenbergiella massiliensis | Coprococcus comes | Blautia glucerasea | Roseburia inulinivorans | Fusicatenibacter saccharivorans | Lachnoanaerobaculum orale | Blautia hominis / massiliensis | Moryella_indoligenes / Fusobacterium_naviforme | [Clostridium] symbiosum | Faecalimonas umbilicata | Hungatella hathewayi | Lachnobacterium bovis / Eubacterium ramulus | Stomatobaculum longum | Blautia luti | Dorea phocaeensis | Enterocloster citroniae | Enterocloster aldensis | Frisingicoccus caecimuris | Moryella indoligenes | Sellimonas intestinalis | Lachnoanaerobaculum gingivalis | Blautia hominis | Blautia obeum | Kineothrix_alysoides / Eisenbergiella_massiliensis | Faecalicatena contorta | Anaerotignum faecicola | Ruminococcus faecis | Lacrimispora amygdalina / saccharolytica | Butyrivibrio hungatei | Lachnoanaerobaculum umeaense | Oribacterium sinus | Blautia obeum / wexlerae | Lachnospira eligens | Anaerostipes hadrus | Lachnoclostridium pacaense)

[0320] 0.21% Taxonomy (Genus): Pseudomonas

[0321] 0.21% Phylotypes (0.5): pt _ 00099 (Sutterella stercoricanis | Mesosutterella multiformis | Sutterella wadsworthensis | Parasutterella secunda)

[0322]

[0323] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0324] 0.21% Phylotypes (0.5): pt _ 00075 (Acinetobacter bouvetii / johnsonii |

[0325] Acinetobacter haemolyticus | Acinetobacter schindleri | Acinetobacter Iwoffii | Acinetobacter radioresistens | Alkanindiges hongkongensis | Acinetobacter modestus | Acinetobacter bereziniae / guillouiae | Acinetobacter baumannii / calcoaceticus | Acinetobacter baumannii | Acinetobacter ursingii | Acinetobacter johnsonii | Acinetobacter calcoaceticus | Acinetobacter bereziniae / guillouiae / junii | Acinetobacter pittii | Acinetobacter guillouiae | Prolinoborus_fasciculus / Acinetobacter_lwoffii | Acinetobacter parvus | Acinetobacter guillouiae / junii | Acinetobacter calcoaceticus / schindleri | Acinetobacter junii | Acinetobacter calcoaceticus / pittii | Acinetobacter harbinensis / lwoffii)

[0326] 0.21% Taxonomy (Family): Corynebacteriaceae

[0327] 0.21% Phylotypes (0.5): pt _ 00058 (Howardella spp)

[0328] 0.21% Phylotypes (0.5): pt _ 00089 (Lagierella massiliensis)

[0329] 0.21% Phylotypes (0.1): pt _ 00010 (Aerococcus christensenii)

[0330] 0.21% Phylotypes (0.5): pt _ 00064

[0331] (Schaalia_odontolytica / Actinomyces_pacaensis | Gleimia europaea | Schaalia turicensis | Schaalia odontolytica)

[0332] 0.21% Phylotypes (0.1): pt _ 00018 (Dialister propionicifaciens)

[0333] 0.21% Taxonomy (Genus): Achromobacter

[0334] 0.21% Phylotypes (0.1): pt _ 00060 (Prevotella corporis)

[0335] 0.21% Phylotypes (0.1): pt _ 00028 (Parvimonas micra)

[0336] 0.21% Phylotypes (0.1): pt _ 00044 (Corynebacterium aurimucosum)

[0337] 0.21% Phylotypes (0.1): pt _ 00011 (Prevotella amnii)

[0338] 0.21% Phylotypes (1): pt _ 00022 (Fastidiosipila sanguinis |

[0339] Saccharofermentans acetigenes)

[0340] 0.21% Taxonomy (Genus): Fusobacterium

[0341] 0.21% Phylotypes (0.1): pt _ 00031 (Anaerococcus prevotii | Anaerococcus tetradius | Anaerococcus prevotii / tetradius)

[0342] 0.20% CST (Valencia): IV-C3_sim

[0343]

[0344] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0345] 0.20% Taxonomy (Family): Terrabacteria group

[0346] 0.20% Taxonomy (Genus): Mycoplasma

[0347] 0.20% Phylotypes (0.1): pt _ 00066 (Peptoniphilus coxii | Peptoniphilus coxii / pacaensis)

[0348] 0.20% Phylotypes (0.1): pt _ 00050 ([Bacteroides] coagulans)

[0349] 0.20% Taxonomy (Genus): Lachnospiraceae

[0350] 0.20% Taxonomy (Genus): Dermabacter

[0351] 0.20% Taxonomy (Genus): Anaerococcus

[0352] 0.20% Phylotypes (1): pt _ 00023 (Porphyromonas catoniae | Porphyromonas bennonis | Porphyromonas asaccharolytica / uenonis | Porphyromonas endodontalis | Porphyromonas asaccharolytica / bennonis / uenonis | Porphyromonas uenonis | Porphyromonas asaccharolytica)

[0353] 0.20% Taxonomy (Family): Atopobiaceae

[0354] 0.20% Phylotypes (0.1): pt _ 00080 (Enterococcus faecium | Enterococcus faecalis / faecium | Enterococcus faecalis)

[0355] 0.20% Phylotypes (0.5): pt _ 00066 (Murdochiella massiliensis | Murdochiella asaccharolytica | Levyella massiliensis | Ndongobacter massiliensis | Murdochiella vaginalis | Murdochiella_asaccharolytica / Levyella_massiliensis) 0.20% Phylotypes (0.1): pt _ 00097 (Porphyromonas asaccharolytica / uenonis |

[0356] Porphyromonas uenonis | Porphyromonas asaccharolytica)

[0357] 0.20% Phylotypes (0.1): pt _ 00035 (Peptoniphilus lacrimalis)

[0358] 0.20% Phylotypes (0.1): pt _ 00023 (Prevotella colorans)

[0359] 0.20% Taxonomy (Genus): Megasphaera

[0360] 0.20% Phylotypes (1): pt _ 00026 (Campylobacter hominis | Campylobacter rectus / showae | Campylobacter gracilis | Campylobacter ureolyticus | Campylobacter showae | Campylobacter concisus | Campylobacter gracilis / hominis | Campylobacter hominis / sputorum | Campylobacter canadensis | Campylobacter sputorum | Campylobacter canadensis / hominis)

[0361]

[0362] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0363] 0.20% CST (Valencia): IV-C1_sim

[0364] 0.20% Taxonomy (Genus): Ureaplasma

[0365] 0.20% Taxonomy (Species): Prevotella timonensis

[0366] 0.20% Phylotypes (1): pt _ 00036 (Mycoplasma hominis)

[0367] 0.20% Taxonomy (Species): Peptoniphilus lacrimalis

[0368] 0.20% Phylotypes (1): pt _ 00048 (Cytophagaceae spp)

[0369] 0.20% Taxonomy (Genus): Coriobacteriia

[0370] 0.20% Phylotypes (0.1): pt _ 00024 (Anaerococcus mediterraneensis) 0.20% Taxonomy (Genus): Peptostreptococcus

[0371] 0.20% Taxonomy (Genus): Actinomyces

[0372] 0.20% Phylotypes (0.5): pt _ 00059 (Porphyromonas bennonis)

[0373] 0.20% Taxonomy (Family): Actinomycetaceae

[0374] 0.20% Phylotypes (0.1): pt _ 00358 (Lachnospiraceae spp)

[0375] 0.20% Phylotypes (0.1): pt _ 00046 (Corynebacterium thomssenii |

[0376] Corynebacterium sundsvallense | Corynebacterium sundsvallense / thomssenii)

[0377] 0.20% Phylotypes (0.5): pt _ 00544 (Actinomyces_urinae / Gleimia_europaea)

[0378] 0.20% Phylotypes (0.1): pt _ 00015 (Achromobacter insuavis / xylosoxidans |

[0379] Achromobacter xylosoxidans | Achromobacter spanius | Achromobacter spanius / xylosoxidans)

[0380] 0.20% Taxonomy (Genus): Neisseria

[0381] 0.20% Phylotypes (0.5): pt _ 00067 (Bacteroides cellulosilyticus | Bacteroides rodentium | Bacteroides finegoldii / thetaiotaomicron | Bacteroides acidifaciens | Bacteroides uniformis | Bacteroides caecimuris / rodentium | Bacteroides salyersiae | Bacteroides faecis | Bacteroides caccae | Bacteroides fragilis | Bacteroides intestinalis | Bacteroides graminisolvens | Bacteroides stercoris | Bacteroides ovatus | Bacteroides finegoldii | Bacteroides cellulosilyticus / intestinalis | Bacteroides eggerthii | Bacteroides thetaiotaomicron | Bacteroides xylanisolvens)

[0382]

[0383] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0384] 0.20% Phylotypes (1): pt _ 00031 (Porphyromonas gingivalis | Parabacteroides merdae | Porphyromonas catoniae | Porphyromonas bennonis | Parabacteroides goldsteinii | Parabacteroides distasonis | Porphyromonas endodontalis | Porphyromonas asaccharolytica | Porphyromonas uenonis | Parabacteroides johnsonii | Porphyromonas somerae | Tannerella forsythia)

[0385] 0.20% Phylotypes (0.1): pt _ 00091 (Prevotella buccalis)

[0386] 0.20% Phylotypes (0.1): pt _ 00038 (Gardnerella vaginalis)

[0387] 0.20% Taxonomy (Genus): Corynebacterium

[0388] 0.20% Taxonomy (Genus): Bifidobacterium

[0389] 0.20% Phylotypes (0.1): pt _ 00073 (Atopobium vaginae)

[0390] 0.20% Taxonomy (Species): Lactobacillus iners

[0391] 0.20% Taxonomy (Family): Aerococcaceae

[0392] 0.20% Phylotypes (0.1): pt _ 00668 (Anaerococcus lactolyticus | Anaerococcus lactolyticus / mediterraneensis)

[0393] 0.20% Taxonomy (Genus): Parvimonas

[0394] 0.20% Taxonomy (Genus): Schaalia

[0395] 0.20% Taxonomy (Family): Bifidobacteriaceae

[0396] 0.20% Phylotypes (0.1): pt _ 00033 (Escherichia_coli / Shigella_dysenteriae |

[0397] Shigella flexneri | Kosakonia_cowanii / Salmonella_bongori | Shigella flexneri / sonnei | Escherichia coli | Escherichia fergusonii | Escherichia coli / fergusonii | Escherichia_coli / Shigella_sonnei | Escherichia_coli / Shigella_flexneri | Shigella dysenteriae / sonnei | Shigella sonnei)

[0398] 0.20% Taxonomy (Genus): Howardella

[0399] 0.20% Phylotypes (0.1): pt _ 00037 (Varibaculum anthropi | Varibaculum anthropi / cambriense | Varibaculum anthropi / massiliense | Varibaculum cambriense / timonense | Varibaculum cambriense)

[0400] 0.20% Taxonomy (Genus): Arcanobacterium

[0401] 0.20% Taxonomy (Genus): Winkia

[0402]

[0403] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0404] 0.20% Taxonomy (Genus): Finegoldia

[0405] 0.20% Phylotypes (1): pt _ 00028 (Intestinibacillus massiliensis |

[0406] Pseudoflavonifractor capillosus | Agathobaculum butyriciproducens | Dysosmobacter welbionis | Mageeibacillus indolicus | Fastidiosipila sanguinis | Papillibacter cinnamivorans | Flavonifractor plautii | Intestinimonas timonensis | Intestinimonas butyriciproducens)

[0407] 0.20% Phylotypes (0.5): pt _ 00086 (Bacteroides coprocola | Bacteroides vulgatus | Bacteroides dorei | Bacteroides massiliensis | Bacteroides plebeius)

[0408] 0.20% Phylotypes (1): pt _ 00025 (Leptotrichia wadei | Sneathia amnii |

[0409] Leptotrichia hongkongensis)

[0410] 0.20% GST (Valencia): IV-C0_sim

[0411] 0.20% Taxonomy (Family): Firmicutes

[0412] 0.20% Phylotypes (0.1): pt _ 00026 (Gemella asaccharolytica)

[0413] 0.20% Phylotypes (0.1): pt _ 04188 (Lactobacillus johnsonii)

[0414] 0.20% Taxonomy (Family): Staphylococcaceae

[0415] 0.20% Phylotypes (0.5): pt _ 00074 (Eremococcus coleocola | Abiotrophia defectiva | Globicatella sulfidifaciens | Facklamia ignava | Facklamia languida)

[0416] 0.20% Phylotypes (0.1): pt _ 00040 (Gardnerella vaginalis)

[0417] 0.20% GST (Valencia): l-A sim

[0418] 0.20% Phylotypes (0.1): pt _ 00092 (Corynebacterium amycolatum / lactis |

[0419] Corynebacterium amycolatum)

[0420] 0.20% Taxonomy (Species): Anaerococcus mediterraneensis

[0421] 0.20% GST (Valencia): IV-C2_sim

[0422] 0.20% Phylotypes (0.1): pt _ 00121 (Corynebacterium simulans / striatum |

[0423] Corynebacterium simulans | Corynebacterium striatum)

[0424] 0.20% Phylotypes (0.1): pt _ 00078 (Anaerococcus vaginalis | Anaerococcus hydrogenalis)

[0425] 0.20% CST (Valencia): score

[0426]

[0427] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0428] 0.20% Phylotypes (0.1): pt _ 00166 (Dermabacter hominis | Dermabacter hominis / jinjuensis | Dermabacter jinjuensis)

[0429] 0.20% Phylotypes (0.1): pt _ 00022 (Lactobacillus acidophilus | Lactobacillus gallinarum | Lactobacillus crispatus)

[0430] 0.20% Taxonomy (Genus): Peptococcus

[0431] 0.20% Phylotypes (0.1): pt _ 00225 (Dialister hominis / massiliensis)

[0432] 0.20% Phylotypes (0.1): pt _ 00077 (Pseudoglutamicibacter albus / cumminsii |

[0433] Pseudoglutamicibacter albus | Pseudoglutamicibacter cumminsii) 0.20% Phylotypes (0.1): pt _ 00075 (Lawsonella clevelandensis)

[0434] 0.20% Phylotypes (1): pt _ 00050 (Duncaniella spp)

[0435] 0.20% Phylotypes (0.1): pt _ 00198 (Bacteroides vulgatus)

[0436] 0.20% GST (Valencia): II sim

[0437] 0.20% Phylotypes (0.1): pt _ 00049 (Bifidobacterium breve)

[0438] 0.20% Taxonomy (Family): Alcaligenaceae

[0439] 0.20% GST (Valencia): l-B_sim

[0440] 0.19% Taxonomy (Species): Pseudomonas veronii

[0441] 0.19% Phylotypes (0.1): pt _ 00054 (Corynebacterium tuscaniense |

[0442] Corynebacterium genitalium)

[0443] 0.19% Phylotypes (1): pt _ 00039 ([Eubacterium] infirmum | Alterileibacterium massiliense | [Eubacterium] saphenum | [Eubacterium] nodatum | Mogibacterium timidum | [Eubacterium] sulci | Mogibacterium diversum / pumilum | Mogibacterium pumilum | [Eubacterium] brachy) 0.19% Phylotypes (0.5): pt _ 00069 (Ruthenibacterium lactatiformans |

[0444] Faecalibacterium prausnitzii)

[0445] 0.19% Phylotypes (0.1): pt _ 00059 (Campylobacter ureolyticus)

[0446] 0.19% Taxonomy (Genus): Sutterella

[0447] 0.19% Phylotypes (0.5): pt _ 00073 (Fastidiosipila sanguinis)

[0448] 0.19% Taxonomy (Family): Bacteria <bacteria>

[0449]

[0450] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0451] 0.19% Phylotypes (0.1): pt _ 00089 (Corynebacterium pyruviciproducens |

[0452] Corynebacterium glucuronolyticum)

[0453] 0.19% Phylotypes (0.1): pt _ 00052 (Varibaculum timonense | Winkia neuii)

[0454] 0.19% CST (Valencia): lll-B_sim

[0455] 0.19% Taxonomy (Family): Peptoniphilaceae

[0456] 0.19% Taxonomy (Family): Peptostreptococcaceae

[0457] 0.19% Phylotypes (0.1): pt _ 00144 (Corynebacterium imitans |

[0458] Corynebacterium hadale / imitans | Corynebacterium hadale)

[0459] 0.19% Taxonomy (Species): Atopobium vaginae

[0460] 0.19% Phylotypes (0.1): pt _ 00032 (Eggerthellaceae spp)

[0461] 0.19% Phylotypes (1): pt _ 00032 (Bacteroides cellulosilyticus | Bacteroides coprocola | Bacteroides coprophilus | Bacteroides rodentium | Bacteroides finegoldii / thetaiotaomicron | Bacteroides nordii | Bacteroides acidifaciens | Bacteroides dorei | Bacteroides uniformis | Bacteroides caecimuris / rodentium | Bacteroides salyersiae | Bacteroides faecis | Bacteroides caccae | Bacteroides fragilis | Bacteroides graminisolvens | Bacteroides intestinalis | Bacteroides vulgatus | Bacteroides massiliensis | Bacteroides plebeius | Bacteroides stercoris | Bacteroides ovatus | Bacteroides finegoldii | Bacteroides oleiciplenus / stercorirosoris | Bacteroides cellulosilyticus / intestinalis | Bacteroides eggerthii | Bacteroides thetaiotaomicron | Bacteroides xylanisolvens)

[0462] 0.19% Phylotypes (1): pt _ 00027 ([Eubacterium] siraeum | Faecalibacterium prausnitzii | Neglecta timonensis | Ruminococcus bicirculans | Negativibacillus massiliensis | Ruthenibacterium lactatiformans | Anaeromassilibacillus senegalensis | Anaerotruncus rubiinfantis | Pseudoruminoccoccus massiliensis | Ruminococcus champanellensis | [Clostridium] leptum | Ruminococcus callidus | Anaerotruncus colihominis | Ruminococcus bromii)

[0463] 0.19% Phylotypes (0.5): pt _ 00130 (Peptoniphilus catoniae | Peptoniphilus obesi)

[0464] 0.19% Taxonomy (Family): Bacillales Family XI. Incertae Sedis

[0465]

[0466] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0467] 0.19% Phylotypes (0.1): pt _ 00104 (Peptoniphilus spp)

[0468] 0.19% Phylotypes (0.1): pt _ 00102 (Anaerococcus octavius)

[0469] 0.19% Phylotypes (0.5): pt _ 00077 (Porphyromonas bennonis | Porphyromonas somerae)

[0470] 0.19% CST (Valencia): IV-C4_sim

[0471] 0.19% Phylotypes (0.5): pt _ 00092 (Blautia luti | Blautia luti / obeum | Blautia faecis | Blautia wexlerae | Blautia massiliensis | Blautia obeum / wexlerae | Blautia glucerasea | Blautia obeum | Blautia hominis / massiliensis) 0.19% Taxonomy (Species): Ureaplasma

[0472] 0.19% Taxonomy (Family): Veillonellaceae

[0473] 0.19% Taxonomy (Species): Achromobacter

[0474] 0.19% Taxonomy (Genus): Gemella

[0475] 0.19% Phylotypes (0.1): pt _ 00143 (Peptoniphilus harei / lacydonensis |

[0476] Peptoniphilus harei | Peptoniphilus lacydonensis)

[0477] 0.19% Phylotypes (0.1): pt _ 00061 (Gorynebacterium coyleae)

[0478] 0.19% Taxonomy (Species): Peptoniphilus asaccharolyticus / harei

[0479] 0.19% Phylotypes (0.5): pt _ 00072 ([Eubacterium] infirmum | Alterileibacterium massiliense | [Eubacterium] sulci | Mogibacterium diversum / pumilum | Mogibacterium pumilum)

[0480] 0.19% Phylotypes (1): pt _ 00068 (Peptococcus niger | Peptococcus simiae)

[0481] 0.19% Phylotypes (0.1): pt _ 00082 (Gorynebacterium senegalense |

[0482] Gorynebacterium mycetoides)

[0483] 0.19% Phylotypes (0.1): pt _ 00062 (Peptoniphilus duerdenii)

[0484] 0.19% Phylotypes (0.1): pt _ 00076 (Facklamia hominis)

[0485] 0.19% Phylotypes (0.1): pt _ 00074 (Porphyromonas bennonis)

[0486] 0.19% Taxonomy (Family): Pseudomonadaceae

[0487] 0.19% Phylotypes (0.1): pt _ 00279 (Goriobacteriia spp)

[0488]

[0489] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0490] 0.19% Phylotypes (0.1): pt _ 00057 (Veillonella denticariosi / parvula | Veillonella parvula / tobetsuensis | Veillonella dispar / parvula | Veillonella parvula | Veillonella dispar / tobetsuensis | Veillonella dispar | Veillonella atypica / rogosae | Veillonella atypica / parvula / rogosae | Veillonella atypica / dispar | Veillonella atypica)

[0491] 0.19% Taxonomy (Family): Fusobacteriaceae

[0492] 0.19% Phylotypes (0.1): pt _ 00103 (Levyella massiliensis | Murdochiella massiliensis | Murdochiella asaccharolytica | Murdochiella_asaccharolytica / Levyella_massiliensis) 0.19% Phylotypes (0.1): pt _ 00186 (Sutterella stercoricanis)

[0493] 0.19% Taxonomy (Species): Anaerococcus prevotii / tetradius

[0494] 0.19% Phylotypes (0.1): pt _ 00085 (Brevibacterium ravenspurgense) 0.19% Phylotypes (0.1): pt _ 00079 (Streptococcus mitis | Streptococcus infantis | Streptococcus oralis | Streptococcus mitis / oralis | Streptococcus mitis / pneumoniae | Streptococcus pneumoniae | Streptococcus pseudopneumoniae | Streptococcus peroris / sanguinis | Streptococcus mitis / pseudopneumoniae)

[0495] 0.19% Taxonomy (Species): Prevotella colorans

[0496] 0.19% Phylotypes (1): pt _ 00042 (Monoglobus pectinilyticus)

[0497] 0.19% Taxonomy (Family): unclassified Clostridiales

[0498] 0.19% Taxonomy (Species): Lactobacillus gasseri / paragasseri

[0499] 0.19% Phylotypes (0.1): pt _ 00128 (Lagierella massiliensis)

[0500] 0.19% Phylotypes (0.1): pt _ 02028 (Howardella spp)

[0501] 0.19% Phylotypes (0.1): pt _ 00112 (Campylobacter hominis | Campylobacter gracilis / hominis)

[0502] 0.19% Phylotypes (0.1): pt _ 00260 (Prevotella bergensis)

[0503] 0.19% Taxonomy (Family): Neisseriaceae

[0504] 0.19% Taxonomy (Family): Enterobacteriaceae

[0505] 0.19% Phylotypes (0.1): pt _ 00109 (Urinicoccus massiliensis)

[0506] 0.19% Taxonomy (Species): Pseudomonas fluorescens / veronii

[0507]

[0508] eEeoEJidsouqoBq peijissepun:(snueg) ALUOUOXEI %8k 0 1z9£ luninoequEA:(snueg) ALUOUOXEI %8l’O £98 snooooiuun:(snueg) ALUOUOXEI %8k 0 398 sipeg 9Bueou| ’mx A|ILUE J se|Eipuiso|o:(A|ILUE J) ALUOUOXEI %8k O 198 eEeoEOOOOOjded:(A|ILUT2 j) ALUOUOXEI %8l‘O 098 eeeoeooooojeius:(A|iLUEd) ALUOUOXBJ. %8l'O 6iz8 eEeoEjeiOBqo|AdujEO:(A|iLUEd) ALUOUOXEI %8l‘O 8178 eBeoBijouioiedisAjs:(A|iLUEd) ALUOUOXEI %8l'O Z17S eEeoE||exEJO| / \|:(A|iLUEd) ALUOUOXEI %6l‘O 9178 eEeoEueiOBqoAEij:(A|ILUE J) ALUOUOXEI %6l’O 9178 EUEnouod:(snue0) ALUOUOXEI %6l‘O rtS (dds E||ej|ESE0) £3100 id:( |_'O) sedAjo|Aqd %6l‘O eve eseoEnejeuns:(A|iwEd) ALUOUOXEI %6l'O 3V8 eseoEisiBjeuAg:(A|iLUEd) ALUOUOXEI %6l‘O 1178 (sisueiiissBLU B||9sn6eio) t?9100 id:(ro) sedAjo|Aqd %6k 0 0178 eEeoEueiOEqouoo:(A|iLUEd) ALUOUOXEI %6l’O 688 (iseqo sniiqdiuoidej) 891-00 id:(k0) sedAio|Aqd %6l’O 888

[0509] W! S“V-AI:(E|oue|EA) ISO %6l’O Z8S eEeoEpEuoLuojAqdjOci:(A|iLUEd) ALUOUOXEI %6l‘O 988 (sisueoueAOJd snooooojeEuv) 80300 id:(ro) sedAjo|Aqd %6l'O 988 (wnie^ief wnuejoEqeuAjoo) L 1100 id:(k'O) sedAjo|Aqj %6l‘O 1788 (EeEdojne-EiLuieio / eEuurrseoALUouijov) 38800 id:(|_‘O) sedAjoiAqd %6l’O £88

[0510] (siuinBuES BiidisoipiiSEj) £31-00 id:(|.’O) sedAjo|Aqd %6l‘O 388

[0511] (sisueEOEd sniiqdiuoidod | sisueiiissELU^sniiqdiuoided^sniEpipuBO / sisueiiissBLUiuun^sniiqdiuoided

[0512] | sisueiiissELU snnqdiuoided snispipuEo) Z8000 id:(|.’O) sedAjoiAqd %6l'O 188

[0513] (idojqiuE E||eienbuop | sueioosid

[0514]

[0515] jejosqopiLUEjAd | siuiiuoq wniqojoiLuuEy) 09000 id:(|.) sedAjo|Aqd %6l’O 088

[0516] (O-lOd-S-Sfrl-fr^OSdS) OMV6Z-dSOn:i9>iooa uv Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0517] 0.18% Taxonomy (Genus): Terrabacteria group

[0518] 0.18% Taxonomy (Genus): unclassified Bacteroidales

[0519] 0.18% Taxonomy (Genus): Casaltella

[0520] 0.18% Taxonomy (Genus): Bacteroidales

[0521] 0.18% Taxonomy (Genus): Firmicutes

[0522] 0.18% Taxonomy (Genus): Campylobacter

[0523] 0.18% Taxonomy (Genus): Eggerthellaceae

[0524] 0.18% Taxonomy (Genus): Blautia

[0525] 0.18% Taxonomy (Genus): Bacteroides

[0526] 0.18% Taxonomy (Genus): Ruminococcaceae

[0527] 0.18% Taxonomy (Genus): Bacteria <bacteria>

[0528] 0.18% Taxonomy (Genus): Actinomyces / Gleimia

[0529] 0.18% Taxonomy (Genus): Acinetobacter

[0530] 0.18% Taxonomy (Genus): Facklamia

[0531] 0.18% Taxonomy (Genus): Fastidiosipila

[0532] 0.18% Taxonomy (Genus): Ezakiella

[0533] 0.18% Taxonomy (Genus): Brevibacterium

[0534] 0.18% Taxonomy (Genus): Hungateiclostridiaceae

[0535] 0.18% Taxonomy (Genus): Olegusella

[0536] 0.18% Taxonomy (Genus): Murdochiella / Levyella

[0537] 0.18% Taxonomy (Genus): Lawsonella

[0538] 0.18% Taxonomy (Genus): Lagierella

[0539] 0.18% Taxonomy (Genus): Negativicoccus

[0540] 0.18% Taxonomy (Genus): Helcococcus

[0541] 0.18% Taxonomy (Genus): Porphyromonas

[0542] 0.18% Taxonomy (Genus): Pseudoglutamicibacter

[0543] 0.18% Taxonomy (Species): Peptoniphilus obesi

[0544] 0.18% Taxonomy (Species): Porphyromonas asaccharolytica

[0545]

[0546] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0547] 0.18% Taxonomy (Species): Porphyromonas uenonis

[0548] 0.18% Taxonomy (Species): Porphyromonas

[0549] 0.18% Taxonomy (Species): Peptostreptococcus anaerobius

[0550] 0.18% Taxonomy (Species): Peptoniphilus pacaensis

[0551] 0.18% Taxonomy (Species): Porphyromonas bennonis

[0552] 0.18% Taxonomy (Species): Mycoplasma hominis

[0553] 0.18% Taxonomy (Family): Micrococcaceae

[0554] 0.18% Taxonomy (Species): Peptoniphilus harei / lacydonensis

[0555] 0.18% Taxonomy (Species): Olegusella massiliensis

[0556] 0.18% Taxonomy (Species): Peptoniphilus duerdenii

[0557] 0.18% Taxonomy (Species): Peptoniphilus coxii

[0558] 0.18% Taxonomy (Family): Rikenellaceae

[0559] 0.18% Taxonomy (Species): Prevotella

[0560] 0.18% Taxonomy (Species): Peptoniphilus

[0561] 0.18% Taxonomy (Species):

[0562] Murdochiella_asaccharolytica / Levyella_massiliensis

[0563] 0.18% Taxonomy (Family): Ruminococcaceae

[0564] 0.18% Taxonomy (Species): Peptoniphilus grossensis / harei

[0565] 0.18% Taxonomy (Species): Prevotella corporis

[0566] 0.18% Taxonomy (Species): Prevotella amnii

[0567] 0.18% Taxonomy (Species): Prevotella bergensis

[0568] 0.18% Taxonomy (Species): Winkia neuii

[0569] 0.18% Taxonomy (Species): Varibaculum anthropi / cambriense

[0570] 0.18% Taxonomy (Species): Urinicoccus massiliensis

[0571] 0.18% Taxonomy (Family): Bacteroidaceae

[0572] 0.18% Taxonomy (Species): Terrabacteria group

[0573] 0.18% Taxonomy (Species): Sutterella

[0574] 0.18% Taxonomy (Species): Streptococcus mitis

[0575]

[0576] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0577] 0.18% Taxonomy (Species): Streptococcus anginosus

[0578] 0.18% Taxonomy (Species): Staphylococcus lugdunensis

[0579] 0.18% Taxonomy (Species): Staphylococcus haemolyticus

[0580] 0.18% Taxonomy (Species): Staphylococcus epidermidis

[0581] 0.18% Taxonomy (Species): Sneathia sanguinegens

[0582] 0.18% Taxonomy (Species): Sneathia amnii

[0583] 0.18% Taxonomy (Species): Ruminococcaceae

[0584] 0.18% Taxonomy (Family): Bacteroidales

[0585] 0.18% Taxonomy (Family): Brevibacteriaceae

[0586] 0.18% Taxonomy (Species): Pseudoglutamicibacter albus / cumminsii

[0587] 0.18% Taxonomy (Family): Coriobacteriia

[0588] 0.18% Taxonomy (Species): Prevotella disiens

[0589] 0.18% Taxonomy (Family): unclassified Bacteroidales

[0590] 0.18% Taxonomy (Family): Dermabacteraceae

[0591] 0.18% CST (Valencia): CST

[0592] 0.18% Taxonomy (Family): Eggerthellaceae

[0593] 0.18% Taxonomy (Species): Mobiluncus curtisii

[0594] 0.18% CST (Valencia): subCST

[0595] 0.18% Taxonomy (Species): Lawsonella clevelandensis

[0596] 0.18% Taxonomy (Species): Bifidobacterium breve

[0597] 0.18% Taxonomy (Species): Corynebacterium jeikeium

[0598] 0.18% Taxonomy (Species): Corynebacterium coyleae

[0599] 0.18% Taxonomy (Species): Corynebacterium aurimucosum

[0600] 0.18% Taxonomy (Species): Corynebacterium amycolatum / lactis

[0601] 0.18% Taxonomy (Species): Coriobacteriia

[0602] 0.18% Taxonomy (Species): Casaltella

[0603] 0.18% Taxonomy (Species): Campylobacter ureolyticus

[0604] 0.18% Taxonomy (Species): Campylobacter hominis

[0605]

[0606] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0607] 0.18% Taxonomy (Species): Brevibacterium ravenspurgense

[0608] 0.18% Taxonomy (Species): Bacteroides vulgatus

[0609] 0.18% Taxonomy (Species): Lagierella

[0610] 0.18% Taxonomy (Species): Bacteroidales

[0611] 0.18% Taxonomy (Species): Bacteria <bacteria>

[0612] 0.18% Taxonomy (Species): Anaerococcus vaginalis

[0613] 0.18% Taxonomy (Species): Anaerococcus provencensis

[0614] 0.18% Taxonomy (Species): Anaerococcus octavius

[0615] 0.18% Taxonomy (Species): Anaerococcus lactolyticus

[0616] 0.18% Taxonomy (Species): Anaerococcus hydrogenalis

[0617] 0.18% Taxonomy (Species): Anaerococcus

[0618] 0.18% Taxonomy (Species): Actinomyces_urinae / Gleimia_europaea

[0619] 0.18% Taxonomy (Species): Corynebacterium pyruviciproducens

[0620] 0.18% Taxonomy (Species): Corynebacterium senegalense

[0621] 0.18% Taxonomy (Species): Corynebacterium sundsvallense / thomssenii

[0622] 0.18% Taxonomy (Species): Corynebacterium tuscaniense

[0623] 0.18% Taxonomy (Species): Lactobacillus reuteri

[0624] 0.18% Taxonomy (Species): Lactobacillus johnsonii

[0625] 0.18% Taxonomy (Family): unclassified Corynebacteriales

[0626] 0.18% Taxonomy (Family): unclassified Tissierellia

[0627] 0.18% Taxonomy (Species): Lactobacillus coleohominis

[0628] 0.18% Taxonomy (Species): Hungateiclostridiaceae

[0629] 0.18% Taxonomy (Species): Howardella

[0630] 0.18% Taxonomy (Species): Helcococcus

[0631] 0.18% Taxonomy (Species): Gemella asaccharolytica

[0632] 0.18% Taxonomy (Species): Fusobacterium nucleatum

[0633] 0.18% Taxonomy (Species): Firmicutes

[0634] 0.18% Taxonomy (Species): Fenollaria massiliensis / timonensis

[0635]

[0636] Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0637] 466 0.18% Taxonomy (Species): Fenollaria

[0638] 467 0.18% Taxonomy (Species): Fastidiosipila sanguinis

[0639] 468 0.18% Taxonomy (Species): Facklamia hominis

[0640] 469 0.18% Taxonomy (Species): Ezakiella

[0641] 470 0.18% Taxonomy (Species): Enterococcus faecalis

[0642] 471 0.18% Taxonomy (Species): Eggerthellaceae

[0643] 472 0.18% Taxonomy (Species): Dialister propionicifaciens

[0644] 473 0.18% Taxonomy (Species): unclassified Bacteroidales

[0645]

[0646] According to some embodiments, the vaginal fluid sample was obtained by the subject from whom the vaginal fluid sample was obtained. According to some embodiments, the vaginal fluid sample was obtained at the subject’s home.

[0647] The methods of the present disclosure may further comprise providing the predicted likelihood of PTB or ePTB to the subject from whom the vaginal fluid sample was obtained. The predicted likelihood of PTB or ePTB may be provided to the subject in the form of a report. In some embodiments, the profile report is characterized as having an encoding selected from the group consisting of “.doc”; “.pdf’; “.xml”; “ htm I”; “.jpg”; “ aspx”; “ php”, and any combination thereof. Alternatively, or additionally, a hard copy (paper copy) of the report may be provided to the subject, e.g., delivered to the subject’s home.

[0648] In some instances, the methods of the present disclosure comprise assessing (e.g., by the subject’s healthcare provider) the quality of the predicted likelihood of PTB or ePTB, e.g., prior to any providing (e.g., via a report) of the predicted likelihood of PTB or ePTB to the subject.

[0649] In certain embodiments, the predicted likelihood of PTB or ePTB meets a threshold value, and wherein the method further comprises administering one or more PTB or ePTB interventions to the subject and / or fetus based on the predicted likelihood. According to some embodiments, the one or more PTB or ePTB interventions comprise administering a corticosteroid to the subject, e.g., for fetal lung maturation. In some instances, the one or more PTB or ePTB interventions comprise administering magnesium sulfate to the subject, e.g., to reduce the risk of cerebral palsy. The one or more PTB or ePTB interventions may comprise one or more of those described in: Medley et al. (2018) Interventions during pregnancy to prevent preterm birth: an overview of Cochrane systematic reviews Cochrane Database Syst Rev. 2018(11): CD012505; Wastnedge et al. (2021) J Glob Health. 11:04050; Shapiro-Mendoza et al. (2016) CDC Grand Rounds: Public Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0650] Health Strategies to Prevent Preterm Birth. MMWR Morb Mortal Wkly Rep 2016;65:826-830; and Rundell et al. (2017) Am Fam Physician. 95(6):366-372; the disclosures of which are incorporated herein by reference in their entireties.

[0651] COMPUTER-READABLE MEDIA AND SYSTEMS

[0652] Aspects of the present disclosure further include non-transitory computer-readable media and systems.

[0653] In certain aspects, provided are one or more computer-readable media comprising instructions stored thereon, which when executed by one or more processors, cause the one or more processors to perform operations. According to some embodiments, the operations comprise one, two, three or each of: (a) harmonizing vaginal microbiome 16S rRNA gene sequencing data obtained by 16S rRNA gene sequencing on vaginal fluid sample DNA; (b) transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into vaginal microbiome features; (c) inputting the vaginal microbiome features into a predictive model; and (d) using the predictive model, predicting the likelihood of preterm birth (PTB) or early preterm birth (ePTB) for the subject from whom the vaginal fluid sample was obtained.

[0654] In some embodiments, the instructions cause the one or more processors to harmonize the vaginal microbiome 16S rRNA gene sequencing data by phylogenetic placement of amplicon sequence variants (ASVs) onto a maximum likelihood phylogenetic tree comprised of full-length 16S rRNA alleles. In certain embodiments, the instructions cause the one or more processors to transform the harmonized vaginal microbiome 16S rRNA gene sequencing data into vaginal microbiome features by transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into feature tables.

[0655] According to some embodiments, the vaginal microbiome features comprise one or more diversity measures, one or more community state types, one or more phylotypes, one or more taxons (e.g., at the family, genus, and / or species levels), or any combination thereof. For examples, the features may comprise one or more diversity measures, one or more community state types, and one or more phylotypes, optionally wherein the features further comprise one or more taxons.

[0656] According to some embodiments, the vaginal microbiome features comprise one or more of the features listed in FIG. 4B. In some instances, the features comprise two or more, 5 or more, 10 or more, 15 or more, or 20 or more of the features listed in FIG. 4B. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0657] In certain embodiments, the vaginal microbiome features comprise one or more of the features listed in Table 1. In some instances, the the vaginal microbiome features comprise two or more, 5 or more, 10 or more, 15 or more, 20 or more, 30 or more, 40 or more, 50 or more, 60 or more, 70 or more, 80 or more, 90 or more, or 100 or more of the features listed in Table 1. According to some embodiments, the vaginal microbiome features comprise two or more, 5 or more, 10 or more, 15 or more, 20 or more, 30 or more, 40 or more, or 50 or more of the first 103 features listed in Table 1. In certain embodiments, the vaginal microbiome features comprise 103 or fewer, 90 or fewer, 80 or fewer, 70 or fewer, 60 or fewer, 50 or fewer, or 40 or fewer of the first 103 features listed in Table 1. According to any of the embodiments of the methods of the present disclosure, the vaginal microbiome features may comprise Taxonomy (Genus): Mobiluncus.

[0658] In other aspects, provided are systems for predicting the likelihood of preterm birth (PTB) or early preterm birth (ePTB) for a subject. In some embodiments, such systems comprise one or more processors and one or more computer-readable media, wherein the one or more computer-readable media may be any of the one or more computer-readable media of the present disclosure, e.g., any of the one or more computer-readable media described in the preceding paragraphs. By virtue of comprising such computer-readable media, the systems are capable of performing one or more steps of any of the methods of the present disclosure.

[0659] A variety of processor-based systems may be employed to implement the embodiments of the present disclosure. Such systems may include system architecture wherein the components of the system are in electrical communication with each other using a bus. System architecture can include a processing unit (CPU or processor), as well as a cache, that are variously coupled to the system bus. The bus couples various system components including system memory, (e.g., read only memory (ROM) and random access memory (RAM), to the processor.

[0660] System architecture can include a cache of high-speed memory connected directly with, in close proximity to, or integrated as part of the processor. System architecture can copy data from the memory and / or the storage device to the cache for quick access by the processor. In this way, the cache can provide a performance boost that avoids processor delays while waiting for data. These and other modules can control or be configured to control the processor to perform various actions. Other system memory may be available for use as well. Memory can include multiple different types of memory with different performance characteristics. Processor can include any general purpose processor and a hardware module or software module, such as first, second and third modules stored in the storage device, configured to control the processor as well as a special-purpose processor where software instructions are incorporated into the actual Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0661] processor design. The processor may essentially be a completely self-contained computing system, containing multiple cores or processors, a bus, memory controller, cache, etc. A multicore processor may be symmetric or asymmetric.

[0662] To enable user interaction with the computing system architecture, an input device can represent any number of input mechanisms, such as a microphone for speech, a touch-sensitive screen for gesture or graphical input, keyboard, mouse, motion input, speech and so forth. An output device can also be one or more of a number of output mechanisms. In some instances, multimodal systems can enable a user to provide multiple types of input to communicate with the computing system architecture. A communications interface can generally govern and manage the user input and system output. There is no restriction on operating on any particular hardware arrangement and therefore the basic features here may easily be substituted for improved hardware or firmware arrangements as they are developed.

[0663] The storage device is typically a non-volatile memory and can be a hard disk or other types of computer-readable media which can store data that are accessible by a computer, such as magnetic cassettes, flash memory cards, solid state memory devices, digital versatile disks, cartridges, random access memories (RAMs), read only memory (ROM), and hybrids thereof.

[0664] The storage device can include software modules for controlling the processor. Other hardware or software modules are contemplated. The storage device can be connected to the system bus. In one aspect, a hardware module that performs a particular function can include the software component stored in a computer-readable medium in connection with the necessary hardware components, such as the processor, bus, output device, and so forth, to carry out various functions of the disclosed technology.

[0665] Embodiments within the scope of the present disclosure may also include tangible and / or non-transitory computer-readable storage media or devices for carrying or having computerexecutable instructions or data structures stored thereon. Such tangible computer-readable storage devices can be any available device that can be accessed by a general purpose or special purpose computer, including the functional design of any special purpose processor as described above. By way of example, and not limitation, such tangible computer-readable devices can include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other device which can be used to carry or store desired program code in the form of computer-executable instructions, data structures, or processor chip design. When information or instructions are provided via a network or another communications connection (either hardwired, wireless, or combination thereof) to a computer, the computer Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0666] properly views the connection as a computer-readable medium. Thus, any such connection is properly termed a computer-readable medium. Combinations of the above should also be included within the scope of the computer-readable storage devices.

[0667] Computer-executable instructions include, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. Computer-executable instructions also include program modules that are executed by computers in stand-alone or network environments. Generally, program modules include routines, programs, components, data structures, objects, and the functions inherent in the design of special-purpose processors, etc. that perform tasks or implement abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of the program code means for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.

[0668] Other embodiments of the disclosure may be practiced in network computing environments with many types of computer system configurations, including personal computers, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like. Embodiments may also be practiced in distributed computing environments where tasks are performed by local and remote processing devices that are linked (either by hardwired links, wireless links, or by a combination thereof) through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.

[0669] The following examples are offered by way of illustration and not by way of limitation.

[0670] EXPERIMENTAL

[0671] Example 1 - Data Aaareaation and Processinq

[0672] The training dataset was constructed by aggregating and processing vaginal microbiome data from the public domain leveraging resources including dbGAP as well as MOD Database for Preterm Birth Research. The final dataset included data from nine studies, representing 3,578 samples from 1,268 individuals. Of these patients, 851 delivered at term and 417 preterm (before 37 weeks of gestation) including 170 whose deliveries were early preterm (before 32 weeks of gestation). While all of these studies focus on profiling the 16S rRNA gene, primers targeting Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0673] different variable regions of the 16S rRNA gene, PGR conditions, and sequencers all varied. The combination of microbiome data from different studies, particularly those using different underlying techniques, is a challenging task which has hindered prior efforts for meta-analysis of microbiome data in a manner distinct from other forms of ‘omics data, like transcriptomics and genotyping. To emphasize that it is not biologically correct to combine technically diverse 163-based microbiome studies at the raw sequence level, ordination based on denoised and then dereplicated amplicon sequence variants (ASVs) pseudo-counts results in specimens clustering by technique, as expected when non-overlapping variable regions of the 16S rRNA gene are being amplified (Figure 1 B, left). Thus, it was first focused on harmonizing the data from the nine studies that comprised the training seta common that were not reliant upon taxonomy, but instead based on phylogenetic placement of the ASVs onto a common de novo maximum likelihood phylogenetic tree comprised of full-length 16S rRNA alleles using a Nextflow-based workflow called MaLiAmPi. After processing with MaLiAmPi, most of the technique-based noise was able to be overcome and the data was succesfully harmonized into one cohesive feature set of compositional features, as evident by the integration of the studies after UMAP ordination, this time based on phylotype-counts (Figure 1B, right). Ultimately, the true relationship between specimens is unknown. But, after harmonization it is reassuring that each study now has specimens overlapping, and broadly representing the topography after UMAP ordination, as well as tSNE and MDS ordination (Data not shown). Additional dimensionality reduction plots demonstrating the successful integration of the data.

[0674] A similar challenge arises when comparing alpha-diversity (metrics of richness and / or unevenness in a microbial community) between studies, where the estimates can be affected by total reads recovered per specimen as well as the specific variable region of the 16S rRNA gene being targeted. To demonstrate this point, the Shannon alpha-diversity estimates based on denoised and dereplicated ASV-counts were inconsistent in range between the studies (Figure 1 C, top). Alpha diversity can be estimated after phylogenetic placement of ASVs, including the estimation of Shannon and inverse-Simpson alpha diversity via Chao numbers, but cannot fully overcome all of the limitations of cross-study comparison of alpha diversity (Figure 1 C, bottom). In particular, project F (pyrosequencing-based) resulted in an estimated Shannon Diversity an order of magnitude higher than would typically be expected. Participating teams were presented these post-harmonization results as well as raw ASV-counts if they wished to re-derive the alpha diversity metrics.

[0675] The separation between samples by outcome-from term, preterm, and early preterm deliveries-is not clearly evident (Figure 2A and 2B). There are some distinct differences observed Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0676] with respect to community state types (CSTs) and outcome (Figure 2C). Leveraging different types of microbial features including phylotype relative abundance, diversity measures as well as CST membership provide an opportunity to apply ML techniques to these data for PTB prediction.

[0677] To build an independent test set for evaluating the models submitted by participants in the DREAM challenge, an unpublished dataset from Wayne State University was combined, the dataset consisting of 159 samples across 60 individuals among whom 40 (66.7%) had term deliveries and 20 (33.3%) had preterm deliveries, including 5 (8.3%) who had early preterm deliveries was (Table 1). Most patients in this test set had three longitudinal samples. A second validation dataset was generated that comprised 172 vaginal microbiome samples from 88 individuals, up to three samples (one sample per trimester) for each individual, with 48 individuals (54.5%) having term deliveries, and 40 individuals (45.5%) having preterm deliveries including 8 (9.1%) having early preterm deliveries. DNA extraction, V4 16S rRNA gene library preparation, and 16S rRNA gene sequencing (2x150 Paired-End sequencing on the Illumina NextSeq platform) of these samples was performed by the UCSF Benioff Center for Microbiome Medicine, with most samples yielding over 100,000 reads (see Methods for details). Validation datasets became available only after the training dataset was generated and distributed to teams. Thus, the resultant reads had to be integrated into the same feature set as in the training data post-hoc. Using MaLiAmPi, the training data was first generated, preserving the features (e.g., phylotypes, alpha diversity, etc.) (Figure 2A, 2A) and the validation datasets were able to be further integrated. The generalizability of these features across studies, including new study data, has allowed the application of the ML models to these independent validation sets, and enabled the use of the model on data to be generated in the future.

[0678] Example 2 - DREAM Challenge Results

[0679] The Preterm Birth Microbiome Prediction DREAM Challenge featured two sub-challenges: sub-challenge 1 - Prediction of PTB (before 37 weeks of gestation) and sub-challenge 2 -Prediction of early PTB (before 32 weeks of gestation). The validation dataset for this second subchallenge included only data from samples collected no later than 28 weeks of gestation (to reduce trivial predictions based upon later-in-gestation specimens being available from a pregnancy). A baseline ‘organizers’ random-forest based model was developed with the training data to provide participants an example, inclusive of packing of the model within a docker container. Performance metrics that were used to evaluate the prediction models submitted by the teams include area under the receiver operator characteristic (AUROC) curve, area under the precision-recall (AUPR) curve, accuracy, sensitivity, specificity and Matthews Correlation Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0680] Coefficient (MCC). All values were determined on bootstrapped validation data, with the mean bootstrapped value used to evaluate the model. The primary scoring metric was set at the onset to be AUROC, followed by AUPR to break ties.

[0681] There were 318 participants from all over the world with 136 and 110 submissions for subchallenges 1 and 2, respectively. The prediction models with top-ranking submissions achieved mean bootstrapped AUROC scores of 0.688 and 0.868 respectively for the 2 sub-challenges (Figure 3A-3B). Several techniques were carried out in order to ensure the robustness of the resulting rankings including test set label inversion, bootstrapping, oversampling, and undersampling (see Methods).

[0682] A few patterns emerged in the best-performing predictive models for sub-challenge 1 and sub-challenge 2. Nearly all of the models used tree-based approaches (typically implemented as part of the python Scikit Learn package), such as random forest and relatives. A few models used regression approaches with inclusion of gestational age at sampling (with feature pruning and clustering), or neural networks. All of these modeling approaches are notable for their aggressive pruning or consolidation of features well-suited for handling both sparse and highly dimensional data. Therefore, avoiding overfitting the training data was a shared and likely essential attribute of the best-performing models.

[0683] Predictive Features:

[0684] Microbiome features were identified that the best performing models (as judged by mean bootstrapped AUROC cutoff at of 0.64 and 0.8 for sub-challenge 1 and 2, one model per team and limited to models that could make a prediction in a tractable time) relied upon to make their predictions resulting in the evaluation of three models for sub-challenge 1 and eight models for sub-challenge 2. Feature permutation was used as a means of empirically identifying the feature tables and in turn specific features that the models depended upon for their predictions, with an emphasis of identifying features used by multiple independently-developed models. Teams were provided multiple feature tables but had no requirement to use any specific table for making predictions. These included a table of alpha-diversity metrics; composition via phylotypes at three distinct resolutions; composition via taxonomy at species-, genus-, or family-level; and VALENCIA community state types. Feature permutation at the table level revealed broad use of alphadiversity metrics (2 out of three and 7 out of 8 for sub-challenge 1 and 2 respectively), VALENCIA community state types (3 out of 3 and 7 out of 8, for sub-challenge 1 and 2), and composition via phylotypes (2 out of 3 and 8 out of 8 for sub-challenge 1 and 2). In contrast, composition via taxonomy was used by three of the 8 better-performing models for sub-challenge 2 (Figure 4A). Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0685] Specific features used by the top-performing predictive models in both sub-challenges were next identified via feature permutation, narrowing in on alpha-diversity and compositional features used consistently by the better-performing models. For compositional features (phylotypes or taxons), individual features found in at least 10% of the specimens were further considered in the training set (to reduce the computational load). There was notable convergence between the features used by the models to make their predictions. For sub-challenge 1, the two models that made use of alpha-diversity made use of the same seven metrics. For the two models that made use of genus-level compositional data, 68 genera were used by both models. For subchallenge 2, all seven models that used alpha-diversity had predictions that were sensitive to rooted phylogenetic diversity. Of the six models that made use of the phylotypes binned at 0.1 distance (of which 105 of the phylotypes were above 10% density), 32 of these phylotypes were used by all six models for their prediction (Figure 4B). As expected based on the parallel evaluation of phylotypes relative to taxonomy, a given species is frequently split among multiple phylotypes when binned at a distance 0.1. Four phylotypes like Lactobacillus crispatus, three phylotypes like Garnerella vaginalis, and two Prevotella timonensis-Wke phylotypes were used by all six models when making predictions. One model for sub challenge 2 (USF biostat) was able to make quite accurate prediction while only making use of phylotypes binned at 0.5 distance — with comparable prediction performance to models making use of much broader set of featuretables (Figure 4B).

[0686] Univariate correlation was performed with features used by at least one of the betterperforming sub-challenge 1 or sub-challenge 2 models with PTB or ePTB respectively. For each alpha-diversity metrics and VALECINA CSTs generalized linear modeling was used to estimate the univariate correlation. For taxons and phylotypes, detected I not-detected status was chosen to account for these features being sparse (detected only in a small minority of microbiota). To account for repeated sampling, averaging was by pregnancy across each trimester. Overall, as expected, the univariate analysis revealed complex and trimester-dependent relationships with PTB and ePTB that the ML-models were able to overcome.

[0687] Sensitivity analysis on gestational age at sampling:

[0688] To ensure that the best performing models were not overly reliant upon the gestational week of collection of specimens, a sensitivity analysis was performed-removing gestational age at sampling or permuting gestational age values (Table 2). Model performance was only modestly affected removing model access to the gestational age of collection, indicating the predictions were primarily based on other attributes. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0689] Post-challenge ensemble models:

[0690] Several ensemble models were created - combining results of (a) the winning teams, (b) the teams with Bayes factor < 20 (Tables 3 and 4), and (c) all the participants across the two subchallenges (Figure 5). The underlying models (as noted above) make use of many of the same microbiome features. Still, the ensemble models were evaluated against the two validation studies unavailable to the model developers to avoid artificially improved scores due to overfitting. An improvement in performance was observed across the board with the ensemble models of Bayes factor < 20 performing the best AUROC 0.74 and AUROC 0.91 respectively for sub-challenges 1 and 2.

[0691] Method Details

[0692] Resource Availability

[0693] Data and code availability

[0694] • All tokenized and harmonized training and validation data used for this study, including paired metadata and outcomes data is available at pretermdb.org under accession SDY2187.

[0695] • Sequence data and associated metadata for Study SDY465 were downloaded from ImmPort via the March of Dimes Preterm Birth database. Sequence data and associated metadata for BioProjects PRJNA242473, PRJNA294119, PRJNA393472, and PRJNA430482 were downloaded from the NGBI Sequence Read Archive. Additional associated metadata for PRJNA430482 were requested through and obtained from the RAMS Registry (ramsregistry(.)vcu(.)edu).

[0696] • Sequence data and associated metadata for Projects PRJEB11895, PRJEB12577, PRJEB21325, and PRJEB30642 were downloaded from the Sequence Read Archive of the European Nucleotide Archive, with associated metadata for PRJEB11895 and PRJEB12577 downloaded from Additional Files 4 and 6 from the paper by the Kindinger et al.. Additional associated metadata for Projects PRJEB11895, PRJEB12577, PRJEB21325, and PRJEB30642 were requested from the senior author.

[0697] • Sequence data and associated metadata for accession number phs001739.v1,p1 were downloaded from the database of Genotypes and Phenotypes (dbGaP).

[0698] • The training dataset representing 7 of the 9 aggregated studies and the validation dataset for our Challenge are available under Study ID SDY2187 from the MOD Preterm Birth Research Database (pretermbirthdb(.)org / mod / studydata). Two of the nine training data Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0699] (PRJNA430482 and phs001739.v1,p1.) are exclusively available via dbGap after following the application procedures there.

[0700] • The processed dataset is also available as a visualization Rshiny application VMAP (Vaginal Microbiome in Pregnancy) -vmapapp(.)org.

[0701] • The code for the microbiome data harmonization tool, MaLiAmPi, is available at github(.)com / jgolob / maliampi.

[0702] • DREAM challenge participants’ code for sub-challenge 1 and sub-challenge 2 is in their docker submissions.

[0703] Validation Data Generation

[0704] Wayne State University - Study design, sample collection

[0705] The microbiome dataset from Wayne State University School of Medicine included in the challenge was a subset of randomly selected 20 cases and 40 controls from a larger retrospective longitudinal case-control study described in detail elsewhere (www(.)researchsquare(.)com / article / rs-2359402 / v1)64. The 20 spontaneous PTB cases included both spontaneous preterm labor with intact membranes (PTL) and preterm prelabor rupture of membranes (PPROM) resulting in delivery 20-36+6 weeks. Cases had 3 or 4 longitudinal samples collected from 10-36 weeks of gestation which were matched with samples from controls (2 to 4 samples per patient). Term controls were defined as women who delivered between 38 and 42 weeks of gestation without congenital anomalies or obstetrical, medical, or surgical complications. Samples of vaginal fluid were collected using a Dacron swab (Medical Packaging Corp., Camarillo, CA). Vaginal swabs were stored at -80°C until time of DNA extraction, following established standard operating procedures. The study was conducted at the Perinatology Research Branch, an intramural program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, U. S. Department of Health and Human Services, Wayne State University (Detroit, Ml), and the Detroit Medical Center (Detroit, Ml). The collection of samples was approved by the Institutional Review Boards of the National Institute of Child Health and Human Development and Wayne State University (#110605MP2F(RCR)). All participating women provided written informed consent prior to sample collection.

[0706] DNA extraction from vaginal swabs

[0707] Genomic DNA was extracted from vaginal swabs using a Qiagen MagAttract PowerMicrobiome DNA / RNA EP extraction kit (Qiagen, Germantown, MD), with minor Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0708] modifications to the manufacturer’s protocols as described in (www(.)researchsquare(.)com / article / rs-2359402 / v1). The purified DNA was transferred to the provided 96-well microplates and stored at -20°C.

[0709] 16S rRNA gene sequencing and processing

[0710] The V4 region of the 16S rRNA gene was amplified from vaginal swab and control DNA extracts and sequenced at Michigan State University’s Research Technology Support Facility (rtsf(.)natsci(.)msu.edu / ) using the dual indexing sequencing strategy developed by Kozich et al. The forward primer was 515F: 5’-GTGCCAGCMGGCGCGGTAA-3’ and the reverse primer was 806R: 5’-GGACTACHVGGGTWTCTAAT-3’.

[0711] Stanford University - Study design, sample collection

[0712] The Stanford University microbiome dataset included in the challenge consisted of 40 cases and 48 controls from a repository of specimens from women enrolled in a longitudinal study conducted by the March of Dimes Prematurity Research Center at Stanford University. Samples of vaginal fluid were collected using a 2x Sterile Catch-All™ Sample Collection Swab (Epicentre Biotechnologies #QEC091H, Madison, Wl). Vaginal swabs were placed into tubes then immediately placed on ice or in a household freezer (-20°C). After samples arrived at the March of Dimes Prematurity Center they were immediately placed on dry ice, inventoried, and then stored at -80°C at the Stevenson Laboratory until time of DNA extraction. The study was conducted at Stanford Hospital and Clinics. The collection of samples was approved by the Institutional Review Board of Stanford University (Study number 21956). All participating women provided written informed consent prior to sample collection.

[0713] Vaginal swab DNA extraction and 16S rRNA sequencing

[0714] Genomic DNA extraction and microbial sequencing were performed at the Microbial Genomics CoLab Plug-in Facility within the Benioff Center for Microbiome Medicine at University of California, San Francisco. First, vaginal swabs were aseptically transferred to 2 mL tubes prefilled with 300 pL sterile molecular-grade water. Vaginal samples were vortexed with the swab remaining in the tube. 200 pL vaginal suspension from the tube was withdrawn for downstream processing using the QIAamp BiOstic DNA Kit (QIAGEN, Hilden, Germany). DNA from all samples and several extraction blanks were extracted according to the manufacturer's protocol and eluted in 50 pl EB buffer. DNA concentrations were quantified using the Qubit dsDNA HS Assay Kit (ThermoFisher Scientific, MA), diluted to 5 ng / pL and stored at -20°C. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0715] The V4 hypervariable region of the 16S rRNA gene was amplified using 515F and 806R primers with PCR conditions previously described. Amplicon reactions were quantified using the Qubit dsDNA HS Assay Kit (ThermoFisher Scientific, MA), and pooled at equimolar concentrations. The pooled library was cleaned and concentrated using the Agencourt AMPure XP beads (Beckman-Coulter), quality checked with the Bioanalyzer DNA 1000 Kit (Agilent, Santa Clara, CA), quantified using the KAPA Library Quantification Kit (KAPA Biosystems), and diluted to 2 nM. Library was denatured according to manufacturer’s protocol and spiked in with 40% PhiX control prior to loading onto the NextSeq 550 platform (Illumina, San Diego, CA) for 2 x 150bp sequencing.

[0716] Training Data Acquisition and Processing

[0717] The following vaginal microbiome studies were identified by leveraging the March of Dimes Preterm Birth database, the NCBI Sequence Read Archive, the European Nucleotide Archive, and the database of Genotypes and Phenotypes (dbGaP). Sequence data and associated metadata for the DiGiulio et al. cohort were downloaded from ImmPort, under Study SDY465 in May 2016. Sequence data and associated metadata for Romero et al. cohort were downloaded from the NCBI Sequence Read Archive under BioProject PRJNA242473 in May 2016. Sequence data and associated metadata for the Callahan et al. cohort were downloaded from the NCBI Sequence Read Archive under BioProject PRJNA393472 in January 2018. Sequence data and associated metadata for the Stout et al. cohort were downloaded from the NCBI Sequence Read Archive under BioProject PRJNA294119 in January 2018. Sequence data for the Kindinger et al. cohort were downloaded from the Sequence Read Archive of the European Nucleotide Archive under Projects PRJEB11895 and PRJEB12577 in June 2020, and associated metadata was downloaded from Additional Files 4 and 6 from the paper with some additional metadata requested from the senior author. Sequence data and associated metadata for the Brown et al. (2018) cohort were downloaded from the Sequence Read Archive of the European Nucleotide Archive under Project PRJEB21325 in June 2020 with some additional metadata requested from the senior author. Sequence data and associated metadata for the Brown et al. (2019) cohort were downloaded from the Sequence Read Archive of the European Nucleotide Archive under Project PRJEB30642 in June 2020 with some additional metadata requested from the senior author. Sequence data and associated metadata for the Elovitz et al. cohort were downloaded from the database of Genotypes and Phenotypes (dbGaP) under accession number phs001739. v1,p1 in September 2021. Sequence data and associated metadata for the Fettweis et al. cohort were downloaded from the NCBI Sequence Read Archive under BioProject ID Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0718] PRJNA430482 in January 2022, and associated metadata were requested through and obtained from the RAMS Registry (ramsregistry(.)vcu.edu).

[0719] Data Processing and Harmonization

[0720] MaLiAmPi was applied to both training and test data to process and aggregate the datasets. In brief, MaLiAmPi uses DADA2 to assemble each project’s raw reads into approximate sequence variants (ASVs). These ASVs are used to recruit full-length 16s rRNA gene alleles from a repository of cached 16S rRNA alleles derived from the NCBI NT database. The objective is to recruit ten full-length 16S rRNA alleles for each ASV with equal sequence identity to the ASV (e.g., bounded best hits), with most ASVs recruiting multiple 16s rRNA alleles with 100% sequence identity for the region of the ASV. These recruits are assembled into a de novo maximum-likelihood phylogeny with RAxML and the ASVs are placed onto this common phylogenetic tree with EPA-ng. Finally, these placements are used to determine the alphadiversity of communities (diversity measures include Shannon, Inverse Simpson, Balance weighted phylogenetic diversity (bwpd), phylogenetic entropy, quadratic, unrooted phylogenetic diversity, and rooted phylogenetic diversity) via the guppy utility in the pplacer package, phylogenetic (KR) distance between communities, provide taxonomic assignments (via the guppy ‘hybrid 2’ classifer) to each ASV, and cluster ASVs into phylotypes (based on phylogenetic distance between ASVs). Sequence variance counts were also determined. In addition, VALENCIA was used to provide the community state type (CST) of each sample and alluvial plots were made using the ggalluvial R package in order to visualize CST composition by trimester. UMAP representations of the data and violin plots of Shannon alpha diversity before and after processing of the data with MaLiAmPi were visualized to gauge data harmonization. Extensive use of the Python seaborn visualization package was used for figure preparation.

[0721] DREAM Challenge

[0722] Overall Challenge structure.

[0723] The overview of the Challenge is shown in Figure 1. All Challenge elements were supported by the Synapse platform (www(.)synapse.org), including documentation, access to the data, submission of models, leaderboards, and the discussion forum. To gain access to the data, teams were required to comply with a data use agreement, restricting use of the data outside the Challenge and providing guidelines on ethical participation in the Challenge. Teams were provided the training data, they built their models, dockerized their environment, and submitted their models through the Synapse platform. Models were run on the test data and performance metrics were returned to the teams. Teams were limited to 5 total submissions with the top Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0724] performing model selected as the final submission to be scored and ranked. Leaderboards were provided throughout the open phase of the Challenge, which provided teams with real-time feedback and comparative performance rankings. After the close of the Challenge, models were evaluated for completeness and reproducibility. For teams to be included in the Preterm Birth DREAM Community, they were required to make the code public, provide a method write-up, and participate in a post-challenge survey to collect information on method development and features of the data important to the model.

[0725] Participant engagement.

[0726] Information about our challenge was shared through the Dream Challenges website (dreamchallenges(.)org). Challenge organizers also shared information about the challenge through listservs such as ML-news Google News Group and social media outlets including Facebook, Linkedln, Reddit, and Twitter.

[0727] In order to preserve model environments for portability of models, we required participants to submit Docker environments. These environments contain the necessary programming dependencies and models for each sub-challenge that can run on a processed and prepared microbiome dataset folder arranged in a standardized format. The organizers prepared an example Docker container for participants to utilize as a starting template and held occasional seminars to describe the data and answer questions from participants. Organizers also engaged with participants through the forums to help answer questions throughout the challenge.

[0728] Sub-challenge 1 - Top performing teams:

[0729] Team UWisc-Madison

[0730] For predicting PTB, a LightGBM-based pipeline was built using an ensemble strategy tailored for vaginal microbiome data collected from multiple projects. The model was developed using specimens collected no later than 32 weeks of gestation and included five types of features: counts of taxa at different taxonomic levels, counts of phylotypes, microbiome community states, alpha diversity metrics, and metadata (age, collection week, and race). In particular, the counts of taxa at the family, genus, and species levels, the counts of phylotypes defined at phylogenetic distances of 0.5 and 1, and the alpha diversity metrics including Shannon index, Inverse Simpson Index, phylogenetic entropy, balance-weighted phylogenetic diversity, and rooted / unrooted / quadratic phylogenetic diversity were used. To obtain scale-invariant values, the centered log-ratio (CLR) transformation was applied to each type of the microbiome count data. Rare microbial features with less than 5 non zero counts in any of the studies of the training set were removed. The LightGBM model was chosen as the prediction model due to its well-known Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0731] efficiency. Each specimen was one training sample and each training sample had a total of 1,991 features. Five-fold cross-validation on the subject level was used to tune hyperparameters. Because Project G had a very different sequencing depth profile (the average sequencing depth of Project G is 185,010, whereas the value is below 50,000 for other projects), two prediction models were built: one was trained using specimens from all projects (Model 1) and one was trained only using specimens from Project G (Model 2). When making a prediction given a specimen, the ensembling weights of Model 1 and Model 2 were generated by a logistic regression model with sequencing depth and collection week as features. As one subject is likely to have multiple vaginal microbiome specimens, a customized weighting method was designed to aggregate predictions from multiple specimens on one subject. If a subject has multiple specimens, then the weight of each specimen equals the collection week of the specimen divided by the sum of the collection weeks of all specimens from the subject. In other words, the closer a sample was to delivery, the more impact it would make on the final prediction. The pipeline achieved an AUROC of 0.69 and an AUPRC of 0.58 when tested on the validation dataset for sub-challenge 1.

[0732] Team AI4knowledqeLAB

[0733] To predict the risk of PTB, a workflow based on an ensemble of random forest models with oversampling of the minority class had been used. For the implementation of the model, both metadata and characteristic data of the vaginal microbiome were used. Concerning metadata, information on race and ethnicity and the gestational week when the sample was collected were included into the analysis. Microbiome data included: relative abundances of clusters of variants measured at three different phylogenetic distances (0.1, 0.5, 1), alpha-diversity metrics, and “VALENCIA Community State Types” (CST).

[0734] The first step was to eliminate samples collected after the 32nd week of gestation. A model was then built that takes three different matrices as input, one for each phylogenetic distance, to create three independent models that can output three different predictions for the same individual, which are then combined using an ensemble strategy. Each input matrix had a number of features of 9743, 3651, and 1871: to each matrix of relative abundance of phylotypes were added features related to: alpha-diversity (7), CST (11), and demographics (8).

[0735] To make the dataset more balanced, a data augmentation algorithm, SMOTE (Synthetic Minority Over-sampling Technique), was adopted. As a classification algorithm, random forest was chosen using the default parameters of the Scikit-learn python package due to its efficiency Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0736] in handling datasets with a high number of features. The final output was obtained as the average of the three probability values and the associated class was obtained from the probability value by imposing the classic threshold of 0.5. The prediction model achieved an AUROC of 0.64 and an AUPRC of 0.48 on the Dream Challenge validation dataset.

[0737] Sub-challenge 2 - Top performing Teams:

[0738] Team Techtmann Lab

[0739] To predict early PTB, a basic random forest classifier was employed using python’s Scikit-learn package. Training data included relative abundances clustered phylogenetically at a distance of 0.1, race of the patient, VALENCIA community state types, diversity metrics, and collection week. This model used default Scikit-learn parameters and involved no additional feature selection or hyperparameter tuning. When tested on the competition validation dataset, the model reported an AUROC of 0.87 and an AUPRC of 0.45.

[0740] When investigating feature importance diversity metrics, race, community state type, sample collection week, and some phylotypes were found to be the most important features in the model’s decision-making. Specifically, five phylotypes whose relative abundances were identified as important to predict early PTB: Lactobacillus jensenii, Lactobacillus iners, Lactobacillus crispatus, Prevotella bivia, and Ureaplasma urealyticum. This approach is hypothesized to result in a model that was not over-tuned to the training data, allowing it to generalize well to the competition validation dataset.

[0741] Team KBJ

[0742] With the approach of team KBJ for sub-challenge 2, several processes were applied to improve the model prediction performance. First, samples were filtered out by collection week conditions as the test dataset and aggregated all corresponding features. Here, one feature type was selected among several for taxonomy and phylotypes - genus-level and 0.1 phylogenetic distance, respectively. Also, race information was considered, while pairwise distance was excluded. Next, significant features were selected using the minimum redundancy maximum relevance, which considers mutual information of features in terms of response variables (i.e., early preterm versus non-preterm). The feature selection was conducted for phylotypes, sequence variants, and taxonomy whose dimensions are relatively large compared to the data size. Then, an ensemble model was constructed with five algorithms (Linear Support Vector Classification, Support Vector Classification, Quadratic Discriminant Analysis, Calibrated Classifier, and Passive Aggressive Classifier) that solely performed the best in cross-validation. All compared models were tested with default parameters by the Lazy Predict and Scikit-learn Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0743] python packages. The prediction model constructed by team KBJ achieved an AUROC of 0.841 and an AUPRC of 0.270 on the Dream Challenge validation dataset. Specifically, the model showed good balanced accuracy (sensitivity: 0.77; specificity: 0.79).

[0744] Quantification and Statistical Analysis

[0745] Performance metrics that were used to evaluate the teams include Area under the receiver operator characteristic (AUROC) curve and Area under the precision-recall (AUPR) curve. On the held-out external validation dataset, metrics of accuracy, sensitivity, and specificity were also computed. These metrics were shown on the final public rankings.

[0746] The reproducibility of models, including the baseline, were determined by calculating the Bayes factor for 1000 bootstrapped iterations on a random sampling of the data. For each subchallenge, the best-performing models from each team were rerun to obtain scores on the random sampling. These scores were then used to calculate the Bayes factor, using the computeBayesFactor function from the challenge scoring R package, comparing them to the topperforming model as well as the baseline model.

[0747] To increase the certainty of DREAM Challenge participants’ rankings whose models’ performances could have been affected by prediction threshold and class imbalance in our validation dataset, we employed the following strategies to validate participants' models for both sub-challenges on the external dataset: inverting labels, bootstrapped random subsampling, bootstrapped under-sampling, and bootstrapped over-sampling.

[0748] Inverted labels: Invert the class labels for the external dataset and prediction model outputs (i.e., classifying preterm or early preterm births as term births, and vice versa), and computing AUROC / AUPR curves.

[0749] Bootstrapped random subsamplinq: Randomly sample a subset of 100 from the 152 participants of the external dataset, and run the prediction models on the validation data subset, bootstrapped 1000 times.

[0750] Bootstrapped undersamplinq: Undersample the external dataset (n = 152) to balance the minority (Preterm, n = 63. Early preterm, n=13) and majority (i.e., Term, n = 89) classes by randomly sampling from the minority and the majority groups to have the same number in each group (n = 50 for Preterm and n = 50 for Term in sub-challenge 1, and n = 13 for Early Preterm and n = 13 for Term in for sub-challenge 2), and then computing AUROC / AUPRC on the undersampled external validation dataset, bootstrapped 1000 times.

[0751] Bootstrapped oversampling: Oversample the external dataset to balance the preterm or early preterm and term classes by randomly sampling per group (n = 200 for Preterm and n = 200 Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0752] for Term in sub-challenge 1, and n = 200 for Early Preterm and n = 200 for Term in for subchallenge 2), and then computing AUROC / AUPRC oversampled external dataset, bootstrapped 1000 times.

[0753] DREAM challenge participants and teams were surveyed to gather information on how they developed their models.

[0754] Sensitivity analysis was carried out removing gestational age at sampling as a feature. As with previous DREAM Challenges, ensemble models were generated to explore the "wisdom of the crowds" phenomenon, by aggregating the best-performing models from each team. For each sub-challenge, 3 ensemble models were experimented with by calculating the mean estimation from: 1) top two performing models; 2) models with Bayes factor less than 20; 3) all models.

[0755] Feature Permutation

[0756] Feature permutation was employed to empirically determine which of the microbiome feature sets, and in turn which specific features, models made use of to make their predictions. Feature importance was determined across the best performing models for sub-challenges 1 and 2 that demonstrated predictive performance at threshold of 0.64 for sub-challenge 1 and a threshold of 0.80 sub-challenge 2 which also could be run in a bootstrapped manner in a tractable amount of time (e.g., offer a prediction in under ten seconds on a 12 core AMD Ryzen 3900X processor). Three models for sub-challenge 1 and eight models for sub-challenge 2 fit these criteria and were evaluated. A staged approach was employed, first randomizing feature tables to identify which feature tables a model used, and then in those feature table-model pairing, randomized individual features.

[0757] Table permutation: The alpha-diversity, taxonomy (species-, genus-, and family-level), phylotype (1, 0.5, and 0.1 binning distance), VALENCIA CSTs, and raw sequence variant count tables (with features in columns and specimens in rows) were each individually shuffled by row without replacement. After obtaining a baseline prediction from each model with unmodified feature tables, the model was rerun with a shuffled table replacing one of the feature tables and the predictions recorded and compared to the baseline prediction. A feature table was scored as used by that model if the predict changed compared to the baseline prediction.

[0758] Feature permutation: The results of the table permutation effort as above were then used to filter down to model - feature table pairs. Again a baseline prediction was made, and then each column (feature) was shuffled one-by-one and the model output recorded and compared to the baseline. If the predictions varied when that specific feature was shuffled, it was considered Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0759] ‘important’ for that model to make its prediction. To reduce the computational load, only features with a density over 10% (e.g., found in at least 10% of the specimens) were considered.

[0760] Further experimental details may be found in Golob et al. (2024) Cell Reports Medicine 5(1 ):101350, the disclosure of which is incorporated herein by reference in its entirety for all purposes.

[0761] References

[0762] 1. Blencowe, H., Gousens, S., Oestergaard, M. Z., Chou, D., Moller, A.-B., Narwal, R., Adler, A., Garcia, C. V., Rohde, S., Say, L., et al. (2012). National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: a systematic analysis and implications. The Lancet 379, 2162-2172. 10.1016 / S0140-6736(12)60820-4.

[0763] 2. Blencowe, H., Gousens, S., Chou, D., Oestergaard, M., Say, L., Moller, A.-B., Kinney, M., Lawn, J., and the Born Too Soon Preterm Birth Action Group (see acknowledgement for full list) (2013). Born Too Soon: The global epidemiology of 15 million preterm births. Reprod. Health 10, S2. 10.1186 / 1742-4755-10-S1-S2.

[0764] 3. Liu, L., Johnson, H. L., Gousens, S., Perin, J., Scott, S., Lawn, J. E., Rudan, I., Campbell, H., Cibulskis, R., Li, M., et al. (2012). Global, regional, and national causes of child mortality: an updated systematic analysis for 2010 with time trends since 2000. The Lancet 379, 2151-2161.

[0765] 10.1016 / S0140-6736(12)60560-1.

[0766] 4. Norwitz, E. R., and Caughey, A. B. (2011). Progesterone Supplementation and the Prevention of Preterm Birth. Rev. Obstet. Gynecol. 4, 60-72.

[0767] 5. Lynch, A. M., Hart, J. E., Agwu, O. C., Fisher, B. M., West, N. A., and Gibbs, R. S. (2014). Association of extremes of prepregnancy BMI with the clinical presentations of preterm birth. Am. J. Obstet. Gynecol. 210, 428.e1-9. 10.1016 / j.ajog.2013.12.011.

[0768] 6. Underwood, P., Hester, L. L., Laffitte, T., and Gregg, K. V. (1965). The Relationship Of Smoking To The Outcome Of Pregnancy. Am. J. Obstet. Gynecol. 91, 270-276. 10.1016 / 0002-9378(65)90211-5.

[0769] 7. lams, J. D., Goldenberg, R. L., Meis, P. J., Mercer, B. M., Moawad, A., Das, A., Thom, E., McNellis, D., Copper, R. L., Johnson, F., et al. (1996). The length of the cervix and the risk of spontaneous premature delivery. National Institute of Child Health and Human Development Maternal Fetal Medicine Unit Network. N. Engl. J. Med. 334, 567-572.

[0770] 10.1056 / NEJM199602293340904. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0771] 8. Fall, C. H. D., Sachdev, H. S., Osmond, C., Restrepo-Mendez, M. C., Victora, C., Martorell, R., Stein, A. D., Sinha, S., Tandon, N., Adair, L., et al. (2015). Association between maternal age at childbirth and child and adult outcomes in the offspring: a prospective study in five low-income and middle-income countries (COHORTS collaboration). Lancet Glob. Health 3, e366-e377.

[0772] 10.1016 / S2214-109X(15)00038-8.

[0773] 9. Sheikh, I. A., Ahmad, E., Jamal, M. S., Rehan, M., Assidi, M., Tayubi, LA., AIBasri, S. F., Bajouh, O. S., Turki, R. F., Abuzenadah, A. M., et al. (2016). Spontaneous preterm birth and single nucleotide gene polymorphisms: a recent update. BMC Genomics 17, 759. 10.1186 / s12864-016-3089-0.

[0774] 10. Kramer, M. S., Goulet, L., Lydon, J., Seguin, L., McNamara, H., Dassa, C., Platt, R. W., Fong Chen, M., Gauthier, H., Genest, J., et al. (2001 ). Socio-economic disparities in preterm birth: causal pathways and mechanisms. Paediatr. Perinat. Epidemiol. 15, 104-123. 10.1046 / j.1365-3016.2001.00012.x.

[0775] 11. Slattery, M. M., and Morrison, J. J. (2002). Preterm delivery. The Lancet 360, 1489-1497.

[0776] 10.1016 / S0140-6736(02)11476-0.

[0777] 12. Mercer, B. M., Goldenberg, R. L., Moawad, A. H., Meis, P. J., lams, J. D., Das, A. F., Caritis, S. N., Miodovnik, M., Menard, M. K., Thurnau, G. R., et al. (1999). The preterm prediction study: effect of gestational age and cause of preterm birth on subsequent obstetric outcome. National Institute of Child Health and Human Development Maternal-Fetal Medicine Units Network. Am. J. Obstet. Gynecol. 181, 1216-1221. 10.1016 / s0002-9378(99)70111-0.

[0778] 13. Suff, N., Story, L., and Shennan, A. (2019). The prediction of preterm delivery: What is new? Semin. Fetal. Neonatal Med. 24, 27-32. 10.1016 / j.siny.2O18.09.006.

[0779] 14. Manz, C. R., Zhang, Y., Chen, K., Long, Q., Small, D. S., Evans, C. N., Chivers, C., Regli, S. H., Hanson, C. W., Bekelman, J. E., et al. (2023). Long-term Effect of Machine Learning-Triggered Behavioral Nudges on Serious Illness Conversations and End-of-Life Outcomes Among Patients With Cancer: A Randomized Clinical Trial. JAMA Oncol.

[0780] 10.1001 / jamaoncol.2022.6303.

[0781] 15. Tomasev, N., Glorot, X., Rae, J. W., Zielinski, M., Askham, H., Saraiva, A., Mottram, A., Meyer, C., Ravuri, S., Protsyuk, I., et al. (2019). A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 572, 116-119. 10.1038 / s41586-019-1390-1. 16. Shung, D. L., Au, B., Taylor, R. A., Tay, J. K., Laursen, S. B., Stanley, A. J., Dalton, H. R., Ngu, J., Schultz, M., and Laine, L. (2020). Validation of a Machine Learning Model That Outperforms Clinical Risk Scoring Systems for Upper Gastrointestinal Bleeding. Gastroenterology 158, 160-167. 10.1053 / j.gastro.2019.09.009. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0782] 17. Reel, P. S., Reel, S., Pearson, E., Trucco, E., and Jefferson, E. (2021). Using machine learning approaches for multi-omics data analysis: A review. BiotechnoL Adv. 49, 107739.

[0783] 10.1016 / j.biotechadv.2021.107739.

[0784] 18. Akazawa, M., and Hashimoto, K. (2022). Prediction of preterm birth using artificial intelligence: a systematic review. J. Obstet. Gynaecol. 42, 1662-1668.

[0785] 10.1080 / 01443615.2022.2056828.

[0786] 19. Davidson, L., and Boland, M. R. (2021). Towards deep phenotyping pregnancy: a systematic review on artificial intelligence and machine learning methods to improve pregnancy outcomes. Brief. Bioinform. 22, bbaa369. 10.1093 / bib / bbaa369.

[0787] 20. Espinosa, C., Becker, M., Marie, I., Wong, R. J., Shaw, G. M., Gaudilliere, B., Aghaeepour, N., Stevenson, D. K., Stelzer, LA., Peterson, L. S., et al. (2021). Data-Driven Modeling of Pregnancy-Related Complications. Trends Mol. Med. 27, 762-776.

[0788] 10.1016 / j.molmed.2021.01.007.

[0789] 21. Stelzer, I. A., Ghaemi, M. S., Han, X., Ando, K., Hedou, J. J., Feyaerts, D., Peterson, L. S., Rumer, K. K., Tsai, E. S., Ganio, E. A., et al. (2021). Integrated trajectories of the maternal metabolome, proteome, and immunome predict labor onset. Sci. TransL Med. 13, eabd9898.

[0790] 10.1126 / scitranslmed.abd9898.

[0791] 22. Marie, I., Contrepois, K., Moufarrej, M. N., Stelzer, LA., Feyaerts, D., Han, X., Tang, A., Stanley, N., Wong, R. J., Traber, G. M., et al. (2022). Early prediction and longitudinal modeling of preeclampsia from multiomics. Patterns 3, 100655. 10.1016Zj.patter.2022.100655.

[0792] 23. Ghaemi, M. S., DiGiulio, D. B., Contrepois, K., Callahan, B., Ngo, T. T. M., Lee-McMullen, B., Lehallier, B., Robaczewska, A., Mcilwain, D., Rosenberg-Hasson, Y., et aL (2019). Multiomics modeling of the immunome, transcriptome, microbiome, proteome and metabolome adaptations during human pregnancy. Bioinforma. Oxf. EngL 35, 95-103. 10.1093 / bioinformatics / bty537. 24. Tarca, A. L., Pataki, B. A., Romero, R., Sirota, M., Guan, Y., Kutum, R., Gomez-Lopez, N., Done, B., Bhatti, G., Yu, T., et al. (2021). Crowdsourcing assessment of maternal blood multiomics for predicting gestational age and preterm birth. Cell Rep. Med. 2, 100323.

[0793] 10.1016 / j.xcrm.2021.100323.

[0794] 25. Hyman, R. W., Fukushima, M., Jiang, H., Fung, E., Rand, L., Johnson, B., Vo, K. C., Caughey, A. B., Hilton, J. F., Davis, R. W., et aL (2014). Diversity of the Vaginal Microbiome Correlates With Preterm Birth. Reprod. Sci. 21, 32-40. 10.1177 / 1933719113488838.

[0795] 26. DiGiulio, D. B., Callahan, B. J., McMurdie, P. J., Costello, E. K., Lyell, D. J., Robaczewska, A., Sun, C. L., Goltsman, D. S. A., Wong, R. J., Shaw, G., et aL (2015). Temporal and spatial Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0796] variation of the human microbiota during pregnancy. Proc. Natl. Acad. Sci. U. S. A. 112, 11060— 11065. 10.1073 / pnas.1502875112.

[0797] 27. Callahan, B. J., DiGiulio, D. B., Goltsman, D. S. A., Sun, C. L., Costello, E. K., Jeganathan, P., Biggio, J. R., Wong, R. J., Druzin, M. L., Shaw, G. M., et al. (2017). Replication and refinement of a vaginal microbial signature of preterm birth in two racially distinct cohorts of US women. Proc. Natl. Acad. Sci. U. S. A. 114, 9966-9971. 10.1073 / pnas.1705899114.

[0798] 28. Structure, Function and Diversity of the Healthy Human Microbiome (2012). Nature 486, 207-214. 10.1038 / nature11234.

[0799] 29. Huang, C., Gin, C., Fettweis, J., Foxman, B., Gelaye, B., MacIntyre, D. A., Subramaniam, A., Fraser, W., Tabatabaei, N., and Callahan, B. (2022). Meta-Analysis Reveals the Vaginal Microbiome is a Better Predictor of Earlier Than Later Preterm Birth. Preprint at medRxiv, 10.1101 / 2022.09.26.22280389 10.1101 / 2022.09.26.22280389.

[0800] 30. Haque, M. M., Merchant, M., Kumar, P. N., Dutta, A., and Mande, S. S. (2017). First-trimester vaginal microbiome diversity: A potential indicator of preterm delivery risk. Sci. Rep. 7, 16145. 10.1038 / S41598-017-16352-y.

[0801] 31. Huo, Y., Jiang, Q., and Zhao, W. (2022). Meta-analysis of metagenomics reveals the signatures of vaginal microbiome in preterm birth. Med. Microecol. 14, 100065.

[0802] 10.1016 / j.medmic.2022.100065.

[0803] 32. Kosti, I., Lyalina, S., Pollard, K. S., Butte, A. J., and Sirota, M. (2020). Meta-Analysis of Vaginal Microbiome Data Provides New Insights Into Preterm Birth. Front. Microbiol. 11, 476.

[0804] 10.3389 / fmicb.2020.00476.

[0805] 33. Park, S., Oh, D., Heo, H., Lee, G., Kim, S. M., Ansari, A., You, Y.-A., Jung, Y. J., Kim, Y.-H., Lee, M., et al. (2021). Prediction of preterm birth based on machine learning using bacterial risk score in cervicovaginal fluid. Am. J. Reprod. Immunol. 86, e13435. 10.1111 / aji.13435.

[0806] 34. Kumar, M., Murugesan, S., Singh, P., Saadaoui, M., Elhag, D. A., Terranegra, A., Kabeer, B. S. A., Marr, A. K., Kino, T., Brummaier, T., et al. (2021). Vaginal Microbiota and Cytokine Levels Predict Preterm Delivery in Asian Women. Front. Cell. Infect. Microbiol. 11, 639665.

[0807] 10.3389 / fcimb.2021.639665.

[0808] 35. Sharma, D., and Xu, W. (2021). phyLoSTM: a novel deep learning model on disease prediction from longitudinal microbiome data. Bioinformatics 37, 3707-3714.

[0809] 10.1093 / bioinformatics / btab482.

[0810] 36. Zheng, Q., Bartow-McKenney, C., Meisel, J. S., and Grice, E. A. (2018). HmmUFOtu: An HMM and phylogenetic placement based ultra-fast taxonomic assignment and OTU picking tool Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0811] for microbiome amplicon sequencing studies. Genome Biol. 19, 82. 10.1186 / s13059-018-1450-0.

[0812] 37. Janssen, S., McDonald, D., Gonzalez, A., Navas-Molina, J. A., Jiang, L., Xu, Z. Z., Winker, K., Kado, D. M., Orwoll, E., Manary, M., et al. (2018). Phylogenetic Placement of Exact Amplicon Sequences Improves Associations with Clinical Information. mSystems 3, e00021-18.

[0813] 10.1128 / mSystems.00021 -18.

[0814] 38. Mirarab, S., Nguyen, N., and Warnow, T. (2012). SEPP: SATe-enabled phylogenetic placement. Pac. Symp. Biocomput. Pac. Symp. Biocomput., 247-258.

[0815] 10.1142 / 9789814366496_0024.

[0816] 39. Silverman, J. D., Washburne, A. D., Mukherjee, S., and David, L. A. (2017). A phylogenetic transform enhances analysis of compositional microbiota data. eLife 6. 10.7554 / eLife.21887. 40. Mailman, M. D., Feolo, M., Jin, Y., Kimura, M., Tryka, K., Bagoutdinov, R., Hao, L., Kiang, A., Paschall, J., Phan, L., et al. (2007). The NCBI dbGaP database of genotypes and phenotypes. Nat. Genet. 39, 1181-1186. 10.1038 / ng1007-1181.

[0817] 41. Sirota, M., Thomas, C. G., Liu, R., Zuhl, M., Banerjee, P., Wong, R. J., Quaintance, C. C., Leite, R., Chubiz, J., Anderson, R., et al. (2018). Enabling precision medicine in neonatology, an integrated repository for preterm birth research. Sci. Data 5, 180219. 10.1038 / sdata.2018.219.

[0818] 42. Minot, S. S., Garb, B., Roldan, A., Tang, A., Oskotsky, T., Rosenthal, C., Hoffman, N. G., Sirota, M., and Golob, J. L. (2022). Robust Harmonization of Microbiome Studies by Phylogenetic Scaffolding with MaLiAmPi. Preprint at bioRxiv, 10.1101 / 2022.07.26.501561 10.1101 / 2022.07.26.501561.

[0819] 43. Willis, A. D. (2019). Rarefaction, Alpha Diversity, and Statistics. Front. Microbiol. 10, 2407.

[0820] 10.3389 / fmicb.2019.02407.

[0821] 44. Matsen, F. A., Kodner, R. B., and Armbrust, E. V. (2010). pplacer: linear time maximumlikelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics 11, 538. 10.1186 / 1471-2105-11-538.

[0822] 45. Chao, A., Chiu, C.-H., and Jost, L. (2014). Unifying Species Diversity, Phylogenetic Diversity, Functional Diversity, and Related Similarity and Differentiation Measures Through Hill Numbers. Annu. Rev. Ecol. Evol. Syst. 45, 297-324. 10.1146 / annurev-ecolsys-120213-091540.

[0823] 46. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. (2011). Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825-2830. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0824] 47. Minot, S. S., Garb, B., Roldan, A., Tang, A., Oskotsky, T., Rosenthal, C., Hoffman, N. G., Sirota, M., and Golob, J. L. (2022). Robust Harmonization of Microbiome Studies by Phylogenetic Scaffolding with MaLiAmPi (Bioinformatics) 10.1101 / 2022.07.26.501561.

[0825] 48. Honest, H., Forbes, C. A., Duree, K. H., Norman, G., Duffy, S. B., Tsourapas, A., Roberts, T. E., Barton, P. M., Jowett, S. M., Hyde, C. J., et al. (2009). Screening to prevent spontaneous preterm birth: systematic reviews of accuracy and effectiveness literature with economic modelling. Health Technol. Assess. Winch. Engl. 13, 1-627. 10.3310 / hta13430.

[0826] 49. Budelier, M. M., and Hubbard, J. A. (2023). The regulatory landscape of laboratory developed tests: Past, present, and a perspective on the future. J. Mass Spectrom. Adv. Clin. Lab 28, 67-69. 10.1016 / j.jmsacl.2O23.02.008.

[0827] 50. Golob, J. L., Margolis, E., Hoffman, N. G., and Fredricks, D. N. (2017). Evaluating the accuracy of amplicon-based microbiome computational pipelines on simulated human gut microbial communities. BMC Bioinformatics 18, 283. 10.1186 / s12859-017-1690-0.

[0828] 51. Brown, R. G., Al-Memar, M., Marchesi, J. R., Lee, Y. S., Smith, A., Chan, D., Lewis, H., Kindinger, L., Terzidou, V., Bourne, T., et al. (2019). Establishment of vaginal microbiota composition in early pregnancy and its association with subsequent preterm prelabor rupture of the fetal membranes. TransL Res. J. Lab. Clin. Med. 207, 30-43. 10.1016 / j.trsl.2O18.12.005. 52. Roberto, R., Tarca, A., Winters, A., Panzer, J., Lin, H., Gudicha, D., Galaz, J., Farias-Jofre, M., Kracht, D., Chaiworapongsa, T., et al. (2022). The Vaginal Microbiota in Early Pregnancy Identifies a Subset of Women at Risk for Early Preterm Prelabor Rupture of Membranes and Preterm Birth. 10.21203 / rs.3.rs-2359402 / v1.

[0829] 53. Gihawi, A., Ge, Y., Lu, J., Puiu, D., Xu, A., Cooper, C. S., Brewer, D. S., Pertea, M., and Salzberg, S. L. (2023). Major data analysis errors invalidate cancer microbiome findings. mBio, 60160723. 10.1128 / mbio.01607-23.

[0830] 54. Bhattacharya, S., Andorf, S., Gomes, L., Dunn, P., Schaefer, H., Pontius, J., Berger, P., Desborough, V., Smith, T., Campbell, J., et al. (2014). ImmPort: disseminating data to the public for the future of immunology. Immunol. Res. 58, 234-239. 10.1007 / s12026-014-8516-1.

[0831] 55. Leinonen, R., Sugawara, H., Shumway, M., and International Nucleotide Sequence Database Collaboration (2011). The sequence read archive. Nucleic Acids Res. 39, D19-21.

[0832] 10.1093 / nar / gkq1019.

[0833] 56. Leinonen, R., Akhtar, R., Birney, E., Bower, L., Cerdeno-Tarraga, A., Cheng, Y., Cleland, I., Faruque, N., Goodgame, N., Gibson, R., et al. (2011). The European Nucleotide Archive. Nucleic Acids Res. 39, D28-D31. 10.1093 / nar / gkq967. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0834] 57. Kindinger, L. M., Bennett, P. R., Lee, Y. S., Marchesi, J. R., Smith, A., Cacciatore, S., Holmes, E., Nicholson, J. K., Teoh, T. G., and MacIntyre, D. A. (2017). The interaction between vaginal microbiota, cervical length, and vaginal progesterone treatment for preterm birth risk. Microbiome 5, 6. 10.1186 / s40168-016-0223-9.

[0835] 58. Kozich, J. J., Westcott, S. L., Baxter, N. T., Highlander, S. K., and Schloss, P. D. (2013). Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Appl. Environ. Microbiol. 79, 5112— 5120. 10.1128 / AEM.01043-13.

[0836] 59. Caporaso, J. G., Lauber, C. L., Walters, W. A., Berg-Lyons, D., Lozupone, C. A., Turnbaugh, P. J., Fierer, N., and Knight, R. (2011 ). Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc. Natl. Acad. Sci. 108, 4516-4522. 10.1073 / pnas.1000080107.

[0837] 60. Fujimura, K. E., Sitarik, A. R., Havstad, S., Lin, D. L., Levan, S., Fadrosh, D., Panzer, A. R., LaMere, B., Rackaityte, E., Lukacs, N. W., et al. (2016). Neonatal gut microbiota associates with childhood multisensitized atopy and T cell differentiation. Nat. Med. 22, 1187-1191.

[0838] 10.1038 / nm.4176.

[0839] 61. Romero, R., Hassan, S. S., Gajer, P., Tarca, A. L., Fadrosh, D. W., Bieda, J., Chaemsaithong, P., Miranda, J., Chaiworapongsa, T., and Ravel, J. (2014). The vaginal microbiota of pregnant women who subsequently have spontaneous preterm labor and delivery and those with a normal delivery at term. Microbiome 2, 18. 10.1186 / 2049-2618-2-18.

[0840] 62. Stout, M. J., Zhou, Y., Wylie, K. M., Tarr, P. I., Macones, G. A., and Tuuli, M. G. (2017). Early pregnancy vaginal microbiome trends and preterm birth. Am. J. Obstet. Gynecol. 217, 356. e1-356. e18. 10.1016 / j.ajog.2017.05.030.

[0841] 63. Brown, R. G., Marchesi, J. R., Lee, Y. S., Smith, A., Lehne, B., Kindinger, L. M., Terzidou, V., Holmes, E., Nicholson, J. K., Bennett, P. R., et al. (2018). Vaginal dysbiosis increases risk of preterm fetal membrane rupture, neonatal sepsis and is exacerbated by erythromycin. BMC Med.

[0842] 16, 9. 10.1186 / S12916-017-0999-x.

[0843] 64. Elovitz, M. A., Gajer, P., Riis, V., Brown, A. G., Humphrys, M. S., Holm, J. B., and Ravel, J. (2019). Cervicovaginal microbiota and local immune response modulate the risk of spontaneous preterm delivery. Nat. Commun. 10, 1305. 10.1038 / s41467-019-09285-9.

[0844] 65. Fettweis, J. M., Serrano, M. G., Brooks, J. P., Edwards, D. J., Girerd, P. H., Parikh, H. I., Huang, B., Arodz, T. J., Edupuganti, L., Glascock, A. L., et al. (2019). The vaginal microbiome and preterm birth. Nat. Med. 25, 1012-1021. 10.1038 / s41591-019-0450-2.

[0845] 66. Evans, S. N., and Matsen, F. A. (2010). The phylogenetic Kantorovich-Rubinstein metric for environmental sequence samples. ArXivI 0051699 Q-Bio. Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0846] 67. France, M. T., Ma, B., Gajer, P., Brown, S., Humphrys, M. S., Holm, J. B., Waetjen, L. E., Brotman, R. M., and Ravel, J. (2020). VALENCIA: a nearest centroid classification method for vaginal microbial communities based on composition. Microbiome 8, 166. 10.1186 / s40168-020-00934-6.

[0847] 68. Brunson, J. C., and Read, Q. D. (2020). ggalluvial: Alluvial Plots in “ggplot2.” Version 0.12.3.

[0848] 69. Aitchison, J. (1986). The Statistical Analysis of Compositional Data (Chapman & Hall Ltd.).

[0849] 70. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T.-Y. (2017). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems (Curran Associates, Inc.), pp. 3146-3154.

[0850] 71. Breiman, L. (2001). Random Forests. Mach. Learn. 45, 5-32. 10.1023 / A:1010933404324.

[0851] 72. Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 16, 321-357. 10.1613 / jair.953.

[0852] 73. Monaco, A., Pantaleo, E., Amoroso, N., Lacalamita, A., Lo Giudice, C., Fonzino, A., Fosso, B., Picardi, E., Tangaro, S., Pesole, G., et al. (2021). A primer on machine learning techniques for genomic applications. Comput. Struct. BiotechnoL J. 19, 4345-4359.

[0853] 10.1016 / j.csbj.2O21.07.021.

[0854] 74. Ding, C., and Peng, H. (2005). Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol. 3, 185-205.

[0855] 75. Hearst, M. A., Dumais, S. T., Osuna, E., Platt, J., and Scholkopf, B. (1998). Support vector machines. IEEE Intell. Syst. Their AppL 13, 18-28. 10.1109 / 5254.708428.

[0856] 76. Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning (Springer) 10.1007 / 978-0-387-84858-7.

[0857] 77. Platt, J. and others (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 10, 61-74.

[0858] 78. Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., and Singer, Y. (2006). Online passive aggressive algorithms.

[0859] 79. Pandala, S. R. (2023). Lazy Predict.

[0860] 80. Sage Bionetworks (2021). challengescoring. (Sage Bionetworks).

[0861] 81. Sirota, M., Thomas, C. G., Liu, R., Zuhl, M., Banerjee, P., Wong, R. J., Quaintance, C. C., Leite, R., Chubiz, J., Anderson, R., et al. (2018). Enabling precision medicine in neonatology, an integrated repository for preterm birth research. Sci. Data 5, 180219. 10.1038 / sdata.2018.219.

[0862] 82. Romero, R., Hassan, S. S., Gajer, P., Tarca, A. L., Fadrosh, D. W., Bieda, J., Chaemsaithong, P., Miranda, J., Chaiworapongsa, T., and Ravel, J. (2014). The vaginal Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0863] microbiota of pregnant women who subsequently have spontaneous preterm labor and delivery and those with a normal delivery at term. Microbiome 2, 18. 10.1186 / 2049-2618-2-18.

[0864] 83. Fettweis, J. M., Serrano, M. G., Brooks, J. P., Edwards, D. J., Girerd, P. H., Parikh, H. I., Huang, B., Arodz, T. J., Edupuganti, L., Glascock, A. L., et al. (2019). The vaginal microbiome and preterm birth. Nat. Med. 25, 1012-1021. 10.1038 / s41591-019-0450-2.

[0865] 84. Callahan, B. J., DiGiulio, D. B., Goltsman, D. S. A., Sun, C. L., Costello, E. K., Jeganathan, P., Biggio, J. R., Wong, R. J., Druzin, M. L., Shaw, G. M., et al. (2017). Replication and refinement of a vaginal microbial signature of preterm birth in two racially distinct cohorts of US women. Proc. Natl. Acad. Sci. U. S. A. 114, 9966-9971. 10.1073 / pnas.1705899114.

[0866] 85. Liao, J., Shenhav, L., Urban, J. A., Serrano, M., Zhu, B., Buck, G. A., and Korem, T. (2023). Microdiversity of the Vaginal Microbiome is Associated with Preterm Birth. BioRxiv Prepr. Serv. Biol., 2023.01.13.523991. 10.1101 / 2023.01.13.523991.

[0867] 86. Kindinger, L. M., MacIntyre, D. A., Lee, Y. S., Marchesi, J. R., Smith, A., McDonald, J. A. K., Terzidou, V., Cook, J. R., Lees, C., Israfil-Bayli, F., et al. (2016). Relationship between vaginal microbial dysbiosis, inflammation, and pregnancy outcomes in cervical cerclage. Sci. Transl. Med.

[0868] 8, 350ra102. 10.1126 / scitranslmed.aag1026.

[0869] 87. Brown, R. G., Marchesi, J. R., Lee, Y. S., Smith, A., Lehne, B., Kindinger, L. M., Terzidou, V., Holmes, E., Nicholson, J. K., Bennett, P. R., et al. (2018). Vaginal dysbiosis increases risk of preterm fetal membrane rupture, neonatal sepsis and is exacerbated by erythromycin. BMC Med.

[0870] 16, 9. 10.1186 / S12916-017-0999-x.

[0871] 88. Brown, R. G., Al-Memar, M., Marchesi, J. R., Lee, Y. S., Smith, A., Chan, D., Lewis, H., Kindinger, L., Terzidou, V., Bourne, T., et al. (2019). Establishment of vaginal microbiota composition in early pregnancy and its association with subsequent preterm prelabor rupture of the fetal membranes. Transl. Res. J. Lab. Clin. Med. 207, 30—43. 10.1016 / j.trsl.2O18.12.005. 89. Minot, S. S., Garb, B., Roldan, A., Tang, A. S., Oskotsky, T. T., Rosenthal, C., Hoffman, N. G., Sirota, M., and Golob, J. L. (2023). MaLiAmPi enables generalizable and taxonomyindependent microbiome features from technically diverse 16S-based microbiome studies. Cell Rep. Methods, 100639. 10.1016 / j.crmeth.2O23.100639.

[0872] Accordingly, the preceding merely illustrates the principles of the present disclosure. It will be appreciated that those skilled in the art will be able to devise various arrangements which, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and conditional language recited herein are principally intended to aid the reader in understanding the principles of the invention Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)

[0873] and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents and equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure. The scope of the present invention, therefore, is not intended to be limited to the exemplary embodiments shown and described herein.

Claims

Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)WHAT IS CL IMED IS:

1. A method comprising:performing 16S rRNA gene sequencing on vaginal fluid sample DNA to obtain vaginal microbiome 16S rRNA gene sequencing data;harmonizing the vaginal microbiome 16S rRNA gene sequencing data; transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into vaginal microbiome features;inputting the vaginal microbiome features into a predictive model; andusing the predictive model, predicting the likelihood of preterm birth (PTB) or early preterm birth (ePTB) for the subject from whom the vaginal fluid sample was obtained.

2. The method of claim 1, wherein harmonizing the vaginal microbiome 16S rRNA gene sequencing data comprises phylogenetic placement of amplicon sequence variants (ASVs) onto a maximum likelihood phylogenetic tree comprised of full-length 16S rRNA alleles.

3. The method of claim 1 or 2, wherein transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into vaginal microbiome features comprises transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into feature tables.

4. The method of any one of claims 1-3, wherein the vaginal microbiome features comprise one or more diversity measures, one or more community state types, one or more phylotypes, one or more taxons, or any combination thereof.

5. The method of claim 4, wherein the vaginal microbiome features comprise one or more diversity measures, one or more community state types, one or more phylotypes, and one or more taxons.

6. The method of claim 4 or 5, wherein the vaginal microbiome features comprise one or more of the features listed in FIG. 4B.Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)7. The method of claim 4 or 5, wherein the vaginal microbiome features comprise one or more of the features listed in Table 1, optionally wherein the one or more features are from those of rows 1-103.

8. The method of claim 7, wherein the vaginal microbiome features comprise 10 or more of the features listed in Table 1, optionally wherein the 10 or more features are from those of rows 1-103.

9. The method of claim 8, wherein the vaginal microbiome features comprise 25 or more of the features listed in Table 1, optionally wherein the 25 or more features are from those of rows 1-103.

10. The method of any one of claims 1-9, wherein the one or more features comprise Taxonomy (Genus): Mobiluncus.

11. The method of any one of claims 1-10, wherein the vaginal fluid sample was obtained by the subject from whom the vaginal fluid sample was obtained.

12. The method of claim 11, wherein the vaginal fluid sample was obtained at the subject’s home.

13. The method of any one of claims 1-12, further comprising providing the predicted likelihood of PTB or ePTB to the subject from whom the vaginal fluid sample was obtained.

14. The method of claim 13, further comprising, prior to providing the predicted likelihood of PTB or ePTB to the subject, assessing the quality of the predicted likelihood of PTB or ePTB.

15. The method of any one of claims 1-14, wherein the predicted likelihood of PTB or ePTB meets a threshold value, and wherein the method further comprises administering one or more PTB or ePTB interventions to the subject and / or fetus based on the predicted likelihood.

16. The method of claim 15, wherein the intervention comprises administering corticosteroids or magnesium sulfate to the subject.Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)17. One or more computer-readable media comprising instructions stored thereon, which when executed by one or more processors, cause the one or more processors to perform operations comprising:(a) harmonizing vaginal microbiome 16S rRNA gene sequencing data obtained by 16S rRNA gene sequencing on vaginal fluid sample DNA;(b) transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into vaginal microbiome features;(c) inputting the vaginal microbiome features into a predictive model; and(d) using the predictive model, predicting the likelihood of preterm birth (PTB) or early preterm birth (ePTB) for the subject from whom the vaginal fluid sample was obtained.

18. The one or more computer-readable media of claim 17, wherein harmonizing the vaginal microbiome 16S rRNA gene sequencing data comprises phylogenetic placement of amplicon sequence variants (ASVs) onto a maximum likelihood phylogenetic tree comprised of full-length 16S rRNA alleles.

19. The one or more computer-readable media of claim 17 or 18, wherein transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into vaginal microbiome features comprises transforming the harmonized vaginal microbiome 16S rRNA gene sequencing data into feature tables.

20. The one or more computer-readable media of any one of claims 17-19, wherein the vaginal microbiome features comprise one or more diversity measures, one or more community state types, one or more phylotypes, one or more taxons, or any combination thereof.

21. The one or more computer-readable media of claim 20, wherein the vaginal microbiome features comprise one or more diversity measures, one or more community state types, one or more phylotypes, and one or more taxons.

22. The one or more computer-readable media of claim 20 or 21, wherein the vaginal microbiome features comprise one or more of the features listed in FIG. 4B.Atty. Docket: UCSF-794WO (SF2024-148-2-PCT-0)23. The one or more computer-readable media of claim 20 or 21 wherein the vaginal microbiome features comprise one or more of the features listed in Table 1, optionally wherein the one or more features are from those of rows 1-103.

24. The one or more computer-readable media of claim 23, wherein the vaginal microbiome features comprise 10 or more of the features listed in Table 1, optionally wherein the 10 or more features are from those of rows 1-103.

25. The one or more computer-readable media of claim 24, wherein the vaginal microbiome features comprise 25 or more of the features listed in Table 1, optionally wherein the 25 or more features are from those of rows 1-103.

26. The one or more computer-readable media of any one of claims 17-25, wherein the one or more features comprise Taxonomy (Genus): Mobiluncus.

27. A system for predicting the likelihood of preterm birth (PTB) or early preterm birth (ePTB) for a subject, comprising:one or more processors; andthe one or more computer-readable media of any one of claims 17-26.