Method for determining the risk of developing a liver disease using single nucleotide polymorphisms

The detection of specific SNPs in combination with clinical data allows for early identification of liver disease risk, addressing the lack of effective genetic markers for liver disease management.

WO2026139664A2PCT designated stage Publication Date: 2026-07-02INST DE SALUD CARLOS III +1

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
INST DE SALUD CARLOS III
Filing Date
2025-12-26
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Current methods fail to effectively identify genetic markers for early detection and management of liver diseases, particularly in individuals with Alpha-1 antitrypsin deficiency, due to insufficient understanding of genetic factors contributing to fatty liver disease progression.

Method used

A method involving the detection of specific single nucleotide polymorphisms (SNPs) such as rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967, and rs17846713, combined with clinical data, to determine the risk of developing liver diseases like steatosis, fibrosis, and cirrhosis, using polygenic risk models and logistic regression.

Benefits of technology

Enables early identification of individuals at higher risk for liver diseases, facilitating timely intervention and improved clinical management through genetic and clinical data integration.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure IMGF000013_0001
    Figure IMGF000013_0001
  • Figure IMGF000020_0001
    Figure IMGF000020_0001
  • Figure IMGF000028_0001
    Figure IMGF000028_0001
Patent Text Reader

Abstract

The present invention relates to methods for determining the risk of developing a liver disease based on single nucleotide polymorphism detection. More specifically, the present invention comprises the use of rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and / or rs17846713. The present invention therefore relates to the field of biomedicine.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] DESCRIPTION

[0002] Method for determining the risk of developing liver disease using single nucleotide polymorphisms

[0003] The present invention relates to methods for determining the risk of developing liver disease based on the detection of single nucleotide polymorphisms. Therefore, the present invention falls within the field of biomedicine.

[0004] BACKGROUND OF THE INVENTION

[0005] The liver is one of the largest and most important organs in the human body, an organ that is involved in various vital processes, such as keeping the body clean and breaking down food into nutrients, converting them into energy, while also eliminating various toxins from them.

[0006] Liver diseases are those that affect the liver in some way, preventing it from working or functioning properly. Liver diseases can vary in severity, from mild conditions like steatosis (fatty liver disease) to more serious diseases such as hepatitis (unregulated inflammation), cirrhosis (severe scarring), and hepatocellular carcinoma. Early diagnosis and appropriate treatment are crucial to prevent disease progression and improve patient health outcomes.

[0007] The development of liver disease follows a progressive process that can vary depending on the underlying cause. However, many liver diseases share a common pattern involving stages of liver damage, steatosis, inflammation, fibrosis, and, in severe cases, cirrhosis or cancer.

[0008] Hepatic steatosis, or fatty liver, is a condition characterized by the excessive accumulation of fat in liver cells. It is generally defined as more than 5% of the liver's weight being fat. It is classified into non-alcoholic fatty liver disease (NAFLD), which can be related to factors such as obesity, type 2 diabetes, and metabolic syndrome, and alcoholic fatty liver disease (AFLD), related to excessive alcohol consumption. In its early stages, steatosis is usually asymptomatic, but if left untreated, it can progress to inflammation (steatohepatitis), fibrosis, cirrhosis, and liver cancer.

[0009] In the early stages, treatment may focus on dietary and lifestyle changes, while other cases may require lifelong medication for management. Starting treatment early enough can often prevent permanent damage. However, you may not experience any symptoms in the early stages. Advanced liver disease is more complex to treat (Martin P. Approach to the patient with liver disease. In: Goldman L, Schafer Al, eds. Goldman-Cecil Medicine. 26th ed. Philadelphia, PA: Elsevier; 2020:chap 137; Williams MJ, Gordon-Walker TT. Hepatology. In: Penman ID, Ralston SH, Strachan MWJ, Hobson RP, eds. Davidson's Principles and Practice of Medicine. 24th ed. Philadelphia, PA: Elsevier; 2023:chap 24).

[0010] Some liver diseases have a genetic origin, due to inherited mutations that affect the metabolism or regulation of key substances in the liver. Alpha-1 antitrypsin deficiency (AATD) is caused by a genetic alteration that impairs the production of AAT in the liver. Patients with AATD can develop steatosis, hepatitis, fibrosis, and in some cases, progress to cirrhosis. The factors that trigger the development of liver disease in patients with AATD are currently unknown.

[0011] It has also been described that common genetic variations in the population can be associated with a higher risk of developing certain liver diseases. In this sense, the study of certain polymorphisms has already proven important in liver pathology, such as rs738409 of the PNPLA3 gene and rs58542926 of the TM6SF2 gene. In a disease as important as chronic hepatitis C, this study is presented as a necessity to better understand the disease progression in these patients and, eventually, serve as a tool to select those who need treatment more urgently, especially today, when highly effective direct-acting antiviral drugs exist, but their cost prevents them from being offered universally (Urzúa et al. (2015). Association between polymorphisms in the PNPLA3 and TM6SF2 genes and presence of fibrosis in patients with chronic hepatitis C virus infection. Revista Hospital Clínico Universidad de Chile, 26(4), 329-35).The genetic factors that contribute to fatty liver disease are not yet fully understood. Therefore, there is a need to establish new biomarkers that allow for the rapid and efficient identification of a risk of developing liver disease, thereby improving the management and treatment of these conditions.

[0012] DESCRIPTION OF THE INVENTION

[0013] The present inventors sequenced the complete exorna of subjects with Alpha-1 Antitrypsin Deficiency (AATD), all with the ZZ genotype, to identify genetic variants modifying AATD. One group developed only liver disease (ZZ-HIG), and another group only lung disease (ZZ-PUL). Subsequently, the allele frequencies of the variants were compared in each group—cases with liver disease, ZZ-HIG, and cases with lung disease, ZZ-PUL—using Fisher's exact test or logistic regression. From the variants significantly associated with liver disease, seven single nucleotide polymorphisms (SNPs) of interest were selected due to their potential relationship with lipid accumulation in hepatocytes, representing biomarkers for identifying subjects at risk of developing liver disease.Finally, a polygenic risk model (PRS) was performed to assess the genetic predisposition of an ZZ individual to develop liver disease based on the sum of effects of the selected genetic variants.

[0014] The inventors found that the presence of the SNPs rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967, and / or rs17846713 (biomarkers or SNPs of the invention), of interest due to their relationship with different aspects of lipid metabolism, is associated with the development of liver disease. Therefore, in light of the inventors' findings, the individual presence of these seven SNPs or their combinations, detected in a single biological sample from a subject, represents a novel biomarker for determining the risk of developing liver disease, thus enabling prevention and / or early diagnosis, leading to improved clinical management of individuals with such diseases.

[0015] Therefore, in one aspect the present invention relates to an in vitro method for determining the risk of developing liver disease in a subject, hereinafter referred to as “Method I of the invention,” wherein said method comprises the following steps: a) detecting the presence of a single nucleotide polymorphism (SNP) in a biological sample isolated from the subject, wherein the SNP is selected from among rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 or rs17846713; and

[0016] b) determine the subject's risk of developing liver disease based on the results of step a).

[0017] The term "risk" implies the presence of a characteristic or factor (or vapors) that increases the likelihood of adverse consequences. In the context of the present invention, risk constitutes a measure of the statistical probability that a subject will suffer from or develop a disease in the future, more specifically, a liver disease. Thus, the presence of a risk factor signifies an increased risk of developing a disease in the future, compared to subjects who do not possess the risk factor.

[0018] In the present invention, the terms “liver disease,” “hepatopathy,” or “liver damage,” used interchangeably, refer to a type of liver injury or disease. When the injury is long-lasting, chronic liver disease develops. Chronic liver disease can progress through approximately four stages: 1. Hepatitis; 2. Fibrosis; 3. Cirrhosis; 4. Liver failure. Examples of liver diseases include, but are not limited to, non-alcoholic fatty liver disease (NAFLD), fibrosis, non-alcoholic steatohepatitis (NASH), metabolically dysfunctional fatty liver disease (MAFLD), cirrhosis, and hepatocellular carcinoma with metabolic dysfunction (MAFLD).

[0019] To determine the risk of developing liver disease in method I of the invention, step a) is first carried out, which consists of detecting the presence of a single nucleotide polymorphism (SNP) in a biological sample isolated from the subject, wherein the SNP is selected from among rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 or rs178467132.

[0020] A single nucleotide polymorphism, or SNP, is a variation in the DNA sequence that occurs when a single nucleotide in the genome differs between members of a biological species or between paired chromosomes in an individual. SNPs are used as genetic markers for vagrant alleles. An allele is one of a series of two or more different gene sequences that occupy the same position, or locus, on a chromosome.

[0021] "Vanant" refers to a sequence of polynucleotides or polypeptides that differs from a wild-type or reference sequence by one or more nucleotides or one or more amino acids.

[0022] As used in this document, "rs" in "rs3859093", "rs34474737", "rs10893", "rs2070666", "rs7087728", "rs55907967" or "rs178467132" is the rs number (Reference SNP Identification Number) from the National Center for Biotechnology Information's (NCBI) Single Nucleotide Polymorphism Database (dbSNP).

[0023] The identification of genetic variants in an isolated biological sample can be carried out using methods known to the expert in the field, such as, but not limited to, biochemical tests (to measure the proteins produced by the genes) and molecular tests (to detect small mutations in DNA).

[0024] Thus, in a particular embodiment of Method I of the invention, the presence of the SNP is determined by a technique selected from the list consisting of genome sequencing, exorna sequencing, TaqMan assays, allele-specific primer extension, and PCR. In a more particular embodiment, the presence of the SNP is determined by exorna sequencing. Even more particularly, the presence of the SNP is determined by NGS.

[0025] In the context of the present invention, "detecting a single nucleotide polymorphism (SNP)" refers to the process by which the presence of a specific allele or alleles (in homozygosity or heterozygosity of a nucleic acid variant, preferably DNA) is determined, where the allele or alleles may correspond to the nucleotides A (adenine), G (guanine), C (cytosine), or T (thymine) when the nucleic acid is DNA. Thus, when A (adenine) is detected as an allele in a specific genomic region, it is described as "allele A"; when T (thymine) is detected as an allele, it is described as "allele T"; when G (guanine) is detected as an allele, it is described as "allele G"; and C (cytosine) is described as "allele C". The presence of one or two alleles in an SNP may be associated with a higher or lower predictive value of the SNP.Thus, in the present invention, "detecting a single nucleotide polymorphism (SNP)" preferably comprises determining both alleles of a given SNP or set of SNPs, thereby providing the genotype for each of the SNPs.

[0026] “Genotype” means the genetic makeup of a cell, organism, or individual at a specific position in its genome. With reference to the invention, the genotype of an individual is determined as heterozygous or homozygous for one or more vaquent alleles of interest.

[0027] As observed in the experiments section, each of the SNPs rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 or rs178467132 has an allele associated with a higher risk of liver disease. Specifically, the presence of at least one C allele in rs3859093, one G allele in rs34474737, one G allele in rs10893, one A allele in rs2070666, one A allele in rs7087728, one G allele in rs55907967 and / or one G allele in rs17846713 is associated with a higher risk of liver disease. Preferably, the alleles included in this paragraph are referred to as “risk alleles”.

[0028] Following step a), method I of the invention comprises step b) of determining the subject's risk of developing liver disease based on the results of step a).

[0029] Thus, in a particular embodiment of method I of the invention, it is determined in step b) that the subject has a higher risk of developing liver disease when:

[0030] rs3859093 has at least one C allele or a CC genotype,

[0031] rs34474737 has at least one G allele or a GG genotype,

[0032] rs10893 has at least one G allele or one GG genotype,

[0033] rs2070666 has at least one A allele or an AA genotype,

[0034] rs7087728 has at least one A allele or an AA genotype,

[0035] rs55907967 has at least one G allele, or a GG genotype, and / or

[0036] rs17846713 has at least one G allele or a GG genotype.

[0037] Additional copies of the risk alleles of the SNPs of the present invention are associated with an increased risk of liver disease. Therefore, as can be seen in the examples described below, when the genotype of the SNP identified by rs3859093 in the subject is CC, the subject has a higher risk of developing liver disease than if rs3859093 has a single C allele.

[0038] When the genotype of the SNP identified by rs34474737 in the subject is GG, the subject has a higher risk of developing liver disease than if rs34474737 has a single G allele.

[0039] When the genotype of the SNP identified by rs10893 in the subject is GG, the subject has a higher risk of developing liver disease than if rs10893 has a single G allele.

[0040] When the genotype of the SNP identified by rs2070666 in the subject is AA, the subject has a higher risk of developing liver disease than if rs2070666 has a single A allele.

[0041] When the genotype of the SNP identified by rs7087728 in the subject is AA, the subject has a higher risk of developing liver disease than if rs7087728 has a single A allele.

[0042] When the genotype of the SNP identified by rs55907967 in the subject is GG, the subject has a higher risk of developing liver disease than if rs55907967 has a single G allele.

[0043] When the genotype of the SNP identified by rs17846713 in the subject is of type GG, the subject has a greater risk of developing liver disease than if rs17846713 has a single G allele.

[0044] In another particular embodiment, step a) of method I of the invention comprises detecting the presence of the combination of at least two, three, four, five or six SNPs selected from rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 or rs17846713.

[0045] In another more particular embodiment of method I of the invention, alone or in combination with each of the above particular embodiments, step a) comprises detecting the presence of the SNP rs3859093 and at least one of the SNPs selected from rs34474737, rs10893, rs2070666, rs7087728, rs55907967 or rs17846713.

[0046] In another particular embodiment, alone or in combination with each of the above particular embodiments, step a) comprises detecting the presence of the SNPs rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and rs17846713.

[0047] Thus, step b) of method I of the invention allows the determination of the subject's risk of developing liver disease based on the SNP or set of SNPs from step a).

[0048] Furthermore, in the method for determining the risk of developing liver disease of the present invention, the risk of developing liver disease can be determined based on other factors or clinical data in addition to the detection of a SNP, wherein the SNP is selected from rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 or rs178467132.

[0049] In the present invention, the term “clinical factor(s)” or “clinical data” refers to data related to demographic, physical, familial, habit, or substance exposure characteristics. Examples of clinical data include age, sex, body mass index, or tobacco exposure.

[0050] Thus, in another particular embodiment of method i of the invention, alone or in combination with each of the preceding particular embodiments, step a) further comprises determining at least one clinical data point of the subject. More particularly, it comprises determining the clinical data point selected from among age, sex, body mass index, and / or whether the subject is a smoker. More particularly, it comprises determining the clinical data of age, sex, body mass index, and whether the subject is a smoker.

[0051] In the present invention, a “sample” means a part or small quantity of something that is considered representative of the whole and that is taken or separated from it for study, analysis, or experimentation. In particular, in the present invention, the term sample encompasses samples of biological origin, which are isolated from a subject, including, but not limited to, a tissue sample (e.g., skin), saliva, blood, and serum.

[0052] Thus, in another particular embodiment of method I of the invention, alone or in combination with each of the preceding particular embodiments, the biological sample is selected from the list consisting of a tissue sample, blood, serum, and saliva; preferably where the tissue sample is skin.

[0053] In the present invention, the biological sample contains nucleic acids, preferably deoxyribonucleic acid (DNA). The nucleic acids from the patient's biological sample can be extracted prior to the step of determining the presence of a SNP or set of SNPs. As is known to those skilled in the art, the extraction of nucleic acids, preferably DNA, can be carried out by known methods such as, without limitation, extraction with organic solvents (phenol-chloroform or isoamyl alcohol), inorganic solvents, salt extraction, chromatography, or silica gel extraction.

[0054] Preferably, prior to the step of detecting the presence of an SNP or set of SNPs, DNA is extracted from the biological sample isolated from the subject.

[0055] In another particular embodiment of Method I of the invention, alone or in combination with any of the preceding particular embodiments, the liver disease is selected from the list consisting of steatosis, fibrosis, hepatitis, and cirrhosis. In a more particular embodiment, the liver disease is steatosis.

[0056] Fatty liver disease (FAD) is a condition characterized by the accumulation of fat in the liver, which can progress to more severe stages of liver damage. Approximately 20–30% of people with NAFLD develop steatohepatitis, a liver inflammation that can lead to fibrosis and, eventually, cirrhosis or even liver cancer. Risk factors for progression include obesity, type 2 diabetes, insulin resistance, and genetic factors.

[0057] In the present invention, the subject may suffer from some other condition. Thus, in another particular embodiment of the invention, alone or in combination with each of the preceding particular embodiments, the subject suffers from Alpha-1 Antitrypsin deficiency.

[0058] In the present invention, the expression “the subject suffers from Alpha-1 Antitrypsin Deficiency” means that the subject has been previously diagnosed with Alpha-1 Antitrypsin Deficiency. Alpha-1 antitrypsin deficiency is a genetic disorder caused by mutations in the SERPINA1 gene. The Z mutation is the most common and clinically significant variant associated with AATD, caused by a point mutation that results in the substitution of glutamic acid for lysine (p.Glu342Lys). Liver disease associated with AATD is caused by the accumulation of mutated AAT protein in liver cells. This can lead to liver inflammation, cirrhosis, and impaired liver function. In individuals with AATD, the liver cannot properly process and release the AAT protein into the bloodstream. As a result, the abnormal protein accumulates in liver cells, causing damage.This can lead to a variety of liver problems, ranging from mild liver dysfunction to severe liver disease. Some people with AATD may develop liver disease in childhood, while others may not experience symptoms until adulthood or may never develop liver disease at all. The most severe liver disease may require a liver transplant as the only treatment option.

[0059] Furthermore, the present inventors have developed a multivariable model that allows calculating a liver disease risk score based on the different SNP variables of the invention, where the calculation of said risk score allows determining whether the subject has a higher risk of suffering from liver disease.

[0060] Thus, in another aspect, the invention relates to an in vitro method for determining the risk of developing liver disease in a subject, hereinafter referred to as “method II of the invention”, where said method comprises the following steps:

[0061] a) Detecting the presence of single nucleotide polymorphisms (SNPs) in a biological sample isolated from the subject, where the SNPs are rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and rs17846713;

[0062] b) calculate a liver disease risk score based on the data from step a), and

[0063] c) compare the risk score from step b) with a threshold value,

[0064] where a liver disease risk score value greater than the threshold value indicates that the subject has a higher risk of developing liver disease. To determine the risk of developing liver disease in method II of the invention, step a) is first carried out, which consists of detecting the presence of single nucleotide polymorphisms (SNPs) in an isolated biological sample from the subject, where the SNPs are rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and rs178467132.

[0065] The identification of genetic variants in an isolated biological sample can be carried out using methods known to the expert in the field, such as, but not limited to, biochemical tests (to measure the proteins produced by the genes) and molecular tests (to detect small mutations in DNA).

[0066] Thus, in a particular embodiment of Method II of the invention, the presence of the SNP is determined by a technique selected from the list consisting of genome sequencing, exorna sequencing, TaqMan assays, allele-specific primer extension, and PCR. In a more particular embodiment, the presence of the SNP is determined by exorna sequencing. Even more particularly, the presence of the SNP is determined by NGS.

[0067] In another particular embodiment of method II of the invention, alone or in combination with each of the preceding particular embodiments, the subject has a greater risk of developing liver disease when:

[0068] rs3859093 has at least one C allele or a CC genotype,

[0069] rs34474737 has at least one G allele or a GG genotype,

[0070] rs10893 has at least one G allele or one GG genotype,

[0071] rs2070666 has at least one A allele or an AA genotype,

[0072] rs7087728 has at least one A allele or an AA genotype,

[0073] rs55907967 has at least one G allele, or a GG genotype, and / or

[0074] rs17846713 has at least one G allele or a GG genotype.

[0075] Following step a), method II of the invention comprises step b) of calculating a liver disease risk score based on the data from step a).

[0076] In the present invention, the term “risk score” refers to a statistical metric that allows for the stratification or classification of a subject or population based on the risk of a particular event occurring (e.g., the development of liver disease). As those skilled in the art know, the risk score can be calculated using a predictive model comprising multiple biomarker variables (e.g., different SNPs). In multivariable models, a score can be assigned to each of the variables (e.g., each of the SNPs) that are subsequently included in the risk score calculation, for example, without limitation, as a sum; as a weighted sum (e.g., in a regression model);using any linear or generalized linear model that takes normalized biomarker scores as inputs and produces, based on the input normalized biomarker scores, an output indicative of the risk of developing liver disease; or using any statistical model (e.g., a neural network model, a Bayesian regression model, an adaptive nonlinear regression model, a support vector regression model, a Gaussian mixture model, a random forest regression and / or any other mixture model of suitable types) taking normalized biomarker scores as inputs and producing, based on the input biomarker scores, an output indicative of the risk of developing liver disease.

[0077] In another particular embodiment of Method II of the invention, alone or in combination with each of the preceding particular embodiments, the risk score is calculated using a polygenic risk score (PRS). As those skilled in the art know, PRS models are typically constructed as the weighted sum of a collection of genetic variants, generally single nucleotide polymorphisms (SNPs) defined as single-base-pair variations of the reference genome (Choi, SW, Mak, TSH, & O'Reilly, PF (2020). Nature protocols, 15(9), 2759-2772; Collister, JA, Liu, X., & Clifton, L. (2022). Frontiers in genetics, 13, 818574).

[0078] Therefore, in step b) of method II of the invention, the liver disease risk score is calculated using a PRS model with the data from step a) (i.e., the detection of the polymorphisms rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and rs178467132).

[0079] In another particular embodiment, step b) is carried out using a PRS model with the data from step a) using formula (I):

[0080]

[0081] where p is the effect of vahant i, and genotype i is the number of risk alleles present in that vahant for the individual. The effect p of each vahant can be determined using odds ratio (OR) values, calculating the natural logarithm of each OR as p=ln(OR).

[0082] Following step b), method II of the invention comprises a step c) of comparing the risk score from step b) with a threshold value, wherein a liver disease risk score value greater than the threshold value indicates that the subject has a higher risk of developing liver disease.

[0083] In the present invention, the term “threshold value” refers to a predetermined value that is statistically predictive of the risk of developing liver disease. The threshold value can be predetermined by comparing data obtained from a subject or group of subjects who do not develop liver disease with data from a subject or group of subjects who do develop liver disease, thus establishing a threshold value from data of already classified subjects that allows differentiation between them. In this comparison, the threshold value can be obtained from analysis using statistical methodologies known to those skilled in the art, such as Receiver Operating Characteristic (ROC) curves, which determine a threshold value that allows for prediction or diagnosis with adequate sensitivity and specificity.In the present invention, the terms “threshold value”, “reference limit” and “cut-off reference value” are equivalent and may be used interchangeably.

[0084] In the context of the invention, the term “higher risk of developing liver disease” or “increased risk of developing liver disease” indicates that the subject is expected, i.e., predicted to develop, or is at high risk of developing, liver disease. Preferably, the term “higher,” in the context of this application, refers to a risk greater than the average risk for a heterogeneous population. Furthermore, in Method II for determining the risk of developing liver disease of the present invention, the risk of developing liver disease can be detected based on clinical data other than the detection of the SNPs rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967, and rs178467132.

[0085] Thus, in another particular embodiment of Method II of the invention, alone or in combination with each of the preceding particular embodiments, step a) further comprises determining at least one clinical data point of the subject. More particularly, it comprises determining the clinical data point selected from age, sex, body mass index, or whether the subject is a smoker.

[0086] When method II of the invention comprises in step a) further determining at least one clinical data or set of clinical data of the subject, the risk score for a liver disease calculated in step b) is calculated based on the data of the presence of the polymorphisms and clinical data(s).

[0087] Thus, preferably in step b) of method II of the invention, the liver disease risk score is calculated using a PRS model with the data from step a), i.e., with the data from the detection of the polymorphisms rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and rs178467132, together with at least one clinical data; more preferably with the clinical data of age, sex, body mass index and whether the subject is a smoker.

[0088] In another particular embodiment, step b) of method II of the invention is calculated from the data of step a), i.e., with the data from the detection of the polymorphisms rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and rs178467132, together with the clinical data of age, sex, body mass index and whether the subject is a smoker, preferably by means of the calculation of step b) is carried out by means of a logistic regression model according to formula (II):

[0089] (II) K = ?0+ ?-£ * PRS + (32* dad + ?3* sexo + ?4* bmi + ?5* tabaco + E¿

[0090] where,

[0091] Y: liver disease risk score,

[0092] PRS: polygenic risk model calculated according to formula (I), Sex: value assigned according to whether the subject is male (“1”) or female (“2”),

[0093] Age: numerical value of the subject's age,

[0094] BMI: body mass index

[0095] tobacco: value assigned according to whether the subject is a smoker (“1”) or not (“0”),

[0096] Po = intercept

[0097] Pi, p2, Ps, P4, Ps: values ​​of the factors that multiply each of the aforementioned variables, where these factors indicate their relationship to the effect of each variable on the risk of liver disease. The “P” values ​​can be calculated through routine practice by experts in the field, using data from subjects who have or will develop liver disease.

[0098] E¡ = error,

[0099] i = subject.

[0100] In another particular embodiment of method II of the invention, alone or in combination with each of the preceding particular embodiments, the biological sample is selected from the list consisting of tissue, blood, serum and saliva; preferably where the tissue is skin.

[0101] Thus, in another particular embodiment of Method II of the invention, alone or in combination with any of the preceding particular embodiments, the liver disease is selected from the list consisting of steatosis, fibrosis, hepatitis, and cirrhosis. In a more particular embodiment, the liver disease is steatosis.

[0102] In another particular embodiment of method II of the invention, alone or in combination with each of the preceding particular embodiments, the subject suffers from Alpha-1 Antitrypsin deficiency.

[0103] In another aspect, the present invention relates to a computer-implemented method for determining the risk of developing liver disease in a subject, hereinafter referred to as the “computer-implemented method of the invention”, wherein said method comprises the following steps:

[0104] a) receiving or accessing, by means of a processor, the single nucleotide polymorphism (SNP) data detected in an isolated biological sample of the subject, wherein the SNPs are rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and rs17846713, and carrying out steps b) and c) of method II of the invention from the data of step a), wherein a liver disease risk score value greater than the threshold value indicates that the subject has a higher risk of developing liver disease.

[0105] In a particular embodiment of the computer-implemented method of the invention, step a) further comprises receiving or accessing at least one clinical data point from the subject. More particularly, it comprises receiving or accessing the clinical data point selected from among those consisting of age, sex, body mass index, and / or whether the subject is a smoker.

[0106] In another particular embodiment of the computer-implemented method of the invention, alone or in combination with each of the above particular embodiments, the biological sample is selected from the list consisting of a tissue sample, blood, serum, and saliva; preferably where the tissue sample is skin.

[0107] In another particular embodiment of the computer-implemented method of the invention, alone or in combination with any of the preceding particular embodiments, the liver disease is selected from the list consisting of steatosis, fibrosis, hepatitis, and cirrhosis. In a more particular embodiment, the liver disease is steatosis.

[0108] In another particular embodiment of the computer-implemented method of the invention, alone or in combination with each of the above particular embodiments, the subject suffers from Alpha-1 Antitrypsin deficiency.

[0109] In another aspect, the invention relates to a computer program comprising instructions that, when the program is executed on a computer, cause the computer to carry out the steps of the computer-implemented method described in the present invention.

[0110] In another aspect of the invention, it relates to the in vitro use of the SNPs rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and / or rs17846713 to determine the risk of developing liver disease in a subject, hereinafter referred to as “use of the SNPs of the invention”. In a particular embodiment of the use of the invention, the subject suffers from Alpha-1 Antitrypsin deficiency.

[0111] In the present invention, the term “subject” refers to a human or non-human animal, preferably where the non-human animal is a mammal, of any age, sex, or race.

[0112] Thus, in preferred embodiments of methods I, II, and the computer-implemented method of the invention, the subject is a human or a non-human animal. In a more preferred embodiment, the subject is a human.

[0113] The present invention contemplates each of the preferred embodiments of the different aspects, either alone or in combination with the other preferred embodiments.

[0114] DESCRIPTION OF THE FIGURES

[0115] Figure 1. Allele frequencies of the selected variants comparing cases with liver disease, with lung disease and the frequency in the general population (gnomAD).

[0116] Figure 2. Distribution of cases according to the genotype for CES1 rs3859093 and according to the type of disease in ZZ patients.

[0117] Figure 3. Distribution of cases according to the genotype for LIPG rs 34474737 and according to the type of disease in ZZ patients.

[0118] Figure 4. Distribution of cases according to the genotype for AOC1 rs 10893 and according to the type of disease in ZZ patients.

[0119] Figure 5. Distribution of cases according to the APOC3 rs2070666 genotype and according to the type of disease in ZZ patients.

[0120] Figure 6. Distribution of cases according to genotype for MAT1A rs7087728 and according to disease type in ZZ patients. Figure 7. Distribution of cases according to genotype for SOAT2 rs55907967 and according to disease type in ZZ patients.

[0121] Figure 8. Distribution of cases according to the genotype for ATP1A2 rs17846713 and according to the type of disease in ZZ patients.

[0122] EXAMPLES

[0123] The following invention is described by means of the following examples, which should be interpreted as merely illustrative and not limiting to the scope of the invention.

[0124] MATERIALS AND METHODS

[0125] Samples

[0126] To identify genetic modifiers in AATD, the complete exorne (Twist Human Core Exorne Plus) was sequenced using NGS in DNA samples obtained from peripheral blood of a group of 72 patients with AATD, all with the ZZ genotype. Of these patients, 13 had liver disease (ZZ-HIG) and 59 had only lung disease (ZZ-PUL). Both groups were compared according to sex, age, body mass index (BMI), serum AAT levels, and smoking status.

[0127] Table 1. Characteristics of the cases included in the study.

[0128] <

[0129]

[0130] Association study: After annotating the vahant alleles following exorna sequencing, an association analysis was performed comparing the allele frequencies of the vahant alleles in each of the groups: cases with liver disease, ZZ-HIG, and cases with lung disease, ZZ-PLIL. The allele frequencies of the vahant alleles throughout the entire exorna were compared in both groups using Fisher's exact test or logistic regression to identify significant associations.

[0131] To estimate the effects of the SNPs, the following logistic regression model (formula III) was carried out for each of the SNPs:

[0132] (HI) AND t = p0+ p i SNP i + If

[0133] where,

[0134] Y = risk of liver disease,

[0135] i = subject

[0136] Po = intercept or constant term of the model, which represents the average value of Y when SNP¿=0

[0137] Pi = polymorphic vahante effect, indicates the average change in Y for each unit change in SNP¿,

[0138] E¡ = error that captures the variations in Y¡ that are not explained by SNPi or the model.

[0139] Knowing the odds ratios OR, p is calculated as the natural logarithm of OR, for each of the SNPs using p=ln(OR).

[0140] Odds ratios (ORs) compare the relative probability of liver disease in the presence of d SNPs versus the probability of not having liver disease. The OR is the probability of the event occurring divided by the probability of it not occurring.

[0141] If the OR = 1: there is no association between the SNP and liver disease. If the OR > 1: there is a positive association, suggesting that the SNP increases the likelihood of disease.

[0142] If the OR < 1: there is a negative association, indicating that the SNP could be a protective factor, decreasing the likelihood of liver disease. If the OR = 2: it means that those with the SNP are twice as likely to develop liver disease as those without it.

[0143] For significant variants, Odds Ratios (OR) were calculated to estimate the magnitude of the association between each genetic variant and the disease, along with 95% Confidence Intervals (95% CI). A p-value less than 0.05 was considered statistically significant.

[0144] Polygenic risk model

[0145] Of the variants significantly associated with cases of liver disease, 7 SNPs of interest have been selected due to their possible relationship with the accumulation of lipids in hepatocytes (Table 2), and which could represent possible biomarkers to identify DAAT ZZ patients at risk of developing liver disease, as opposed to cases that only develop lung disease.

[0146] Possible confounding factors, such as age, sex, BMI, or being a smoker, were taken into account for the risk estimation.

[0147] A polygenic risk model (PRS) was developed to assess the genetic predisposition of a ZZ individual to develop liver disease based on the sum of the effects of selected genetic variants. The estimated effect of each SNP is obtained by estimating the odds ratios (ORs) and beta coefficients. Each SNP is weighted by its beta coefficient, and for each individual, the risk is calculated by summing the products of the genotypes and their corresponding beta coefficients.

[0148] The polygenic risk score (PRS) is calculated by summing the effects of each vahant, weighted by the presence of each vahant in its genome. The score was calculated for each individual by summing the individual's risk alleles (from the 7 previously selected SNPs), weighted by the effect sizes of the risk alleles derived from the initial association study data (see the Association Study described in the Materials and Methods section of the Examples in this description), i.e., multiplied by its specific beta association coefficient.

[0149] The PRS was calculated using formula(l)

[0150]

[0151] where p is the effect of vahant i, and genotype i is the number of risk alleles present in that vahant for the individual. The effect p of each vahant can be determined using odds ratio (OR) values, calculating the natural logarithm of each OR as p=ln(OR).

[0152] Finally, a polygenic risk model of liver disease was constructed based on a logistic regression model that includes the variables: age, sex, body mass index (BMI), being a smoker or not, and the polygenic risk score (PRS), according to formula (II):

[0153] (II) Y = ?0+ / Á * PRS + (32* dad + ?3* sex + ?4* bmi + ?5* tobacco + E¿

[0154] where,

[0155] Y: liver disease risk score,

[0156] PRS: polygenic risk model calculated according to formula (I), Sex: value assigned according to whether the subject is male (“1”) or female (“2”),

[0157] Age: numerical value of the subject's age,

[0158] BMI: body mass index

[0159] tobacco: value assigned according to whether the subject is a smoker (“1”) or not (“0”),

[0160] Po = intercept,

[0161] Pi, 2, 3, 4, s: values ​​of the factors that multiply each of the aforementioned variables, where these factors indicate their relationship to the effect of each variable on the risk of liver disease. The “P” values ​​can be calculated through routine practice by experts in the field, using data from subjects who have or will develop liver disease.

[0162] E¡ = error,

[0163] To evaluate the predictive capacity of the model, a forensic analysis was performed, obtaining indicators of the quality of the model fit such as the pseudo R2, the ROC curve analysis to evaluate the ability of the PRS to discriminate between cases and controls, the Area Under the Curve (AUC) to quantify the accuracy of the PRS, as well as the sensitivity, specificity, or the percentage of false positives and false negatives.

[0164] RESULTS

[0165] Genetic variants associated with liver disease

[0166] Specific genetic markers, or SNPs, associated with the development of liver disease have been identified in patients with AATD. AATD is most frequently associated with the development of lung disease, primarily COPD (chronic obstructive pulmonary disease), and less frequently with liver disease.

[0167] Comparing the two groups of patients with AATD with ZZ genotype, one with pulmonary disease (ZZ-PIIL, 59 cases) and the other with hepatic disease (ZZ-HIG, 13 cases), vagrants significantly associated with hepatic disease have been identified. Seven vagrants of interest have been selected due to their relationship with different aspects of lipid metabolism (Table 2).

[0168] Table 2. List of genetic variants selected for polygenic risk estimation.

[0169] >

[0170] >

[0171] >

[0172] >

[0173] >

[0174]

[0175] The selected variants have a higher allelic frequency in cases with liver disease compared to cases with lung disease and to the frequency in the general population (gnomAD) (Figure 1).

[0176] The genetic factors that may influence the development of liver disease in cases of DAAT (ZZ genotype) are currently unknown. Identifying these factors could help pinpoint patients with a high probability of developing liver abnormalities. These patients could then benefit from better disease management, allowing for earlier control of liver disease progression.

[0177] CES1, Carboxylesterase 1, rs3859093

[0178] CES1, carboxylesterase 1, is encoded in humans by the CES1 gene. CES1 is a key enzyme in the detoxification of xenobiotics and in the activation of ester and amide prodrugs. It hydrolyzes aromatic and aliphatic esters and exhibits synthase activity of fatty acid ethyl esters. Furthermore, it converts monoacylglycerols into free fatty acids and glycerol, and acts on 2-aracidonylglycerol and prostaglandins.

[0179] The vahante rs3859093 in CES1 has been found to be associated with an increased risk of developing liver disease in patients with Alpha-1 Antithpsin Deficiency.

[0180] The rs3859093 variant, when considered individually, was significantly associated with a risk of liver disease (OR: 4.292; 95% confidence interval, CI: 1.647–11.19). 76% of the analyzed cases with liver disease had a C / T genotype, while only 23% of the cases with lung disease had this C / T genotype (Figure 2 and Table 3).

[0181] The CES1 rs3859093 variant has a population allele frequency for the minor allele of 0.278 (27.8%). The worldwide prevalence of non-alcoholic fatty liver disease is approximately 25%.

[0182] Table 3. Evaluation of the predictive capacity associated with the distribution of cases according to the genotype for CES1 rs3859093 and according to the type of disease in ZZ patients.

[0183] CES1 rs3859093

[0184] Sensitivity 0.76

[0185] Specificity 0.76

[0186] Positive predictive value 0.46

[0187] Negative predictive value 0.92

[0188] % of false negatives 23.53

[0189] % of false positives 23.81

[0190] Accuracy 0.76

[0191] LIPG, Lipase G, Endothelial, rs34474737

[0192] Endothelial lipase (LIPG) is a vascular lipase synthesized in endothelial cells. LIPG hydrolyzes phospholipids from VLDL, chylomicrons, and HDL; it also exhibits triglyceride lipase activity.

[0193] The vahante rs34474737 in the LIPG gene has been found to be significantly associated with ZZ cases with liver disease compared to cases with lung disease (OR: 5.076, 95% CI: 2.055–12.54). The allele frequency of vahante rs34474737 in the general population is 0.293 (29.3%), very similar to that found in ZZ samples with lung disease (27.1%). However, the allele frequency in cases with liver disease is 0.653 (65.3%) (Figure 3 and Table 4).

[0194] Table 4. Evaluation of the predictive capacity associated with the distribution of cases according to the genotype for LIPG rs34474737 and according to the type of disease in ZZ patients.

[0195] LIPG rs34474737

[0196] Sensitivity 1.00

[0197] Specificity 0.56

[0198] Positive predictive value 0.33

[0199] Negative predictive value 1.00

[0200] False negative rate 0.00

[0201] % de falsos positivos 44,07

[0202] Exactitud 0,64

[0203] AOC1, Copper-containing amino oxidase 1, rs 10893

[0204] AOC1, also known as histaminase, is a key enzyme in the degradation of biogenic amines, primarily histamine, which regulates allergic and inflammatory responses. It also breaks down other polyamines such as putrescine, participating in the control of cell growth and differentiation. It is found in various tissues, including the gastrointestinal tract, kidneys, and liver, where it helps prevent histamine intolerance and maintain biogenic amine homeostasis, thus contributing to detoxification and the balance of the immune system and other biological systems in the body.

[0205] The vahante rs10893 in the AOC1 gene appears to be significantly associated with ZZ cases with liver disease compared to cases with lung disease (OR: 4.466, 95% CI: 1.79 - 11.14).

[0206] The allele frequency in the general population for the vahante rs10893 is 0.375 (37.5%). The allele frequency found in ZZ samples with pulmonary disease is 16.1%. However, the allele frequency in cases with hepatic disease is 0.461 (46.1%) (Figure 4 and Table 5). Table 5. Evaluation of the predictive capacity associated with the distribution of cases according to the genotype for AOC1 rs10893 and according to the type of disease in ZZ patients.

[0207] AOC1 rs10893

[0208] Sensitivity 0.69

[0209] Specificity 0.69

[0210] Positive predictive value 0.33

[0211] Negative predictive value 0.91

[0212] % of false negatives 30.77

[0213] % of false positives 30.51

[0214] Accuracy 0.69

[0215] APOC3, Apolipoprotein C3, rs2070666

[0216] The APOC3 gene encodes a protein component of triglyceride-rich lipoproteins, including very low-density lipoproteins (VLDL), high-density lipoproteins (HDL), and chylomicrons. It plays a crucial role in the metabolism of these lipoproteins. It has been shown to promote VLDL secretion, inhibit lipoprotein lipase enzyme activity, and delay triglyceride catabolism. Mutations in this gene are associated with low plasma triglyceride levels and a reduced risk of ischemic cardiovascular disease and hyperalphalipoproteinemia, which is characterized by elevated levels of high-density lipoproteins (HDL) and HDL cholesterol.

[0217] The vahante rs2070666 in the APOC3 gene has been found to be significantly associated with ZZ cases with liver disease compared to cases with lung disease (OR: 10.89, 95% CI: 3.226 - 36.75).

[0218] The allele frequency in the general population for the rs2070666 variant is 0.17 (17%). The allele frequency found in ZZ samples with pulmonary disease is 5.7%. However, the allele frequency in cases with hepatic disease is 0.4 (40%) (Figure 5 and Table 6).

[0219] Table 6. Evaluation of the predictive capacity associated with the distribution of cases according to the genotype for APOC3 rs2070666 and according to the type of disease in ZZ.APOC3 rs2070666 patients

[0220] Sensitivity 0.46

[0221] Specificity 0.90

[0222] Positive predictive value 0.50

[0223] Negative predictive value 0.88

[0224] % of false negatives 53.85

[0225] % of false positives 10.17

[0226] Accuracy 0.82

[0227] MAT1A, Methionine Adenosyltransferase 1A, rs7087728

[0228] The MAT1A gene encodes the enzyme Methionine Adenosyltransferase 1A. MAT1A catalyzes the formation of S-adenosylmethionine from methionine and ATP. The reaction consists of two steps catalyzed by the same enzyme: formation of S-adenosylmethionine and triphosphate, and subsequent hydrolysis of the triphosphate. S-adenosylmethionine is the source of methyl groups for most biological methylations. These methylation reactions are essential for the regulation of gene expression, and the synthesis and modification of proteins and nucleic acids.

[0229] The rs7087728 variant in the MAT1A gene has been found to be significantly associated with ZZ cases with liver disease compared to cases with lung disease (OR: 4.292, 95% CI: 1.647 - 11.19).

[0230] The allele frequency in the general population for vahante rs7087728 is 0.2007 (20%). The allele frequency found in ZZ samples with pulmonary disease is 12.7%. However, the allele frequency in cases with hepatic disease is 0.3846 (38.4%) (Figure 6 and Table 7).

[0231] Table 7. Evaluation of the predictive capacity associated with the distribution of cases according to the genotype for MAT1A rs7087728 and according to the type of disease in ZZ patients.

[0232] MAT1A rs7087728

[0233] Sensitivity 0.69

[0234] Specificity 0.76

[0235] Positive predictive value 0.39

[0236] Negative predictive value 0.92

[0237] % of false negatives 30.77% of false positives 23.73

[0238] Accuracy 0.75

[0239] SOAT2, Sterol O-Acyltransferase 2, rs55907967

[0240] The SOAT2 gene encodes a member of a small family of acyl-coenzyme A: cholesterol acyltransferases. It is located in the endoplasmic reticulum membrane where it produces intracellular cholesterol esters from long-chain fatty acids and cholesterol, which are stored within the cell as lipid vesicles. SOAT2 is involved in cholesterol absorption in the intestine and in the assembly and secretion of apolipoprotein B-containing lipoproteins, such as very low-density lipoproteins (VLDL). A proper balance between free and esterified cholesterol is crucial for maintaining lipid homeostasis.

[0241] The vahante rs55907967 in the SOAT2se gene has been found to be significantly associated with ZZ cases with liver disease compared to cases with lung disease (OR: 8.077, 95% CI: 2.224 - 29.34).

[0242] The allele frequency in the general population for vahante rs7087728 is 0.211 (21%). The allele frequency found in ZZ samples with pulmonary disease is 6.2%. However, the allele frequency in cases with hepatic disease is 0.35 (35%) (Figure 7 and Table 8).

[0243] Table 8. Evaluation of the predictive capacity associated with the distribution of cases according to the genotype for SOAT2 rs55907967 and according to the type of disease in ZZ patients.

[0244] SOAT2 rs55907967

[0245] Sensitivity 0.54

[0246] Specificity 0.90

[0247] Positive predictive value 0.54

[0248] Negative predictive value 0.90

[0249] % of false negatives 46.15

[0250] % of false positives 10.17

[0251] Accuracy 0.83

[0252] ATP1A2, Na+ / K+ ATPase Transporter Alpha Subunit 2, rs17846713. The protein encoded by the ATP1A2 gene belongs to the cation-transporting ATPases family, specifically the Na+ / K+ ATPases subfamily. ATP1A2 is an integral membrane protein responsible for establishing and maintaining electrochemical gradients of Na+ and K+ ions across the plasma membrane. These gradients are essential for osmoregulation, for sodium-coupled transport of a variety of organic and inorganic molecules, and for the electrical excitability of nerves and muscles. This enzyme is composed of two subunits: a large catalytic (alpha) subunit and a smaller glycoprotein (beta) subunit.

[0253] The rs17846713 variant in the ATP1A2 gene has been found to be significantly associated with ZZ cases with liver disease compared to cases with lung disease (OR: 11.22, 95% CI: 2.428 - 51.86).

[0254] The allele frequency in the general population for vahante rs7087728 is 0.0324 (3.2%). The allele frequency found in ZZ samples with pulmonary disease is 2.8%. However, the allele frequency in cases with hepatic disease is 0.25 (25%) (Figure 8 and Table 9).

[0255] Table 9. Evaluation of the predictive capacity associated with the distribution of cases according to the genotype for ATP1A2 rs17846713 and according to the type of disease in ZZ patients.

[0256]

[0257] Sensitivity 0.31

[0258] Specificity 0.95

[0259] Positive predictive value 0.57

[0260] Negative predictive value 0.86

[0261] % of false negatives 69.23

[0262] False positive rate 5.08

[0263] Accuracy 0.83

[0264] Contribution of each individual SNP to the risk of liver disease associated with DAAT.

[0265] To estimate the risk of developing liver disease in individuals with AATD, the odds ratio (OR) was calculated individually for each of the genetic variants analyzed. The OR compares the probabilities of developing the disease between those with the genetic variant and those without it.

[0266] A log-additive model has been applied that assumes that each additional copy of the risk vahant (risk allele) has a multiplicative effect on the risk of developing the disease.

[0267] Table 10. Log-additive model to evaluate the contribution of each individual SNP to the risk of liver disease associated with DAAT.

[0268] ( ,?L°og-ad rf d S itive) ndÍVÍdUal N OR' 95% Cl 1 p r-value CES1 rs3859093 72 9.78 2.60 - 48.1 0.002 LIPG rs34474737 72 5.01 1.98 - 15.1 0.002 AOC1 rs10893 72 4.69 1.76 - 14.8 0.003 APOC3 rs2070666 72 6.76 2.10 - 26.1 0.003 MAT1A rs7087728 72 5.12 1.74 - 17.6 0.005 SOAT2 rs55907967 72 10.3 2.66 - 43.6 <0.001 ATP1A2 rs17846713) 72 7.27 1.69 - 41.1 0.013 1 OR = Odds Ratio, Cl = Confidence Interval

[0269] All selected SNPs contribute significantly to the risk of having DAAT-associated liver disease (Table 10).

[0270] Polygenic risk model for liver disease associated with AAT deficiency

[0271] The PRS model includes the 7 selected SNPs (rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and rs17846713).

[0272] To establish the PRS model, clinical variables of the patients that could be confounding factors and influence the development of liver and / or lung disease were taken into account. Specifically, the following were considered for this model: Sex, Age, BMI, serum AAT level, and smoking status.

[0273] First, the contribution of each variable was estimated individually (Table 11). Age, BMI, and smoking status all significantly contributed to the difference between liver and lung disease. However, the highest odds ratio (OR) was obtained in the PRS genetic model (Table 11), indicating that the selected genetic variants contributed more significantly to the development of liver disease.

[0274] Table 11. Individual association of each clinical variable and of the PRS, with the development of liver disease.

[0275] Variables N OR 1 95% Cl 1 p-value SEX

[0276] H 39

[0277] M 32 0.48 0.12 - 1.64 0.3 AGE 71 0.93 0.89 - 0.96 <0.001 BMI 71 0.82 0.69 - 0.95 0.012 AAT LEVEL 71 1.00 0.95 - 1.06 0.9 SMOKER 2

[0278] 0 24

[0279] 1 47 0.10 0.02 - 0.36 0.001 PRS 72 2.40 1.63 - 4.50 <0.001 ps new (Validation) 72 2.13 1.54.3.44 <0.001 1 OR = Odds Ratio, Cl = Confidence Interval

[0280] 2 SMOKER, 0=Non-Smoker; 1=Smoker or Ex-Smoker

[0281] 3 ps new. PRS model with leave.one.out validation

[0282] Multivariable logistic regression analysis taking into account clinical variables together with the genetic PRS shows a significant contribution of the SNPs included in the PRS to the development of liver disease (Table 12).

[0283] Table 12. Multivariable model taking into account clinical and PRS variables.

[0284] Variable N beta OR1 95% Cl 1 p-value Model SEX

[0285] H 39

[0286] M 32 -0.821 0.44 0.01 - 9.53 0.6 AGE 71 -0.041 0.96 0.87 - 1.06 0.4 BMI 71 -0.094 0.91 0.59 - 1.31 0.7 NO SMOKER 24

[0287] YES 47 -1.661 0.19 0.01 - 5.93 0.3PRS 71 0.971 2.64 1.57 - 7.04 0.008 1 OR = Odds Ratio, Cl = Confidence Interval

[0288] To avoid potential overfitting induced by the assessment of the relationship between PRS and liver disease, a leave-one-out cross-validation was performed, a technique for evaluating predictive models that is particularly useful when a small dataset is available (Table 13). An alternative PRS was calculated for each participant based on the regression coefficients for each SNP estimated from the other participants. This procedure was repeated sequentially for all participants, and the results were combined across the entire genetic sample to obtain a nearly unbiased estimate of the expected association in an independent sample from the same population.

[0289] Table 13. Multivariable model taking into account clinical and PRS variables, performing a leave-one-out validation.

[0290] Variable N beta OR 1 95% Cl 1 p-value

[0291]

[0292] BMI 71 -0.105 0.9 0.60 - 1.24 0.6 NO SMOKER 24

[0293] SI 47 -1.561 0.21 0.01 - 4.48 0.3 PRS 71 0.806 2.24 1.49 - 4.58 0.003 1 0R = Odds Ratio, Cl = Confidence Interval

[0294] The forensic analysis of the developed predictive model, including indicators of robustness, accuracy, and model performance, is as follows (Table 14):

[0295] Table 14. Comparison of the PRS Model and PRS Model (leave-one-out validation).

[0296] PRS Model Indicator PRS Model (validation _ leave-one- ouf} _ pseudo. rsq 0J2 0.67

[0297] AIC 31.10 34.44

[0298] AUG 0.97 0.96 Sensitivity 0.95 (0.86-0.99) 0.88 (0.77-0.95) Specificity 0.92 (0.64 - 1.00) 0.92 (0.64 - 1.00) Diagnostic Accuracy 0.94 (0.86-0.98) 0.89 (0.79-0.95) Positive Predictive Value 0.98 (0.90 - 1.00) 0.98 (0.90 - 1.00) Negative Predictive Value 0.80 (0.52 - 0.96) 0.63 (0.38 - 0.84)

[0299] Both our multivariable PRS model and the leave-one-out validation of the model show indicators that show good predictive capacity (Table 14).

[0300] pseudo R 2

[0301] The pseudo R 2 This is a measure that assesses the quality of fit in logistic regression predictive models. It indicates the proportion of variability explained by the model. In our model, it is 0.72 (and 0.67 for the validated model), indicating that the model could explain a significant portion of the variability observed in the data.

[0302] AUC (Area Under the Curve)

[0303] The AUC is a measure of the performance of a binary classification model. It refers to the area under the ROC (Receiver Operating Characteristic) curve, which represents the true positive rate (sensitivity) versus the false positive rate (1 - specificity) at various classification thresholds.

[0304] The area under the curve (AUC) of our model is 0.97, which indicates a very high predictive capacity and therefore a good ability to distinguish between individuals with a high genetic risk for the development of liver disease.

Claims

CLAIMS 1. An in vitro method for determining the risk of developing liver disease in a subject, wherein said method comprises the following steps: a) Detecting the presence of a single nucleotide polymorphism (SNP) in a biological sample isolated from the subject, where the SNP is selected from among rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 or rs17846713; and b) determine the subject's risk of developing liver disease based on the results of step a).

2. Method according to claim 1, wherein step a) comprises detecting the presence of the SNP rs3859093 and at least one of the SNPs selected from rs34474737, rs10893, rs2070666, rs7087728, rs55907967 or rs17846713.

3. Method according to claim 1 or 2, wherein step a) comprises detecting the presence of the SNPs rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and rs17846713.

4. Method according to any one of claims 1 to 3, wherein step a) further comprises determining at least one clinical fact of the subject.

5. Method according to claim 4, wherein the clinical data is selected from the list consisting of age, sex, body mass index and whether the subject is a smoker.

6. Method according to claim 4 or 5, wherein step a) comprises determining the clinical data of age, sex, body mass index and whether the subject is a smoker.

7. Method according to any one of claims 1 to 6, wherein the biological sample is selected from the list consisting of tissue, blood, serum, and saliva.

8. Method according to claim 1 to 7, wherein the liver disease is selected from the list consisting of steatosis, fibrosis, hepatitis, and cirrhosis.

9. Method according to claim 8, wherein the liver disease is steatosis.

10. Method according to any one of claims 1 to 9, wherein the subject suffers from Alpha-1 Antitrypsin deficiency.

11. Method according to any one of claims 1 to 10, wherein the presence of the SNP is determined by a technique selected from the list consisting of genome sequencing, exorna sequencing, TaqMan assays, allele-specific primer extension, and PCR.

12. An in vitro method for determining the risk of developing liver disease in a subject, wherein said method comprises the following steps: a) Detecting the presence of single nucleotide polymorphisms (SNPs) in a biological sample isolated from the subject, where the SNPs are rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and rs17846713; b) calculate a liver disease risk score based on the data from step a), and c) compare the risk score from step b) with a threshold value, where a liver disease risk score value higher than the threshold value indicates that the subject has a higher risk of developing liver disease.

13. Method according to claim 12, wherein step a) further comprises determining at least one clinical fact of the subject.

14. Method according to claim 13, wherein the clinical data is selected from the list consisting of age, sex, body mass index and whether the subject is a smoker.

15. Method according to claim 13 or 14, wherein step a) comprises determining the clinical data of age, sex, body mass index and whether the subject is a smoker.

16. A method according to any one of claims 12 to 15, wherein the biological sample is selected from the list consisting of tissue, blood, serum, and saliva.

17. A method according to claims 12 to 16, wherein the liver disease is selected from the list consisting of steatosis, fibrosis, hepatitis, and cirrhosis.

18. Method according to claim 17, wherein the liver disease is steatosis.

19. Method according to any one of claims 12 to 18, wherein the subject suffers from Alpha-1 Antitrypsin deficiency.

20. Method according to any one of claims 12 to 19, wherein the presence of the SNP is determined by a technique selected from the list consisting of genome sequencing, exorna sequencing, TaqMan assays, allele-specific primer extension, and PCR.

21. A computer-implemented method for determining the risk of developing liver disease in a subject, wherein said method comprises the following steps: a) receiving or accessing, by means of a processor, the single nucleotide polymorphism (SNP) data detected in a biological sample isolated from the subject, wherein the SNPs are rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and rs17846713; and carrying out steps b) and c) of the method according to any one of claims 12 to 20, where a liver disease risk score value higher than the threshold value indicates that the subject has a higher risk of developing liver disease.

22. A computer program comprising instructions that, when the program is executed on a computer, cause the computer to carry out the steps of the method according to claim 21.

23. In vitro use of the SNP rs3859093, rs34474737, rs10893, rs2070666, rs7087728, rs55907967 and / or rs17846713 to determine the risk of developing liver disease in a subject.

24. Use according to claim 23, wherein the liver disease is selected from the list consisting of steatosis, fibrosis, hepatitis, and cirrhosis.

25. Use according to claim 24, wherein the liver disease is steatosis.

26. Use according to any one of claims 23 to 25, wherein the subject suffers from Alpha-1 Antitrypsin deficiency.