Introduction
Acute-on-chronic liver failure (ACLF) is widely recognized as a distinct clinical syndrome characterized by acute deterioration in patients with chronic liver disease and a high risk of short-term mortality.1–4 A defining advance in the field was the recognition that ACLF is not simply severe decompensation, but a state of organ failure associated with sharp inflections in mortality risk. Early frameworks therefore linked organ-specific thresholds to clinically meaningful outcomes, enabling separation of dysfunction from failure and providing a basis for grading disease severity.1,4
However, subsequent definitions have evolved from different populations, etiologies, and conceptual models.2,5,6 As a result, multiple ACLF frameworks now coexist, each capturing overlapping but non-identical patient populations.7 While this diversity reflects biological and geographical heterogeneity,8 it has also created persistent uncertainty regarding disease definition, limiting comparability across studies and complicating the assessment of prognostic scores,9 development of targeted therapies, and regulatory pathways.
Recent efforts have therefore focused on harmonization.10,11 The Kyoto consensus acknowledged that existing definitions remain valid within their original contexts and proposed convergence through prospective international validation rather than replacement.10 In parallel, the 2025 Gastroenterology consensus on organ failures in cirrhosis proposed harmonized organ failure criteria and a unifying framework for ACLF, while explicitly recognizing that key elements, including liver failure thresholds, biological mechanisms, and organ system interactions, require further validation.11
These initiatives have brought into focus a critical tension in ACLF research.12 Broadening definitions may improve global applicability and inclusiveness but may also blur the distinction between dysfunction and failure, reduce biological specificity, and weaken the prognostic gradients that underpin clinical decision-making and trial design.12 This concern is particularly relevant for lower thresholds for liver failure based on bilirubin and international normalized ratio (INR), the inclusion of grade II encephalopathy as cerebral failure, more permissive respiratory criteria, and the conceptual reclassification of infection from a precipitating event to a form of organ dysfunction.11 Whether such changes enhance clinical utility or simply expand the diagnostic label remains uncertain and requires direct empirical evaluation.
The A-TANGO score was developed to address these challenges by recalibrating organ failure definitions against short-term mortality.4 Using large international cohorts for derivation and validation, A-TANGO defined organ-specific thresholds based on mortality inflection points, incorporated updated concepts of acute kidney injury (AKI), simplified the handling of low-grade encephalopathy, and introduced an additional severity grade to better capture heterogeneity at the highest-risk end. Importantly, A-TANGO demonstrated the ability to identify a broader population while maintaining preserved prognostic performance across diverse cohorts.4
What remains unknown is how such outcome-calibrated frameworks compare with the newly proposed consensus definition.12 This comparison is critical because the value of harmonization ultimately depends not on agreement in definition alone, but on whether clinically meaningful high-risk patients are appropriately identified. Accordingly, we performed a head-to-head comparison of the consensus ACLF framework and A-TANGO across two independent cohorts from India and China. We specifically examined: (i) differences in case capture; (ii) clinical characteristics and outcomes of patients identified by each framework; and (iii) whether patients classified as non-ACLF by the consensus definition could be further stratified into distinct risk groups using A-TANGO.
Methods
Design and setting
This was a multinational observational cohort study including the TIH cohort (N = 2,398) from India and the CATCH-LIFE cohort (N = 2,568) from China. The TIH cohort included patients from PGIMER, Chandigarh, and Amrita, Kochi, between 2015 and 2025. Patients were enrolled retrospectively between 2015 and 2020 and prospectively between 2021 and 2025. Hospitalized patients with chronic liver disease aged ≥18 years, admitted non-electively with acute decompensation (AD) as per EASL criteria, were enrolled. Patients with chronic kidney disease, hepatocellular carcinoma, severe cardiopulmonary dysfunction, refractory ascites or hydrothorax, HIV infection, pregnancy, lactation, missing information, or lack of consent were excluded. All patients were managed with standard of care and intensive care support when needed and were followed up for 90 days after admission. None of the recruited patients underwent liver transplantation during the study period. Informed consent was obtained from prospectively recruited patients. A waiver of consent was granted for the retrospective cohort. The study was approved by the Institute Ethics Committee (PGI/IEC/2021/001346).
The CATCH-LIFE study was approved by the Ethics Committee of the leading center, Shanghai Jiao Tong University School of Medicine, Renji Hospital (approval numbers: (2014) 148 k for the investigation study and (2016) 142 k for the validation study). Written informed consent was obtained from all participants. The study was registered at ClinicalTrials.gov (NCT02457637 and NCT03641872). The CATCH-LIFE cohort was a prospective study divided into two parts: an investigation study (NCT02457637), which enrolled 2,600 potential ACLF patients admitted to 14 liver centers across China from January 2015 to December 2016, and a validation study (NCT03641872), which enrolled 1,370 patients admitted to 13 centers from September 2018 to January 2019. The key inclusion criteria for both studies were chronic liver disease with or without cirrhosis and acute liver injury or AD in patients. All patients were followed up for 90 days. A total of 1,144 patients without cirrhosis, 226 patients without AD, and 32 patients with missing baseline data (total bilirubin, INR, and creatinine) were excluded. Finally, 2,568 patients with cirrhosis were included in the CATCH-LIFE cohort. Among them, 219 patients (8.5%) underwent liver transplantation within 90 days.
We performed a head-to-head comparative analysis of A-TANGO and the consensus framework for ACLF in the two cohorts. Both cohorts had previously been used in A-TANGO validation analyses; however, 343 new patients were added to the TIH cohort. The present study was designed as a classification and outcome comparison and not for the development of a new score.
Variables and definitions
A-TANGO ACLF was applied according to the published A-TANGO organ failure framework.4 In this framework, organ-specific subscore 3 thresholds were derived to correspond to approximately ≥15% 28-day mortality. A single liver, kidney, coagulation, or respiratory failure was sufficient to diagnose ACLF, whereas isolated brain or circulatory failure required an additional organ dysfunction to meet the ACLF threshold. A-TANGO also graded ACLF from 1 to 4, with grade 4 used to resolve heterogeneity within the highest-risk group. A-TANGO defined organ failure as subscore 3 for each organ, with thresholds selected to correspond to approximately ≥15% 28-day mortality: bilirubin ≥20 mg/dL for liver; creatinine ≥2 mg/dL and/or AKI grade 1b or renal replacement therapy for kidney; West Haven grade 3–4 encephalopathy or mechanical ventilation for cerebral failure; INR ≥2.2 for coagulation; vasopressor requirement for circulatory failure; and PaO2/FiO2 ≤225, SpO2/FiO2 ≤325, or mechanical ventilation for respiratory failure.
The comparator was an operationalized version of the recently proposed consensus ACLF framework based on the 2025 organ failure consensus.11 The organ domains were defined as follows: liver failure as bilirubin ≥7.5 mg/dL plus INR ≥1.5; brain failure as West Haven grade ≥2; kidney failure as stage 2 AKI or need for hemodialysis; circulatory failure as MAP <65 mmHg or vasopressor use for hypotension; and respiratory failure as PaO2/FiO2 ≤300 or SpO2/FiO2 ≤315. Immune and gastrointestinal failure were not included in the operationalized ACLF definition. Coagulation was not defined as a separate failure domain in the consensus framework. In the TIH and CATCH-LIFE cohorts, baseline creatinine was unavailable; therefore, kidney organ failure was not directly estimable. To perform the analysis, and because patients with chronic kidney disease were excluded, baseline creatinine was approximated as 1 mg/dL, and patients with a ≥2-fold increase in creatinine reaching ≥2 mg/dL or requiring renal replacement therapy were considered to have kidney failure. Consensus ACLF was defined, as per published literature, as liver failure together with at least one extrahepatic organ failure.11
Outcomes
The main outcomes were 28-day and 90-day transplant-free mortality, as documented during follow-up. We assessed overall case capture, overlap between definitions, baseline clinical phenotypes across reclassified groups, short-term mortality, severity cross-tabulation, and diagnostic performance measures including sensitivity, specificity, positive predictive value, and negative predictive value where applicable. Because the most clinically relevant question was whether consensus non-ACLF status concealed clinically important risk, we pre-specified a subgroup analysis splitting consensus-negative patients into A-TANGO-negative and A-TANGO-positive strata.
Statistical analysis
Continuous variables were presented as mean ± standard deviation or median with interquartile range, as appropriate, and categorical variables as counts with percentages. Categorical data were compared between groups using the Chi-square test or Fisher’s exact test, and numerical data using the Mann–Whitney U test or t-test, depending on data distribution. Normality was assessed using the Shapiro–Wilk test. Cox regression was used to compare the hazard of mortality across definitions. NRI was calculated based on cross-classification of patients between the two binary definitions. Event and non-event NRI components were calculated separately to assess reclassification improvement among individuals with and without events. All tests were two-sided, with P < 0.05 considered statistically significant. Analyses were performed using RStudio (version 1.5) and IBM SPSS (version 22).
Results
Overall cohort
The TIH cohort included 2,398 patients, with a median age of 44 years; 11.3% were female; alcohol was the main etiology of cirrhosis and precipitating event; and the median MELD score was 27.2. At 28 and 90 days of follow-up, 905 (37.7%) and 1,122 (46.8%) patients had died, respectively. The CATCH-LIFE cohort included 2,568 patients with a median age of 51 years, of whom 27% were female. The proportion of patients with hepatitis B virus (HBV) or hepatitis C virus (HCV) as the sole etiology was 39.4%. When coexisting etiologies were taken into account, the prevalence of HBV-related cases was 66.9%, while that of HCV-related cases was 4.1%. Bacterial infection was the most commonly identifiable precipitating factor (16.2%), while 40.5% of patients had no defined precipitating event. The median MELD score was 17.6. Patients who died in both cohorts had more frequent infections and hepatic encephalopathy and showed worse baseline laboratory and severity scores, including lower hemoglobin and platelet counts, higher white blood cell count (WBCC), INR, bilirubin, and creatinine, and higher MELD, MELD-sodium (MELD-Na), Chronic Liver Failure Consortium Organ Failure (CLIF-C OF), and A-TANGO-OF scores. Organ failures classified by A-TANGO-OF score were also consistently more common among non-survivors (Table 1).
Table 1Baseline characteristics of cohort stratified by 28-day outcomes
| Characteristic | TIH cohort
| CATCH-LIFE cohort
|
|---|
| Total (n = 2,398) | Survived (n = 1,493, 62.3%) | Died (n = 905, 37.7%) | P-value | Total (n = 2,568) | Survived/LT (n = 2,330, 90.7%) | Died (n = 238, 9.3%) | P-value |
|---|
| Age, median (IQR) | 44.0 (36.0–52.0) | 45.0 (36.0–51.0) | 44.0 (36.0–54.0) | 0.207 | 51.4 (44–60) | 51.2 (43.8–60) | 53 (45.8–61.5) | 0.024 |
| Female sex, n/N (%) | 271 (11.3) | 147 (9.8) | 124 (13.7) | 0.005 | 693 (27) | 642 (27.6) | 51 (21.4) | 0.051 |
| Etiology of cirrhosis, n/N (%) |
| Alcohol | 1,709 (71.3) | 1,099 (73.6) | 610 (67.4) | 0.012 | 167 (6.5) | 156 (6.7) | 11 (4.6) | 0.215 |
| MASLD | 149 (6.2) | 81 (5.4) | 68 (7.5) | 0.012 | 3 (0.1) | 2 (0.1) | 1 (0.4) | 0.215 |
| Hepatitis B/C | 242 (10.1) | 153 (10.2) | 89 (9.8) | 0.012 | 1,011 (39.4) | 919 (39.4) | 92 (38.7) | 0.215 |
| Alcohol+ CMRF | 43 (1.8) | 23 (1.5) | 20 (2.2) | 0.012 | NA | | | 0.215 |
| AIH | 125 (5.2) | 66 (4.4) | 59 (6.5) | 0.012 | 45 (1.8) | 42 (1.8) | 3 (1.3) | 0.215 |
| Cryptogenic | 79 (3.3) | 46 (3.1) | 33 (3.6) | 0.012 | 99 (3.9) | 85 (3.6) | 14 (5.9) | 0.215 |
| Alcohol+ Hepatitis B/C | 48 (2.0) | 23 (1.5) | 25 (2.8) | 0.012 | 128 (5) | 112 (4.8) | 16 (6.7) | 0.215 |
| Others | 3 (0.1) | 2 (0.1) | 1 (0.1) | 0.012 | 1,115 (43.4) | 1,014 (43.5) | 101 (42.4) | 0.215 |
| Precipitating events, n/N (%) |
| AH | 854 (35.6) | 568 (38.0) | 286 (31.6) | 0.002 | 116 (4.5) | 108 (4.6) | 8 (3.4) | <0.001 |
| Infections | 380 (15.8) | 202 (13.5) | 178 (19.7) | 0.002 | 417 (16.2) | 359 (15.4) | 58 (24.4) | <0.001 |
| AH+ others | 631 (26.3) | 380 (25.5) | 251 (27.7) | 0.002 | 143 (5.6) | 126 (5.4) | 17 (7.1) | <0.001 |
| DILI | 148 (6.2) | 94 (6.3) | 54 (6.0) | 0.002 | 69 (2.7) | 61 (2.6) | 8 (3.4) | <0.001 |
| Viral hepatitis | 187 (7.8) | 122 (8.2) | 65 (7.2) | 0.002 | 135 (5.3) | 119 (5.1) | 16 (6.7) | <0.001 |
| AIH | 36 (1.5) | 24 (1.6) | 12 (1.3) | 0.002 | NA | | | <0.001 |
| UGIB/LGIB | 33 (1.4) | 19 (1.3) | 14 (1.5) | 0.002 | 266 (10.4) | 254 (10.9) | 12 (5) | <0.001 |
| Others | 129 (5.4) | 84 (5.6) | 45 (5.0) | 0.002 | 1,422 (55.4) | 1,303 (55.9) | 119 (50) | <0.001 |
| Decompensations, n/N (%) |
| Ascites | 2,301 (96.0) | 1,421 (95.2) | 880 (97.2) | 0.018 | 1,711 (66.6) | 1,530 (65.7) | 181 (76.1) | 0.002 |
| Hepatic encephalopathy | 1,514 (63.1) | 821 (55.0) | 693 (76.6) | <0.001 | 110 (4.3) | 93 (4) | 17 (7.1) | 0.034 |
| Lab investigations |
| Hemoglobin (g/dL) | 9.6 (7.8–11.2) | 10.0 (8.3–11.5) | 8.8 (7.3–10.6) | <0.001 | 10.8 (8.6–12.5) | 10.7 (8.5–12.5) | 11.1 (9.0–12.9) | 0.014 |
| WBCC (×109/L) | 11.0 (7.5–16.5) | 10.0 (6.8–14.7) | 13.3 (8.9–19.8) | <0.001 | 4.9 (3.3–7.2) | 4.7 (3.2–6.9) | 7.1 (5.3–10.3) | <0.001 |
| Platelet counts (×109/L) | 102.0 (66.3–159.0) | 109.0 (70.0–167.0) | 92.0 (60.0–143.0) | <0.001 | 74 (49–113) | 74 (49–113) | 80 (49.2–116) | 0.454 |
| INR | 2.0 (1.7–2.6) | 1.9 (1.6–2.4) | 2.2 (1.8–2.8) | <0.001 | 1.5 (1.3–1.9) | 1.5 (1.3–1.8) | 2.4 (1.8–2.9) | <0.001 |
| AST (U/L) | 117.0 (76.0–184.0) | 117.4 (75.0–185.0) | 116.0 (78.0–178.0) | 0.896 | 72.8 (37.7–165.3) | 68 (36–150.8) | 144.1 (66.3–352.5) | <0.001 |
| ALT (U/L) | 49.8 (32.0–91.0) | 52.0 (31.0–93.0) | 47.0 (32.0–88.0) | 0.077 | 49.4 (25–137) | 46.9 (24.4–123.7) | 90.7 (38.2–368.7) | <0.001 |
| Bilirubin (mg/dL) | 15.0 (7.1–24.0) | 13.0 (6.6–22.0) | 18.0 (8.4–26.9) | <0.001 | 4.7 (1.6–15.7) | 3.9 (1.5–13) | 18.9 (9.5–29.6) | <0.001 |
| Creatinine (mg/dL) | 1.2 (0.8–2.1) | 1.0 (0.7–1.7) | 1.5 (1.0–2.6) | <0.001 | 0.8 (0.7–1) | 0.8 (0.6–1) | 0.9 (0.7–1.4) | <0.001 |
| Sodium (mmol/L) | 132.6 (128.1–136.5) | 132.8 (128.6–136.0) | 132.0 (128.0–138.0) | 0.171 | 137.9 (134–140.2) | 138 (134.8–140.4) | 134.6 (130–138) | <0.001 |
| MAP (mmHg) | 85.3 (78.0–93.3) | 86.0 (78.7–93.3) | 84.7 (77.0–93.3) | 0.014 | 89 (82.3–94.3) | 89 (82.7–94.3) | 89 (81.7–94.6) | 0.797 |
| Disease severity scores |
| MELD score | 27.2 (23.0–32.7) | 25.9 (22.0–30.0) | 30.0 (26.0–36.0) | <0.001 | 17.6 (12.1–24.2) | 16.8 (11.7–23.1) | 28.6 (23–33.6) | <0.001 |
| MELD-Na score | 30.0 (26.0–34.2) | 28.5 (24.8–32.4) | 32.3 (28.3–37.0) | <0.001 | 19.1 (13–25.8) | 18 (12.5–24.4) | 30 (24.7–34.6) | <0.001 |
| CLIF-C OF score | 10.0 (8.0–12.0) | 9.0 (8.0–10.5) | 12.0 (10.0–13.0) | <0.001 | 7 (6–8) | 7 (6–8) | 10 (8–10) | <0.001 |
| A-TANGO-OF score | 10.0 (9.0–12.0) | 9.0 (8.0–11.0) | 12.0 (11.0–14.0) | <0.001 | 7 (6–9) | 7 (6–8) | 10 (9–11) | <0.001 |
| Organ failures as per A-TANGO criteria, n/N (%) |
| Liver failure | 877 (36.6) | 479 (32.1) | 398 (44.0) | <0.001 | 472 (18.4) | 356 (15.3) | 116 (48.7) | <0.001 |
| Kidney failure | 632 (26.4) | 289 (19.4) | 343 (37.9) | <0.001 | 90 (3.5) | 61 (2.6) | 29 (12.2) | <0.001 |
| Brain failure | 314 (13.1) | 72 (4.8) | 242 (26.7) | <0.001 | 62 (2.4) | 39 (1.7) | 23 (9.7) | <0.001 |
| Circulatory failure | 696 (29.0) | 252 (16.9) | 444 (49.1) | <0.001 | 448 (17.4) | 313 (13.4) | 135 (56.7) | <0.001 |
| Respiratory failure | 408 (17.0) | 162 (10.9) | 246 (27.2) | <0.001 | 26 (1) | 25 (1.1) | 1 (0.4) | 0.536 |
| Coagulation failure | 1,019 (42.6) | 534 (35.8) | 485 (53.7) | <0.001 | 87 (3.4) | 54 (2.3) | 33 (13.9) | <0.001 |
A-TANGO captured a much larger ACLF population than the consensus framework
In the TIH cohort, A-TANGO classified 1,899 of 2,398 patients as having ACLF (79.2%), whereas the operationalized consensus framework classified 1,014 (42.3%) as having ACLF (Fig. 1). In CATCH-LIFE, the difference was even more striking: 807 of 2,568 patients (31.4%) were classified as ACLF by A-TANGO, compared with only 150 (5.8%) by the consensus framework (Fig. 1). As expected, compared with patients without ACLF, patients classified as having ACLF by either framework were sicker, with more frequent ascites, hepatic encephalopathy, organ failures, and higher severity scores (Supplementary Tables 1 and 2).
In the TIH cohort, every patient classified as consensus ACLF was also classified as A-TANGO ACLF, but 885 additional patients (36.9%) who were classified as consensus non-ACLF were classified as ACLF by A-TANGO (Supplementary Table 3). In CATCH-LIFE, a similar pattern was observed: 139 of 150 consensus ACLF cases were also A-TANGO ACLF, while 668 additional patients (26.0%) classified as consensus non-ACLF were captured by A-TANGO as ACLF (Supplementary Table 4). The 28-day mortality was 26.9% in patients classified as non-ACLF by the consensus definition but as ACLF by A-TANGO, compared with 10.6% in those negative by both frameworks and 60.6% in those positive by both frameworks in the TIH cohort (P < 0.001). A similar pattern of mortality was observed in the CATCH-LIFE cohort (18.1% vs. 3.2% vs. 43.2%, P < 0.001).
ACLF definitions and risk of mortality
A-TANGO ACLF was strongly associated with 28-day and 90-day mortality in both cohorts. In the TIH cohort, patients classified as ACLF by A-TANGO had a higher hazard of death (hazard ratio [HR] 5.22, 95% confidence interval [CI] 3.96–6.90; P < 0.001) compared with those defined by the consensus criteria (HR 3.84, 95% CI 3.33–4.41; P < 0.001). Corresponding 28-day mortality was 44.9% vs. 60.6% in the A-TANGO and consensus ACLF groups, and 90-day mortality was 54.8% vs. 69.5%, respectively. In the CATCH-LIFE cohort, both definitions showed strong associations with mortality, with A-TANGO again showing a higher hazard (HR 7.72, 95% CI 5.74–10.39; P < 0.001) than the consensus definition (HR 7.01, 95% CI 5.23–9.40; P < 0.001). Corresponding 28-day mortality was 22.4% vs. 40.7%, and 90-day mortality was 37.4% vs. 54.7%, respectively (Fig. 1; Supplementary Tables 1 and 2).
When 28-day mortality was used as the outcome, A-TANGO showed substantially higher sensitivity than the consensus definition in both cohorts (Fig. 1). In TIH, sensitivity for 28-day mortality was 94.1% for A-TANGO versus 67.8% for the consensus definition; at 90 days, the corresponding values were 92.7% and 62.8%. In CATCH-LIFE, the contrast was even larger: 76.1% versus 25.6% for 28-day mortality (Fig. 1). In contrast, the consensus definition was more specific, with specificity of 73.2% versus 29.9% in TIH and 96.2% versus 73.1% in CATCH-LIFE.
To assess whether these performance differences were consistent across diverse patient populations, we performed subgroup analyses stratified by etiology. As shown in Supplementary Table 5, the higher sensitivity of A-TANGO for predicting 28-day mortality compared with the consensus definition was consistently observed across all major etiologic subgroups in both cohorts, including alcohol-related liver disease, viral hepatitis, and autoimmune hepatitis. These findings indicate that the superior prognostic performance of the A-TANGO framework is robust and not driven by a specific etiology.
Consensus negative but A-TANGO positive population represented a high-risk cohort
Further analysis revealed that patients missed by the consensus criteria but captured by A-TANGO were at high risk of mortality. In the TIH cohort, patients who were consensus-negative but A-TANGO-positive had a 28-day mortality of 26.9% and a 90-day mortality of 37.9%, compared with 10.6% and 16.4% in those negative by both frameworks (Supplementary Tables 3, 4, 6, and 7). Their risk remained lower than that of patients positive by both frameworks, who had 28-day and 90-day mortality rates of 60.6% and 69.5%, respectively. A similar gradient was observed in the CATCH-LIFE cohort: consensus-negative/A-TANGO-positive patients had 28-day and 90-day mortality of 18.1% and 33.2%, versus 3.2% and 6.7% in patients negative by both frameworks; patients positive by both frameworks had mortality rates of 43.2% and 57.6%, respectively. This reclassification translated into an NRI of 17.1% in the TIH cohort and 27.4% in the CATCH-LIFE cohort, indicating improved identification of patients at risk with A-TANGO compared with the consensus framework (Supplementary Tables 3 and 4). In both cohorts, consensus non-ACLF was therefore not a homogeneous low-risk state, with 28-day mortality increasing from 10.6% to 26.9% in TIH and from 3.2% to 18.1% in CATCH-LIFE after re-stratification using the A-TANGO framework.
To identify the reasons for discordance, we compared characteristics of patients classified as non-ACLF by the consensus and A-TANGO frameworks. The consensus non-ACLF group had worse laboratory and disease severity profiles, including higher WBCC, INR, bilirubin, creatinine, MELD, MELD-Na, CLIF-C OF, and A-TANGO-OF scores, and higher 28-day and 90-day mortality. In contrast, the A-TANGO non-ACLF group showed milder disease overall, with slightly higher hemoglobin and sodium levels and substantially lower short-term mortality (Table 2). Likewise, patients classified as ACLF by the consensus framework were sicker and had higher mortality than those classified as ACLF by A-TANGO (Table 3), suggesting that the consensus criteria captured patients with more advanced disease.
Table 2Baseline characteristics of non-ACLF population stratified by defining criteria
| Characteristic | TIH cohort
| CATCH-LIFE cohort
|
|---|
| A-TANGO non-ACLF (n = 499, 20.8%) | Consensus non-ACLF (n = 1,384, 57.7%) | P-value | A-TANGO non-ACLF (n = 1,761, 68.6%) | Consensus non-ACLF (n = 2,418, 94.2%) | P-value |
|---|
| Age, median (IQR) | 45.0 (39.0–52.0) | 45.0 (37.0–53.0) | 0.313 | 52.8 (45–61) | 51.6 (44–60) | 0.006 |
| Female sex, n/N (%) | 54 (10.8) | 153 (11.1) | 0.953 | 528 (30) | 673 (27.8) | 0.138 |
| Etiology of cirrhosis, n/N (%) |
| Alcohol | 376 (75.4) | 988 (71.4) | 0.588 | 130 (7.4) | 151 (6.2) | 0.801 |
| MASLD | 25 (5.0) | 95 (6.9) | | 3 (0.2) | 3 (0.1) | |
| Hepatitis B/C | 43 (8.6) | 122 (8.8) | | 695 (39.5) | 951 (39.3) | |
| Alcohol+ CMRF | 9 (1.8) | 34 (2.5) | | | | |
| AIH | 21 (4.2) | 75 (5.4) | | 36 (2) | 44 (1.8) | |
| Cryptogenic | 16 (3.2) | 39 (2.8) | | 71 (4) | 94 (3.9) | |
| Alcohol+ Hepatitis B/C | 9 (1.8) | 29 (2.1) | | 85 (4.8) | 119 (4.9) | |
| Others | 0 (0.0) | 2 (0.1) | | 741 (42.1) | 1,056 (43.7) | |
| Precipitating events, n/N (%) |
| AH | 182 (36.5) | 444 (32.1) | 0.388 | 81 (4.6) | 110 (4.5) | 0.05 |
| Infections | 77 (15.4) | 264 (19.1) | | 238 (13.5) | 383 (15.8) | |
| AH+ others | 135 (27.1) | 377 (27.2) | | 98 (5.6) | 127 (5.3) | |
| DILI | 32 (6.4) | 79 (5.7) | | 40 (2.3) | 62 (2.6) | |
| Viral hepatitis | 35 (7.0) | 88 (6.4) | | 76 (4.3) | 124 (5.1) | |
| AIH | 7 (1.4) | 27 (2.0) | | | | |
| UGIB/LGIB | 10 (2.0) | 27 (2.0) | | 242 (13.7) | 263 (10.9) | |
| Others | 21 (4.2) | 78 (5.6) | | 986 (56) | 1,349 (55.8) | |
| Clinical events at inclusion, n/N (%) |
| Ascites | 464 (93.0) | 1,318 (95.2) | 0.073 | 1,151 (65.4) | 1,610 (66.6) | 0.429 |
| Hepatic encephalopathy | 270 (54.1) | 802 (57.9) | 0.152 | 72 (4.1) | 104 (4.3) | 0.795 |
| Lab investigations |
| Hemoglobin (g/dL) | 9.9 (8.1–11.5) | 9.6 (7.8–11.3) | 0.039 | 10.6 (8.4–12.3) | 10.8 (8.5–12.5) | 0.037 |
| WBCC (×109/L) | 8.3 (6.1–11.6) | 9.5 (6.5–14.3) | <0.001 | 4.3 (3–6.2) | 4.7 (3.2–6.9) | <0.001 |
| Platelet counts (×109/L) | 105.0 (67.0–158.0) | 97.0 (65.0–152.2) | 0.122 | 72 (48–110.8) | 74 (49–113.4) | 0.302 |
| INR | 1.7 (1.5–1.9) | 1.9 (1.6–2.4) | <0.001 | 1.4 (1.2–1.6) | 1.5 (1.3–1.9) | <0.001 |
| AST (U/L) | 99.0 (62.0–159.5) | 103.2 (65.0–167.8) | 0.123 | 55.9 (32.3–116) | 70 (36.3–156) | <0.001 |
| ALT (U/L) | 45.0 (28.0–78.0) | 45.1 (29.0–84.0) | 0.329 | 39 (22.7–87) | 47.4 (24.9–131.2) | <0.001 |
| Bilirubin (mg/dL) | 7.1 (4.3–12.0) | 8.6 (5.0–18.1) | <0.001 | 2.5 (1.3–6.8) | 4 (1.5–13.4) | <0.001 |
| Creatinine (mg/dL) | 0.9 (0.7–1.2) | 1.0 (0.8–1.6) | <0.001 | 0.8 (0.6–0.9) | 0.8 (0.6–1) | 0.022 |
| Sodium (mmol/L) | 134.0 (130.0–137.0) | 133.0 (129.0–137.0) | 0.011 | 138.3 (135.5–141) | 138 (134.7–140.3) | <0.001 |
| MAP (mmHg) | 87.0 (80.0–92.7) | 86.0 (78.3–93.1) | 0.247 | 89 (83–94) | 89 (82.7–94.3) | 0.809 |
| Disease severity scores |
| MELD score | 21.0 (18.0–23.0) | 24.6 (21.0–28.2) | <0.001 | 14.3 (10.8–18.5) | 17 (11.8–23.3) | <0.001 |
| MELD-Na score | 24.0 (20.6–27.0) | 27.7 (23.7–31.0) | <0.001 | 15.2 (11.3–20.2) | 18.3 (12.7–24.6) | <0.001 |
| CLIF-C OF score | 8.0 (7.0–8.0) | 9.0 (8.0–10.0) | <0.001 | 6 (6–7) | 7 (6–8) | <0.001 |
| A-TANGO-OF score | 8.0 (7.0–8.0) | 9.0 (8.0–10.0) | <0.001 | 7 (6–7) | 7 (6–8) | <0.001 |
| Mortality, n/N (%) | | | | | | |
| 28-day mortality | 53 (10.6) | 291 (21.0) | <0.001 | 57 (3.2) | 177 (7.3) | <0.001 |
| 90-day mortality | 82 (16.4) | 417 (30.1) | <0.001 | 120 (6.8) | 340 (14.1) | <0.001 |
Table 3Baseline characteristics of ACLF population stratified by the defining criteria
| Characteristic | TIH cohort
| CATCH-LIFE cohort
|
|---|
| A-TANGO ACLF (n = 1,899, 79.2%) | Consensus ACLF (n = 1,014, 42.3%) | P-value | A-TANGO ACLF (n = 807, 31.4%) | Consensus ACLF (n = 150, 5.8%) | P-value |
|---|
| Age, median (IQR) | 44.0 (36.0–52.0) | 43.0 (35.0–51.0) | 0.148 | 48.7 (41.4–57) | 49.4 (42.4–57.2) | 0.643 |
| Female sex, n/N (%) | 217 (11.4) | 118 (11.6) | 0.914 | 165 (20.4) | 20 (13.3) | 0.056 |
| Etiology of cirrhosis, n/N (%) |
| Alcohol | 1,333 (70.2) | 721 (71.1) | 0.351 | 37 (4.6) | 16 (10.7) | 0.07 |
| MASLD | 124 (6.5) | 54 (5.3) | 0.351 | | | 0.07 |
| Hepatitis B/C | 199 (10.5) | 120 (11.8) | 0.351 | 316 (39.2) | 60 (40) | 0.07 |
| Alcohol+ CMRF | 34 (1.8) | 9 (0.9) | 0.351 | | | 0.07 |
| AIH | 104 (5.5) | 50 (4.9) | 0.351 | 9 (1.1) | 1 (0.7) | 0.07 |
| Cryptogenic | 63 (3.3) | 40 (3.9) | 0.351 | 28 (3.5) | 5 (3.3) | 0.07 |
| Alcohol+ Hepatitis B/C | 39 (2.1) | 19 (1.9) | 0.351 | 43 (5.3) | 9 (6) | 0.07 |
| Others | 3 (0.2) | 1 (0.1) | 0.351 | 374 (46.3) | 59 (39.3) | 0.07 |
| Precipitating events, n/N (%) |
| AH | 672 (35.4) | 410 (40.4) | 0.002 | 35 (4.3) | 6 (4) | 0.351 |
| Infections | 303 (16.0) | 116 (11.4) | 0.002 | 179 (22.2) | 34 (22.7) | 0.351 |
| AH+ others | 496 (26.1) | 254 (25.0) | 0.002 | 45 (5.6) | 16 (10.7) | 0.351 |
| DILI | 116 (6.1) | 69 (6.8) | 0.002 | 29 (3.6) | 7 (4.7) | 0.351 |
| Viral hepatitis | 152 (8.0) | 99 (9.8) | 0.002 | 59 (7.3) | 11 (7.3) | 0.351 |
| AIH | 29 (1.5) | 9 (0.9) | 0.002 | | | 0.351 |
| UGIB/LGIB | 23 (1.2) | 6 (0.6) | 0.002 | 24 (3) | 3 (2) | 0.351 |
| Others | 108 (5.7) | 51 (5.0) | 0.002 | 436 (54) | 73 (48.7) | 0.351 |
| Clinical events at inclusion, n/N (%) |
| Ascites | 1,837 (96.7) | 983 (96.9) | 0.847 | 560 (69.4) | 101 (67.3) | 0.686 |
| Hepatic encephalopathy | 1,244 (65.5) | 712 (70.2) | 0.011 | 38 (4.7) | 6 (4) | 0.866 |
| Lab investigations |
| Hemoglobin (g/dL) | 9.4 (7.8–11.1) | 9.5 (7.8–11.1) | 0.781 | 11.1 (9.1–12.9) | 11.0 (9.1–12.7) | 0.45 |
| WBCC (×109/L) | 11.7 (8.1–17.6) | 13.3 (9.3–19.2) | <0.001 | 6.5 (4.4–9.3) | 7.9 (5.6–12) | <0.001 |
| Platelet counts (×109/L) | 102.0 (66.0–159.0) | 110.0 (70.0–169.0) | 0.025 | 80 (50–119) | 74.5 (45.8–110.5) | 0.15 |
| INR | 2.2 (1.8–2.7) | 2.3 (1.9–2.8) | 0.001 | 2.3 (1.6–2.8) | 2.4 (2–3.1) | <0.001 |
| AST (U/L) | 123.1 (80.0–190.0) | 137.0 (91.0–200.8) | <0.001 | 132 (67.8–293.9) | 126.7 (68.2–265) | 0.996 |
| ALT (U/L) | 51.0 (32.0–95.0) | 56.0 (35.0–100.0) | 0.010 | 101.2 (42–351) | 92 (42–364.8) | 0.778 |
| Bilirubin (mg/dL) | 18.2 (9.0–26.0) | 21.8 (15.0–28.3) | <0.001 | 21.8 (11.7–28.5) | 25.3 (15.6–32) | <0.001 |
| Creatinine (mg/dL) | 1.3 (0.8–2.4) | 1.5 (0.9–2.6) | <0.001 | 0.8 (0.7–1.2) | 1.1 (0.7–2) | <0.001 |
| Sodium (mmol/L) | 132.0 (128.0–136.0) | 132.0 (128.0–136.0) | 0.588 | 136 (131.3–139) | 133 (129–137) | <0.001 |
| MAP (mmHg) | 85.0 (78.0–93.3) | 85.0 (77.0–94.0) | 0.684 | 88.7 (81.3–94.7) | 87.3 (77–93.3) | 0.069 |
| Disease severity scores |
| MELD score | 29.0 (25.1–34.3) | 32.3 (27.4–37.4) | <0.001 | 27.4 (23.8–30.9) | 30.7 (27.3–37.3) | <0.001 |
| MELD-Na score | 31.5 (28.0–35.6) | 34.0 (29.9–38.0) | <0.001 | 28.6 (24.7–32.3) | 32.7 (28.5–37.8) | <0.001 |
| CLIF-C OF score | 11.0 (9.0–12.0) | 12.0 (11.0–13.0) | <0.001 | 9 (8–10) | 11 (10–12) | <0.001 |
| A-TANGO-OF score | 11.0 (10.0–13.0) | 12.0 (11.0–14.0) | <0.001 | 9 (9–10) | 11 (11–12) | <0.001 |
| Mortality, n/N (%) | | | | | | |
| 28-day mortality | 852 (44.9) | 614 (60.6) | <0.001 | 181 (22.4) | 61 (40.7) | <0.001 |
| 90-day mortality | 1,040 (54.8) | 705 (69.5) | <0.001 | 302 (37.4) | 82 (54.7) | <0.001 |
Comparative analysis of consensus-negative/A-TANGO-positive versus double-negative patients further confirmed substantially greater disease severity and short-term mortality in the reclassified group across both cohorts (Table 4).
Table 4Baseline characteristics of consensus non-ACLF population captured by A-TANGO framework
| Characteristic | TIH cohort
| CATCH-LIFE cohort
|
|---|
| Consensus-/A-TANGO- (n = 499, 20.8%) | Consensus-/A-TANGO+ (n = 885, 36.9%) | P-value | Consensus-/A-TANGO- (n = 1,750, 68.1%) | Consensus-/A-TANGO+ (n = 668, 26%) | P-value |
|---|
| Age, median (IQR) | 45.0 (39.0–52.0) | 45.0 (37.0–53.0) | 0.141 | 52.8 (45–60.9) | 48.1 (41.3–57.1) | <0.001 |
| Female sex, n/N (%) | 54 (10.8) | 99 (11.2) | 0.906 | 526 (30.1) | 147 (22) | <0.001 |
| Etiology of cirrhosis, n/N (%) |
| Alcohol | 376 (75.4) | 612 (69.2) | 0.139 | 127 (7.3) | 24 (3.6) | 0.008 |
| MASLD | 25 (5.0) | 70 (7.9) | 0.139 | 3 (0.2) | 0 (0) | 0.008 |
| Hepatitis B/C | 43 (8.6) | 79 (8.9) | 0.139 | 691 (39.5) | 260 (38.9) | 0.008 |
| Alcohol+ CMRF | 9 (1.8) | 25 (2.8) | 0.139 | | | 0.008 |
| AIH | 21 (4.2) | 54 (6.1) | 0.139 | 36 (2.1) | 8 (1.2) | 0.008 |
| Cryptogenic | 16 (3.2) | 23 (2.6) | 0.139 | 71 (4.1) | 23 (3.4) | 0.008 |
| Alcohol+ Hepatitis B/C | 9 (1.8) | 20 (2.3) | 0.139 | 84 (4.8) | 35 (5.2) | 0.008 |
| Others | 0 (0.0) | 2 (0.2) | 0.139 | 982 (56.1) | 367 (54.9) | 0.008 |
| Precipitating events, n/N (%) |
| AH | 182 (36.5) | 262 (29.6) | 0.031 | 80 (4.6) | 30 (4.5) | <0.001 |
| Infections | 77 (15.4) | 187 (21.1) | 0.031 | 236 (13.5) | 147 (22) | <0.001 |
| AH+ others | 135 (27.1) | 242 (27.3) | 0.031 | 96 (5.5) | 31 (4.6) | <0.001 |
| DILI | 32 (6.4) | 47 (5.3) | 0.031 | 39 (2.2) | 23 (3.4) | <0.001 |
| Viral hepatitis | 35 (7.0) | 53 (6.0) | 0.031 | 75 (4.3) | 49 (7.3) | <0.001 |
| AIH | 7 (1.4) | 20 (2.3) | 0.031 | | | <0.001 |
| UGIB/LGIB | 10 (2.0) | 17 (1.9) | 0.031 | 242 (13.8) | 21 (3.1) | <0.001 |
| Others | 21 (4.2) | 57 (6.4) | 0.031 | 80 (4.6) | 30 (4.5) | <0.001 |
| Clinical events at inclusion, n/N (%) |
| Ascites | 464 (93.0) | 854 (96.5) | 0.005 | 1,144 (65.4) | 466 (69.8) | 0.046 |
| Hepatic encephalopathy | 270 (54.1) | 532 (60.1) | 0.034 | 72 (4.1) | 32 (4.8) | 0.535 |
| Lab investigations |
| Hemoglobin (g/dL) | 9.9 (8.1–11.5) | 9.4 (7.7–11.1) | 0.003 | 10.6 (8.4–12.3) | 11.2 (9.1–12.9) | <0.001 |
| WBCC (×109/L) | 8.3 (6.1–11.6) | 10.3 (6.8–15.4) | <0.001 | 4.3 (2.9–6.2) | 6.1 (4.3–8.6) | <0.001 |
| Platelet counts (×109/L) | 105.0 (67.0 – 158.0) | 93.0 (63.0 – 148.0) | 0.024 | 72 (48–111) | 81 (51–120) | 0.015 |
| INR | 1.7 (1.5–1.9) | 2.2 (1.6–2.6) | <0.001 | 1.4 (1.2–1.6) | 2.3 (1.6–2.7) | <0.001 |
| AST (U/L) | 99.0 (62.0–159.5) | 110.0 (68.0–169.0) | 0.025 | 55.4 (32.2–115) | 132.5 (67.6–306.5) | <0.001 |
| ALT (U/L) | 45.0 (28.0–78.0) | 46.0 (29.0–89.0) | 0.155 | 39 (22.4–85.7) | 103.9 (42–351) | <0.001 |
| Bilirubin (mg/dL) | 7.1 (4.3–12.0) | 10.8 (5.3–22.0) | <0.001 | 2.5 (1.3–6.6) | 20.9 (9.7–27.5) | <0.001 |
| Creatinine (mg/dL) | 0.9 (0.7–1.2) | 1.1 (0.8–1.9) | <0.001 | 0.8 (0.6–0.9) | 0.8 (0.7–1.1) | <0.001 |
| Sodium (mmol/L) | 134.0 (130.0–137.0) | 132.1 (128.0–136.0) | <0.001 | 138.3 (135.5–141) | 136.2 (132–139) | <0.001 |
| MAP (mmHg) | 87.0 (80.0–92.7) | 85.0 (78.0–93.3) | 0.092 | 89 (83–94) | 89 (81.7–95) | 0.471 |
| Disease severity scores |
| MELD score | 21.0 (18.0–23.0) | 27.0 (24.0–30.2) | <0.001 | 14.2 (10.7–18.4) | 26.7 (23.1–30) | <0.001 |
| MELD-Na score | 24.0 (20.6–27.0) | 30.0 (26.5–32.3) | <0.001 | 15.1 (11.3–20) | 27.8 (24–31.1) | <0.001 |
| CLIF-C OF score | 8.0 (7.0–8.0) | 9.0 (8.0–10.0) | <0.001 | 6 (6–7) | 9 (8–10) | <0.001 |
| A-TANGO-OF score | 8.0 (7.0–8.0) | 10.0 (9.0–11.0) | <0.001 | 6 (6–7) | 9 (9–10) | <0.001 |
| Mortality, n/N (%) | | | | | | |
| 28-day mortality | 53 (10.6) | 238 (26.9) | <0.001 | 56 (3.2) | 121 (18.1) | <0.001 |
| 90-day mortality | 82 (16.4) | 335 (37.9) | <0.001 | 118 (6.7) | 222 (33.2) | <0.001 |
A-TANGO grading exposed heterogeneity within consensus non-ACLF
Additional evidence for the clinical value of the A-TANGO framework compared with the consensus definition was reflected in the severity sub-stratification within the consensus non-ACLF group (Fig. 2). In the TIH cohort, consensus non-ACLF was stratified by A-TANGO into grade 0 (36.1%), I (39.6%), II (17.2%), III (5.1%), and IV (2.1%). These consensus-negative cases had 28-day mortality rates of 10.6%, 15.5%, 33.2%, 70.0%, and 86.2% across A-TANGO grades 0–4, respectively. The CATCH-LIFE cohort showed a similar pattern: non-ACLF was sub-stratified into grade 0 (72.4%), I (20.1%), II (7.4%), and III (0.1%), and 28-day mortality rose from 3.2% to 12.6%, 32.8%, and 50% across A-TANGO grades 0–III. These data strongly support that consensus non-ACLF contains several distinct high-risk subgroups and should not be treated as a single low-risk category (Fig. 2).
Organ failure cut-offs and mortality
Among patients with A-TANGO ACLF, the median (interquartile range) bilirubin and INR values were 18.2 mg/dL (9.0–26.0) and 2.2 (1.8–2.7), respectively. These values were well above the thresholds used to define liver and coagulation failure in the consensus definition and in previously described APASL frameworks (Table 3).
In the TIH cohort (Supplementary Table 1), the frequencies of coagulation, liver, circulatory, kidney, respiratory, and brain failure among patients with A-TANGO ACLF were 53.7%, 46.2%, 36.5%, 33.3%, 21.5%, and 16.4%, respectively. In contrast, among patients with consensus ACLF, the corresponding organ failures—based on consensus-specific cut-offs—were present in 0%, 100%, 48.4%, 42.6%, 35.4%, and 53.8% of patients, respectively. Similarly, in the CATCH-LIFE cohort (Supplementary Table 2), liver and coagulation failure were the most frequent organ failures among patients with A-TANGO ACLF, occurring in 58.5% and 55.5% of cases, respectively. By contrast, liver and brain failure were the most common organ failures among patients with consensus ACLF, highlighting that the two definitions identify fundamentally different populations with organ failure.
Notable differences were also observed in organ failure-specific mortality across the two cohorts. In the TIH cohort, liver failure was associated with lower mortality when defined by the consensus cut-offs than by the A-TANGO cut-offs, whereas brain failure showed higher mortality with the consensus definition than with the A-TANGO definition. Similarly, in the CATCH-LIFE cohort, liver failure defined by the consensus cut-offs was associated with lower mortality than liver failure defined by the A-TANGO cut-offs (Supplementary Tables 8 and 9).
Discussion
This study provides the first large-scale, head-to-head comparison between the recently proposed consensus ACLF framework and the A-TANGO classification across two independent, geographically and etiologically distinct cohorts. Three key findings emerge. First, the consensus framework identifies a substantially smaller subset of patients with ACLF compared with A-TANGO in both cohorts, but the differences are particularly dramatic in the CATCH-LIFE cohort. Second, the discordance between definitions is not random but concentrated within a large intermediate-risk population classified as non-ACLF by consensus but as ACLF by A-TANGO. Third, these patients are not clinically trivial; they exhibit substantial short-term mortality (18%–27% at 28 days and 33%–38% at 90 days) and a markedly worse clinical profile than those negative by both definitions, placing them closer to ACLF than to uncomplicated AD.
Both frameworks largely agree on the sickest patients with advanced multi-organ failure, who have the highest mortality. The clinically important difference lies outside this group. Across both cohorts, 26%–37% of patients with AD were classified as ACLF by A-TANGO but not by the consensus definition, and this group carried a higher mortality that is well within the range historically associated with ACLF.1 This observation directly challenges the assumption that consensus non-ACLF11 represents a low-mortality risk state. Instead, it conceals a sizeable population with clinically meaningful risk that is only revealed by outcome-calibrated A-TANGO definitions. This is further supported by the NRI of 17%–27%, indicating that A-TANGO4 more accurately reallocates patients into clinically relevant risk strata. Importantly, these reclassified patients were not borderline cases: they had substantially higher MELD, MELD-Na, leukocyte counts, coagulopathy, and organ dysfunction compared with double-negative patients.
The current push toward harmonization reflects an understandable desire to unify definitions across regions and facilitate research.10,11 However, our findings highlight a fundamental limitation: harmonization does not necessarily equate to improved clinical utility. The consensus framework,11 by anchoring ACLF to liver failure defined by relatively permissive bilirubin and INR thresholds combined with an extrahepatic failure, identifies a smaller, more specific, and more liver-centered phenotype. While this improves specificity, it does so at the cost of sensitivity. In both cohorts, consensus criteria failed to identify a large proportion of patients who subsequently died, with markedly lower sensitivity for 28-day mortality compared with A-TANGO. In contrast, A-TANGO retains the original conceptual strength of ACLF definitions, linking organ failure thresholds to sharp inflections in mortality1,4 and therefore captures patients earlier along the trajectory of organ failure. This distinction reflects two fundamentally different conceptual models: a late, highly specific definition of ACLF centered on advanced liver failure (consensus criteria), versus an earlier, mortality-calibrated syndromic definition (A-TANGO score) aimed at identifying patients at clinically actionable risk. These models are not interchangeable, and attempts to harmonize them risk diluting the strengths of each.
A major insight from this analysis is that consensus non-ACLF is neither biologically nor prognostically homogeneous. Application of A-TANGO criteria to this population consistently decomposed it into graded risk strata, with stepwise increases in mortality across grades. Notably, even within this consensus non-ACLF group, mortality ranged from approximately 10% to over 50%, depending on A-TANGO grade.
These findings support the concept that ACLF is a dynamic process rather than a binary state.13 Patients classified as consensus-negative but A-TANGO-positive appear to represent an evolving phase of organ failure, which is further underscored by the observation that progression to ACLF within seven days is associated with a marked increase in mortality. In this context, A-TANGO4 does not overextend the definition of ACLF but rather identifies patients earlier in their disease trajectory, when intervention may still be possible.
The divergence between frameworks also reflects differences in how organ failure is conceptualized. The consensus definition is inherently liver-centric and does not recognize coagulation as an independent organ failure domain, whereas A-TANGO allows extrahepatic and coagulation failures to define ACLF when associated with mortality risk. Our data indicate that this distinction is clinically meaningful. A substantial proportion of high-risk patients identified by A-TANGO did not meet consensus liver failure thresholds but exhibited significant extrahepatic dysfunction. Conversely, patients classified as consensus ACLF had more advanced disease and higher mortality, reflecting a later-stage phenotype.
Importantly, A-TANGO does not ignore liver dysfunction. Median bilirubin and INR levels in A-TANGO ACLF were well above thresholds proposed in other frameworks,2 indicating that hepatic dysfunction is intrinsically captured when present. Rather than requiring liver failure as an entry criterion for the diagnosis of ACLF, A-TANGO integrates it within a broader multi-organ model, consistent with current understanding of ACLF as a systemic inflammatory and multi-organ dysfunction syndrome.4,14
The observed differences can be interpreted through a sensitivity–specificity trade-off. A-TANGO demonstrates substantially higher sensitivity for mortality, whereas consensus criteria are more specific. This trade-off should be interpreted in the context of the intended application of these scoring systems. For defining extreme phenotypes or epidemiological standardization, higher specificity may be desirable. However, for clinical decision-making, early escalation of care, and trial enrichment, sensitivity is critical. Under-diagnosing patients with a 20%–30% short-term mortality is clinically consequential. From this perspective, A-TANGO provides a more useful framework for identifying patients at a stage where interventions may still alter outcomes.
These findings have direct implications for clinical trials and regulatory pathways. Definitions that preferentially select late-stage disease may limit the therapeutic window and reduce the likelihood of demonstrating benefit.15 Additionally, underdiagnosis of ACLF negatively impacts patient recruitment, sample size, and the evaluation of resolution of ACLF as a potential endpoint in clinical trials.4 In contrast, outcome-calibrated frameworks such as A-TANGO4 may enable earlier identification and more effective enrichment strategies. Moreover, regulatory acceptance increasingly depends on reproducible relationships between diagnostic criteria and clinically meaningful endpoints. The stronger HRs and consistent mortality gradients observed with A-TANGO across both cohorts support its suitability as a stratification and enrichment tool in interventional studies.
This study benefits from a large sample size, inclusion of two independent cohorts with distinct etiologies, geography, practices, mortality rates, and consistent findings across populations. The head-to-head comparison directly addresses issues with the consensus definitions of organ failure and ACLF in comparison with the evidence-based A-TANGO framework. Limitations should be acknowledged. The consensus framework was operationalized without the immune and gastrointestinal failure domains, which have been suggested to constitute individual organ failures in the consensus framework. Under the consensus criteria,11 “immune failure” was defined by the presence of sepsis, septic shock, invasive fungal infection, Vibrio infection, or spontaneous bacterial peritonitis. However, these entities are imprecisely defined and difficult to ascertain consistently in patients with acutely decompensated cirrhosis. There is also a conceptual challenge in classifying infection as a precipitating event, a complication, or an organ failure in itself, creating a risk of circularity and double counting within the ACLF construct.12 Similarly, “gastrointestinal failure” was loosely defined and has not been validated in ACLF. In addition, the consensus definition of kidney failure requires knowledge of baseline creatinine, which is often unavailable or incompletely captured in real-world datasets. Therefore, in the absence of documented chronic kidney disease in our datasets, we approximated baseline creatinine as 1 mg/dL and defined stage 2 AKI, corresponding to creatinine levels above 2 mg/dL, as kidney failure for the operationalized consensus ACLF definition. This approximation may modestly underestimate the sensitivity of the consensus definition if baseline creatinine was lower than 1 mg/dL, or overestimate it if baseline was higher, and prospective studies with complete baseline values are needed to validate its exact performance. Finally, the two cohorts differed not only in etiology and disease severity but also in healthcare practices, referral patterns, and demographic factors. While these regional differences could theoretically affect the operating characteristics of any ACLF classification system, the consistent advantage of A-TANGO in sensitivity and risk stratification across both cohorts argues against major bias driving our conclusions. This robustness is further supported by our subgroup analyses stratified by etiology. Together, these findings support the generalizability of our results despite cross-regional heterogeneity.