Introduction
Paleontological and anthropological evidence indicates that the use of medicinal plants for therapeutic purposes dates back approximately 60,000 years.1 In the 21st century, traditional herbal medicines (THMs) have transitioned from localized ancestral practices to a cornerstone of the global therapeutic landscape.2 Currently, an estimated 80% of the global population uses THMs, either as primary healthcare or within the framework of complementary and alternative medicine (CAM).3 This shift is reflected in the rapid expansion of the worldwide herbal industry, which is projected to exceed USD 328 billion by 2030, driven by technological advancements in standardized extraction and a systemic consumer shift toward preventative, plant-based healthcare.4–6
While the Asia-Pacific region, led by the deep-rooted systems of Ayurveda and traditional Chinese medicine (TCM), remains the epicenter of production, the clinical implications of herbal use are now a universal concern. Within this expanding landscape, the liver has emerged as a critical focal point. As the primary site for the biotransformation of xenobiotics, the liver’s metabolic pathways are uniquely vulnerable to reactive metabolites generated during the detoxification of complex herbal extracts. This vulnerability has manifested in a rising global incidence of herb-induced liver injury (HILI), a condition that challenges the widespread but scientifically unsubstantiated perception of herbal products as inherently benign.7,8
The epidemiological burden of HILI is now recognized as a non-linear global crisis, though its presentation varies by region. In the Asia-Pacific region, the integration of the Roussel Uclaf Causality Assessment Method (RUCAM) into clinical research has facilitated the documentation of extensive HILI cohorts, with China and Korea currently leading the global literature in the number of published, RUCAM-assessed cases associated with complex herbal matrices.9 However, recent data from North America and Europe indicate a shifting demographic, where herbal and dietary supplements (HDS)—particularly those marketed for weight loss and bodybuilding—now account for an estimated 16% to 20% of all drug-induced liver injury (DILI) cases.10 In the European market, mature regulatory frameworks are increasingly flagging hepatotoxicity linked to products such as Chelidonium majus and green tea extracts within the EudraVigilance network.11 Conversely, in Africa, the burden remains largely “silent”; despite the vast reliance on traditional African medicine (TAM), fragmented toxicovigilance infrastructure masks a significant incidence of acute hepatic failure linked to regional botanicals.12
This fundamental disconnect—between traditional empirical practice and modern safety requirements across diverse global populations—demands a sophisticated, proactive solution.13 The integration of ancient herbal wisdom with the precision of contemporary “omics” technologies and artificial intelligence (AI) is strategically positioned to bridge this gap.14
This review delineates the complexities inherent in herbal safety assessment, details the roles of genomics, transcriptomics, and metabolomics in molecular-level risk stratification, and demonstrates the essential function of AI in synthesizing these disparate, heterogeneous datasets. By harmonizing global clinical signals with high-dimensional biological data, this “omics”-guided approach defines a new paradigm for integrative liver health, transforming empirical practice into evidence-based precision medicine.
The pharmacovigilance challenge in THM
While THM is a cornerstone of global healthcare, its integration into modern pharmacovigilance (PV) systems is fraught with structural and scientific hurdles. Most PV frameworks were originally calibrated for chemically defined synthetic drugs; applying these same paradigms to complex, multi-component botanical matrices reveals significant gaps in safety assessment and regulatory oversight.15
Phytochemical complexity and causal ambiguity
The primary barrier to effective PV in this sector is the inherent multi-component nature of polyherbal products. Unlike conventional pharmaceuticals with isolated active ingredients, herbal preparations contain a vast array of bioactive metabolites, making it exceedingly difficult to pinpoint a specific causative agent during an adverse event. Identifying whether hepatic stress—a common complication—is the result of a single botanical, a toxic contaminant, or a synergistic interaction among multiple phytochemicals remains a formidable task.16 This complexity often prevents the establishment of definitive “causal relationships”, leaving clinicians and regulators unable to distinguish between intrinsic pharmacological toxicity and extrinsic product defects.
Product heterogeneity and quality control
Standardizing toxicological profiles for herbal medicine is complicated by extreme product variability. A plant’s chemical fingerprint is not static; it is influenced by a convergence of intrinsic and extrinsic factors. Genetic polymorphisms and phenological stages (intrinsic), combined with soil composition, climate, and geographical provenance (extrinsic), result in diverse phytochemical profiles within the same species. This heterogeneity leads to inconsistent pharmacokinetic and pharmacodynamic properties across different batches. For instance, Ginkgo biloba extracts often show marked compositional disparities based on regional harvesting and processing methods, which directly impact patient safety and therapeutic efficacy.17–20
Nomenclature confusion and underreporting
Inconsistencies in taxonomic nomenclature represent a critical impediment to accurate adverse drug reaction surveillance in herbal pharmacovigilance systems. The absence of standardized botanical identification protocols across diverse cultural and linguistic contexts generates systematic errors in adverse event attribution, compromising the integrity of safety databases and hindering effective risk signal detection. This nomenclatural ambiguity is compounded by prevalent self-medication practices and the pervasive misconception that botanical preparations are inherently benign because of their natural origin.21,22
Widespread self-medication and underreporting
Accurate risk signal detection is further hampered by a global lack of standardized botanical nomenclature. The use of varied common names across different cultures leads to systematic errors in adverse drug reaction (ADR) attribution.23 This technical ambiguity is exacerbated by the “natural is safe” myth. Because patients frequently perceive herbal remedies as inherently benign, they often engage in unsupervised self-medication. Research indicates that over 76% of herbal users in certain regions operate without professional guidance and rarely disclose such use to their physicians.24 Consequently, ADR reporting rates for herbal products remain alarmingly low—ranging from a mere 0.03% to 29.84%—thereby preventing the development of robust safety databases.25–27
Regulatory divergence and safety oversight
The safety of the herbal supply chain is further undermined by “regulatory gaps” in which products are classified as dietary supplements rather than medicinal agents. In jurisdictions such as the United States Food and Drug Administration (FDA)’s categorization of herbal products as food supplements bypasses the rigorous premarket approval and Good Manufacturing Practice (GMP) standards required for pharmaceuticals.28,29 This multi-regional disparity in licensing and post-market surveillance creates a fragmented safety landscape in which products of unknown quality are distributed globally with minimal oversight.30–32
Clinical implications: ADRs and contamination
A pooled analysis of 428 clinical cases found that kava and green tea were each associated with 16–19 documented cases of liver injury.33 A systematic review analyzing 936 HILI cases across 446 references confirmed green tea extract and kava as among the most common hepatotoxic supplements.34 Green tea specifically shows hepatocellular injury patterns at Epigallocatechin gallate intakes ranging from 140–1,000 mg/day.35 Heavy metal contamination in herbal medicines has been confirmed as a widespread global threat with documented health risks, supported by multiple large-scale studies. A comprehensive analysis of 1,773 samples collected worldwide revealed that 30.51% of herbal products contained at least one over-limit metal (Pb: 5.75%, Cd: 4.96%, As: 4.17%, Hg: 3.78%, Cu: 1.75%).36 An extensive analysis of 2,245 batches of herbal products revealed the presence of several heavy metals, with mean concentrations of Pb (1.566 mg/kg), Cd (0.299 mg/kg), As (0.391 mg/kg), Hg (0.074 mg/kg), and Cu (8.386 mg/kg).37 These findings highlight a consistent baseline of metallic contaminants within the herbal supply chain. While some individual concentrations may fall within permissible limits, cumulative exposure—especially to elements such as Lead and Cadmium—poses a significant risk of chronic organ toxicity and may exacerbate herb-induced hepatic adverse drug reactions. The detection and risk assessment of these contaminants, often requiring sophisticated methods like Inductively Coupled Plasma Mass Spectrometry, force the PV system to operate as a retroactive GMP enforcement mechanism.38 A significant proportion of reported adverse events in herbal medicine and supplements are attributable to product quality failure (contamination or adulteration) rather than inherent pharmacological toxicity. This necessitates that PV centers acquire specialized technical expertise and access to analytical laboratories to differentiate between intrinsic pharmacological signals and extrinsic product defects, a capacity often lacking in national monitoring centers.
To move beyond the “black box” of traditional toxicology and identify causative agents within complex mixtures, PV must shift the point of detection “upstream”. Table 1 provides a comparative framework for this technological evolution, contrasting the limitations of clinical symptom-based monitoring with the advantages of proactive, biomarker-focused molecular surveillance.
Table 1Evolution of PV in the context of integrative liver health
| Feature | Conventional PV | Integrative liver-centric PV |
|---|
| Primary focus | Single-molecule synthetic drugs. | Complex phytochemical matrices and multi-herb interactions. |
| Causality assessment | Reactive; relies on manual clinical correlation. | Predictive; based on molecular signatures and relational modeling. |
| Diagnostic resolution | Macroscopic; focused on overt organ dysfunction (e.g., jaundice). | Subcellular; near single-cell resolution of the intrahepatic proteome and transcriptome. |
| Toxicological markers | Traditional enzymes (ALT/AST) with low early-stage sensitivity. | Molecular biomarkers (e.g., GDF-15) and stress response signatures. |
| Data integrity | Manual retrospective reviews. | Automated signal detection via AI and immutable blockchain ledgers. |
| Standardization | Established GMP for standardized chemicals. | Post-market “retroactive GMP” to detect contaminants (Pb, Cd, As). |
| Risk stratification | “One-size-fits-all” population-level assessment. | Personalized medicine based on individual CYP450 polymorphic profiles. |
| Regulatory Framework | Stringent pre-market approval. | Global harmonization of Real-World Evidence and “omics” data. |
| Traceability | Historical; reliant on vernacular names and documentation. | Molecular; dual “DNA × metabolite” batch-release standards. |
| Methodology | Observational, expert-based, and localized. | High-throughput technologies (Genomics, Transcriptomics, Metabolomics). |
| Response type | Reactive; responding after clinical symptoms are reported. | Proactive; identifying emerging risks and subclinical perturbations early. |
| Analytical model | Reductionist; characterizing effects through isolated components. | Systems-level; integrating molecular, clinical, and real-world data. |
| Data modalities | Structured adverse event reports and clinical narratives. | Heterogeneous; multi-modal data from genomics, wearables, and EHRs. |
| Safety threshold | “Natural is safe” perception; delayed identification of rare reactions. | Risk-based credibility; automated “signal-from-noise” detection. |
This shift from reactive to proactive monitoring is operationalized through the deployment of multi-layered “omics” platforms. As detailed in the following section, these technologies offer the high-resolution, molecular-level insights necessary to detect subclinical perturbations significantly before the manifestation of overt clinical symptoms.
“omics” applications in herbal hepatotoxicity
The safety assessment of traditional herbal medicine (THM) has historically been hindered by a “black box” dilemma, where the synergistic interactions of hundreds of chemical constituents render standard toxicology ineffective for isolating causative agents. To overcome this, PV must shift the point of detection “upstream” through the use of multi-omics platforms. These approaches have fundamentally transformed hepatotoxicity research by enabling the early detection of biochemical damage signatures—often manifesting within 5–10 hours of exposure—significantly before traditional aminotransferase markers (ALT/AST) rise or overt clinical symptoms appear. Evidence supporting this molecular shift is robust across multiple high-resolution investigations.39,40
Genomics: Unraveling the plant and the patient
Genomics offers a dual-pronged approach to mitigating HILI risk by focusing on both the quality of the herbal material and the genetic susceptibility of the patient. The integration of herb genomics provides a robust framework for the “accurate identification” of medicinal plant species, underpinned by the development of comprehensive genomic databases derived from global pharmacopoeias. These digital repositories are essential for the molecular authentication of raw materials, ensuring that botanical products meet stringent quality standards. Furthermore, genomic methodologies facilitate “high-quality herb cultivation” by addressing chronic quality control challenges, such as chemical inconsistency and the presence of contaminants, thereby stabilizing the safety profile of herbal medicines at the source.41
Host susceptibility to HILI is mediated by a polygenic architecture encompassing human leukocyte antigen (HLA) polymorphisms and non-HLA genetic variants that function as primary risk modulators. The genetic heterogeneity within cytochrome P450 (CYP450) enzyme systems, particularly at the CYP3A4, CYP2C19, and CYP2D6 loci, represents a key determinant of individual susceptibility through the differential metabolic processing of phytochemical compounds. Polymorphic variants at these loci influence the biotransformation pathways of herbal xenobiotics, determining whether metabolic activation toward reactive intermediates or detoxification predominates. Single nucleotide polymorphisms (SNPs) within these pharmacogenomic loci alter drug exposure kinetics and metabolite profiles, establishing a genotype-dependent predisposition to idiosyncratic hepatotoxicity following herbal product consumption.42
The research literature provides strong evidence supporting the critical role of plant genomics in addressing botanical challenges that confound HILI case assessment. A clinical case study highlights the severe consequences of botanical misidentification, specifically involving the confusion of pyrrolizidine alkaloid (PA)-producing Gynura japonica with non-toxic TCM herbs such as Sedum aizoon L. (Tu-San-Qi) and Panax notoginseng (San-Qi). This frequent misuse stems from overlapping herbal nomenclature, morphological similarities, and shared medicinal indications. The clinical impact of this “black box” authentication failure is substantial, directly contributing to over 2,156 cases of hepatic sinusoidal obstruction syndrome in China between 1980 and 2019.43 Techniques such as DNA barcoding allow for the rapid and unambiguous identification of herbal species, even in processed powder form, thereby helping to secure a sustainable supply of high-quality raw materials with the desired therapeutic traits.44
Transcriptomics: The genetic “early warning sign”
Transcriptomics effectively detects hepatotoxicity through stress response pathways and can capture interindividual variability in toxicodynamic responses, supporting the framework described, although the evidence for “exceptional sensitivity” specifically in idiosyncratic hepatotoxicity is more nuanced. The sources confirm that transcriptomics quantifies mRNA expression in hepatic tissue to detect stress responses.45 Recent research has demonstrated that transcriptomics can effectively map interindividual variability by analyzing responses across 50 human hepatocyte donors. This mapping revealed striking differences in stress response activation—up to 864-fold—between individuals. Such findings provide a critical molecular basis for understanding idiosyncratic hepatotoxicity, highlighting how an individual’s unique genetic landscape dictates susceptibility to liver injury.46 Research has noted that leveraging inter-individual differences enables “personalized DILI risk analyses”. However, recent perspectives emphasize that the “idiosyncratic nature of DILI complexes its mechanistic studies”, suggesting that detection remains a significant challenge despite the advanced capabilities offered by transcriptomics.47 Research has also demonstrated that transcriptomics exhibits superior sensitivity in detecting differentially expressed genes compared with existing models. This finding reinforces claims regarding the sensitivity of the technology, although the study did not specifically address idiosyncratic cases.48
The primary utility of this approach lies in its ability to identify the activation of discrete “toxicity pathways” significantly before traditional clinical markers, such as Alanine Aminotransferase (ALT) or Aspartate Aminotransferase (AST), become elevated in the systemic circulation. For instance, in cases of Triptolide-induced injury, transcriptomic profiling has elucidated substantial downregulation of the Nrf2-mediated antioxidant pathway. This finding serves as empirical evidence that the compound effectively compromises the liver’s endogenous defenses against oxidative stress. Transcriptomics therefore serves as a highly sensitive modality for detecting the activation of discrete “toxicity pathways”—such as the Nrf2-mediated antioxidant pathway—significantly before the manifestation of overt clinical injury. By quantifying mRNA expression, this layer maps interindividual variability in stress response activation, which can vary by up to 864-fold between donors. This capability provides a critical molecular basis for understanding the idiosyncratic nature of HILI by capturing the host’s cellular signaling response to botanical xenobiotics before physical tissue damage occurs.49–52
Proteomics: Mapping the mechanism of injury
Proteomics provides a functional window into the state of the liver by characterizing global changes in protein expression and post-translational modifications.53,54 The primary utility of this layer lies in biomarker discovery, identifying specific protein signatures, such as inter-alpha-trypsin inhibitor and serum amyloid P-component, that are released into the systemic circulation during active hepatic stress. Furthermore, proteomics can identify markers of specific organelle dysfunction, such as growth differentiation factor 15 (GDF-15), which serves as a sensitive diagnostic indicator of mitochondrial distress. Unlike transcriptomics, which captures early stress signaling, proteomics identifies the proteins that reflect active apoptosis, inflammation, and HILI pathogenesis.55–57
Metabolomics: The functional readout
Metabolomics represents the final “omics” layer, providing a functional readout of the entire biological system’s response to an herbal intervention. Since metabolites are the end-products of cellular regulatory processes, changes in the metabolome directly reflect the physiological consequences of exposure.58 Several studies exemplify metabolomics as the final integrative “omics” layer; for instance, spatially resolved metabolomics has been used to visualize changes in metabolic profiles within liver tissue. This approach has successfully identified biomarkers such as taurine, taurocholic acid, and acyl-carnitines, which correspond directly to disorders involving cholestasis, mitochondrial damage, oxidative stress, and energy metabolism.59–60 Liver metabolomics has further elucidated the biochemical drivers of herbal toxicity, particularly in the case of Polygonum multiflorum (PM). Mechanistic investigations have revealed that PM-induced hepatotoxicity is characterized by significant interference with purine metabolism. This metabolic disruption is marked by the up-regulation of xanthosine and xanthine, coupled with the concomitant down-regulation of nucleotidases, providing a molecular fingerprint for the observed liver injury.61 The functional readout capability of metabolomics is further underscored by the identification of specific sphingolipid biomarkers that exhibit greater sensitivity than traditional ALT and AST markers.62 Furthermore, metabolomic profiling has proven effective in isolating idiosyncratically hepatotoxic components and characterizing the associated endogenous metabolic biomarkers.63 These advances allow the detection of subtle biochemical shifts that precede overt clinical injury, facilitating a more nuanced understanding of individual susceptibility to herbal toxicity. Collectively, these studies demonstrate that metabolomic alterations serve as direct functional indicators of disruptions in cellular regulation. Enrichment analyses of metabolic pathways have revealed significant perturbations in linolenic acid metabolism, carnitine synthesis, and branched-chain fatty acid oxidation, each corresponding to specific molecular mechanisms of toxicity. This high-resolution mapping allows complex metabolic shifts to be translated into actionable toxicological insights.64 As the final integrative “omics” layer, metabolomics provides a dynamic map of the host’s biochemical response to herbal intervention. Rather than characterizing the herb itself, this layer quantifies the “downstream” physiological consequences—such as perturbations in bile acid synthesis, lipid peroxidation, and amino acid metabolism—that signal early-stage hepatic stress. In the case of Polygonum multiflorum, metabolomic profiling distinguishes therapeutic tolerance from active injury by identifying specific endogenous biomarkers such as hypoxanthine and taurochenodesoxycholic acid in patient biofluids.65
Collectively, these “omics” layers offer the resolution needed to bridge the gap between traditional herbal wisdom and modern precision safety. By combining genomic data on patient susceptibility and plant quality, proteomic insights into mechanistic injury pathways, and metabolomic functional readouts, the path toward a safer, more evidence-based practice of integrative liver health becomes clearer.
Recent research has employed a multi-omics strategy to pinpoint specific genes linked to lipid metabolism abnormalities and oxidative stress in cases of dictamnine-induced hepatotoxicity.66 A multi-omics investigation also identified19 differential metabolites associated with altered bile acid biosynthesis pathways in cases of herbal extract toxicity.67 Research on the hepatotoxicity of Fructus Psoraleae has revealed 575 significantly altered proteins and 14 key indicators, including glutathione and bile acids.68 Studies have shown that combined metabolomic and transcriptomic analyses can identify markers of liver injury in plasma and urine within 5–10 hours after toxicant exposure. This represents a notable advance in early detection, as these “omics” signal appear considerably earlier than traditional aminotransferase markers. Research has further identified specific metabolite biomarkers that facilitate the early detection of liver injury induced by herbal compounds, corroborating the utility of metabolomic approaches in detecting toxicity before traditional clinical markers become apparent.69,70Table 2 summarizes the technical specificities of these layers and their functional roles in bridging the gap between traditional wisdom and modern precision safety.
Table 2Functional roles of multi-Omics layers in decoding the molecular signatures of herbal hepatotoxicity
| “omics” layer | Functional safety goal | Key molecular targets and mechanisms | Clinical utility & diagnostics |
|---|
| Genomics | Botanical authentication and host risk stratification | Whole-Genome Sequencing (WGS), DNA Barcoding, GWAS | Ensures precise species identification to prevent toxic substitution and identifies human genetic markers (e.g., HLA-B*35:01, CYP450 SNPs) that predispose individuals to injury. |
| Transcriptomics | Early detection of subclinical stress responses | RNA-Seq, Microarrays | Quantifies mRNA expression to detect the activation of “toxicity pathways” significantly before traditional clinical markers likeALT or AST become elevated. |
| Proteomics | Discovery of organ-specific injury biomarkers | LC-MS/MS, 2D-PAGE | Characterizes changes in protein expression and post-translational modifications to identify mechanistic markers of mitochondrial stress, such as GDF-15. |
| Metabolomics | Functional readout of the biological system | NMR Spectroscopy, LC-HRMS, UPLC-Q-TOF-MS | Detects metabolic disturbances and toxic metabolites in biological fluids, providing a final “molecular fingerprint” of the liver’s metabolic state. |
AI-powered herbal PV
AI and machine learnings (MLs) serve as the indispensable analytical bridge required to synthesize the high-dimensional complexity of “omics” data with clinical outcomes.71 This integrated workflow—spanning botanical and patient-specific inputs to AI-driven risk stratification—is conceptualized in Figure 1.
To move beyond the “black box” of traditional toxicology and systematically identify the molecular drivers of liver injury, each layer of the “omics” hierarchy is deployed for a distinct safety purpose. As illustrated in the high-dimensional input layer of the integrated “omics”-AI paradigm (Fig. 1), these technologies provide the raw biological data required for predictive modeling. The specific mapping of these technologies—genomics, transcriptomics, proteomics, and metabolomics—to their respective safety evaluation objectives is detailed in Table 2.
Predictive toxicology and personalized risk assessment
The efficacy of the integrated “omics”-AI paradigm is substantiated by the high-resolution performance of diverse computational architectures in toxicity modeling. Extensive evaluations across large-scale datasets (comprising 1,200 to 2,500 compounds) demonstrate that ML models—specifically support vector machines, random forests, and deep belief networks—achieve predictive accuracies ranging from 76.7% to 83.8%. Furthermore, integrated multi-omics classifiers have demonstrated substantial diagnostic precision in early-detection tasks, with area under the curve (AUC) values consistently ranging from 0.81 to 0.87. These metrics, coupled with ensemble approaches achieving recall rates as high as 93%, underscore the reliability of AI-driven frameworks in managing the high-dimensional data characteristic of both herbal and synthetic toxicological assessments.72–76
These models use calculated molecular descriptors, such as lipophilicity and specific molecular fingerprints, to accelerate early-stage toxicity screening. For complex TCM mixtures, analytical descriptor-based ML can predict toxicity, while molecular docking simulations can explore interactions with key protein targets, such as hepatic CYP450 3A4.77,78
AI-driven multi-omics integration significantly improves clinical prediction accuracy and patient stratification compared with traditional approaches, with strong evidence across multiple recent studies. The sources provide robust support for this claim. Recent findings indicate that integrated multi-omics classifiers can achieve AUC values ranging from 0.81 to 0.87 in early-detection tasks, underscoring their substantial predictive capability.79 The pivotal role of AI in this field is further highlighted by its recognition as one of the most efficient tools for processing and interpreting multi-omics datasets to generate individualized patient insights. AI facilitates seamless data integration, elucidates comprehensive molecular disease pathways and considerably improves the identification of precise diagnostic and prognostic biomarkers.
Data integration, signal detection, and pattern analysis
AI/ML can automate adverse signal detection from diverse data sources, although evidence that this overcomes underreporting remains limited. Salas et al.80 reviewed 66 articles and found that 57.6% focused on identifying ADEs/ADRs through automated processes, concluding that “automation and ML models can optimize PV processes”. Hu et al.81 demonstrated strong ML performance in predicting ADEs from electronic health records, with an average AUC of 76.68% across 59 studies. However, critical gaps remain. Salas et al.80 noted that “more research is needed to identify if this optimization has an impact on the quality of safety analyses”. Importantly, while García-Abeijon et al.82 and Costa et al.83 extensively document the causes of underreporting, neither source directly evaluates whether AI/ML solutions mitigate this problem in practice.
AI is used for the automated extraction and analysis of adverse event data from sources including spontaneous safety reports, electronic health records (EHRs), and unstructured data like social media platforms, often leveraging Natural Language Processing (NLP) techniques. Multiple sources confirm the use of AI across these data sources. Salas et al.80 found that among 66 reviewed articles, 21.2% focused on processing safety reports and 57.6% on identifying ADEs/ADRs through automated processes. Kim et al.84 analyzed 72 articles and found Electronic Medical Record data were “exclusively analyzed using the regression method”, whereas FDA adverse event reporting system data were analyzed using disproportionality methods. For social media and NLP applications, Painter et al.85 reported that real-world data and social media comprised 63% (21/33) of industry-based PV papers. Pilipiec et al.86 specifically reviewed NLP applications, and found that 14 of 16 publications “reported positive findings with respect to the identification of adverse drug reactions” from user-generated textual content, concluding that “natural language processing can be used effectively and accurately”. Systematic reviews of EHR-based ML applications demonstrate robust performance across multiple cohorts, with predictive models achieving a mean AUC of 76.68%. Furthermore, the integration of advanced signal detection methodologies within spontaneous reporting systems has validated the efficacy of automated analysis in clinical PV. These findings underscore a significant shift toward high-throughput, data-driven surveillance that enhances the identification of adverse events compared with manual retrospective review.87,88 Furthermore, various ML techniques, such as Lasso shrinkage regression, Bayesian borrowing algorithms, and temporal scan statistics, are applicable to signal detection. This integrated approach allows AI to identify subtle syndromic patterns and trends in HILI cases that might be missed by manual review, thereby enhancing proactive PV. Recent evidence demonstrates that ML architectures based on gradient boosting can accelerate PV by detecting confirmed safety signals up to six months earlier than conventional human-led analysis. In comparative evaluations across two drug products, these models maintained sensitivity rates ranging from 50% to 55.6%, highlighting their potential to improve the timeliness of toxicity surveillance and early-stage risk mitigation.89 The application of Bayesian networks in PV centers has also markedly enhanced causality assessment processes, reducing processing times from many days to only a few hours. This shift to probabilistic modeling enables faster and more effective handling of safety signals, thereby improving real-time surveillance of adverse drug feedback.90
AI-driven quality control and fraud prevention
AI algorithms contribute directly to mitigating some of the primary causes of HILI, including poor herbal product quality, misidentification, and adulteration.91 The convergence of advanced analytical platforms—including chromatography, spectroscopy, and mass spectrometry—with chemometric modeling provides a robust framework for comprehensive herbal authentication. This integrated approach facilitates the high-resolution detection of variations in geographical origin, processing methodologies, and deliberate adulteration, thereby helping to ensure the chemical integrity of herbal products through multifaceted data analysis.92 Evidence indicates that the most successful approach for detecting herbal adulteration is a carefully planned combination of analytical and molecular methods. The simultaneous use of DNA barcoding, DNA metabarcoding, mass spectrometry, and high-performance liquid chromatography offers a comprehensive analytical framework. This combined approach enables high-resolution authentication by leveraging the specificity of genetic sequencing together with the sensitivity of chemical profiling to maintain the integrity of herbal products.93 Nevertheless, despite the revolutionary promise of these technologies, their translation into clinical and regulatory practice continues to be hindered by persistent obstacles. Significant constraints include inconsistencies in data quality, limited model interpretability, and a complex process for obtaining formal regulatory approval. Moreover, the generalizability of these predictive models is often undermined by limited sample sizes and substantial regional disparities in both herbal chemical composition and human genetic variation.94
Clinical and integrative applications
Evidence increasingly supports the capacity of multi-omics platforms to translate complex molecular data into actionable clinical insights, thereby facilitating the development of personalized therapeutic strategies. The integration of pharmacogenetics into precision PV allows treatments to be tailored according to individual genetic profiles, while AI-driven multi-omics integration elucidates the critical gene-gene and gene-environment interactions that determine therapeutic outcomes. This holistic approach enables the optimization of treatment efficacy and the mitigation of adverse effects through more precise host stratification.95,96
However, while omics technologies have been applied to investigate the mechanisms of traditional medicine across various pathologies,97 their specific application in validating the gut-liver axis remains an area with limited evidence. Although recent frameworks have begun to integrate nutrigenomics and microbiome data with pharmacogenomics—elements inherently relevant to gut-liver interactions—explicit validation of this axis through these technologies has not yet been fully established in the literature.98 Consequently, although the transformative potential of omics in traditional medicine is well supported, specific empirical evidence for the gut-liver axis claim remains a current research gap.99
While AI frameworks demonstrate significant potential for assessing herb-drug interactions (HDIs) by providing mechanistic insights, empirical evidence for specific clinical scenarios remains limited in the current literature. Computational models are capable of processing high-dimensional datasets—encompassing chemical structures, pharmacological properties, molecular pathways, and established interaction patterns—to predict potential interference.
However, the inherent complexity of herbal products, characterized by multi-constituent mixtures with frequently ill-defined pharmacological profiles, presents a substantial challenge for predictive modeling. This data scarcity, combined with chemical heterogeneity currently limits the precision of AI in characterizing specific HDI outcomes, necessitating further integration of standardized botanical data into existing algorithmic frameworks.
Case studies
Evidence increasingly supports the capacity of multi-omics platforms to translate complex molecular data into actionable clinical insights. The integration of these technologies is best illustrated through the following clinical scenarios:
Scenario I: Pre-prescription risk stratification
In a personalized medicine clinic, a patient’s CYP450 polymorphic profile (e.g., CYP2D6 or CYP3A4) is screened before a high-risk botanical extract is prescribed. If the patient is identified as a “poor metabolizer”, the AI-driven system flags a potential risk of idiosyncratic HILI, allowing the clinician to either adjust the dosage or select a safer alternative therapy.
Scenario II: Early signal detection via wearable integration
Longitudinal monitoring through EHRs and wearable data allows the early identification of subtle transitions from health-to-disease. Transformer architectures can detect a “creeping” rise in mitochondrial stress markers, such as GDF-15, weeks before a patient develops jaundice or elevated ALT/AST levels, enabling a proactive “stop-treatment” order to prevent serious hepatic injury.
Scenario III: Managing herb-drug interactions
For patients on multi-drug regimens, AI frameworks process high-dimensional datasets to predict interference between conventional drugs and complex herbal matrices. By analyzing the metabolic footprint of both agents, the system provides mechanistic insights that help clinicians manage the risks associated with the gut-liver axis and concurrent pharmaceutical use.
Personalized medicine clinics and risk stratification
Leveraging the pharmacogenomic foundations of metabolic heterogeneity, personalized medicine approaches are investigating pre-prescription genotyping as a potential risk-stratification framework for HILI prevention. For herbal compounds metabolized through polymorphic pathways with established genetic associations, patient-specific genomic profiling may eventually enable individualized risk assessment beyond conventional population-based prescribing practices. Advanced computational approaches, including AI and ML algorithms, represent promising tools for integrating multi-dimensional datasets encompassing genetic polymorphisms, demographic variables, and environmental exposures into comprehensive risk prediction models, thereby enhancing the precision of clinical decision-making.100
Integration into conventional healthcare and safeguards
The successful incorporation of herbal medicine into conventional healthcare settings demands the establishment of appropriate safeguards rooted in evidence-based rigor. Srisittiratkul et al.101 explicitly state that “comprehensive control of herb, disease, and patient factors is crucial for improving herbal therapy outcomes and that this multivariable approach will facilitate the successful integration of herbal medicine into modern evidence-based healthcare”. Recent perspectives emphasize the critical need for robust PV frameworks specifically tailored to herbal medicines. To address the global complexities of herbal safety, there is strong advocacy for the implementation of a collaborative international approach. Such a harmonized, multi-jurisdictional strategy is considered essential for enhancing safety monitoring, standardizing reporting, and facilitating the global exchange of toxicological data.102 Observational studies have emerged as a valuable alternative methodology for assessing HDIs in real-world clinical settings. This approach facilitates the collection of high-quality, real-world evidence that captures the complexities of patient behavior and multi-herb use—variables often excluded from controlled experimental settings. The integration of such observational data is increasingly recognized as a necessary prerequisite for establishing comprehensive safety profiles and informed clinical guidelines for herbal medicine.103
The integration of multi-omics data with EHRs provides a transformative framework for the realization of precision medicine. This synergy enables a more comprehensive understanding of complex biological systems by bridging high-resolution molecular profiles with longitudinal clinical phenotypes. Such multi-dimensional data integration allows for the identification of subtle transitions from health-to-disease and supports highly personalized clinical decision-making.104 The integration of high-dimensional healthcare data necessitates a rigorous focus on ethical governance, particularly with respect to privacy, confidentiality, and data autonomy. To mitigate the risks associated with the synthesis of multi-omics and clinical data, the implementation of advanced technical safeguards—such as Differential Privacy and robust encryption protocols—is essential. The development of comprehensive ethical frameworks is critical to ensuring that the transition toward data-driven precision medicine maintains patient trust while protecting sensitive biological and longitudinal health information.105
For real-world evidence to influence regulatory decision-making, rigorous validation and smooth integration into existing healthcare infrastructures are imperative. Regulatory bodies such as the European Medicines Agency (EMA) are actively defining the acceptability and use of Real-World Evidence derived from non-traditional sources for safety signal evaluation. Moreover, fostering global collaboration among bodies like the World Health Organization (WHO), the FDA, and the EMA remains vital for standardizing PV reporting, sharing crucial safety information, and harmonizing regulatory expectations for these novel data types internationally.106
The operational pipeline: Bridging high-dimensional data and clinical action
To operationalize the paradigm illustrated in Figure 1, the “Processing” phase is executed through a five-stage analytical pipeline. This systematic approach ensures that high-dimensional biological data are accurately translated into actionable clinical insights107:
Stage I: High-dimensional data acquisition
The pipeline begins with the synchronous capture of multi-layered biological and chemical data. Unlike conventional pharmacovigilance, this stage requires high-resolution characterization of both the host and the intervention.
Host genomics
Mapping SNPs to identify genetic predispositions to HILI or nephrotoxicity.
Transcriptomics
Quantifying mRNA expression to detect cellular stress signatures before clinical symptoms manifest.
Phytochemical fingerprinting
Advanced analytical methodologies, including high-performance liquid chromatography-mass spectrometry and nuclear magnetic resonance (NMR) spectroscopy, can be employed to generate comprehensive phytochemical profiles that serve as molecular fingerprints for botanical preparations. This analytical approach prioritizes the systematic identification and quantification of bioactive constituents, potentially hepatotoxic compounds, and interactive phytochemical complexes within raw herbal materials, thereby establishing quality assurance protocols prior to human consumption.108
Stage II: Data harmonization and normalization
Heterogeneity is the primary obstacle in bioinformatic integration. Stage II employs advanced AI-based preprocessing to normalize disparate datasets. This involves converting qualitative reports of traditional herbal use, quantitative chemical concentrations, and categorical genomic variants into a unified, machine-readable format. AI-driven imputation techniques are used at this stage to address missing values in EHRs, ensuring that the metadata remains robust across diverse population cohorts.109
Stage III: Algorithmic synthesis and signal detection
At the core of the pipeline is the computational fusion of multi-omics layers with longitudinal clinical data, using advanced deep learning architectures to model complex, non-linear toxicological relationships.
Relational modeling via graph neural networks
To address the “interactome” of herbal medicine, graph neural networks—specifically graph convolutional networks—propagate information across knowledge graphs in which nodes represent botanical constituents and biological pathways. This allows the system to predict HILI by ‘borrowing’ relational insights from chemically similar toxicophores even when direct clinical data are sparse.110
Temporal surveillance via transformers
Because pharmacovigilance is inherently longitudinal, transformer architectures are employed to analyze EHRs. By using attention mechanisms, the pipeline weighs the significance of clinical events over time, identifying subtle, gradual elevations in liver enzymes that precede overt clinical symptoms.111
Uncertainty quantification via Bayesian neural networks
To ensure clinical interpretability, Bayesian neural networks (BNNs) assign a probability distribution to each prediction. If a detected signal is associated with noisy data, the BNN assigns a high “uncertainty score”, flagging the case for manual clinical review or further in vitro validation rather than definitive ADR classification.112
Stage IV: Signal benchmarking and clinical validation
To ensure that AI-detected signatures are not merely correlations, the final stage subjects all signals to rigorous benchmarking. This involves validating molecular findings against established clinical causality tools, specifically RUCAM. This calibration ensures that the pipeline’s outputs align with the gold standard of hepatotoxicity assessment, providing clinicians with high-confidence evidence to inform therapeutic decisions and regulatory actions.
In Stage III, the transition from data harmonization to algorithmic synthesis requires architectures capable of handling the inherent non-linearity and high dimensionality of biological systems. Rather than relying on simple linear regressions, the pipeline employs advanced deep learning and relational modeling to uncover hidden toxicological signals.113
Overcoming technical and systemic bottlenecks
While the integration of “omics” and AI provides a revolutionary foundation for herbal PV, the trajectory of molecular science points toward even finer resolution and more robust security measures to overcome existing systemic barriers. Despite the clear advantages of this framework, its global implementation is currently hindered by several critical technical and regulatory bottlenecks.
Technical and systemic bottlenecks
The high-resolution promise of this paradigm is frequently limited by significant methodological gaps. Current multi-omics studies often rely on bulk tissue analysis, which provides an average signal across millions of cells and may mask critical, localized events occurring within specific cell populations, such as hepatocytes or Kupffer cells. This lack of resolution is compounded by the absence of standardized isolation methods and an incomplete understanding of complex biosynthetic pathways, which restrict the reproducibility and comparability of results across laboratories worldwide.114
Furthermore, analytical gaps remain a primary hurdle; conventional biomarkers like ALT and AST lack the sensitivity to distinguish HILI from other liver diseases or to predict progression to serious injury. Addressing these deficits often requires advanced, yet frequently inaccessible, technologies such as inductively coupled plasma mass spectrometry for detecting heavy metal contaminants. These resource disparities mean that many national monitoring centers lack the specialized technical expertise and laboratory access required to differentiate intrinsic pharmacological toxicity from extrinsic product defects. Finally, significant ethical concerns persist regarding data privacy, as integrating sensitive genomic identifiers with EHRs demands stringent protections for patient autonomy and confidentiality.115
Future directions
To maximize the safety and therapeutic potential of herbal medicine, future efforts must focus on deepening mechanistic insight and ensuring data integrity through the following frontiers:
High-resolution “omics”: Single-cell and spatial insights
Current multi-omics studies often rely on bulk tissue analysis, which provides an average signal across millions of cells and may mask critical events occurring within specific cell types, such as hepatocytes or Kupffer cells, during the early stages of HILI. The next frontier involves high-resolution technologies: single-cell and spatial “omics”.
Single-cell sequencing promises to dissect the molecular signatures of injury at the level of individual cells, identifying which specific cell populations are most vulnerable to toxic compounds and tracking the precise progression of damage. Simultaneously, spatial “omics” preserves the location of molecular changes within liver tissue, offering contextual data essential for understanding localized injury patterns. For instance, spatially resolved metabolomics, when combined with network toxicology, has already been applied to investigate the hepatotoxic mechanisms of herbs like Polygonum multiflorum (He-Shou-Wu), providing a detailed theoretical foundation for understanding molecular damage linked to cholestasis, mitochondrial injury, and lipid metabolism disorders. Continued development and standardization of platforms for spatial molecular analysis are essential to transition these powerful techniques from research into clinical and regulatory applications.117
Ethical governance and blockchain integration
While blockchain technology provides a decentralized ledger for securing “omics” data and preventing product adulteration, its clinical implementation faces significant ethical and regulatory hurdles.
Informed consent in the “omics” era
The transition to personalized medicine necessitates a shift from traditional “broad consent” to dynamic, tiered consent models. Because genomic data is a permanent biological identifier, informed consent must explicitly address the long-term risks of re-identification and the specific protocols for data withdrawal in a decentralized blockchain environment.118,119
Ethical review and data sharing agreements
The integration of multi-omics data with EHRs requires rigorous oversight by institutional review boards to ensure that data sharing agreements maintain strict domain isolation. These agreements must balance the “Open Data” mandate required for global research collaboration with the need to protect sensitive biological information through advanced encryption and differential privacy.120
A mandate for continued research and global collaboration
The successful advancement of “omics”-guided PV relies on a sustained commitment to collaborative and open science. Natural product research faces several limitations, including the lack of standardized isolation methods and a full understanding of complex biosynthetic pathways, which restrict the comparability and reproducibility of results.
Therefore, efforts must be directed toward:
Open data initiatives
Supporting global, open-source platforms like the Omics Discovery Index, which integrate and disseminate multi-omics datasets (proteomics, genomics, transcriptomics, and metabolomics) to foster shared knowledge and accelerate scientific discovery. Other open-source databases also exist specifically for natural products, further supporting end-user needs and data sharing.
Integrative expertise
Implementing a holistic approach that responsibly blends the rigorous evidence of modern science with the centuries of empirical knowledge preserved in traditional healing systems.
Harmonized regulatory frameworks
Strengthening global collaboration among regulatory bodies such as the WHO, the FDA, and the EMA. This partnership is crucial for standardizing PV reporting and harmonizing regulatory expectations regarding the novel evidence generated by “omics” and AI, ultimately ensuring safer and more effective herbal interventions worldwide.