Introduction
The breast holds significant importance in the reproductive aspects of women, while serving a more limited role in men. Its structure primarily consists of epithelium and stroma, with the terminal duct-lobular unit (TDLU) functioning as the essential component of the epithelial structure. The TDLU facilitates both secretory and collection functions during lactation.1 This unit is critical in reactive adaptations to physiological demands; however, most pathological lesions originate from epithelial cells in the TDLU area.2 Lesions are typically classified into benign and malignant categories, with benign conditions predominantly exhibiting inflammatory characteristics, epithelial hyperplasia, adenomas, fibrocystic changes, and fibroadenomas. Conventional diagnostic approaches, primarily reliant on histopathology of breast tissue, are limited by inconsistencies stemming from subjective interpretation and variability among pathologists.3,4 Consequently, there is a demonstrated need for the integration of objective methodologies to enhance diagnostic accuracy and reliability.
Fourier-transform infrared (FTIR) spectroscopy is a technique that involves exposing a sample to infrared light while measuring the energy that either passes through or reflects from the sample as a function of wavenumbers.5,6 During this process, infrared-active functional groups within biomolecules absorb the energy from the incoming light.3 Studies by Elshemey et al.4 and Luo et al.5 highlight FTIR spectroscopy’s sensitivity to various biomolecules, enabling it to generate biochemical signatures that differentiate proteins, lipids, and nucleic acids based on the distinct vibration energy requirements of their functional groups.4–6 Characteristic features like peak shapes, heights, and ratios are utilized for both qualitative and quantitative assessments of the samples.5–7
Consequently, FTIR spectroscopy emerges as a powerful diagnostic tool, with clinical applications extending beyond screening and diagnosis to the ongoing monitoring of treatment responses. Its use has been documented in analyzing diverse clinical samples, including formalin-fixed paraffin-embedded tissues, as well as cases associated with disorders like diabetes and gastric cancer,8,9 neurodegenerative diseases such as Alzheimer’s,10 and conditions like atherosclerosis,11 and breast cancer.12
Among FTIR techniques, attenuated total reflectance (ATR) spectroscopy is particularly prevalent. It typically employs a diamond to gather data from the layer of the sample adjacent to the internal reflection element surface.11,12 Its higher signal-to-noise ratio and reduced scattering make ATR more efficient for biological material assessments, combined with minimal sample preparation since infrared light penetration depth remains unaffected by sample thickness.
While FTIR spectroscopy is increasingly utilized in Nigeria and Sub-Saharan Africa, its application to biological tissues, particularly in cancer research, remains limited.13–16 Prior studies on breast cancer using vibrational spectroscopy have primarily been conducted outside the region,17,18 and local investigations are scarce. To date, only X-ray emission techniques have been employed in Nigeria to differentiate between cancerous and non-cancerous breast tissues.19 Therefore, this pioneering study aimed to evaluate the diagnostic accuracy of ATR-FTIR spectroscopy in differentiating normal, benign, and malignant breast lesions in a Nigerian population, identifying specific spectral signatures for tumor discrimination.
Materials and methods
Ethical consideration
Ethical approval for the study was granted by the Research Ethics Committee of Ladoke Akintola University of Technology (LAUTECH), Osun, Nigeria, with reference number: UTH/REC/2022/09/649. The study was carried out with a waiver of informed consent for the use of archived, anonymized samples in accordance with the principles of the Declaration of Helsinki (as revised in 2024). The ethics committee agreed that this study did not require informed consent due to its retrospective nature and the minimal risk involved.
Tissue retrieval and sorting
This retrospective study, conducted at LAUTECH, analyzed formalin-fixed paraffin-embedded biopsy samples from female patients collected between 2022 and 2023. The sample set included 10 normal breast samples, 15 benign samples (fibroadenoma and fibrocystic changes), and 31 malignant specimens (invasive ductal carcinoma grade II) as shown in Figures 1 and 2. Using quota sampling, only primary tumors with complete information and that had not undergone neoadjuvant therapy during the study period were included.
Tissue section preparation
Tissue sections measuring 15 µm were prepared from sorted tissue blocks using a Leica RM2125 microtome and transferred to aluminum foil substrates adapted for spectral acquisition according to Cui et al.20 An additional 4 µm section was stained with hematoxylin and eosin using a standard procedure.21 Tumor classification and identification of suitable areas were performed by two pathologists in a blinded study prior to FTIR analysis.
ATR-FTIR spectroscopic analysis on tissue samples
Following heat treatment at 55–60°C to dry the tissue sections, xylene (Surgipath Medical Industries, Inc.) was used for deparaffinization, followed by a descending ethanol series (Sigma-Aldrich) for rehydration, and then atmospheric drying. The Cary 630 Agilent spectrometer was calibrated against blank substrates before spectral acquisition. Point mapping was conducted on tissue sections placed on low-reflective slides as substrates, ensuring precise contact with the diamond ATR crystal at pre-defined normal and tumorous regions, previously annotated, following a methodology adapted from Baker et al.22 The tissue section fields were scanned across the mid-infrared range of 4,000 cm−1 to 600 cm−1, accumulating 32 scans at a resolution of 16 cm−1 and averaging ten spectra for each specimen. Spectra were preprocessed through baseline corrections and smoothing techniques. Peaks and their relative intensities were identified, and spectral biomarkers corresponding to specific peak ratios were analyzed: A2922/A1632, A1632/A1543, A1632/A1080, A1080/A1543, and A1237/A1080, which were identified as biomarkers for diagnostic marker (DM),4 protein,5 cytoplasmic-nuclear ratio (CN),23 carcinogenesis markers,24 phosphate,25 and glycogen,26 respectively as depicted in Figures 3 and 4.
Statistical analysis
Average values of these biomarkers were compared across normal, fibroadenoma, fibrocystic changes, and invasive ductal carcinoma grade 2 tissues, as displayed in Figure 3. Receiver operating characteristic (ROC) curves were employed to assess the sensitivity, specificity, and performance of the spectral biomarkers.4–6 The performance was quantified using area under the curve (AUC) metrics, where values < 0.5 indicate no discrimination, 0.5 represents random guessing, 0.6–0.7 indicates low discrimination, 0.7–0.8 signifies moderate discrimination, and scores between 0.9–1.0 denote excellent discriminatory capability. Data visualization and analysis were conducted using Microsoft Excel and SPSS 26, with results presented in both tables and graphs. Significance was set at a 95% confidence level (p < 0.05).
Results
Distribution of breast tissue types
In the study, the sampled breast tissue types were categorized into ten normal, eight fibroadenoma, seven fibrocystic, and 31 malignant samples, as shown in Figure 1 below. The photomicrographs also showed the characteristics of different breast tissue types (Fig. 2).
Mean distribution of spectral biomarkers
The mean distribution of biomarkers was assessed across normal breast, fibroadenoma, fibrocystic, and malignant breast tissues, with the results presented in Figure 3. This provides a graphical highlight of how biomarkers differ among different breast tissue categories (normal and abnormal).
Quantitative determination of spectral biomarkers attributed to cytoplasm: nucleus ratios was particularly elevated in normal breast tissue (2.29), followed by fibrocystic (2.11), fibroadenoma (1.80), and lowest in malignant tissue (1.41). Similarly, glycogen levels were highest in malignant breast tissue (0.82), followed by fibroadenoma (0.58), fibrocystic (0.484), and normal (0.448). The carcinogenesis marker showed an increase in cancer tissue (0.81), relative to fibroadenoma (0.60), fibrocystic (0.52), and normal (0.48). Phosphate followed a similar pattern, with higher levels in malignant breast tissue compared to fibroadenoma, fibrocystic, and normal tissues. Protein and diagnostic markers in malignant, fibroadenoma, fibrocystic, and normal breast tissues were 1.31, 1.07, 1.10, and 1.10, and 0.53, 0.40, 0.39, and 0.45, respectively. The general trend of results showed that invasive ductal carcinoma and fibroadenoma exhibited elevated biomarkers compared to fibrocystic and normal breast tissue, suggesting differences in the degree of differentiation between these groups.
Diagnostic model performance on matched breast samples
To ascertain the diagnostic plausibility of biomarkers in discriminating between normal and fibroadenoma, normal and fibrocystic, normal and malignant breast tissue, fibroadenoma and malignant, fibrocystic and malignant, and fibroadenoma and fibrocystic tissues, ROC was performed to highlight the discriminatory power of these biomarkers within a 95% confidence interval, as presented in Tables 1–6. Furthermore, the sensitivity and specificity of these biomarkers were evaluated to assess their ability to detect true positives (cancer cases) and rule out false positives (non-cancer cases) (Table 7).
Table 1Model performance of biomarkers between normal breast tissue and fibroadenoma
| Area | SE | 95% CI
| p-value |
---|
Lower bound | Upper bound |
---|
Protein | 0.810** | 0.142 | 0.53 | 1.000 | 0.138 |
DM | 0.810** | 0.179 | 0.466 | 1.000 | 0.138 |
CN | 1.000*** | 0.001 | 1.000 | 1.000 | 0.017 |
CM | 0.048 | 0.069 | 0.001 | 0.183 | 0.03 |
Phosphate | 0.095 | 0.107 | 0.001 | 0.305 | 0.053 |
Glycogen | 0.001 | 0.000 | 0.001 | 0.001 | 0.017 |
Table 2Model performance of biomarkers between normal breast and fibrocystic breast tissues
| Area | SE | 95% CI
| p-value |
---|
Lower bound | Upper bound |
---|
Protein | 0.583 | 0.237 | 0.119 | 1.000 | 0.724 |
DM | 0.917*** | 0.115 | 0.691 | 1.000 | 0.077 |
CN | 0.833** | 0.173 | 0.493 | 1.000 | 0.157 |
CM | 0.083 | 0.115 | 0.001 | 0.309 | 0.077 |
Phosphate | 0.333 | 0.272 | 0.001 | 0.867 | 0.480 |
Glycogen | 0.001 | 0.000 | 0.001 | 0.001 | 0.034 |
Table 3Model performance of biomarkers between normal breast and malignant (invasive ductal carcinoma grade 2) breast tissues
| Area | SE | 95% CI
| p-value |
---|
Lower bound | Upper bound |
---|
Protein | 0.305 | 0.075 | 0.158 | 0.458 | 0.256 |
DM | 0.208 | 0.087 | 0.038 | 0.377 | 0.089 |
CN | 0.990*** | 0.012 | 0.965 | 1.000 | 0.004 |
CM | 0.001 | 0.001 | 0.01 | 0.01 | 0.004 |
Phosphate | 0.005 | 0.008 | 0.001 | 0.02 | 0.004 |
Glycogen | 0.005 | 0.008 | 0.001 | 0.02 | 0.004 |
Table 4Model performance of biomarkers between fibroadenoma and malignant (invasive ductal carcinoma grade 2) breast tissues
| Area | SE | 95% CI
| p-value |
---|
Lower bound | Upper bound |
---|
Protein | 0.148 | 0.057 | 0.036 | 0.259 | 0.002 |
DM | 0.025 | 0.017 | 0.01 | 0.059 | 0.001 |
CN | 0.935*** | 0.028 | 0.088 | 0.991 | 0.001 |
CM | 0.037 | 0.021 | 0.001 | 0.077 | 0.001 |
Phosphate | 0.0081 | 0.013 | 0.001 | 0.043 | 0.001 |
Glycogen | 0.041 | 0.022 | 0.001 | 0.084 | 0.001 |
Table 5Model performance of biomarkers between fibrocystic and malignant (invasive ductal carcinoma grade 2) breast tissues
| Area | SE | 95% CI
| p-value |
---|
Lower bound | Upper bound |
---|
Protein | 0.298 | 0.088 | 0.125 | 0.47 | 0.177 |
DM | 0.008 | 0.01 | 0.001 | 0.028 | 0.001 |
CN | 0.976*** | 0.019 | 0.94 | 1.000 | 0.001 |
CM | 0.008 | 0.01 | 0.001 | 0.027 | 0.001 |
Phosphate | 0.001 | 0.001 | 0.001 | 0.002 | 0.001 |
Glycogen | 0.004 | 0.006 | 0.001 | 0.016 | 0.001 |
Table 6Model performance of biomarkers between fibrocystic and fibroadenoma breast tissues
| Area | SE | 95% CI
| p-value |
---|
Lower bound | Upper bound |
---|
Protein | 0.214 | 0.15 | 0.001 | 0.508 | 0.131 |
DM | 0.643* | 0.174 | 0.301 | 0.984 | 0.45 |
CN | 0.071 | 0.082 | 0.001 | 0.232 | 0.023 |
CM | 0.857** | 0.132 | 0.598 | 1.000 | 0.059 |
Phosphate | 0.821** | 0.137 | 0.552 | 1.000 | 0.089 |
Glycogen | 1.000*** | 0.001 | 1.000 | 1.000 | 0.008 |
Table 7Peak ratios, important biomarker assignment, sensitivity, specificity, and cut-off points of comparison among normal, fibroadenoma, fibrocystic, and malignant breast tissues
Histology classification | Peak ratios | Biomarker assignment | Sensitivity% | Specificity% | Cut-off points |
---|
Normal | | | | | |
Malignant | A1632/A1080 | CN | 100 | 69 | 2.10 |
Fibroadenoma | | | | | |
Malignant | A1632/A1080 | CN | 100 | 86 | 1.58 |
Fibrocystic | | | | | |
Malignant | A1632/A1080 | CN | 100 | 97 | 2.10 |
Normal | | | | | |
Fibroadenoma | A1632/A1535 | Protein | 100 | 71 | 1.09 |
Normal | | | | | |
Fibroadenoma | A2922/A1632 | DM | 100 | 43 | 0.39 |
Normal | | | | | |
Fibroadenoma | A1632/A1080 | CN | 100 | 100 | 2.10 |
Normal | | | | | |
Fibrocystic | A2922/A1632 | DM | 100 | 75 | 0.39 |
Normal | | | | | |
Fibrocystic | A1632/A1080 | CN | 100 | 50 | 2.08 |
Fibroadenoma | | | | | |
Fibrocystic | A2922/A1632 | DM | 71 | 75 | 0.39 |
Fibroadenoma | | | | | |
Fibrocystic | A1080/A1543 | CM | 86 | 100 | 0.54 |
Fibroadenoma | | | | | |
Fibrocystic | A1237/A1080 | Phosphate | 82 | 100 | 0.77 |
Fibroadenoma | | | | | |
Fibrocystic | A1043/A1543 | Glycogen | 100 | 100 | 0.51 |
Table 7 shows the sensitivity and specificity of matched binary samples. Generally, the CN marker demonstrated exceptional performance, achieving 100% sensitivity in differentiating normal breast tissue from both benign (fibroadenoma and fibrocystic) and malignant lesions as shown in Figure S6. The protein marker also yielded 100% sensitivity when distinguishing normal tissue from fibroadenoma, although it exhibited a specificity of 71%. Additional markers, including the DM, carcinogenesis marker, phosphate, and glycogen, exhibited 100% specificity across most tumors but displayed variability in sensitivity, ranging from 45% to 75%. The CN marker also achieved perfect discrimination between normal breast tissue and fibroadenoma, maintaining both sensitivity and specificity at 100%. While the protein and DM markers similarly differentiated normal tissue from fibroadenoma with 100% sensitivity, other markers did not reach notable performance levels (Table 1). Overall, the CN marker effectively distinguished normal from malignant, benign lesions like fibroadenoma and fibrocystic changes from malignant lesions; all with sensitivities of 100% and differing specificities of 69%, 86% and 97% respectively (Tables 2–5). Glycogen was particularly a useful discriminator between fibrocystic and fibroadenoma with 100% sensitivity and specificity (Table 6). Other markers provided limited to moderate discriminatory information as revealed in Tables 1–7.
ROC curve analysis, by identifying optimal threshold values, demonstrated its capacity to distinguish between distinct breast cancer classifications. Cytoplasmic-nuclear ratios, with cut-offs ranging from 2.08 to 2.10, effectively differentiated between normal tissue and fibroadenomas/fibrocystic, invasive ductal carcinoma, and normal/fibroadenoma. A diagnostic marker cut-off of 0.39 proved useful in discriminating between normal and fibroadenomatous/fibrocystic tissues, as well as between fibroadenomas and fibrocystic tissues (Table 7).
Discussion
The current study analyzed breast tissue classified histopathologically as normal, benign (fibroadenoma and fibrocystic), and malignant using ATR-FTIR spectroscopy to identify spectral peak intensities as potential biomarkers for proteins, diagnostic markers, carcinogenesis markers, cytoplasmic-nuclear ratios, phosphate, and glycogen. A ROC curve statistical analysis validated these biomarkers based on their ability to distinguish various breast tissue types. In particular, biomarkers demonstrating high discriminatory power, as reflected in AUC values, were prioritized. The AUC offers a definitive assessment of a model’s performance, often surpassing mere measurements of diagnostic accuracy.27–30 Thus, ROC curve analysis was systematically employed across the selected biomarkers to establish their respective sensitivities, specificities, AUCs, and cut-off points when contrasting breast tissue types.
ROC curve analysis provides a visual and quantitative method for evaluating overall test performance by plotting sensitivity against specificity. The AUC reflects the test’s overall diagnostic effectiveness, balancing the two metrics. Values below 0.5 suggest no discrimination capability, while values between 0.5 and 0.6 indicate random chance. AUCs from 0.6 to 0.7 reflect marginal discrimination ability, 0.7 to 0.8 suggest moderate discrimination, and those above 0.9 indicate excellent discrimination.29,30
Sensitivity measures the likelihood of accurately identifying diseased patients, while specificity assesses the capability of a test to exclude healthy individuals correctly.4,6 The necessity to reduce the proportion of false positives is crucial in ensuring proper screening for positive subjects.27,28 Our findings indicate that ATR-FTIR successfully differentiated nearly all breast tumor classes per histopathological classification. In Table 3 and Table 7, the CN ratio, shown by the peak ratio A1632/A1080, was significantly elevated in normal compared to malignant tissues (p = 0.04, AUC = 0.990, sensitivity = 100%, specificity = 69%). This ratio also demonstrated remarkable discriminatory ability (p = 0.017; AUC = 1.00, sensitivity = 100%, specificity = 100%) between normal and fibroadenoma, as shown in Table 1 and Table 7. Moreover, the protein peak ratio A1632/A1543 in Tables 1 and 7 showed an AUC of 0.810 with sensitivities of 100% and specificities of 71% (p = 0.138). Conversely, the diagnostic marker (A2922/A1632) showed promise, achieving AUC = 0.810, specificity = 43% (p = 0.138) with 100% sensitivity in Tables 1 and 7.
Additionally, in Tables 2 and 7, the diagnostic marker (AUC = 0.917, sensitivity = 100%, specificity = 75% at p = 0.077) and the cytoplasmic-nuclear ratio (AUC = 0.833, sensitivity = 100%, specificity = 50% at p = 0.157) indicated considerable efficacy in differentiating normal from fibrocystic tissue, showcasing the latter’s specific superiority in this regard. The cytoplasmic-nuclear ratio exhibited excellent performance (AUC = 0.935, sensitivity = 100%, specificity = 86%, p = 0.01) in differentiating between fibroadenoma and malignant tumors, as shown in Tables 4 and 7, as well as between fibrocystic and malignant tissues (AUC = 0.976, sensitivity = 100%, specificity = 97%, p = 0.01) seen in Tables 5 and 7. In discerning fibrocystic changes from fibroadenoma, as shown in Tables 6 and 7, the glycogen peak displayed outstanding discriminatory capability (AUC = 1.0, sensitivity = 100%, specificity = 100%), outperforming the carcinogenic marker (AUC = 0.857, p = 0.059, sensitivity = 86%, specificity = 100%) and phosphate marker (AUC = 0.821, p = 0.089, sensitivity = 82%, specificity = 100%). The weakest performance, with sensitivity at 71% and specificity at 75% along with an AUC of 0.643 and no significance (p = 0.45), was observed in the diagnostic marker.
The study identified the cytoplasmic-nuclear ratio’s diagnostic potential, particularly in distinguishing malignant breast tissues from normal and benign types, illustrating its importance in the clinical diagnostic landscape. The cytoplasmic-nuclear ratio has traditionally been vital in histopathological diagnoses,1,2 validated by various analytical methods.31–33 This marker exhibited high specificity (100%) for differentiating normal from both fibroadenoma and malignant tissues, suggesting a closer similarity between normal breast tissues and fibrocystic changes, as supported by existing literature.34
While the peak ratio A2922/A1650 has exhibited diagnostic capability in previous studies, approaching AUC values of 0.908 and 100% sensitivity in distinguishing diseased from healthy tissues,4,6 these findings were not uniformly observed in the present study. Similar 100% sensitivity patterns with 70% specificity and an AUC value of 0.810 were noted primarily between normal and fibroadenoma tissues. This suggests that further exploration of these biomarkers could yield significant insights into breast cancer dynamics—a notion echoed in complementary studies.6 The protein peaks, generally assumed to elevate in cancerous tissues compared to normal counterparts, did not exhibit notable statistical significance in this study. While statistical significance was generally elusive, a particular biomarker, protein, exhibited 100% sensitivity and 69% specificity, but a low AUC of 0.325. This suggests its limited ability to differentiate between normal tissue and invasive ductal carcinoma, potentially due to the instability of the utilized β-sheet derived secondary proteins,35,36 unlike the more stable α-sheet derived proteins used in prior studies.37–41 This contrasts with reports demonstrating the utility of α:β ratios and Amide II/Amide III ratios in serum for distinguishing breast cancer patients from healthy subjects.40 Variations in peak ratios from differing protein structures can impact diagnostic sensitivity, and despite high sensitivity and diagnostic performance between normal and fibroadenoma tissues, statistical significance was lacking.33 Thus, the implementation of this protein biomarker warrants careful consideration due to its statistical insignificance, potentially stemming from confounding variables like formalin fixation.39 Nonetheless, the observed trends suggest potential diagnostic utility.
Furthermore, the glycogen peak demonstrated elevated levels in cancerous tissues as previously reported.35–38 Its presence during the G1 phase may enhance energy provision for tumors,38,39 confirmed by perfect AUC = 1.00, and 100% sensitivity and specificity in this present study. Elsewhere, salivary carbohydrate profiles, specifically the peak at 1,041 cm−1, have been reported to exhibit moderate diagnostic utility in breast cancer detection.6 Ferreira et al.6 reported an AUC of 0.765–0.770, with 80% sensitivity and 70% specificity for differentiating healthy controls from breast cancer patients. The same peak also demonstrated 70% sensitivity and 70% specificity in distinguishing patients with benign breast conditions from healthy individuals using saliva samples.
Additionally, the phosphate peak was notably higher in malignant tissues compared to fibrocystic and benign forms, aligning with prior findings that link phosphate levels to nucleic acid activity in malignant transformations.38,39 While phosphate demonstrated some diagnostic potential, it was less effective for overall tumor discrimination according to its AUC, sensitivity, and specificity metrics. Nonetheless, it showed promise in differentiating between fibrocystic and fibroadenoma tissues with sensitivity and specificity rates of 82% and 100%, respectively, though this contrasts with other studies reporting less than 90% for both metrics.4,6 The carcinogenesis marker appeared to aid primarily in differentiating fibrocystic tissues from fibroadenoma,23,24 which could prove revolutionary for benign variant characterization and diagnosis.
While this study demonstrates encouraging clinical potential, several limitations warrant cautious interpretation of the findings and restrict their broad applicability. These include a modest sample size and the potential for confounding biases, particularly stemming from uncontrolled pre-analytical and analytical variables during tissue block preparation. Future investigations should address these limitations by employing larger, prospectively controlled cohorts to validate these results.