Supplementary Material for: Implementation of machine learning algorithms to screen for advanced liver fibrosis in metabolic dysfunction-associated steatotic liver disease (MASLD): an in-depth explanatory analysis
posted on 2024-10-25, 10:23authored byDabbah S., Mishani I., Davidov Y., BenAri Z.
Background
This study aimed to train machine learning algorithms(MLAs) to detect advanced fibrosis(AF) in MASLD patients at the level of primary care setting and to explain the predictions to ensure responsible use by clinicians.
Methods
Readily available features of 618 MASLD patients followed up at a tertiary center were used to train five MLAs. AF was defined as liver stiffness≥9.3 kPa, measured via 2-dimension shear wave elastography(n=495) or liver biopsy≥F3(n=123). MLAs were compared to Fibrosis-4 index(FIB-4) and NAFLD fibrosis score(NFS) on 540 MASLD patients from the primary care setting as validation. Feature importance, partial dependence, and shapely additive explanations(SHAP) were utilized for explanation.
Results
Extreme gradient boosting(XGBoost) achieved an AUC=0.91,outperforming FIB-4(AUC=0.78) and NFS(AUC=0.81, both p<0.05) with specificity=76% vs. 59% and 48% for FIB-4≥1.3 and NFS≥-1.45, respectively(p<0.05). Its sensitivity(91%) was superior to FIB-4(79%). XGBoost confidently excluded AF (negative predictive value=99%) with the highest positive predictive value (31%), superior to FIB-4 and NFS (all p<0.05). The most important features were HbA1c and GGT with a steep increase in AF probability at HbA1c>6.5%. The strongest interaction was between AST and age. XGBoost, but not logistic regression, extracted informative patterns from ALT, LDL-c,and ALP(p<0.001). One quarter of the false positives (FP) were correctly reclassified with only one additional false negative based on the SHAP values of GGT, platelets, and ALT which were found to be associated with a FP classification.
Conclusions:
An explainable XGBoost algorithm was demonstrated superior to FIB-4 and NFS for screening of AF in MASLD patients at the primary care setting. The algorithm also proved safe for use as clinicians can understand the predictions and flag FP classifications.