Abstract:
This study is dedicated to delving deeply into the uncertainty and interpretability of ensemble learning models in landslide susceptibility modeling. Focusing on the eastern coastal mountainous region of Zhejiang Province as the study area, this research utilizes historical Google imagery and Sentinel-2A imagery to document 552 shallow landslide events triggered by the super typhoon "Megi" in 2016. Initially, the study designs scenarios for continuous factors using non-grading, equal interval method, and natural breaks method, subsequently subdividing them into 4, 6, 8, 12, 16, 20 levels. Thereafter, the Category Boosting Model (CatBoost) is introduced to assess landslide susceptibility values under different scenarios. Coupled with the analysis of ROC (receiver operating characteristic) curves and SHAP (SHapley Additive exPlanation), in-depth investigation into uncertainty and interpretability during the modeling process is conducted, with the aim of determining the optimal modeling strategy. The results indicate that: (1) In the computations of the CatBoost model, aspect emerges as the most critical influencing factor, followed by factors related to water and geological conditions; (2) Under the non-grading scenario, the model achieves the highest
AUC value, reaching 0.866; (3) Compared to the equal interval method, the natural breaks method demonstrates superior generalization capability, and the model’s predictive performance imrpoves with an increase in the number of classifications; (4) The SHAP model reveals the controlling mechanisms of the principal influencing factors (aspect, lithology, elevation, and road distance) on typhoon-induced landslides. The findings of this research can deepen our understanding of landslide susceptibility, enhance the accuracy and reliability of landslide predictions, and provide a scientific basis for disaster prevention and mitigation efforts in the related regions.