ISSN 1003-8035 CN 11-2852/P

    可解释机器学习驱动的怒江中游滑坡易发性评价

    Landslide susceptibility assessment in the middle reaches of the Nujiang River Driven by Explainable Machine Learning

    • 摘要: 怒江中游流域地质构造复杂,滑坡灾害频发,对该区域滑坡易发性评价可有效识别滑坡高易发区域,极大提高怒江中游流域的防灾减灾效率。本文基于历史数据、遥感解译和现场勘察,获取3358处中至大型滑坡灾害数据(滑坡体积>105 m3),构建怒江中游流域滑坡灾害数据库。结合方差膨胀因子(VIF)和容忍度筛选出地形地貌、基础地质、水文地质、环境影响和外界触发因子共12个特征条件因子,以研究区内南侧滑坡相对密集范围的滑坡样本作为训练集,将研究区其余区域滑坡作为测试集(训练集∶测试集≈1∶1),采用随机森林(RF)、朴素贝叶斯(NB)、优化梯度提升树(XGBoost)对整个研究区域的滑坡灾害易发性情况分析预测,并分析评价模型的跨地区泛化能力。结果表明:滑坡的极高易发区和高易发区主要集中于怒江及其支流的河谷地区,受断裂、地表切割破碎和水系发育等因素影响,与研究区内滑坡的分布情况基本吻合;滑坡易发性评价表明RF模型精度最高(AUC=0.880),其次是NB(AUC=0.862)、XGBoost(AUC=0.853),并且RF模型的滑坡易发性制图具有更高的准确度(86.5%)和可靠性(Kappa=0.730);SHAP解释认为高程因子在RF、NB和XGBoost模型中对滑坡易发性评价结果的重要性最大。RF、NB和XGBoost模型均具有较高的跨地区泛化能力,但RF模型AUC值最高,能更适用于大高差地形、复杂地质环境区域的滑坡易发性评价。研究成果可为大高差、地质构造环境复杂河谷区滑坡易发性评价提供参考,同时也为怒江中游滑坡风险与防灾减灾提供理论支撑。

       

      Abstract: Landslides occur frequently in the middle reaches of the Nujiang River due to its complex tectonic and geological environment. Landslide susceptibility assessment can effectively identify areas with high landslide susceptibility, thereby significantly improving disaster prevention and mitigation efficiency in this region. Based on historical data, remoting sensing interpretation, and filed investigations, a total of 3358 medium- to large-scale landslides (volume >105 m3) were identified to construct a landslide inventory for the middle reaches of the Nujiang River. Twelve conditioning factors, including topography, basic geology, hydrology, environmental influences, and external triggering factors, were selected using variance inflation factor (VIF) and tolerance analysis. The landslide samples from the southern part of the study area, where landslides are relatively concentrated, were used as the training set, while those from the remaining regions served as the test set, achieving an approximately 1∶1 ratio between training and test sets. This spatial partitioning strategy was employed to evaluate the cross-regional generalization ability of machine learning models. Random Forest (RF), Naive Bayes (NB), and eXtreme Gradient Boosting (XGBoost) models were applied to predict landslide susceptibility across the entire study area. The results indicate that very high and high susceptibility zones are primarily concentrated in the valleys of the Nujiang River and its tributaries, influenced by faults, intense topographic rockmass, and well-developed drainage networks. These patterns are generally consistent with the actual spatial distribution of landslides in the study area. Among the three models, RF model achieved highest precision (AUC = 0.880), followed by NB (AUC = 0.862), and XGBoost (AUC = 0.853). Furthermore, the landslide susceptibility map generated by the RF model demonstrates higher accuracy (86.5%) and reliability (Kappa = 0.730). SHAP (SHapley Additive exPlanations) interpretation reveals that elevation is the most important factor influencing landslide susceptibility in all three models (RF, NB, and XGBoost). The results indicates that the RF, NB, and XGBoost models all exhibit strong cross-regional generalization capabilities. However, the RF model achieves the highest AUC value and is therefore more suitable for landslide susceptibility assessment in areas characterized by large elevation gradients and complex geological environments. The findings of this study provide a reference for landslide susceptibility evaluation in deep-cut river valleys with complex geological tectonic environments, while also providing theoretical support for landslide risk management and disaster prevention and mitigation in the middle reaches of the Nujiang River.

       

    /

    返回文章
    返回