Exploring the Economic Damages Through Flood Prediction and Spatial Analysis: An Application of Hybrid Bagging-boosting Decision Trees Ensemble

Abstract

A huge national budget is required for flood damage reduction projects; thus, it must be ensured that the public money utilized therein is spent effectively and efficiently. In this context, reliable flood damage assessment is pertinent to analyze the economic aspects of projects related to flood damages. Riverine floods cause significant damage to assets and adversely affect the economies. The research aims to propose a feature selection model for hydraulic analysis as such a model has not been proposed previously. For this purpose, hybrids of three metaheuristic algorithms, Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), and Genetic Algorithm (GA) with two machine learning models which are Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) are employed. The dataset considered was hydraulic having an association with flood and possessed topographic, geo- environmental, and human-induced variables. The dataset considered had multicollinearity heteroscedasticity and autocorrelation problems. The metaheuristic algorithms were evaluated by varying the number of population size. Among them, PSO performed better by providing an appropriate number of features with a lower number of iterations. We have analyzed the performance of SVM with different kernels; linear, radial basis function (RBF), sigmoid, and polynomial, as the original SVM is designed only for linear datasets but the hydraulic dataset possesses non-linear characteristics as well. The performance of different kernels in terms of their accuracies is evaluated and recorded. This study showed that RBF performed the best and sigmoid showed the least accuracy for GA, PSO, and ACO algorithms. The performance of KNN is evaluated in terms of accuracies by varying the K-values. It was found that KNN shows low accuracy with a small K-value which then attained a maximum level by increasing K- values and it finally started decreasing, explicitly, by further enhancing K-values. Further, the research proposes a novel ensemble machine-learning model for flood susceptibility mapping. The ensemble model integrates four independent machine learning models namely, Random Forest (RF), Logistic Model Tree (LMT), Naïve Bayes Tree (NBT), and Reduced Error Pruning Tree (REPT). For susceptibility mapping, a spatial database is prepared by considering 5500 flood points and 5500 non-flood points. The 14 flood conditioning factors (selected by PSO algorithm) considered for the research possess topographic, geo-environmental, and human-induced variables. The dataset has been randomly divided into sample sizes of 70% and 30% for training and validation the models, respectively. The performance of the ensemble model is evaluated by utilizing various statistical techniques and is compared with the stand-alone models. The results revealed that the hybrid bag-boost ensemble model (RF-LMT-NBT-REPT) performed the best with a 99.5% accuracy level for the training sample and 98.9% for the validation sample. The inundation maps hence acquired by utilizing the hybrid bag-boost ensemble model for the years 2010 and 2022 and the predicted flood of 2032 are used to provide district-level flood damages in the lower Indus basin. Moreover, the present research illustrates a complete land use land cover (LULC) transition analysis for the study area between the time period of 2010 to 2022 and illustrates the spatial association between flood and LULC transition by employing geographical weighted regression analysis. In this context, the regional heterogeneities have been considered and a complete district-level analysis for LULC change has been provided by considering each land cover transition from 2010 to 2022. Furthermore, the proposed simulated hybrid bag-boost ensemble model is employed for the calculation of flood depth and extent in the lower Indus basin for the assessment of associated economic damages. The research provides a district-level loss due to the 2022 flood and the forecasted 2032 flood. For this purpose, the study considers Land Use Land Cover and administrative boundary maps of the study area. The monetary values of the assets have been obtained from the concerned administrative departments and are utilized for the damage assessment of the floods at district level. The proposed ensemble model and the flood conditioning factors can be utilized for flood potential assessment in future studies. Moreover, this technique will assist in decision-making while evaluating the economic feasibility of flood damage reduction projects.

Javeria Sarwar

Meta Data

Author: Javeria Sarwar
Supervisor:Saud Ahmed Khan
Co-Supervisor: Muhammad Azmat
Internal Examiner: Zahid Asghar
External Examiner: Iftikhar Hussain Adil

Related Thesis​