PROGRESS IN GEOGRAPHY ›› 2022, Vol. 41 ›› Issue (7): 1239-1250.doi: 10.18306/dlkxjz.2022.07.008

• Articles • Previous Articles     Next Articles

Performance of multiple machine learning model simulation of process characteristic indicators of different flood types

ZHANG Fan1(), ZHANG Yongyong2,*(), CHEN Junxu1, ZHAI Xiaoyan3, HU Qingfang4   

  1. 1. School of Earth Sciences, Yunnan University, Kunming 650504, China
    2. Institute of Geographic Sciences and Natural Resources Research, Key Laboratory of Water Cycle and Related Land Surface Process, CAS, Beijing 100101, China
    3. China Institute of Water Resources and Hydropower Research, Beijing 100038, China
    4. Nanjing Hydraulic Research Institute, State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Nanjing 210029, China
  • Received:2021-12-06 Revised:2022-03-14 Online:2022-07-28 Published:2022-09-28
  • Contact: ZHANG Yongyong;
  • Supported by:
    National Natural Science Foundation of China(42071041);National Key Research and Development Program of China(2016YFC0400902)


The characteristic indicators of flood process include not only flood magnitude, but also flood duration, dynamics, and so on. The existing models and methods focus on the simulation of flood magnitude indicators, but the simulation of other indicators remains insufficient. In this study, four machine learning models — multiple linear regression, multi-layer perceptron, random forest, and support vector machine — were used to simulate seven characteristic indicators — total flood volume, peak flow, flood duration, time deviation of flood peak, proportion of high flow duration, flood rise and fall rates — of 59 rainfall-flood events in the Changtaiguan Basin in the upper reaches of the Huaihe River. The simulation performance of the models for different flood types and characteristic indicators were evaluated. The results show that: 1) The flood process in the Changtaiguan Basin can be divided into three categories. The first category is characterized by moderate flood volume, long duration, and earlier peak time (16 fields); the second type has low flood volume, short and fat shape, and the flood peak appears later (34 fields); the third type has large flood volume, sharp and thin shape, and flood rises and falls rapidly (9 fields). 2) Time indicator simulation performed the best and dynamic indicator simulation performed the worst; the performance of multiple linear regression and random forest simulation increased with the increase of all characteristic indicator values. The simulation performance of support vector machine decreased with the increase of flood duration, and increased with the increase of value of the remaining characteristic indicators. The simulation performance of multi-layer perceptron increased with the increase of value of four indicators, namely, total flood volume, peak flow, proportion of high flow duration, and flood rise rate. 3) With regard to the simulation accuracy of characteristic indicators of various types of floods, the four models performed the best for the simulation of the third type of floods, but the worst for the second type; random forest showed the best simulation performance for the first and third types of floods, and support vector machine showed better simulation performance for the second type of floods. 4) According to the comprehensive simulation accuracy, the support vector machine model performed the best, followed by random forest, multi-layer perceptron, and multiple linear regression. Relative errors for the calibration and validation periods were 23% and 98%, 21% and 109%, 37% and 75%, and 41% and 102%, respectively. The study results may provide some references for flood type simulation and countermeasures in the Huaihe River Basin.

Key words: flood process, characteristic indicators, machine learning, model comparison, Huaihe River Basin