地理科学进展 ›› 2022, Vol. 41 ›› Issue (7): 1239-1250.doi: 10.18306/dlkxjz.2022.07.008

• 研究论文 • 上一篇    下一篇

多种机器学习模型对不同洪水类型特征指标模拟效果评估

张帆1(), 张永勇2,*(), 陈俊旭1, 翟晓燕3, 胡庆芳4   

  1. 1.云南大学地球科学学院,昆明 650504
    2.中国科学院地理科学与资源研究所,中国科学院陆地水循环及地表过程重点实验室,北京 100101
    3.中国水利水电科学研究院,北京 100038
    4.南京水利科学研究院水文水资源与水利工程科学国家重点实验室,南京 210029
  • 收稿日期:2021-12-06 修回日期:2022-03-14 出版日期:2022-07-28 发布日期:2022-09-28
  • 通讯作者: *张永勇(1981— ),男,湖北京山人,博士,研究员,主要从事流域水循环与环境水文学研究。E-mail: zhangyy003@igsnrr.ac.cn
  • 作者简介:张帆(1998— ),女,汉族,四川仪陇人,硕士生,主要从事水文模拟方面的研究。E-mail: ZhangF_YN@163.com
  • 基金资助:
    国家自然科学基金项目(42071041);国家重点研发计划项目(2016YFC0400902)

Performance of multiple machine learning model simulation of process characteristic indicators of different flood types

ZHANG Fan1(), ZHANG Yongyong2,*(), CHEN Junxu1, ZHAI Xiaoyan3, HU Qingfang4   

  1. 1. School of Earth Sciences, Yunnan University, Kunming 650504, China
    2. Institute of Geographic Sciences and Natural Resources Research, Key Laboratory of Water Cycle and Related Land Surface Process, CAS, Beijing 100101, China
    3. China Institute of Water Resources and Hydropower Research, Beijing 100038, China
    4. Nanjing Hydraulic Research Institute, State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Nanjing 210029, China
  • Received:2021-12-06 Revised:2022-03-14 Online:2022-07-28 Published:2022-09-28
  • Supported by:
    National Natural Science Foundation of China(42071041);National Key Research and Development Program of China(2016YFC0400902)

摘要:

洪水过程的特征指标不仅包括洪水量级,还包括时间、形态、动力学等指标。现有模型和方法重点关注洪水量级指标的模拟,对其他指标的模拟仍有待深入。如何实现对洪水过程所有特征指标的模拟已成为目前洪水预报的技术瓶颈。论文采用4种机器学习模型(多元线性回归、多层感知器、随机森林和支持向量机)对淮河上游长台关流域59场降雨—洪水场次7个特征指标(洪水总量、洪峰流量、洪水历时、洪峰时间偏度、高流量历时占比、涨洪和落洪速率)进行模拟,评估不同模型对不同洪水类型和特征指标的模拟效果。结果显示:① 长台关流域洪水过程可分为3类,第1类洪量中等、历时长且洪峰出现时间偏前(16场);第2类洪量低、形态矮胖且洪峰出现时间靠后(34场);第3类洪量大、涨落水迅速、形态尖瘦(9场)。② 时间指标模拟效果最优,动力学指标模拟效果最差。多元线性回归和随机森林模拟效果随所有特征指标数值的增加而增强;支持向量机的模拟效果随着洪水历时指标数值的增加而降低,随着其余特征指标数值的增加而增强;多层感知器模拟效果随着洪水总量、洪峰流量、高流量历时占比和涨洪速率等指标值的增加而增强。③ 从各类型洪水特征模拟精度来看,4种模型对第3类洪水特征模拟均为最佳,第2类最差;随机森林在第1类和第3类洪水特征模拟中效果最优,支持向量机对第2类洪水特征模拟效果最优。④ 从综合模拟精度来看,支持向量机效果最优,然后依次为随机森林、多层感知器和多元线性回归。上述4种模型率定和验证期相对误差分别为23%和98%、21%和109%、37%和75%、41%和102%。研究可为流域洪水过程深度挖掘和防洪措施制定提供参考和借鉴。

关键词: 洪水过程, 特征指标, 机器学习, 模型比较, 淮河流域

Abstract:

The characteristic indicators of flood process include not only flood magnitude, but also flood duration, dynamics, and so on. The existing models and methods focus on the simulation of flood magnitude indicators, but the simulation of other indicators remains insufficient. In this study, four machine learning models — multiple linear regression, multi-layer perceptron, random forest, and support vector machine — were used to simulate seven characteristic indicators — total flood volume, peak flow, flood duration, time deviation of flood peak, proportion of high flow duration, flood rise and fall rates — of 59 rainfall-flood events in the Changtaiguan Basin in the upper reaches of the Huaihe River. The simulation performance of the models for different flood types and characteristic indicators were evaluated. The results show that: 1) The flood process in the Changtaiguan Basin can be divided into three categories. The first category is characterized by moderate flood volume, long duration, and earlier peak time (16 fields); the second type has low flood volume, short and fat shape, and the flood peak appears later (34 fields); the third type has large flood volume, sharp and thin shape, and flood rises and falls rapidly (9 fields). 2) Time indicator simulation performed the best and dynamic indicator simulation performed the worst; the performance of multiple linear regression and random forest simulation increased with the increase of all characteristic indicator values. The simulation performance of support vector machine decreased with the increase of flood duration, and increased with the increase of value of the remaining characteristic indicators. The simulation performance of multi-layer perceptron increased with the increase of value of four indicators, namely, total flood volume, peak flow, proportion of high flow duration, and flood rise rate. 3) With regard to the simulation accuracy of characteristic indicators of various types of floods, the four models performed the best for the simulation of the third type of floods, but the worst for the second type; random forest showed the best simulation performance for the first and third types of floods, and support vector machine showed better simulation performance for the second type of floods. 4) According to the comprehensive simulation accuracy, the support vector machine model performed the best, followed by random forest, multi-layer perceptron, and multiple linear regression. Relative errors for the calibration and validation periods were 23% and 98%, 21% and 109%, 37% and 75%, and 41% and 102%, respectively. The study results may provide some references for flood type simulation and countermeasures in the Huaihe River Basin.

Key words: flood process, characteristic indicators, machine learning, model comparison, Huaihe River Basin