地理科学进展 ›› 2017, Vol. 36 ›› Issue (10): 1304-1312.doi: 10.18306/dlkxjz.2017.10.012

• 专题研究:健康与人居环境 • 上一篇    下一篇

基于随机森林模型的珠江三角洲30 m格网人口空间化

谭敏(), 刘凯*(), 柳林, 朱远辉, 王大山   

  1. 中山大学地理科学与规划学院 广东省城市化与地理环境空间模拟重点实验室综合地理信息研究中心,广州 510275
  • 出版日期:2017-10-28 发布日期:2017-10-28
  • 通讯作者: 刘凯 E-mail:tanm3@mail2.sysu.edu.cn;liuk6@mail.sysu.edu.cn
  • 作者简介:

    作者简介:谭敏(1993-),女,广东省广州市人,硕士研究生,主要从事资源环境遥感应用,E-mail: tanm3@mail2.sysu.edu.cn

  • 基金资助:
    国家自然科学基金重点项目(41531178);广州市科技计划项目(201510010081);国家自然科学基金项目(41001291)

Spatialization of population in the Pearl River Delta in 30 m grids using random forest model

Min TAN(), Kai LIU*(), Lin LIU, Yuanhui ZHU, Dashan WANG   

  1. Center of Integrated Geographic Information Analysis, Guangdong Key Laboratory for Urbanization and Geo-simulation, School of Geography and Planning, Sun Yat-sen University, Guangzhou 510275, China
  • Online:2017-10-28 Published:2017-10-28
  • Contact: Kai LIU E-mail:tanm3@mail2.sysu.edu.cn;liuk6@mail.sysu.edu.cn
  • Supported by:
    Key Project of National Natural Science Foundation of China, No.41531178;Guangzhou Science and Technology Project, No.201510010081;National Natural Science Foundation of China, No.41001291

摘要:

人口空间化是实现人口统计数据与其他环境资源空间数据融合分析的有效途径。本文选取夜间灯光数据、道路网数据、水域分布数据、建成区数据、数字高程模型和地形坡度数据作为影响珠江三角洲人口分布的变量因子,利用随机森林模型对珠江三角洲2010年人口数据进行了30 m格网空间化,并将模拟结果与三个公开数据集作精度对比,最后基于随机森林模型的变量因子重要性分析珠江三角洲人口空间分布的影响因素。结果表明:本文模拟整体精度达到82.32%,均优于WorldPop数据集以及中国公里网格人口数据集,接近GPW数据集,而且在人口密度中等区域模拟精度最高;通过对变量因子重要性进行度量,发现夜间灯光强度是珠江三角洲人口分布的最重要指示性指标,到水域的距离、到建成区的距离和路网密度对珠江三角洲人口分布均具有重要作用。利用随机森林模型结合多源信息能够实现高空间分辨率的人口空间化,可为精细化城市管理提供重要数据源,也可为相关政策决策制定提供支持。

关键词: 人口空间化, 随机森林, 人口分布, 影响因素, 珠江三角洲

Abstract:

Grid population data can enable integrated analysis of population statistics with other spatial data on resources and the environment. Based on a Random Forest model and using nighttime lights, road network, surface water network, built-up area, slope, and DEM as control variables, the 2010 population data of the Pearl River Delta were distributed into 30 m grids. The estimation results were compared with three other public datasets. The importance of input variables was analyzed based on the results. The result shows that the accuracy of this simulation reached 83.32%, which is better than the WorldPop and the Population Grids of China datasets, and more close to the GPW dataset. Moreover, the 30 m resolution of our result furnishes detailed information of population density of the Pearl River Delta. According to the importance of covariates from the Random Forest model, strength of nighttime lights, distance to water, distance to built-up area, and density of roads are important factors in population distribution modeling in the Pearl River Delta. With the Random Forest model and multi-source data, high resolution population spatialization can be achieved. High spatial resolution grid data can provide important data source for high precision city management and policy making.

Key words: population spatialization, random forest, population distribution, impact factors, the Pearl River Delta