地理科学进展 ›› 2016, Vol. 35 ›› Issue (12): 1494-1505.doi: 10.18306/dlkxjz.2016.12.006

• • 上一篇    下一篇

多元统计回归及地理加权回归方法在多尺度人口空间化研究中的应用

王珂靖1,2(), 蔡红艳1,**(), 杨小唤1   

  1. 1. 中国科学院地理科学与资源研究所资源与环境信息系统国家重点实验室,北京 100101
    2. 浙江省测绘科学技术研究院,杭州 310012
  • 出版日期:2016-12-20 发布日期:2016-12-20
  • 通讯作者: 蔡红艳 E-mail:wkj_3210@163.com;caihy@igsnrr.ac.cn
  • 作者简介:

    作者简介:王珂靖(1988-),女,山东文登人,硕士研究生,主要从事社会经济数据空间化建模研究,E-mail:wkj_3210@163.com

  • 基金资助:
    国家自然科学基金项目(41271173,41301155);国家科技支撑计划项目(2012BAI32B06)

Multiple scale spatialization of demographic data with multi-factor linear regression and geographically weighted regression models

Kejing WANG1,2(), Hongyan CAI1,*(), Xiaohuan YANG1   

  1. 1. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China
    2. Zhejiang Academy of Surveying & Mapping, Hangzhou 310012, China
  • Online:2016-12-20 Published:2016-12-20
  • Contact: Hongyan CAI E-mail:wkj_3210@163.com;caihy@igsnrr.ac.cn
  • Supported by:
    National Natural Science Foundation of China, No.41271173, No.41301155;National Science and Technology Support Program of China, No.2012BAI32B06

摘要:

对统计型人口数据进行格网形式的空间化可更直观地展示人口的空间分布,但不同的人口空间化建模方法和不同的格网尺度在表达人口空间化结果方面存在差异。本文在人口特征分区的基础上,引入DMSP/OLS夜间灯光对城镇用地进行再分类,采用多元统计回归和地理加权回归方法(GWR),开展人口统计数据空间化多尺度模型研究,生成1 km、5 km和10 km等3个尺度的2010年安徽省人口空间数据,并对3个尺度下2个模型结果进行精度评价与比较。结果表明:人口空间数据精度不仅与建模所用方法关系密切,还受到建模格网尺度大小的影响。基于多元统计回归方法的模型估计人口数与实际人口的平均相对误差值随着尺度的增加而降低,而基于GWR方法获得的人口空间数据误差值随着尺度的增加而升高。整体来看,基于GWR方法的1 km研究尺度的人口空间数据平均相对误差最低(22.31%)。区域地形地貌条件与人口空间数据误差有较强的关联,地貌类型复杂的山区人口空间数据误差较大。

关键词: 人口分布, 空间化, 多尺度, 多元统计回归, 地理加权回归法, 安徽省

Abstract:

Population distribution data are essential for socioeconomic and environmental studies, such as population estimation, spread of disease, natural disaster relief, and environmental protection. Existing research has proved that spatialized population grid data can precisely delineate the spatial pattern of population distribution, while model selection and size of grids may influence the accuracy of population distribution modeling. It is therefore important to estimate population distribution using appropriate models and at a proper spatial scale. This study mainly focused on the spatialization modeling of Anhui Province county-level population census data in 2010 at three grid scales. Anhui Province was selected for the study due to its complex landforms and significant difference of population distribution within its area. Population regionalization was carried out as a preprocessing step: 78 counties in Anhui Province were divided into four groups. Combining with land-use data and nighttime light (DMSP/OLS), urban residential areas were reclassified to reflect regional differences. Based on the population regionalization, multi-factor linear regression (MFLR) and geographically weighted regression (GWR) models were employed to integrate the reclassified urban residential land-use data with the rural residential land-use data. This study established three population spatial datasets at 1 km, 5 km, and 10 km gird scales. Comparing the two models’ precision at each scale, the results show that the modeling and grid scale have much influence on the accuracy of the spatialization result, which increased with the grid scale by using the MFLR model and the highest accuracy was achieved in the 10 km grid datasets. For the GWR model, the accuracy decreased as the grid scale increased, and the highest model accuracy was obtained at the 1 km scale. Overall, the GWR model had a higher accuracy (22.31%) than the MFLR model when taking into account the geographic location and local modeling. This study may provide a scientific basis for the production and application of population spatial data and provide a reference of spatialization for other types of statistical data in the future.

Key words: population distribution, spatialization, multi-scales, multi-factor linear regression, Geographically Weighted Regression (GWR), Anhui Province