地理科学进展 ›› 2018, Vol. 37 ›› Issue (6): 761-771.doi: 10.18306/dlkxjz.2018.06.003

• 研究论文 • 上一篇    下一篇

基于随机森林和时空核密度方法的不同周期犯罪热点预测对比

柳林1,2,3(), 刘文娟1, 廖薇薇1, 余洪杰1, 姜超1, 林荣平1, 纪佳楷1, 张政1   

  1. 1. 中山大学地理科学与规划学院综合地理信息研究中心,广州 510275
    2. 广州大学地理科学学院公共安全地理信息分析中心,广州 510006
    3. 辛辛那提大学地理系,美国辛辛那提 OH45221-0131
  • 收稿日期:2018-02-02 修回日期:2018-03-29 出版日期:2018-06-28 发布日期:2018-06-28
  • 作者简介:

    作者简介:柳林(1965-),男,湖南湘潭人,博士,教授,主要研究方向为地理信息科学、犯罪时空分析与模拟等,E-mail: lin.liu@uc.edu

  • 基金资助:
    国家自然科学重点基金项目(41531178);广东省自然科学基金研究团队项目(2014A030312010);国家自然科学基金项目(41171140);广东省科技计划项目(2015A020217003)

Comparison of random forest algorithm and space-time kernel density mapping for crime hotspot prediction

Lin LIU1,2,3(), Wenjuan LIU1, Weiwei LIAO1, Hongjie YU1, Chao JIANG1, Rongping LIN1, Jiakai JI1, Zheng ZHANG1   

  1. 1. Center of Integrated Geographic Information Analysis, School of Geography and Planning, Sun Yat-Sen University, Guangzhou 510275, China
    2. Center of Geographic Information Analysis for Public Security, School of Geographic Sciences, Guangzhou University, Guangzhou 510006, China
    3. Department of Geography, University of Cincinnati, Cincinnati OH45221-0131, Ohio, USA
  • Received:2018-02-02 Revised:2018-03-29 Online:2018-06-28 Published:2018-06-28
  • Supported by:
    Key Program of National Natural Science Foundation of China, No.41531178; Research Team Program of Natural Science Foundation of Guangdong Province, China, No.2014A030312010; National Natural Science Foundation of China, No.41171140; Science and Technology Program of Guangdong Province, China, No.2015A020217003

摘要:

犯罪预测对于制定警务策略、实施犯罪防控具有重要意义。机器学习和核密度是2类主流犯罪热点预测方法,然而目前还鲜有研究对这2类方法在不同时间周期下的犯罪预测效果进行系统比较,本文试图对此进行补充。本文以2013-2016年5月的公共盗窃犯罪历史数据作为输入,分别对比了在接下来2周、1个月、2个月、3个月4个不同时间周期随机森林方法与基于时空邻近性的核密度方法的犯罪热点预测效果,结果发现:在各时间周期上,随机森林分类热点预测方法的面积和案件量命中率均比时空核密度方法准确性高;并且2种方法均能有效地识别犯罪热点中的高发区域,其中在较小范围较短时间内随机森林识别热点中的高发区效率更高,而在较大范围较长时间周期上时空核密度方法识别高发区更优。

关键词: 时空核密度, 随机森林算法, 犯罪热点预测, 犯罪高发区识别

Abstract:

Crime prediction is of great significance for the formulation of police tactics and the implementation of crime prevention and control in different time periods. Machine learning and density mapping are two common approaches for crime hotspot prediction. However, there exists few published work that systematically compares the predicted results of these two approaches. This study aimed to fill the gap. With crime patterns uncovered from 2013 to May 2016, we predicted hot-spot distribution of theft crimes in the period of first two weeks of June, July, and August in 2016 by random forest algorithm and traditional space-time kernel density method and compared the two sets of predictions. The research area was divided into grid cells of 50 m×50 m. Each cell was predicted as either hot-spot or non-hot-spot area in the next predicting period. Then we overlaid the forecast results and location of real cases to evaluate the accuracy of the two methods. The results show that both the hit rate of area and cases of the random forest classification hot-spot prediction method are higher than that of the space-time kernel density within different periods. Both methods can effectively identify high-crime areas of crime hot spots in prediction. In a relatively short period of time and small area, the random forest classification hotspot prediction method is more effective than the space-time kernel density method. However, in a relatively long term and large area, the space-time kernel density crime risk estimation method yields better result in identifying high crime areas.

Key words: space-time kernel density, random forest algorithm, crime hotspot prediction, high crime areas identification