PROGRESS IN GEOGRAPHY ›› 2018, Vol. 37 ›› Issue (6): 761-771.doi: 10.18306/dlkxjz.2018.06.003

• Articles • Previous Articles     Next Articles

Comparison of random forest algorithm and space-time kernel density mapping for crime hotspot prediction

Lin LIU1,2,3(), Wenjuan LIU1, Weiwei LIAO1, Hongjie YU1, Chao JIANG1, Rongping LIN1, Jiakai JI1, Zheng ZHANG1   

  1. 1. Center of Integrated Geographic Information Analysis, School of Geography and Planning, Sun Yat-Sen University, Guangzhou 510275, China
    2. Center of Geographic Information Analysis for Public Security, School of Geographic Sciences, Guangzhou University, Guangzhou 510006, China
    3. Department of Geography, University of Cincinnati, Cincinnati OH45221-0131, Ohio, USA
  • Received:2018-02-02 Revised:2018-03-29 Online:2018-06-28 Published:2018-06-28
  • Supported by:
    Key Program of National Natural Science Foundation of China, No.41531178; Research Team Program of Natural Science Foundation of Guangdong Province, China, No.2014A030312010; National Natural Science Foundation of China, No.41171140; Science and Technology Program of Guangdong Province, China, No.2015A020217003


Crime prediction is of great significance for the formulation of police tactics and the implementation of crime prevention and control in different time periods. Machine learning and density mapping are two common approaches for crime hotspot prediction. However, there exists few published work that systematically compares the predicted results of these two approaches. This study aimed to fill the gap. With crime patterns uncovered from 2013 to May 2016, we predicted hot-spot distribution of theft crimes in the period of first two weeks of June, July, and August in 2016 by random forest algorithm and traditional space-time kernel density method and compared the two sets of predictions. The research area was divided into grid cells of 50 m×50 m. Each cell was predicted as either hot-spot or non-hot-spot area in the next predicting period. Then we overlaid the forecast results and location of real cases to evaluate the accuracy of the two methods. The results show that both the hit rate of area and cases of the random forest classification hot-spot prediction method are higher than that of the space-time kernel density within different periods. Both methods can effectively identify high-crime areas of crime hot spots in prediction. In a relatively short period of time and small area, the random forest classification hotspot prediction method is more effective than the space-time kernel density method. However, in a relatively long term and large area, the space-time kernel density crime risk estimation method yields better result in identifying high crime areas.

Key words: space-time kernel density, random forest algorithm, crime hotspot prediction, high crime areas identification