PROGRESS IN GEOGRAPHY ›› 2020, Vol. 39 ›› Issue (7): 1140-1148.doi: 10.18306/dlkxjz.2020.07.007

• Articles • Previous Articles     Next Articles

Development and application test of a collection system for paleoclimate research documents from ResearchGate

ZHANG Xuezhen1,2(), YIN Jun1, BAI Mengxin1,2, LI Yanbo1, ZHENG Jingyun1,2,*()   

  1. 1. Key Laboratory of Land Surface Pattern and Simulation, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2019-05-23 Revised:2019-10-09 Online:2020-07-28 Published:2020-09-28
  • Contact: ZHENG Jingyun;
  • Supported by:
    National Key Research and Development Program of China(2017YFA0603301);National Natural Science Foundation of China(41430528);Key Project of the Chinese Academy of Sciences(ZDRW-ZS-2017-4);Key Research Program of Frontier Sciences from CAS(QYZDB-SSW-DQC005);Youth Innovation Promotion Association, CAS(2015038)


A collection system for paleoclimate research documents (CSPD) was developed in this study using Python (V3.6) and MySQL (V5.7) on the Linux platform. Meanwhile, 1450 research papers of paleoclimate from the National Climate Data Center (NCDC) were manually selected. The keywords from these papers were classified and, then, a keyword list for the research paper collection was prepared. Using the CSPD with the keyword list, we collected 32493 paleoclimate research papers from ResearchGate. To verify the validity of CSPD with the keyword list, we counted the frequencies of four categories of keywords from the 32493 paleoclimate research papers from ResearchGate and from the 1450 papers from NCDC, respectively. Then, the frequencies from the two document datasets were compared. The four categories of keywords refer to the dimensions of temporal scale, type of proxy data, meteorology factors, and study area. We found that the frequencies of the four categories of keywords match well for the two document datasets. This result suggests that the CSPD together with the keyword list is a valid method and the resulting document dataset represents the status of paleoclimate research. A large number of paleoclimate research documents from ResearchGate would work as a great source of paleoclimate reconstruction results, which have not been fully included by NCDC. The CSPD reached the design objective.

Key words: paleoclimate, ResearchGate, document data, collection system, application test