地理科学进展 ›› 2012, Vol. 31 ›› Issue (10): 1307-1317.doi: 10.11820/dlkxjz.2012.10.008

• 模型与方法 • 上一篇    下一篇

基于特征的时间序列聚类方法研究进展

宋辞, 裴韬   

  1. 中国科学院地理科学与资源研究所资源与环境信息系统国家重点实验室, 北京100101
  • 收稿日期:2011-10-01 修回日期:2012-03-01 出版日期:2012-10-25 发布日期:2012-10-25
  • 通讯作者: 裴韬(1972-),男,副研究员,主要从事空间数据挖掘和空间信息统计等方面的研究。E-mail:peit@lreis.ac.cn E-mail:peit@lreis.ac.cn
  • 作者简介:宋辞(1986-),男,博士研究生,主要研究方向为空间数据挖掘。E-mail:songc@lreis.ac.cn
  • 基金资助:

    中国科学院知识创新工程重要方向项目(KZCX2-YW-QN303);中国科学院地理资源所自主部署创新项目(200905004);863 项目(2009AA12Z227)。

Research Progress in Time Series Clustering Methods Based on Characteristics

SONG Ci, PEI Tao   

  1. State Key Lab of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China
  • Received:2011-10-01 Revised:2012-03-01 Online:2012-10-25 Published:2012-10-25

摘要: 时间序列聚类可以根据相似性将对象集分为不同的组, 从而反映出同组对象的相似性特征和不同组对象之间的差异特征。当序列维度较高时, 传统的时间序列聚类方法容易受噪声影响, 难以定义合适的相似性度量, 聚类结果往往意义不明确。当数据有缺失或不等长时, 聚类方法也难以实施。基于上述问题, 一些学者提出了基于特征的时间序列聚类方法, 不仅可以解决上述问题, 还可以发现序列本质特征的相似性。本文根据时间序列的不同特征, 综述了基于特征的时间序列聚类方法的研究进展, 并进行了分析和评述;最后对未来研究进行了展望。

关键词: 聚类, 时间序列, 时间序列特征, 数据挖掘

Abstract: As terabyte time series data pour into the world, more and more attentions have been paid to the technique of analyzing this data. To understand discrepancy between these data, time series clustering methods have been used to divide them into different groups by similarities. Due to high dimension of time series, the traditional clustering methods for static data is not valid for time series clustering problem when they are susceptible to noise, and can hardly define suitable similarity which are prone to a meaningless result. It is also vexatious for many other methods to solve the clustering problem with missing or unequal data. Time series clustering methods based on characteristics could deal with these problems and discover the essential similarities of time series in all directions. According to characteristics of time series, this paper aimed to review the research progress of characteristics-based clustering methods for time series. Firstly, we introduced the definition and classified the different characteristics of time series. Then we reviewed different time series clustering methods based on characteristics and summarized the generality of each method. Finally we discussed some deficiencies of existing methods, and predicted the future of the relative research.

Key words: characteristics of time series, clustering, data mining, time series