地理科学进展  2018 , 37 (1): 66-78 https://doi.org/10.18306/dlkxjz.2018.01.008

自然地理学分支学科

数字土壤制图研究综述与展望

朱阿兴123456, 杨琳37*, 樊乃卿3, 曾灿英1, 张甘霖8

1. 南京师范大学地理科学学院,南京210023
2. 虚拟地理环境教育部重点实验室,南京师范大学,南京 210023
3. 中国科学院地理科学与资源研究所 资源与环境信息系统国家重点实验室,北京 100101
4. 江苏省地理信息资源开发与利用协同创新中心,南京 210023
5. 江苏省地理环境演化国家重点实验室培育建设点,南京 210023
6. 美国维斯康星大学麦迪逊分校地理系,美国 WI 53706
7. 南京大学地理与海洋科学学院,南京 210023
8. 中国科学院南京土壤研究所,南京 210008

The review and outlook of digital soil mapping

ZHU A-Xing123456, YANG Lin37*, FAN Naiqing3, ZENG Canying1, ZHANG Ganlin8

1. School of Geographical Science, Nanjing Normal University, Nanjing 210023, China
2. Key Laboratory of Virtual Geographic Environment, Ministry of Education, Nanjing Normal University, Nanjing 210023, China
3. State Key Lab of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
4. Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
5. State Key Laboratory Cultivation Base of Geographical Environment Evolution, Nanjing 210023, China
6. Department of Geography, University of Wisconsin-Madison, Madison, WI 53706, USA
7. School of Geographic and Oceanographic Sciences, Nanjing University, Nanjing 210023, China
8. Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, China

通讯作者:  通讯作者:杨琳,女,山东威海人,副教授,主要从事数字土壤制图及空间采样设计研究,E-mail: yanglin@nju.edu.cn

收稿日期: 2018-01-16

修回日期:  2018-01-18

网络出版日期:  2018-01-28

版权声明:  2018 地理科学进展 《地理科学进展》杂志 版权所有

基金资助:  国家自然科学基金项目(L1624026,41431177,41471178)中国科学院学部学科发展战略研究项目(2016-DX-C-02)江苏省高校自然科学研究重大项目(14KJA170001)江苏高等学校优秀科技创新团队项目

作者简介:

作者简介:朱阿兴,男,浙江长兴人,教授,从事地理信息科学基础理论研究及其在数字土壤制图中的应用,E-mail: axing@njnu.edu.cn

展开

摘要

土壤的空间分布是土壤形成与发展过程的体现。数字土壤制图是一种新兴的、高效表达土壤空间分布的技术方法,在过去的30年取得了飞速发展。其理论基础为土壤成土因子学说和地理学第一定律。国内外学者在获取环境变量数据、采样方法、制图模型方法和土壤图产生及评价方面开展了大量的研究,应用案例也从小范围到大区域,甚至是全球尺度。未来数字土壤制图的发展方向包括:环境变量刻画的新技术,特别是体现人类活动方面的环境因子;新型数据和遗留数据的有效利用;土壤发生学知识与数学模型的紧密结合的新型推理方法;支持大数据多终端的计算模式。

关键词: 土壤空间分布 ; 数字土壤制图 ; 环境协同变量 ; 土壤环境关系

Abstract

The spatial distribution of soil reflects its formation and development. Digital soil mapping is a new and efficient technique to represent the spatial distribution of soil, which has experienced a rapid development over the last three decades. The theoretical bases are the soil forming factor theory and the first law of geography. Researchers have done significant work on the generation of environmental covariates, soil sampling methods, mapping methods, and production and evaluation of soil maps. The application cases are from small areas to big regions, even at the global scale. Future directions for digital soil mapping include: new techniques for depicting environmental covariates, especially for expressing human activities; efficient use of new data and legacy data; the reconciling of pedometric knowledge and mathematic models; and new computation ways supporting the use of big data.

Keywords: spatial distribution of soil ; digital soil mapping ; environmental covariates ; soil-environment relationship

0

PDF (687KB) 元数据 多维度评价 相关文章 收藏文章

本文引用格式 导出 EndNote Ris Bibtex

朱阿兴, 杨琳, 樊乃卿, 曾灿英, 张甘霖. 数字土壤制图研究综述与展望[J]. 地理科学进展, 2018, 37(1): 66-78 https://doi.org/10.18306/dlkxjz.2018.01.008

ZHU A-Xing, YANG Lin, FAN Naiqing, ZENG Canying, ZHANG Ganlin. The review and outlook of digital soil mapping[J]. Progress in Geography, 2018, 37(1): 66-78 https://doi.org/10.18306/dlkxjz.2018.01.008

1 引言

土壤类型和属性的空间分布信息是生态水文模拟、全球变化研究、资源环境管理所需的基础数据,制图是对土壤空间分布信息获取和表达的有效方式。过去,土壤专家通过野外调查在脑海中形成土壤—景观模型,以多边形为基本表达方式,以手工勾绘为基本技术,依据地形图、航空像片或卫星像片进行土壤制图(Hudson, 1992)。近30年来,随着地理信息系统、数据挖掘和地表数据获取技术的发展,数字土壤制图(Digital soil mapping)成为一种新兴的、高效表达土壤空间分布的方法(McBratney et al, 2003; 朱阿兴等, 2008; Hengl et al, 2017)。

数字土壤制图是以土壤—景观模型为理论基础,以空间分析和数学方法为技术手段的土壤调查与制图方法,是有别于传统土壤调查与制图技术的现代化技术体系。其实现过程主要是根据与土壤发生相关的或与土壤具有协同空间变化的地理环境数据以及土壤属性数据,生成数字格式的土壤图,或者根据土壤属性空间分布的自相关特征,应用地统计的方法来推测土壤的空间分布,形成土壤图。以这种方式生成的土壤图通常利用栅格的方式来表达土壤空间变化,从而可以更详细地表达土壤的空间变化。

计算机技术和地理信息系统(GIS)技术的诞生和发展,促进了数字土壤制图的发展。1975年,第一次国际土壤信息系统会议在新西兰惠灵顿召开并成立了相应的工作组,该工作组随后被国际土壤学会纳入第5组(土壤发生、分类及地理学组)。此后,土壤信息的数字化表达开始迅速发展。1990年国际土壤科学联合会成立了计量土壤学(Pedometrics)专业委员会,2005年成立了数字土壤制图(Digital Soil Mapping)工作组,2009年2月在美国正式启动“全球数字土壤制图计划”。

本文从数字土壤制图的理论基础开始,从获取环境变量数据、采样方法、制图模型方法和土壤图产生及验证四个方面阐述数字土壤制图的研究现状及最新动态,并进一步讨论数字土壤制图的趋势和展望。

2 数字土壤制图的理论基础

数字土壤制图反映的是土壤的空间分布特征和规律,土壤的空间分布是土壤形成与发展过程的体现,因而,数字土壤制图的第一个理论基础是土壤成土因子学说。该学说认为土壤是母质、气候、生物、地形和时间5个成土因素综合作用的产物(Jenny, 1941)。近年来人类活动也成为改变土壤形成方向和强度的重要环境因子(McBratney et al, 2003; 朱阿兴等, 2008; 朱鹤健等, 2010; 宋敏等, 2017)。基于成土因子学说,某一地区的土壤,是成土母质在一定水热条件和生物因素作用下,经过一系列物理、化学和生物化学过程所形成。由于土壤与环境因子之间的关系,土壤的空间分布与环境因子的空间分布具有协同变化的关系(Mcbratney et al, 2003)。特定的环境条件组合形成特定的土壤,具有特定的土壤属性。相似的环境因子组合下分布着相似的土壤,占据相应的空间位置,并且环境组合越相似,其对应的土壤越相似(Zhu et al, 2015)。

由于环境因子在空间的分布大多具有连续性,土壤在空间分布规律呈现出空间连续渐变的特征,往往体现出空间上距离越近的点土壤属性越相似的特点,也即是所谓的“地理学第一定律”(Tobler, 1970)。这是数字土壤制图的第二个理论基础。国内外学者的研究也证实了这一点 (Wilding et al, 1965; Burrough, 1989; 杨琳, 2009)。相邻两种土壤类型间在空间上往往没有明显的界线,而是呈现出一个过渡区。在过渡区内的土壤具有两类土壤的属性特征,也就是说,过渡区中的土壤与这两种土壤类型均具有某种程度的相似性(朱阿兴等, 2008)。

此外,土壤的空间变化具有尺度效应,并以空间格局的形式呈现,即某一尺度只能揭示相应的变化规律,而某一空间结构只能在某一尺度下体现。在进行大尺度(大空间范围)土壤空间变化分析时,可得到整个区域土壤的空间分布规律,较小尺度(小空间范围)下的空间分布特征往往被掩盖;而在进行小尺度土壤空间变化分析时,大多体现的是土壤在微域环境内的变化,以弥补大尺度分析的不足,但大尺度上的变化往往被忽略(张黎明等, 2011; 邓红眉, 2013)。同时,不同尺度下其主要影响因子也不尽相同(杨奇勇等, 2011; 邓红眉, 2013; Miller et al, 2015)。大尺度土壤空间分布,主要与生物气候条件的变化相适应。在较小的空间范围内,大生物气候因素对土壤的形成基本是均质的,土壤形成和发育主要受局部地形、母质等因素的影响。

3 数字土壤制图的研究现状

数字土壤制图一般可包括四个环节:环境协同变量信息的生成、样点数据的获取、制图模型或方法的建立、土壤图的产生及验证。以下分别介绍这四个方面的研究现状。

3.1 环境协同变量信息的生成

在数字土壤制图中,很多方法需要利用能体现土壤环境空间变化的地理变量作为辅助变量,这些变量统称为“环境协同变量”。环境协同变量的选择是数字土壤制图的一个关键,具体选择哪些环境变量参与数字土壤制图需要考虑两个主要方面:第一是所选变量应该能体现土壤空间变化,除土壤成土因子,更应该包括能体现土壤空间变化的其他因子,比如作物生长状况等;第二是所选变量的空间变化信息须是容易获取的,而难以获取其空间变化的变量,如时间因子,则一般不能直接地被用于数字土壤制图。下文对数字土壤制图中常用的环境变量空间信息的获取作简单介绍。

土壤母质是土壤形成的物质基础,通常直接获取母质信息十分困难。因此,在实际制图工作中,常用地质图或地貌图来代替土壤母质分布图(Zhu et al, 1994; Gray et al, 2016; Hengl et al, 2017),这些地图上的信息通常为矢量化表达的地质类型。

气候因素可以分为大气候和小气候。在较大的空间范围内,主要考虑大气候,通常选择年均降水、年均温、积温或相对湿度等因子。在较小的空间范围内,大气候对土壤形成的影响基本是均质的,可以忽略;小气候对土壤形成的影响表现出一定的空间差异,该差异主要由地貌部位和地形条件的差异引起。因此,在较小的空间范围内,一般不考虑气候因素,而是利用地形地貌特征信息来体现小气候对土壤发育的影响(朱阿兴等, 2008)。

地形要素是最常用的环境变量(McSweeney et al, 1994; Behrens et al, 2014),主要包括描述地形特征的定量指标(即地形属性)和描述地貌部位信息的指标(即地貌部位信息)。地形属性可直接或间接由数字高程模型(DEM)计算而得到,如海拔、坡度、坡向、曲率、与河流的距离、与山脊的距离、地形湿度指数等。地貌部位主要指形态相对简单的山脊、坡肩、背坡、坡脚和沟谷等基本的地形组成部分,它类似于地貌元素的概念(Blaszczynski, 1997; MacMillan et al, 2000),通常用坡位来表达。近年来,有学者将在空间渐变的坡位(如坡肩、背坡)进行模糊表达,生成模糊坡位作为新的环境变量,并将其应用于平缓小流域土壤属性的空间分布推测(Qin et al, 2009, 2012; 秦承志等, 2010)。

生物要素主要包括植物、动物和微生物。生长在不同土壤上的植被或类型不同或生长状况有异,因而可通过植被类型或状况来推测土壤类型或属性。土壤动物和微生物的信息难以获取,但是它们往往与地表植被状况有相关性,因而在实际制图中就用植被状况来代替。植被信息主要分为两类,一类是定性的类型空间分布信息,如植被类型;另一类是定量的属性空间分布信息,主要通过对遥感影像数据的计算获取植被指数和植被生物物理参数,如归一化植被指数(NDVI)、叶面积指数(LAI)、林冠郁闭度(Canopy Closure,CC)等(Boettinger 2010; Song et al, 2017)。

在一些平原或地形平缓的地区,常用的地形、植被等信息难以有效的表达土壤的空间变化,于是学者们提出了一种基于特定时段的地表动态反馈来获取土壤空间变化的方法,并借助遥感对地观测和模式定量分析技术来实现(刘峰等, 2009; Zhu et al, 2010; Wang et al, 2012; Zhao et al, 2014; Guo et al, 2015, 2016; Zeng et al, 2017)。该方法通过时序MODIS数据捕捉降雨后短时期内地表变干过程中的光谱动态变化信息或者昼夜温差信息等作为环境协变量推测土壤的空间分布。在地形和植被等其他地表条件类似的情况下,降雨后地表变干过程的动态反馈的空间差异将主要依赖于土壤。因此地表动态反馈信息可以有效指示土壤的空间差异,近年来作为一种新型环境协变量用于推测平缓区域的土壤空间分布。

母质、气候、生物和地形等因素对土壤发育的影响是通过时间来体现的,但是土壤形成的时间信息难以直接获取,而通常在其他成土环境因子(如地形位置)或当地土壤专家知识中有所体现和表达,因此,在土壤制图中对时间因素暂时还没有进行显式的考虑。

近年来,除上述地表动态反馈信息、模糊坡位外,最新开发和探索使用的环境变量还包括:人类活动因子、历史土壤图和近地传感数据。人类活动因子在土壤空间变化中起越来越大的作用,逐渐受到人们的关注。例如,宋敏等(2017)利用傅立叶变换对NDVI时序数据生成可表达农作物轮作的环境变量,研究结果表明且这些变量可提高农耕区土壤有机质制图的精度。传统土壤图也被用来辅助土壤预测制图,一部分研究是将传统土壤图作为模型的输入用于制图(Brus et al, 2008; Kempen et al, 2009);另一些研究是将历史土壤图中蕴含的土壤–环境关系知识提取出来,再进行对历史土壤图的更新或制图(Qi et al, 2003; Yang et al, 2011; 黄魏, 2016)。此外,土壤近地传感器获得的数据,如电导率数据、多光谱等数据也被用于土壤制图(Rossel et al, 2008; Besson et al, 2010; Myers et al, 2010; 史舟等, 2011; Shi et al, 2015)。

3.2 土壤样点的采集

土壤样点数据的直接获取方式是野外采样。采样方法可归为三类:①根据概率理论的采样方法;②根据样点空间自相关的采样方法;③环境因子辅助的采样方法。

3.2.1 根据概率理论的采样方法

基于概率理论的采样中最基本的一种策略是简单随机采样,该采样策略从总体中随机抽取每个样点,且每个样点被抽取的概率相同。这种采样的优点是样点采集概率已知和易于操作,常用于先验知识较少或没有的区域。当研究区可明确进行地理分区(层)时,例如具有不同的母质或土地利用类型,可在分层的基础上再进行随机采样,即分层随机采样。由于分层可在一定程度上避免随机采样中样点的空间聚集,往往可提高采样效率(Brus, 1994; Yang et al, 2018)。系统采样或规则采样也是土壤采样中常用的一种方式,这种采样将研究区划分为规则的形状(如正方形),在每个规则形状中随机或在中心点选择一个样点。该方法的优点在于对地理空间有较好的覆盖。

由于经典统计理论对于样点互相独立的假设,采样通常不考虑所设计样点的空间关系。事实上,不同空间样点上的目标地理变量值通常存在一定的空间相关性,因此基于概率理论的采样可能会在空间相关性较强的地区设计过多的冗余样点,而在空间相关性较差的地区设计的样点不足。此外,要获取精确的土壤空间分布,采用概率采样设计通常需要大量样点。

3.2.2 根据样点空间自相关的采样方法

基于地统计学的空间采样以最小化预测误差方差(如最大或平均克里格方差)为目标函数设计样点(Sacks et al, 1988 ; van Groenigen et al, 1998 )。该方法以模型估算方差最小化为目标,设计最优的样点数量和空间分布格局,获得具有全局代表性的样点(Hughes et al, 1981; Russo, 1984; Warrick et al, 1987; Wang et al, 2009)。基于空间自相关模型的采样方法能得到样点数量和分布的最优解,其采样效果完全取决于空间自相关模型对于目标地理变量空间变化模拟的效果。然而,建立空间自相关模型通常需要有关目标地理变量空间变化特征的先验知识(Webster et al, 1990),同时也需要满足目标地理变量空间变化二阶平稳假设。因此,在多数实际情况下,特别是在大范围研究区,目标地理变量空间变化特征的先验知识需要大量的先验样本往往很难获得(Webster et al, 1992; Simbahan et al, 2006),二阶平稳假设也很难得到满足,这使得基于空间自相关模型的采样设计方法在实际应用中具有一定局限性(Isaaks et al, 1989; Goovaerts, 1999)。为减少空间二阶平稳假设不能得到满足的影响,王劲峰等(2009)、Wang J F等(2013)建立了以最小化层内方差为目标的分层空间采样方法(Sandwich方法)。目前,该方法主要用于对区域总量的估算,在数字土壤制图中的应用还不够成熟。

3.2.3 环境因子辅助的采样方法

环境因子辅助采样方法的理论基础是土壤与环境因子存在协同关系,利用环境因子辅助采样设计以提高采样效率(Minasny et al, 2006; Brus et al, 2007; Zhu et al, 2008, 2010; Mulder et al, 2013; Yang et al, 2013;韩宗伟等, 2014)。环境因子辅助的采样方法大体上可分为3类:一是基于专家知识的目的性采样方法;二是基于环境因子分层的拉丁超立方采样方法;三是基于环境因子相似性的代表性采样方法。

(1) 基于专家知识的目的性采样。根据研究目的,通过有经验的专家选择少量具有“代表性”或“平均状态”的样点(Webster, 1977; Webster et al, 1990; Trochim, 2006)。这种采样策略适用于先验知识丰富的区域,可通过相对较少的样点了解土壤的空间分布信息。但是,该方法依赖于调查者的主观经验,难以进行客观评价。

(2) 基于环境因子分层的拉丁超立方采样方法。该采样方法是将设计的样点尽可能地重复环境因子的分布,通过样点对环境因子属性空间的覆盖,使得样点可很好地捕捉环境因子的多元分布特征(Minasny et al, 2006)。条件拉丁超立方采样方法被认为是一种有效的采样方法,应用广泛(Mulder et al, 2013; Clifford et al, 2014; Reza Pahlavan Rad et al, 2014; Gao et al, 2016; Stumpf et al, 2016)。拉丁超立方体抽样的关键是对输入概率分布进行分层,分层在累积概率尺度(0~1.0)上把累积曲线分成相等的区间,然后,从输入分布的每个区间中随机抽取样本。在每一区间抽取的样本代表环境因子整个分布空间。

(3) 基于环境因子相似性的代表性采样方法。该采样方法认为任何样点都包含了土壤—环境关系的知识,可代表与其环境条件相似的地区,因此能根据少量的可代表环境因子典型位置的典型样点获得研究区的整体信息,是一种高效的采样方式。主要方法包括模糊C均值采样(FCMS)(杨琳等, 2010)和多等级代表性采样(杨琳等, 2011; Yang L et al, 2013, 2016 )。模糊C均值聚类(Dunn, 1973; Bezdek, 1981)采样(FCMS)是根据基于环境因子形成的聚类中心位置设置典型样点。多等级代表性采样的基本思路是把基于环境因子形成的聚类分成代表性不同的等级,代表性等级高的聚类代表土壤空间变化的主要特征,代表性较低的类代表土壤局域细节特征,在该基础上布设点的先后次序,以合理分配采样资源,提高采样效率(Yang et al, 2013)。

拉丁超立方采样和代表性采样都是近年来应用较多、有效利用环境因子的采样方法。不同之处在于:超立方采样方法是等概率地划分环境因子空间,以使样点可以全面覆盖环境因子的多元空间,而代表性采样方法则以聚类的方式、通过寻找典型点来代表研究区土壤空间分布信息,后者可有效地减少所采集的样本量。

除上述采样方法外,最新的研究动态包括基于空间推测不确定性的补样方法和考虑可达性或采样成本的采样方法。空间推测方法的不确定性可分为属性域的不确定性和空间域上的不确定性。基于属性域不确定性的采样方法主要是基于Zhu等(2015)根据样点个体代表性计算的不确定性设计补样(Zhang et al, 2016),空间域上的不确定性主要是基于克里格方差来设计补样(Brus et al, 2007; Juang et al, 2008)。还有学者提出了兼顾属性域和空间域的补样方案(Li et al, 2016)。考虑到实际调查情况,一些学者开始在采样时考虑野外成本或可达性,包括在拉丁超立方采样中加入可达性限制或成本限制,以有效降低采样成本((Roudier et al, 2012; Mulder et al, 2013; Godinho Silva et al, 2014, 2015);以及定量考虑成本的概率采样方法(Yang et al, 2018)。

3.3 制图方法

目前主流的数字土壤制图方法主要包括基于要素相关性的数字土壤制图方法、基于空间自相关的土壤制图方法和基于要素相关性和空间自相关相结合的数字土壤制图方法。

3.3.1 基于要素相关性的土壤制图

基于要素相关性的数字土壤制图就是基于所建立的土壤属性(或类型)与环境因子(要素)之间的关系,来推测土壤类型或土壤属性的空间分布,以生成土壤图。采用要素相关的土壤推测方法主要包括传统的统计学方法、机器学习与数据挖掘方法、基于专家知识的土壤制图以及基于样点个体代表性的方法等。

(1) 统计学方法

统计学方法是根据土壤与地理环境变量之间的统计关系,推测土壤属性的空间分布并生成土壤图的方法,如线性模型、判别分析等(Moore et al, 1993; Odeh et al, 1994)。线性模型是建立土壤属性(或类型)与影响因子之间的定量线性关系的模型。常用的线性模型包括普通线性模型、广义线性模型、广义附加模型等(McBratney et al, 2000, 2003; Zhang et al, 2011)。而判别分析则根据已知样本集建立判别函数,然后根据判别函数或函数集来确定未知样本的所属类别,多用于土壤类型制图(Bell et al, 1992, 1994; Dobos et al, 2001)。

(2) 机器学习与数据挖掘方法

基于机器学习与数据挖掘的方法是利用机器学习与空间数据挖掘的手段,如人工神经元网络模型、贝叶斯模型、回归树/决策树、随机森林等,来获取和表达土壤属性空间变化与环境变量的关系,并根据这种关系推测土壤属性空间分布(Zhu, 2000; Park et al, 2002; Grimm et al, 2008; Hengl et al, 2015, 2017; Gray et al, 2016)。

机器学习与数据挖掘方法能够更有效地解决土壤与环境因子之间的非线性问题,是目前为止应用非常广泛的一类方法。但是其中的大部分方法,比如神经网络、贝叶斯模型和随机森林等,属于黑箱或者半黑箱方法,它们所获得的知识难以被转化成规则型知识,难以直观地了解土壤与环境因子之间的定量关系。而回归树模型能够提取出土壤环境关系的规则,但是在采用回归树方法获得的土壤属性图中,由于在每一个节点处土壤属性都是不连续的,因此导致生成的土壤属性图不是一个平滑的连续面,并且在节点较少的情况下,会导致与现实不符的土壤突变现象(McKenzie et al, 1999)。

(3) 基于专家知识的方法

基于专家知识的土壤制图方法从土壤专家获取关于土壤与地理环境变量关系的知识,将专家知识和语义模型相结合,再借助地理信息技术来完成土壤制图,如模糊逻辑推理方法(Zhu et al, 1994, 1997, 2001; Zhu, 1997)。该方法一般首先将土壤与环境条件关系的知识表达为隶属度函数,然后根据多个因子的隶属度函数来综合评价某点的土壤属于某种土壤类型的隶属度值,因此某点的土壤可与多个土壤类型具有隶属度(相似度),根据这些隶属度可确定该点的土壤的类型和属性,隶属度的利用可以使土壤空间变化的连续性得到较好的体现(Zhu, 1997; Zhu et al, 1997, 2001)。专家知识可以通过了解当地土壤环境关系的专家获取,也可以通过蕴涵了专家知识的传统土壤图,利用数据挖掘方法获取等(Fayyad et al, 1996; Qi et al, 2003; Stoorvogel et al, 2017)。

(4) 基于样点个体代表性的方法

基于样点个体代表性的方法在环境因子越相似、土壤属性越相近的假设下,认为每一样点可看作包含特定土壤—环境关系的案例,能够代表与之环境因子组合相似的地区,并且代表程度可由两点间的环境相似度来度量,通过分析环境相似度推测不确定性,并以环境相似度为权重计算样点可代表区域的土壤属性值(刘京等, 2013; Zhu et al, 2015)。该方法突破了现有方法对样点集全局代表性的严格要求,为利用大尺度空间范围内数量有限、分布任意的样点进行土壤属性制图提供了途径。

3.3.2 基于空间自相关的土壤制图

基于空间自相关的数字土壤制图是在空间自相关理论的基础上,建立描述目标地理变量空间自相关性的模型,进而结合待推测点的空间位置,推测目标地理变量在该点的特征值(Matheron, 1963; Burgess et al, 1980; Isaaks et al, 1989; Goovaerts, 1999)。根据空间自相关分析的范围不同,可分为全局空间自相关分析和局域空间自相关分析。

(1) 全局空间自相关分析

全局空间自相关方法主要为趋势面分析,以样本的地理坐标为自变量,以样本处的土壤属性值为因变量拟合的多项式(一般是低阶多项式)进行全局模拟。趋势面分析是对样本集属性特征进行全局规律的分析,忽略了样本集中的局部规律,因此,对目标变量的局部的特征较难预测,并且一旦研究区改变,趋势面函数往往不适用,需要重新拟合。Davies等(1970)曾利用该方法推测英国肯特郡的土壤Ph值。王会肖等(2007)在陕西省绥德县辛店沟研究区,基于规则采集的样点利用趋势面分析的方法推测了土壤水分的空间变化趋势。

(2) 局域空间自相关分析

局域空间自相关分析主要包括最邻近法、反距离加权法、样条插值法和克里格插值法等。最邻近法是将与待推测点距离最近的样本的属性值作为待推测点的值。该方法的优点是简单、效率高;缺点为只考虑了与待推测点最近的样本,结果容易出现阶梯状的空间变化,在实际应用中往往精度不高(Nemes et al, 2006)。反距离加权法以待推测点与样本点的空间距离确定权重进行加权平均计算,即离待推测点越近的样本点赋予的权重越大。该方法优点是简便易行,缺点是其结果受所采用的局域空间范围大小、参与样本数量以及距离权重衰减系数影响较大,在样点密度高且在空间上呈规则分布的情况下才可能获得较高的插值精度(Isaaks et al, 1989; Chang et al, 2001)。样条插值法是一种分区的分段多项式插值方法,在样本数据量充足且冗余度低的情况下,计算结果快且平滑性较好(Hutchinson, 1995; Hallema et al, 2015),但插值结果受所选样本数据的影响较大,当样本数据较多而冗余的时候,求算函数方程复杂耗时,且区域衔接处的值波动会很大(Bishop et al, 1999)。

克里格插值法是基于空间自相关的数字土壤制图中应用最为广泛的一种方法。其基于样本反映的区域化变量的结构信息(变异函数,也称半方差函数),根据待推测点周围或块段有限邻域内的样本数据,对待推测点进行的一种无偏最优估计,并且能给出估计每一个推测点的推测方差(Matheron, 1963; Burgess et al, 1980; Webster et al, 1990; Loague, 1992; Zhang et al, 2011)。主要方法包括普通克里格、简单克里格以及分区克里格等方法(Burgess et al, 1980; Stein et al, 1988; McBratney et al, 1991; Li et al, 2011, 2014)。与其他传统插值方法相比,克里格插值法的结果更精确,更符合实际;缺点是要求样本数量较多、分布均匀、样本代表性好,而且区域化变量的结构信息要满足二阶平稳假设(Isaaks et al, 1989; Goovaerts, 1999)。从以上阐述可以看出,基于空间自相关的空间推测方法依赖于空间关系(变异函数)的确定,而空间关系的确定是基于所有的样点,因此样点的数量和空间分布成为这类方法的核心。一般情况下,获取具有较好代表性的空间关系需要数量足够大和空间覆盖比较好的样本集(de Gruijter et al, 2006; Brus et al, 2007; Li et al, 2014)。由于对样本的依赖性和对空间关系二阶平稳的要求,所得的空间关系很难被直接应用到其他区域,绝大部分情况下在一个新的区域内必须采集本区域内的样本来定义适用该区域的空间关系(变异函数)。

3.3.3 基于要素相关性和空间自相关相结合的土壤制图

实际上,某点的土壤性状不仅与空间上邻近点的属性相关,而且也与该点的其他地理要素(即土壤环境要素)相关。基于该思想,学者们将上述空间自相关模型与要素相关模型结合,即在考虑土壤属性空间分布具有自相关特征的同时,也考虑土壤与土壤环境要素的关系,形成空间自相关和要素相关性相结合的数字土壤制图方法。其代表方法有协同克里格插值法、回归克里格插值法、地理加权回归模型等。

协同克里格插值法是建立在协同区域化变量(空间相关性)理论基础上,利用目标变量和环境变量之间的协同相关性,建立交叉协方差函数对目标变量进行局部估计(McBratney et al, 1983; Goulard et al, 1992; Odeh et al, 1995; Yang et al, 2016)。回归克里格插值法是将土壤属性与环境变量进行回归分析,然后将回归模型的残差项作为区域化变量进行克里格插值,最后与回归模型的预测值相加,生成最终的土壤属性空间分布图(Knotters et al, 1995; Odeh et al, 1995; McBratney et al, 2000; Mondal et al, 2017;)。地理加权回归是局域线性回归,根据样点离回归中心点的距离,确定样点对回归模型参数估算的权重,距离越近的样点权重越大,从而反映样点及环境变量对回归方程贡献空间上的差异,使回归结果较全局的线性回归更为可信(Kumaret al, 2012; 郭龙等, 2012; Wang K et al, 2013; Song et al, 2016; Zeng et al, 2016)。

由于该类方法同时考虑空间自相关性和环境变量相关性,一定程度上能提高土壤推测的精度,但其缺点是对样本数量与分布要求较高,样本需要满足二阶平稳的假设并要求要素相关性稳定(Hengl et al, 2004, 2007)。基于要素相关性的方法是现有数字土壤制图方法中应用最广泛的方法,其中随机森林是数据挖掘方法应用广泛的方法,而土壤–景观推理模型(SoLIM模型)则是基于知识的制图方法的突出代表(Zhu et al, 2001; 朱阿兴等, 2008)。基于空间自相关推测土壤属性空间分布的方法也应用广泛,这类方法不仅要求样本密度高,而且需要样本能很好捕捉土壤属性的空间自相关特征(Isaaks et al, 1989; Goovaerts, 1999)。基于要素相关性和空间自相关相结合的方法须同时满足2个方面(要素相关性和空间相关性)基本条件,往往在实践应用中很难得达到。随着全球数字土壤制图计划的开展以及全球变化研究的需求,研究者们开展了全球尺度的土壤制图工作(Hengl et al, 2017)。

3.4 土壤图的产生及验证

不同的制图方法生成的土壤图种类不同,一般而言,可生成的土壤图包括土壤类型模糊隶属度图、土壤类型栅格图和空间上连续的土壤属性图等。土壤类型模糊隶属度图主要是Zhu等(2001)和朱阿兴等(2008)所提的基于相似度的制图方法所生成的结果,土壤类型模糊隶属度图还可以通过硬化得到土壤类型栅格图,以及结合隶属度和典型点的土壤属性生成土壤属性图(Zhu et al, 2001)。其他方法,如克里格插值、数据挖掘方法等生成的主要是土壤类型栅格图或土壤属性图。此外,土壤类型栅格图可用于编制与传统土壤图可比的、以多边形为表达方式的类常规土壤图(朱阿兴等, 2008)。

在土壤图生成的过程中,部分模型同时会产生不确定性分布图,用于指示结果的可靠程度,为土壤图的应用(如决策制定、环境评价等)提供更明确的信息。克里格插值方法可生成每个栅格上的克里格方差来度量制图结果的不确定性。基于环境相似度的方法推测制图时可产生两种不确定性:忽略不确定性(Ignorance Uncertainty)和夸大不确定性(Exaggeration Uncertainty)(Zhu, 1997)。而基于个体代表性的制图方法也给出了一种推测不确定性的计算方法(Zhu et al, 2015)。

土壤图的验证方法主要包括定性与定量评价两种。定性评价主要对土壤图的空间分布合理性进行评估或者根据专家已有经验判断结果的正确性,而定量评价则是验证数字土壤制图精度更为常用的方法。定量评价方法主要通过野外样点的真实观测值与预测值进行对比验证制图的精度。验证方式主要包括3种:独立验证点验证、留一交叉验证(leave-one-out Cross Validation)以及多折交叉验证(k-fold Cross Validation)。独立验证点验证主要是在制图之后,采集某种特定的采样策略的额外样点对结果进行验证,最为常用的为概率采样样点。留一交叉验证依次将每一个样点作为验证样点,其余N-1个样点作为训练样点来评价制图结果,该方法主要是在样点数目比较少的情况下使用。而多折交叉验证则是随机将已有样点分为训练样点与验证样点,训练样点用于推测制图,然后通过随机的验证样点来评价制图结果。

土壤图的验证指标根据土壤图的不同而有所差异。对于土壤类型来说,主要通过混淆矩阵验证分类的精度,混淆矩阵包含有总体精度、生产者精度、用户精度以及每种土壤类型的精度等,这些精度指标从不同侧面反映了分类的精度(杨琳, 2006)。而对于土壤属性图来说,验证指标主要为平均绝对误差(MAE)、均方根误差(RMSE)和决定系数等。Brus等(2011)评比了制图验证所用的土壤采样方法和制图评价指标,并推荐了相应的指标。Malone等(2011)提出了量度数字土壤图预测精度和不确定性的新方法。

4 未来发展趋势和展望

数字土壤制图在过去的30年取得了飞速发展,国内外学者在获取环境变量数据、采样方法和制图模型方法方面开展了大量的研究,应用案例也从小范围到大区域,甚至是全球尺度(Hengl et al, 2014, 2017)。在理论研究方面,未来的可能发展方向包括以下几个方面:

(1) 刻画环境变量的新技术。如何在环境梯度较小地区(往往也是人类活动频繁的地区)获取能反映土壤空间差异的环境信息是数字土壤制图研究中的一个重要内容。遥感技术是获取这类信息的重要手段,随着遥感数据的空间分辨率不断提高,不同平台、不同时相的遥感数据更能有效地反映某些土壤类型的空间差异信息。由于土壤状况不同,其生长的植被状态随时间变化而产生的差异一般会在一系列遥感图像上表现出来,因此,通过长时间序列、高时间分辨率的多光谱遥感信息的分析,可获取与土壤空间变化具有协同关系的信息,进而提高精细土壤普查方法在环境梯度较小地区的应用效果。这方面的工作刚刚起步,还需进行大量深入细致的研究。

(2) 新型数据和遗留数据的有效利用。近年来,土壤近地传感、卫星遥感技术的快速发展为获取土壤空间分布信息提供了更多可用的数据,有效利用这些数据可更好地为土壤制图服务。一些地区经过过去多次的野外调查积累了很多样点数据,这些数据具有或大或小的时间间隔、不同的采样设计和目的,如何将多源样点进行评估,以及标准化和协调利用是未来待解决的问题。此外,很多研究区积累了不同类型的数据,包括历史土壤图、样点、文本资料等等,综合利用这些不同数据的优势可更加有效地进行数字土壤制图。

(3) 新型推理方法。目前机器学习、数据挖掘等方法在数字土壤制图中得到全面的应用,也取得了不错的制图精度,这些方法的特点是利用大量的训练集样点获取土壤与环境因子的关系或空间位置关系,但是,基于样点获得的关系可能过于依赖样点数据,因而需要土壤发生学知识介入才能建立更准确的关系,所以在采用机器学习和数据挖掘方法时如何与土壤发生学知识进行结合也是一个重要的研究方向。

(4) 支持大数据多终端的计算模式。全球变化研究对全球或区域尺度的土壤属性数据提出了较高的要求,然而处理全球尺度的大批量数据对计算模式提出了新的要求。同时,也亟需构建为公众所用的土壤制图服务平台。

(5) 拓展推广应用。主要包括2方面:一是生产高分辨率大范围的土壤信息数据库;二是与地学过程模型的领域结合。

The authors have declared that no competing interests exist.


参考文献

[92] Qin C Z, Zhu A-X, Qiu W L, et al.2012.

Mapping soil organic matter in small low-relief catchments using fuzzy slope position information

[J]. Geoderma, 171-172: 64-74.

https://doi.org/10.1016/j.geoderma.2011.06.006      URL      [本文引用: 1]      摘要

Spatial transitions between slope positions (landform positions) are often gradual. Various methods have been developed to quantify the transitions using fuzzy slope positions. However, few studies have used the quantitative information on fuzzy slope positions in digital soil mapping or other terrain-related geographic modeling. This paper examines the use of such information for mapping soil organic matter content (SOM) within a purposive (or directed) sampling framework for predictive soil mapping. First, a five slope position system (i.e., ridge, shoulder slope, back slope, foot slope, channel) was adopted and the fuzzy slope positions were derived through an approach based on typical slope position locations. The typical slope position locations were extracted using a set of rules based on terrain attributes and domain knowledge. Secondly, the fuzzy slope positions were used to direct purposive sampling, which determined the typical SOM value for each slope position type. Typical SOM values were then combined with fuzzy slope position data to map the spatial variation of SOM using a weighted-average model – the fuzzy slope position weighted (FSPW) model – to predict the spatial distribution of SOM for two soil layers at depths of 10–1502cm and 35–4002cm in a low-relief watershed in north-eastern China. The study area comprised two portions: an area of about 402km 2 used for model development, and an area of about 6002km 2 for model extrapolation and validation. Evaluation results show that our FSPW model produces a better prediction of the SOM than that provided by a multiple linear regression (MLR) model. Quantitative measures in the model-development area, including correlation coefficient, mean absolute error, and root mean square of error, show that the performance of the FSPW model with five modeling points from purposive sampling compares favorably with MLR results for 48 modeling points. Evidence from the quantitative assessment based on a validation set of 102 sample points in the model-extrapolation area shows that the FSPW model performs better than the MLR model, which suggests that information on fuzzy slope position was useful in aiding digital soil mapping over the area.
[93] Qin C Z, Zhu A-X, Shi X, et al.2009.

Quantification of spatial gradation of slope positions

[J]. Geomorphology, 110(3-4): 152-161.

https://doi.org/10.1016/j.geomorph.2009.04.003      URL      [本文引用: 1]      摘要

Transition between slope positions (e.g., ridge, shoulder slope, back slope, foot slope, and valley) is often gradual. Quantification of spatial transitions or spatial gradations between slope positions can increase the accuracy of terrain parameterization for geographical or ecological modeling, especially for digital soil mapping at a fine scale. Current models for characterizing the spatial gradation of slope positions based on a gridded DEM either focus solely on the parameter space or depend on too many rules defined by topographic attributes, which makes such approaches impractical. The typical locations of a slope position contain the characteristics of the slope position in both parameter space and spatial context. Thus, the spatial gradation of slope positions can be quantified by comparing terrain characteristics (spatial and parametrical) of given locations to those at typical locations. Based on this idea, this paper proposes an approach to quantifying the spatial gradation of slope positions by using typical locations as prototypes. This approach includes two parts: the first is to extract the typical locations of each slope position and treat them as the prototypes of this position; and the second is to compute the similarity between a given location and the prototypes based on both local topographic attributes and spatial context. The new approach characterizes slope position gradation in both the attribute domain (i.e., parameter space) and the spatial domain (i.e., geographic space) in an easy and practicable way. Applications show that the new approach can quantitatively describe spatial gradations among a set of slope positions. Comparison of spatial gradation of A-horizon sand percentages with the quantified spatial gradation of slope positions indicates that the latter reflects slope processes, confirming the effectiveness of the approach. The comparison of a soil subgroup map of the study area with the maximum similarity map derived from the approach also suggests that the quantified spatial gradation of slope position can be used to aid geographical modeling such as digital soil mapping.
[94] Reza Pahlavan Rad M, Toomanian N, Khormali F, et al.2014.

Updating soil survey maps using random forest and conditioned Latin hypercube sampling in the loess derived soils of northern Iran

[J]. Geoderma, 232-234: 97-106.

https://doi.org/10.1016/j.geoderma.2014.04.036      URL      [本文引用: 1]      摘要

Many Iranian soil surveys need to be updated. Conventional soil survey methods are expensive and time-consuming. Digital soil mapping (DSM) can be used for updating soil surveys. Many sampling and modeling techniques exist for DSM. In this paper we investigate the use of conditioned Latin hypercube sampling and random forest modeling for mapping Soil Taxonomy great group, subgroup and series levels for ~ 85,000 ha in Golestan Province, Iran. Overall error was 48.5, 51.5 and 56.6% for great group, subgroup and series levels, respectively. Estimated individual soil type error was between 8 and 100%. Soil types with larger sample sizes were predicted over a greater area at each taxonomic level. The soil adjusted vegetation index, the conventional soil series map and geomorphology were the most important covariates for each taxonomic level. Taxonomic classes with important covariates had low OOB error. The updated soil series map was 13.4% more accurate than the existing conventional soil series map.
[95] Rossel R A V, Fouad Y, Walter C.2008.

Using a digital camera to measure soil organic carbon and iron contents

[J]. Biosystems Engineering, 100(2): 149-159.

https://doi.org/10.1016/j.biosystemseng.2008.02.007      URL      [本文引用: 1]      摘要

High-resolution digital soil mapping for applications like precision agriculture requires the collection of good-quality high spatial resolution information. Conventional soil analysis is expensive, time consuming and laborious. The development of proximal soil sensors to lessen the need for or to complement conventional soil analysis is important. Although proximal sensing may produce results that are not as accurate as conventional laboratory analysis, they allow for the collection of larger amounts of data using simpler, cheaper and less laborious techniques. This forms the rationale for proximal soil sensing. This article deals with indirect measurements of soil organic carbon (OC) and iron (Fe) contents using soil colour as the proxy. Measurements of soil colour were made using a digital camera. The RGB tristimuli were transformed to variables from other colour space models and a redness index (RI) and these were used to derive pedotransfer functions for soil OC and Fe. Predictions using univariate as well as full factorial regressions (FFR) of these tristimuli were compared to predictions using visible ear infrared (vis IR: 400 1100nm) spectra with partial least squares regression (PLSR) and a reduced number of wavelengths selected using the variable importance for projection (VIP) with PLSR (VIP-PLSR). For predictions of soil OC content, the VIP-PLSR technique produced predictions with R2adj. and RMSE values of 0.91 and 0.46%. These were only very slightly better than predictions by an FFR of the CIELa*b* tristimuli (R2adj. of 0.91 and an RMSE of 0.48%) and PLSR (R2adj. of 0.91 and an RMSE of 0.50%). Predictions using the logarithmic regression of the CIEv* variable were least accurate with R2adj. and RMSE values of 0.88 and 0.56%. For predictions of Fe, an FFR of the CIELc*h* tristimuli produced an R2adj. of 0.71 and an RMSE of 0.068%, which was better than those obtained by PLSR (R2adj. values of 0.64 and RMSE 0.074%) and VIP-PLSR (R2adj. of 0.64 and 0.075%). Predictions of Fe using the logarithmic regression of the RI produced the least accurate results with R2adj. and RMSE values of 0.56 and 0.081%. In this study, we showed that a digital camera can be used for fast, accurate and non-destructive measurements of soil colour and predictions of soil OC and Fe contents in Brittany, France.
[96] Roudier P, Hewitt A E, Beaudette D E.2012.

A conditioned Latin hypercube sampling algorithm incorporating operational constraints

[M]//Minasny B, Malone B P, McBratney A B. Digital soil assessments and beyond: Proceedings of the 5th global workshop on digital soil mapping. London, UK: CRC Press, 227-232.

[本文引用: 1]     

[97] Russo D.1984.

Design of an optimal sampling network for estimating the variogram

[J]. Soil Science Society of America Journal, 48(4): 708-716.

https://doi.org/10.2136/sssaj1984.03615995004800040054x      URL      [本文引用: 1]      摘要

We have recently shown that bimorph piezoelectric PVDF films induce formation of periosteal bone in vivo and attributed this phenomenon to a piezoelectric effect. In the present study films were implanted in rabbits to encircle the femoral diaphysis. Specimens obtained after 6 and 12 days were subjected to routine processing for electron microscopy as well as fixation using the Ka-pyroantimonate technique. The electron micrographs revealed that initial osteoblastic differentiation and formation of collagenous matrix were followed by Ca accumulation in mitochondria. Calcification of the matrix progressed with deposition of mineralizing nodules and their fusion to form larger calcified masses. This was associated with disappearance of the pyroantimonate positive material from mitochondria. These ultrastructural observations confirm that bimorph films induce bone formation and disclose some features of the calcification process of the osseous callus.
[98] Sacks J, Schiller S.1988.

Spatial designs//Gupta S S, Berger J O. Statistical decision theory and related topics IV: Vol. 2.

New York: Springer Verlag: 385-399.

[本文引用: 1]     

[99] Shi Z, Ji W, Viscarra Rossel R A, et al.2015.

Prediction of soil organic matter using a spatially constrained local partial least squares regression and the Chinese vis-NIR spectral library

[J]. European Journal of Soil Science, 66(4): 679-687.

https://doi.org/10.1111/ejss.12272      URL      [本文引用: 1]      摘要

Summary We need to determine the best use of soil vis–NIR spectral libraries that are being developed at regional, national and global scales to predict soil properties from new spectral readings. To reduce the complexity of a calibration dataset derived from the Chinese vis–NIR soil spectral library (CSSL), we tested a local regression method that combined geographical sub-setting with a local partial least squares regression (local-PLSR) that uses a limited number of similar vis–NIR spectra ( k -nearest neighbours). The central idea of the local regression, and of other local statistical approaches, is to derive a local prediction model by identifying samples in the calibration dataset that are similar, in spectral variable space, to the samples used for prediction. Here, to derive our local regressions we used Euclidean distance in spectral space between the calibration dataset and prediction samples, and we also used soil geographical zoning to account for similarities in soil-forming conditions. We tested this approach with the CSSL, which comprised 2732 soil samples collected from 20 provinces in the People's Republic of China to predict soil organic matter (SOM). Results showed that the prediction accuracy of our spatially constrained local-PLSR method ( R 265=650.74, RPIQ65=652.6) was better than that from local-PLSR ( R 265=650.69, RPIQ65=652.3) and PLSR alone ( R 265=650.50, RPIQ65=651.5). The coupling of a local-PLSR regression with soil geographical zoning can improve the accuracy of local SOM predictions using large, complex soil spectral libraries. The approach might be embedded into vis–NIR sensors for laboratory analysis or field estimation.
[100] Simbahan G C, Dobermann A.2006.

Sampling optimization based on secondary information and its utilization in soil carbon mapping

[J]. Geoderma, 133(3-4): 345-362.

https://doi.org/10.1016/j.geoderma.2005.07.020      URL      [本文引用: 1]      摘要

We propose a method for optimizing sampling for digital soil mapping in cases where no directly measured prior information of the primary variable of interest is available. Various ancillary variables (soil series, relative elevation, slope, electrical conductivity and soil surface reflectance) were assumed to provide indirect information about the spatial distribution of soil carbon stock (CS, Mg ha 61021 in 0–0.3 m depth) in three fields of 49 to 65 ha size. The secondary information was used for stratifying each field into contiguous spatial clusters. Using this stratification, initial stratified random sampling schemes were allocated and further optimized by constrained spatial simulated annealing. Three optimization approaches were evaluated: minimization of the shortest distance (MMSD), a uniform distribution of point pairs for variogram estimation (WM), and a combination of MMSD (202/023 of samples) and WM (102/023). Spatially constrained cluster analysis of secondary information resulted in stratifications that accounted for large proportions of the variation of all ancillary variables used in the cluster analysis, but also for 47% to 68% of the spatial variation in measured CS. MMSD-optimized sampling schemes were inappropriate for mapping when the sampling density was low (≤021.5 to 2 samples per hectare) because spatial variation occurring at short lag distances was poorly resolved. WM-optimized sampling schemes allowed modeling of spatial variation, but resulted in poor field coverage for mapping purposes. The combined MMSD02+02WM optimization provided both even field coverage and the ability to estimate variograms for interpolation purposes. Sampling based on secondary information and re-use of the secondary information in regression kriging increased the accuracy of CS maps and allowed a significant reduction in sample size without loss of information. Further improvements could include fitness functions that simultaneously account for variation in feature and geographic space as well as sampling cost.
[101] Song X D, Brus D J, Liu F, et al.2016.

Mapping soil organic carbon content by geographically weighted regression: A case study in the Heihe River Basin, China

[J]. Geoderma, 261: 11-22.

https://doi.org/10.1016/j.geoderma.2015.06.024      URL      [本文引用: 1]      摘要

In large heterogeneous areas the relationship between soil organic carbon (SOC) and environmental covariates may vary throughout the area, bringing about difficulty for accurate modeling of the regional SOC variation. The benefit of local, geographically weighted regression (GWR) coefficients was tested in a case study on soil organic carbon mapping across a 50,810 km 2 area in northwestern China. This area is composed of an alpine ecosystem in the upper reaches and oases in the middle reaches. The benefit was quantified by comparing the quality of the maps obtained by GWR and geographically weighted ridge regression (GWRR) on the one side and multiple linear regression (MLR) on the other side. In these methods spatial dependence of model residuals is ignored. The root mean squared error (RMSE) of predictions of natural log-transformed SOC obtained with GWR was smaller than with MLR: 0.565 versus 0.618 g/kg. The use of a local ridge parameter in GWRR did not lead to an increase in accuracy. Besides we compared the quality of maps obtained by geographically weighted regression followed by simple kriging of model residuals (GWRSK) and kriging with an external drift (KED) with global regression coefficients. In these methods the spatial dependence of model residuals is incorporated in the model. The RMSE with KED was smaller than with GWRSK: 0.515 versus 0.546 g/kg. We conclude that fitting regression coefficients locally as in GWR only paid when no spatial random effect was included in the model. When a spatial random effect was included, the flexibility of local, geographically weighted regression coefficients was not needed and even undesirable as it led to less accurate predictions than KED with global regression coefficients. In comparing the accuracy of prediction methods by leave-one-out cross-validation (LOOCV) of a non-probability sample it is important to account for possible autocorrelation of pairwise differences in the prediction errors. The effective sample sizes were substantially smaller than the total number of sampling points, so that most pairwise differences in MSE were not significant at a significance level of 10% in a two-sided paired t -test.
[1] 邓红眉. 2013.

江汉平原土壤中、小尺度下的空间分异研究

[D]. 武汉: 华中师范大学.

[本文引用: 2]     

[Deng H M.2013.

Study on spatial heterogeneity of soil in mesoscale and small scale in Jianghan Plain

[D]. Wuhan, China: Central China Normal University.]

[本文引用: 2]     

[102] Song X D, Liu F, Ju B, et al.2017.

Mapping soil organic carbon stocks of northeastern China using expert knowledge and GIS-based methods

[J]. Chinese Geographical Science, 27(4): 516-528.

https://doi.org/10.1007/s11769-017-0869-7      URL      [本文引用: 1]      摘要

The main aim of this paper was to calculate soil organic carbon stock (SOCS) with consideration of the pedogenetic horizons using expert knowledge and GIS-based methods in northeastern China.A novel prediction process was presented and was referred to as model-then-calculate with respect to the variable thicknesses of soil horizons (MCV).The model-then-calculate with fixed-thickness (MCF),soil profile statistics (SPS),pedological professional knowledge-based (PKB) and vegetation type-based (Veg) methods were carried out for comparison.With respect to the similar pedological information,nine common layers from topsoil to bedrock were grouped in the MCV.Validation results suggested that the MCV method generated better performance than the other methods considered.For the comparison of polygon based approaches,the Veg method generated better accuracy than both SPS and PKB,as limited soil data were incorporated.Additional prediction of the pedogenetic horizons within MCV benefitted the regional SOCS estimation and provided information for future soil classification and understanding of soil functions.The intermediate product,that is,horizon thick ness maps were fluctuant enough and reflected many details in space.The linear mixed model indicated that mean annual air temperature (MAAT) was the most important predictor for the SOCS simulation.The minimal residual of the linear mixed models was achieved in the vegetation type-based model,whereas the maximal residual was fitted in the soil type-based model.About 95 of SOCS could be found in Argosols,Cambosols and Isohumosols.The largest SOCS was found in the croplands with vegetation of Triticum aestivum L.,Sorghum bicolor (L.) Moench,Glycine max (L.) Merr.,Zea mays L.and Setaria italica (L.) P.Beauv.
[103] Stein A, Hoogerwerf M, Bouma J.1988.

Use of soil-map delineations to improve (co-) kriging of point data on moisture deficits

[J]. Geoderma, 43(2-3): 163-177.

https://doi.org/10.1016/0016-7061(88)90041-9      URL      [本文引用: 1]      摘要

Predictions of 30-year average moisture deficits (MD30) were carried out by means of kriging and co-kriging, using simulations for 500 point observations in an area of 404 ha of sandy soils in The Netherlands. From the above point observations 100 points were selected at random to function as an independent test set. Attention was focused on improving the precision of kriged and co-kriged MD30-maps, as characterized by two error measures, the mean variance of the prediction error and the mean squared error of predictions. To do so the survey area was stratified by means of soil-map delineations according to soil type and water-table classes based on the groundwater table. In unstratified maps the standard deviation of the prediction error largely depends on the observation pattern. Stratification resulted in an increase of precision of predictions in strata with low MD30 variability and an apparent decrease in strata with high MD30 variability. Major soil-map delineations, as distinguished by a soil survey, had significantly different internal variability. Use of co-kriging resulted in an average increase of precision of MD30-maps of about 10%. This study illustrates the use of available soil-survey information for stratifying a survey area so as to enhance precision of predictions when using kriging and co-kriging of point data.
[2] 郭龙, 张海涛, 陈家赢, . 2012.

基于协同克里格插值和地理加权回归模型的土壤属性空间预测比较

[J]. 土壤学报, 49(5): 1037-1042.

URL      [本文引用: 1]      摘要

选取宜昌市红花套镇作为研究区域,研究土壤pH、有机质、有效磷、速效钾、碱解氮与土壤属性指标变量之间的关系,选择与预测变量之间具有较高相关性的变量作为辅助变量用以提高预测精度,本文试图将地理加权回归模型应用于土壤属性空间模拟中,以此与协同克里格插值的预测结果进行对照,从而比较它们的预测精度以提出更适合土壤属性预测的模型。结果表明:协同克里格插值和地理加权回归模型对土壤属性的空间模拟均有较高的预测精度,在辅助变量较多的情况下地理加权回归模型具有比协同克里格插值更为简单的算法,并且比较预测值相对误差的范围跨度和标准差以及均方根误差等方面,地理加权回归模型在土壤属性指标预测方面具有更高的预测精度,也具有更大的优势。

[Guo L, Zhang H T, Chen J Y, et al.2012.

Comparison between Co-Kriging model and geographically weighted regression model in spatial prediction of soil attributes

[J]. Acta Pedologica Sinica, 49(5): 1037-1042.]

URL      [本文引用: 1]      摘要

选取宜昌市红花套镇作为研究区域,研究土壤pH、有机质、有效磷、速效钾、碱解氮与土壤属性指标变量之间的关系,选择与预测变量之间具有较高相关性的变量作为辅助变量用以提高预测精度,本文试图将地理加权回归模型应用于土壤属性空间模拟中,以此与协同克里格插值的预测结果进行对照,从而比较它们的预测精度以提出更适合土壤属性预测的模型。结果表明:协同克里格插值和地理加权回归模型对土壤属性的空间模拟均有较高的预测精度,在辅助变量较多的情况下地理加权回归模型具有比协同克里格插值更为简单的算法,并且比较预测值相对误差的范围跨度和标准差以及均方根误差等方面,地理加权回归模型在土壤属性指标预测方面具有更高的预测精度,也具有更大的优势。
[104] Stoorvogel J J, Bakkenes M, Temme A J A M, et al.2017.

S-world: A Global soil map for environmental modelling

[J]. Land Degradation & Development, 28(1): 22-33.

https://doi.org/10.1002/ldr.2656      URL      [本文引用: 1]      摘要

The research community increasingly analyses global environmental problems like climate change and desertification with models. These global environmental modelling studies require global, high resolution, spatially exhaustive, and quantitative data describing the soil profile. This study aimed to develop a pedological approach that takes stock of available legacy and auxiliary data to create a global, 30 arc second soil property database for modelling. The methodology uses the Harmonized World Soil Database and the ISRIC-WISE 3.1 soil profile database. Auxiliary information at 30 arc second resolution for various landscape properties is used to describe the variation in landscape properties (temperature and precipitation, topography, elevation, land use, and land cover). Complex mapping units of the HWSD were first disaggregated using a digital elevation model and toposequences to generate delineated areas described by a single soil type. Secondly, ranges of soil properties per soil type were determined using the soil profile data. Then a meta-analysis on a broad literature survey was used to develop a simple model that, based on landscape properties at a particular location, determines the position within these ranges and thus provides an estimation of the local soil properties. Finally, the model was implemented at the global level to determine the distribution of soil properties. The methodology, denominated S-World (Soils of the world) resulted in readily available, high resolution, global soil property maps that are now available for environmental modelling.
[105] Stumpf F, Schmidt K, Behrens T, et al.2016.

Incorporating limited field operability and legacy soil samples in a hypercube sampling design for digital soil mapping

[J]. Journal of Plant Nutrition and Soil Science, 179(4): 499-509.

https://doi.org/10.1002/jpln.201500313      URL      [本文引用: 1]      摘要

Abstract Most calibration sampling designs for Digital Soil Mapping (DSM) demarcate spatially distinct sample sites. In practical applications major challenges are often limited field accessibility and the question on how to integrate legacy soil samples to cope with usually scarce resources for field sampling and laboratory analysis. The study focuses on the development and application of an efficiency improved DSM sampling design that (1) applies an optimized sample set size, (2) compensates for limited field accessibility, and (3) enables the integration of legacy soil samples. The proposed sampling design represents a modification of conditioned Latin Hypercube Sampling (cLHS), which originally returns distinct sample sites to optimally cover a soil related covariate space and to preserve the correlation of the covariates in the sample set. The sample set size was determined by comparing multiple sample set sizes of original cLHS sets according to their representation of the covariate space. Limited field accessibility and the integration of legacy samples were incorporated by providing alternative sample sites to replace the original cLHS sites. We applied the modified cLHS design (cLHSadapt) in a small catchment (4.2 km2) in Central China to model topsoil sand fractions using Random Forest regression (RF). For evaluating the proposed approach, we compared cLHSadapt with the original cLHS design (cLHSorig). With an optimized sample set size n = 30, the results show a similar representation of the cLHS covariate space between cLHSadapt and cLHSorig, while the correlation between the covariates is preserved ( r = 0.40 vs. r = 0.39). Furthermore, we doubled the sample set size of cLHSadapt by adding available legacy samples (cLHSadapt+) and compared the prediction accuracies. Based on an external validation set cLHSval ( n = 20), the coefficient of determination ( R 2) of the cLHSadapt predictions range between 0.59 and 0.71 for topsoil sand fractions. The R 2-values of the RF predictions based on cLHSadapt+, using additional legacy samples, are marginally increased on average by 5%.
[3] 韩宗伟, 黄魏, 张春弟, . 2014.

基于土壤养分—景观关系的土壤采样布局合理性研究

[J]. 华中农业大学学报, 33(1): 56-61.

[本文引用: 1]     

[Han Z W, Huang W, Zhang C D, et al.2014.

Rationality of sampling strategies based on soil-landscape relationships

[J]. Journal of Huazhong Agricultural University, 33(1): 56-61.]

[本文引用: 1]     

[106] Tobler W R.1970.

A computer movie simulating urban growth in the Detroit region

[J]. Economic Geography, 46(S1): 234-240.

https://doi.org/10.2307/143141      URL      [本文引用: 1]     

[107] Trochim W M K. 2006.

The qualitative debate: Research methods knowledge base

[R/OL]. New York, NY: Cornell University, .

URL      [本文引用: 1]     

[4] 黄魏, 罗云, 汪善勤, . 2016.

基于传统土壤图的土壤—环境关系获取及推理制图研究

[J]. 土壤学报, 53(1): 72-80.

https://doi.org/10.11766/trxb201503260023      URL      [本文引用: 1]      摘要

在数字土壤制图研究中,从历史资料中提取准确的、详细的土壤—环境关系对于土壤图的更新和修正十分重要。从传统土壤图中提取土壤类型并从地形数据中提取环境参数,采用空间数据挖掘方法建立土壤—环境关系,并进行推理制图和精度验证。以湖北省黄冈市红安县华家河镇滠水河流域为例,首先选取成土母质和基于地形数据提取的高程、坡度、坡向等7个环境因子;然后利用频率分布原理得到包含土壤类型与环境因子信息的典型样本数据1 410个;采用See5.0决策树方法进行空间数据挖掘,建立土壤—环境关系;将其导入So LIM中进行推理制图;最后利用270个实地采样点验证所得土壤图的精度。土壤图的精度提高了约11%,证明了本研究方法对土壤类型和空间分布推理的可靠性。

[Huang W, Luo Y, Wang S Q, et al.2016.

Knowledge of soil-landscape model obtain from a soil map and mapping

[J]. Acta Pedologica Sinica, 53(1): 72-80.]

https://doi.org/10.11766/trxb201503260023      URL      [本文引用: 1]      摘要

在数字土壤制图研究中,从历史资料中提取准确的、详细的土壤—环境关系对于土壤图的更新和修正十分重要。从传统土壤图中提取土壤类型并从地形数据中提取环境参数,采用空间数据挖掘方法建立土壤—环境关系,并进行推理制图和精度验证。以湖北省黄冈市红安县华家河镇滠水河流域为例,首先选取成土母质和基于地形数据提取的高程、坡度、坡向等7个环境因子;然后利用频率分布原理得到包含土壤类型与环境因子信息的典型样本数据1 410个;采用See5.0决策树方法进行空间数据挖掘,建立土壤—环境关系;将其导入So LIM中进行推理制图;最后利用270个实地采样点验证所得土壤图的精度。土壤图的精度提高了约11%,证明了本研究方法对土壤类型和空间分布推理的可靠性。
[108] van Groenigen J W, Stein A.1998.

Constrained optimization of spatial sampling using continuous simulated annealing

[J]. Journal of Environmental Quality, 27: 1078-1086.

https://doi.org/10.2134/jeq1998.00472425002700050013x      URL      [本文引用: 2]      摘要

Spatial sampling is an important issue in environmental studies because the sample configuration influences both costs and effectiveness of a survey. Practical sampling constraints and available pre-information can help to optimize the sampling scheme. In this paper, spatial simulated annealing (SSA) is presented as a method to optimize spatial environmental sampling schemes. Sampling schemes are optimized at the point-level, taking into account sampling constraints and preliminary observations. Two optimization criteria have been used. The first optimizes even spreading of the points over a region, whereas the second optimizes variogram estimation using a proposed criterion from the literature. For several examples it is shown that SSA is superior to conventional methods of designing sampling schemes. Improvements up to 30% occur for the first criterion, and an almost complete solution is found for the second criterion. Spatial simulated annealing is especially useful in studies with many sampling constraints. It is flexible in implementing additional, quantitative criteria.
[109] Wang D C, Zhang G L, Pan X Z, et al.2012.

Mapping soil texture of a plain area using fuzzy-c-means clustering method based on land surface diurnal temperature difference

[J]. Pedosphere, 22(3): 394-403.

https://doi.org/10.1016/S1002-0160(12)60025-3      URL      [本文引用: 1]      摘要

The use of landscape covariates to estimate soil properties is not suitable for the areas of low relief due to the high variability of soil properties in similar topographic and vegetation conditions.A new method was implemented to map regional soil texture (in terms of sand,silt and clay contents) by hypothesizing that the change in the land surface diurnal temperature difference (DTD) is related to soil texture in case of a relatively homogeneous rainfall input.To examine this hypothesis,the DTDs from moderate resolution imagine spectroradiometer (MODIS) during a selected time period,i.e.,after a heavy rainfall between autumn harvest and autumn sowing,were classified using fuzzy-c-means (FCM) clustering.Six classes were generated,and for each class,the sand ( 0.05 mm),silt (0.002-0.05 mm) and clay ( 0.002 mm) contents at the location of maximum membership value were considered as the typical values of that class.A weighted average model was then used to digitally map soil texture.The results showed that the predicted map quite accurately reflected the regional soil variation.A validation dataset produced estimates of error for the predicted maps of sand,silt and clay contents at root mean of squared error values of 8.4%,7.8% and 2.3%,respectively,which is satisfactory in a practical context.This study thus provided a methodology that can help improve the accuracy and efficiency of soil texture mapping in plain areas using easily available data sources.
[5] 李天杰, 赵烨, 张科利, . 2004. 土壤地理学[M]. 3版. 北京: 高等教育出版社.

[Li T J, Zhao Y, Zhang K L, et al.2004. Turang dilixue[M]. 3rd ed. Beijing, China: Higher Edudation Press.]

[110] Wang J F, Li L F, Christakos G.2009.

Sampling and kriging spatial means: Efficiency and conditions

[J]. Sensors, 9(7): 5224-5240.

https://doi.org/10.3390/s90705224      URL      PMID: 22346694      摘要

Sampling and estimation of geographical attributes that vary across space (e.g., area temperature, urban pollution level, provincial cultivated land, regional population mortality and state agricultural production) are common yet important constituents of many real-world applications. Spatial attribute estimation and the associated accuracy depend on the available sampling design and statistical inference modelling. In the present work, our concern is areal attribute estimation, in which the spatial sampling and Kriging means are compared in terms of mean values, variances of mean values, comparative efficiencies and underlying conditions. Both the theoretical analysis and the empirical study show that the mean Kriging technique outperforms other commonly-used techniques. Estimation techniques that account for spatial correlation (dependence) are more efficient than those that do not, whereas the comparative efficiencies of the various methods change with surface features. The mean Kriging technique can be applied to other spatially distributed attributes, as well.
[111] Wang J F, Robert H, Liu T J, et al.2013.

Sandwich estimation for multi-unit reporting on a stratified heterogeneous surface

[J]. Environment and Planning A, 45(10): 2515-2534.

https://doi.org/10.1068/a44710      URL      [本文引用: 1]      摘要

Spatial sampling is widely used in environmental and social research. In this paper we consider the situation where instead of a single global estimate of the mean of an attribute for an area, estimates are required for each of many geographically defined reporting units (such as counties or grid cells) because their means cannot be assumed to be the same as the global figure. Not only may survey costs greatly increase if sample size has to be a function of the number of reporting units, estimator sampling error tends to be large if the population attribute of each reporting unit can be estimated by using only those samples actually lying inside the unit itself. This study proposes a computationally simple approach to multi-unit reporting by using analysis of variance and incorporating 'twice-stratified' statistics. We assume that, although the area is heterogeneous (the mean varies across the area), it can be zoned (or stratified) into homogeneous subareas (the mean is constant within each subarea) and, in addition, that it is possible to acquire prior knowledge about this partition. This zoning of the study area is independent of the reporting units. The zone estimates are transferred to the reporting units. We call the methodology sandwich estimation and we report two contrasting empirical studies to demonstrate the application of the methodology and to compare its performance against some other existing methods for tackling this problem. Our study shows that sandwich estimation performs well against two other frequently used, probabilistic, model-based approaches to multi-unit reporting on stratified heterogeneous surfaces whilst having the advantage of computational simplicity. We suggest those situations where sandwich estimation might be expected to do well.
[6] 刘峰, 朱阿兴, 李宝林, . 2009.

利用陆面反馈动态模式来识别土壤类型的空间差异

[J]. 土壤通报, 40(3): 501-508.

[本文引用: 1]     

[Liu F, Zhu A X, Li B L, et al.2009.

Identification of spatial difference of soil types using land surface feedback dynamic patterns

[J]. Chinese Journal of Soil Science, 40(3): 501-508.]

[本文引用: 1]     

[112] Wang K, Zhang C R, Li W D.2013.

Predictive mapping of soil total nitrogen at a regional scale: A comparison between geographically weighted regression and cokriging

[J]. Applied Geography, 42: 73-85.

https://doi.org/10.1016/j.apgeog.2013.04.002      URL      [本文引用: 1]      摘要

Accurately mapping the spatial distribution of soil total nitrogen is important to precision agriculture and environmental management. Geostatistical methods have been frequently used for predictive mapping of soil properties. Recently, a local regression method, geographically weighted regression (GWR), got the attention of environmentalists as an alternative in spatial modeling of environmental attributes, due to its capability of incorporating various auxiliary variables with spatially varied correlation coefficients. The objective of this study is to compare GWR and ordinary cokriging (OCK) in predictive mapping of soil total nitrogen (TN) using multiple environmental variables. 353 soil Samples within the surface horizon of 0 20 cm in a study area were collected, and their TN contents were measured for calibrating and validating the GWR and OCK interpolations. The environmental variables finally chosen as auxiliary data include elevation, land use types, and soil types. Results indicate that, although OCK is slightly better than GWR in global accuracy of soil TN prediction (the adjusted R 2 for GWR and OCK are 0.5746 and 0.6858, respectively), the soil TN map interpolated by GWR shows many details reflecting the spatial variations of major auxiliary variables while OCK smoothes out almost all local details. Geographically weighted regression could account for both the spatial trend and local variations, whilst OCK had difficulties to capture local variations. It is concluded that GWR is a more promising spatial interpolation method compared to OCK in predicting soil TN and potentially other soil properties, if a suitable set of auxiliary variables are available and selected.
[113] Warrick A W, Myers D E.1987.

Optimization of sampling locations for variogram calculations

[J]. Water Resources Research, 23(3): 496-500.

https://doi.org/10.1029/WR023i003p00496      URL      [本文引用: 1]      摘要

A method is presented and demonstrated for optimizing the selection of sample locations for variogram estimation. It is assumed that the distribution of distance classes is decided a priori and the problem therefore is to closely approximate the preselected distribution, although the dispersion within individual classes can also be considered. All of the locations may be selected or points added to an existing set of sites or to those chosen on regular patterns. In the examples, the sum of squares characterizing the deviation from the desired distribution of couples is reduced by as much as 2 orders of magnitude between random and optimized points. The calculations may be carried out on a microcomputer. Criteria for what constitutes best estimators for variogram are discussed, but a study of variogram estimators is not the object of this paper.
[7] 刘京, 朱阿兴, 张淑杰, . 2013.

基于样点个体代表性的大尺度土壤属性制图方法

[J]. 土壤学报, 50(1): 12-20.

URL      [本文引用: 1]      摘要

大空间尺度范围的土壤属性分布信息是陆地表层过程模拟的基础信息。基于野外样点进行空间插值是获得土壤属性空间分布信息的重要手段。现有的空间插值方法通常要求所用样点对研究区土壤属性空间分布规律具有良好的全局代表性。然而,受采样经费和野外采样条件的限制,所采集的样点往往难以全面地反映研究区土壤属性的空间分布规律。基于这样的样点用现有空间插值方法得到的土壤属性分布图通常精度较低,并且由样点全局代表性差带来的推测不确定性也无法得到度量。为了合理利用这些已采集的但全局代表性不好的样点,本文提出了基于样点"个体代表性"推测土壤属性空间分布并度量推测不确定性的方法。该方法在两点环境条件越相似、土壤属性就越相似的假设下,认为每一样点可以代表与其环境条件相似的地区,并且代表程度可以由两点的环境相似度度量;通过分析环境相似度计算推测不确定性,并以环境相似度为权重计算样点可代表地区的土壤属性值。将该方法应用于推测新疆伊犁地区土壤表层有机质含量,经验证,本文方法能够有效地利用全局代表性差的样点推测样点能够代表地区的土壤属性空间分布,并且所得的推测不确定性与预测残差呈现正向关系,能够有效地指示推测结果的可靠程度。

[Liu J, Zhu A X, Zhang S J, et al.2013.

Large-scaled soil attribute mapping method based on individual representativeness of sample sites

[J]. Acta Pedologica Sinica, 50(1): 12-20.]

URL      [本文引用: 1]      摘要

大空间尺度范围的土壤属性分布信息是陆地表层过程模拟的基础信息。基于野外样点进行空间插值是获得土壤属性空间分布信息的重要手段。现有的空间插值方法通常要求所用样点对研究区土壤属性空间分布规律具有良好的全局代表性。然而,受采样经费和野外采样条件的限制,所采集的样点往往难以全面地反映研究区土壤属性的空间分布规律。基于这样的样点用现有空间插值方法得到的土壤属性分布图通常精度较低,并且由样点全局代表性差带来的推测不确定性也无法得到度量。为了合理利用这些已采集的但全局代表性不好的样点,本文提出了基于样点"个体代表性"推测土壤属性空间分布并度量推测不确定性的方法。该方法在两点环境条件越相似、土壤属性就越相似的假设下,认为每一样点可以代表与其环境条件相似的地区,并且代表程度可以由两点的环境相似度度量;通过分析环境相似度计算推测不确定性,并以环境相似度为权重计算样点可代表地区的土壤属性值。将该方法应用于推测新疆伊犁地区土壤表层有机质含量,经验证,本文方法能够有效地利用全局代表性差的样点推测样点能够代表地区的土壤属性空间分布,并且所得的推测不确定性与预测残差呈现正向关系,能够有效地指示推测结果的可靠程度。
[114] Webster R.1977. Quantitative and numerical methods in soil classification and survey[M]. Oxford, England: Oxford University Press.

[本文引用: 3]     

[115] Webster R, Oliver M A.1990. Statistical methods in soil and land resource survey[M]. Oxford, England: Oxford University Press.

[本文引用: 2]     

[8] 秦承志, 卢岩君, 邱维理, . 2010.

模糊坡位信息在精细土壤属性空间推测中的应用

[J]. 地理研究, 29(9): 1706-1714.

https://doi.org/10.11821/yj2010090016      URL      [本文引用: 1]      摘要

坡位的空间渐变特征影响着小流域及坡面尺度上的土壤、水文、地貌 等现象和过程,因此对精细尺度下的地理建模(如土壤空间信息推理)有重要作用.虽然目前已有多种模糊坡位信息定量提取方法,但所得到的模糊坡位信息还缺乏 实际应用.本文以精细尺度下的土壤属性空间分布推测为例,对此展开探索.应用模型假设:(1)在小流域内,地形因素主导着土壤属性空间分布的变化;(2) 典型坡位上对应分布着典型的土壤属性值,土壤属性与坡位之间存在协同变化关系.据此建立以模糊坡位信息对各类典型坡位上土壤样点属性值的加权平均模型,推 测土壤属性的空间分布.模型应用于黑龙江省嫩江流域一个地形平缓的小区(面积约60 km2),通过一个以坡位典型位置作为原型的模糊坡位定量方法提取5类坡位(山脊、坡肩、背坡、坡脚、沟谷)的空间渐变信息,对土壤表层有机质含量的空间 分布进行推测.推测结果通过研究区70个土壤采样点进行评价,以推测结果与评价样点集之间的相关系数、平均绝对误差、均方根误差作为定量评价指标,与使用 常用地形属性的多元线性回归模型推测结果进行对比.评价结果表明,仅使用极少建模点的加权平均模型的推测结果优于多元线性回归模型的推测结果.

[Qin C Z, Lu Y J, Qiu W L, et al.2010.

Application of fuzzy slope positions in predicting spatial distribution of soil property at finer scale

[J]. Geographical Research, 29(9): 1706-1714.]

https://doi.org/10.11821/yj2010090016      URL      [本文引用: 1]      摘要

坡位的空间渐变特征影响着小流域及坡面尺度上的土壤、水文、地貌 等现象和过程,因此对精细尺度下的地理建模(如土壤空间信息推理)有重要作用.虽然目前已有多种模糊坡位信息定量提取方法,但所得到的模糊坡位信息还缺乏 实际应用.本文以精细尺度下的土壤属性空间分布推测为例,对此展开探索.应用模型假设:(1)在小流域内,地形因素主导着土壤属性空间分布的变化;(2) 典型坡位上对应分布着典型的土壤属性值,土壤属性与坡位之间存在协同变化关系.据此建立以模糊坡位信息对各类典型坡位上土壤样点属性值的加权平均模型,推 测土壤属性的空间分布.模型应用于黑龙江省嫩江流域一个地形平缓的小区(面积约60 km2),通过一个以坡位典型位置作为原型的模糊坡位定量方法提取5类坡位(山脊、坡肩、背坡、坡脚、沟谷)的空间渐变信息,对土壤表层有机质含量的空间 分布进行推测.推测结果通过研究区70个土壤采样点进行评价,以推测结果与评价样点集之间的相关系数、平均绝对误差、均方根误差作为定量评价指标,与使用 常用地形属性的多元线性回归模型推测结果进行对比.评价结果表明,仅使用极少建模点的加权平均模型的推测结果优于多元线性回归模型的推测结果.
[116] Webster R, Oliver M A.1992.

Sample adequately to estimate variograms of soil properties

[J]. European Journal of Soil Science, 43(1): 177-192.

https://doi.org/10.1111/j.1365-2389.1992.tb00128.x      URL      [本文引用: 1]      摘要

SUMMARY The variogram is central in the spatial analysis of soil, yet it is often estimated from few data, and its precision is unknown because confidence limits cannot be determined analytically from a single set of data. Approximate confidence intervals for the variogram of a soil property can be found numerically by simulating a large field of values using a plausible model and then taking many samples from it and computing the observed variogram of each sample. A sampling distribution of the variogram and its percentiles can then be obtained. When this is done for situations typical in soil and environmental surveys it seems that variograms computed on fewer than 50 data are of little value and that at least 100 data are needed. Our experiments suggest that for a normally distributed isotropic variable a variogram computed from a sample of 150 data might often be satisfactory, while one derived from 225 data will usually be reliable.
[117] Wilding L P, Jones R B, Schafer G M.1965.

Variation of soil morphological properties within miami, Celina, and Crosby mapping units in West-Central Ohio

[J]. Soil Science Society of America Journal, 29(6): 711-717.

https://doi.org/10.2136/sssaj1965.03615995002900060033x      URL      [本文引用: 1]      摘要

Variation of soil morphological properties within mapping units of Miami, Celina, and Crosby soils in Ohio has been statistically summarized. Ten randomly selected profiles within each of 24 mapping delineations of these soils were sampled for this characterization. The most variable properties were horizon thickness, depth of leaching of carbonates, loess thickness, depth to mottling, pH, and class (size) of soil structure. Clay content, grade (strength) of soil structure, and soil color were least variable. The number of observations required to estimate the population mean of the above parameters within certain limits using a .95 confidence interval was computed. These soils were correctly classified with regard to great group at 96% of the 240 observations, to subgroup at 85%, to soil series at 42%, and to soil type at 39%. Parent material was mapped accurately 88% of the time; erosion, 94%; pH, 70%; solum thickness, 63%; and drainage class, 65%. Since all delineations contained 30% or more inclusions of other soils, these mapping units would be considered complex or undifferentiated units based on the present concept of a mapping unit. It is proposed that the concept of the mapping unit be modified to emphasize the dominant soil of the area rather than implying 85% mapping accuracy of a specified soil.
[9] 瞿明凯, 李卫东, 张传荣, . 2014.

地理加权回归及其在土壤和环境科学上的应用前景

[J]. 土壤, 46(1): 15-22.

[Qu M K, Li W D, Zhang C R.2014.

Geographically weighted regression and its application prospect in soil and environmental sciences

[J]. Soils, 46(1): 15-22.]

[118] Yang L, Brus D J, Zhu A X, et al.2018.

Accounting for access costs in validation of soil maps: A comparison of design-based sampling strategies

[J]. Geoderma, 315: 160-169.

https://doi.org/10.1016/j.geoderma.2017.11.028      URL      [本文引用: 1]      摘要

The quality of soil maps can best be estimated by collecting additional data at locations selected by probability sampling. These data can be used in design-based estimation of map quality measures such as the population mean of the squared prediction errors (MSE) for continuous soil maps and overall accuracy for categorical soil maps. In areas with large differences in access costs it can be attractive to account for these differences in selecting validation locations. In this paper two types of sampling design are compared that take access costs into account: sampling with probabilities proportional to size (pps) and stratified simple random sampling (STSI). In pps the inverse of the square root of the access costs is used as a size variable. Two estimators of MSE are applied, the Hansen-Hurwitz and Hajek estimator. In STSI optimal strata are constructed based on access costs. Simple random sampling (SI) is taken as a reference design. The sampling strategies were compared on the basis of: 1) the variance of the estimated MSE; 2) the variance of the total pointwise access costs; 3) the 95-percentile of the sampling distribution of the total access costs. The comparison was done at equal expected total pointwise access costs. The sampling strategies were compared in a simulation study and a real-world case study in Anhui, China. In the case study car travel and hiking costs were considered in computing access costs per point. The results showed that the variance of estimated MSE with pps(Hansen-Hurwitz) was larger than with pps(Hajek) and STSI. The variances of estimated MSE of pps(Hajek) and STSI were about equal and smaller than that of SI. The gain in precision compared to SI depends on the cost distribution. The larger the coefficient of variation of the costs, the larger the gain. The 95 percentile of the sampling distribution of the total pointwise access costs with STSI was smaller than with pps and SI. The gain in precision of pps(Hajek) and STSI was about 30% accounting for hiking costs only, and about 10% accounting for the sum of car travel and hiking costs in the case study. The proposed sampling strategies are of interest for surveying any soil property in areas with marked differences in access costs, not just for validation of soil maps.
[119] Yang L, Jiao Y, Fahmy S, et al.2011.

Updating conventional soil maps through digital soil mapping

[J]. Soil Science Society of America Journal, 75(3): 1044-1053.

https://doi.org/10.2136/sssaj2010.0002      URL      [本文引用: 1]      摘要

ABSTRACT Conventional soil maps, as the major data source for information on the spatial variation of soil, are limited in terms of both the level of spatial detail and the accuracy of soil attributes. These soil maps, however, contain valuable knowledge on soil-environment relationships. Such knowledge can be extracted for updating conventional soil maps through the use of available high-quality data on environmental variables and data analysis techniques. We developed a method to update conventional soil maps using digital soil mapping techniques without additional field work, which can be used in situations where the study area contains no or few soil profile descriptions at points. The basis of the method is that soil polygons on a conventional soil map correspond to landscape units, which can be considered as combinations of environmental factors. Such environmental combinations were approximated through fuzzy clustering on the environmental factors. We extracted the knowledge on soil-environment relationships by relating the environmental combinations to the mapped soil types. The extracted knowledge was then used for soil mapping using the Soil Land Inference Model (SoLIM) framework. This method was demonstrated through a case study for updating a conventional 1:20,000 soil map of Wakefield, NB, Canada. The case study showed that the updated digital soil map contained much greater spatial detail than the conventional soil map. Field validation indicated that the accuracy of the updated soil map was much higher than the conventional soil map at the level of soil associations with drainage classes, indicating that the proposed method is an effective approach to updating conventional soil maps.
[10] 史舟, 郭燕, 金希, . 2011.

土壤近地传感器研究进展

[J]. 土壤学报, 48(6): 1274-1281.

URL      [本文引用: 1]      摘要

野外快速获取土壤各类理化参数的技术手段是土壤科学的重要研究方向,也是传统土壤理化测试分析向土壤野外实时监测方向发展的重要技术支撑。国际土壤科学联合会(IUSS)在2008年专门成立了土壤近地传感器技术(Proximal Soil Sensing,PSS)工作组,开展有关各类土壤近地传感器原理、技术、设备、应用等方面的学术交流。本文就目前国内外有关土壤近地传感器方面的研究工作按照其设备工作原理的不同进行了分类总结和说明,最后指出了当前土壤近地传感器技术发展存在的主要问题和发展趋势。

[Shi Z, Guo Y, Jin X, et al.2011.

Advancement in study on proximal soil sensing

[J]. Acta Pedologica Sinica, 48(6): 1274-1281.]

URL      [本文引用: 1]      摘要

野外快速获取土壤各类理化参数的技术手段是土壤科学的重要研究方向,也是传统土壤理化测试分析向土壤野外实时监测方向发展的重要技术支撑。国际土壤科学联合会(IUSS)在2008年专门成立了土壤近地传感器技术(Proximal Soil Sensing,PSS)工作组,开展有关各类土壤近地传感器原理、技术、设备、应用等方面的学术交流。本文就目前国内外有关土壤近地传感器方面的研究工作按照其设备工作原理的不同进行了分类总结和说明,最后指出了当前土壤近地传感器技术发展存在的主要问题和发展趋势。
[11] 宋敏, 杨琳, 朱阿兴, . 2017.

轮作模式在农耕区土壤有机质推测制图中的应用

[J]. 土壤通报, 48(4): 778-785.

[本文引用: 2]     

[Song M, Yang L, Zhu A X, et al.2017.

Mapping soil organic matter in farming areas with crop rotation

[J]. Chinese Journal of Soil Science, 48(4): 778-785.]

[本文引用: 2]     

[120] Yang L, Qi F, Zhu A X, et al.2016.

Evaluation of integrative hierarchical stepwise sampling for digital soil mapping

[J]. Soil Science Society of America Journal, 80(3): 637-651.

https://doi.org/10.2136/sssaj2015.08.0285      URL      [本文引用: 3]      摘要

This paper presents an integrative hierarchical stepwise sampling (IHS) method and two case studies to compare it with stratified random sampling (SRS) and conditioned Latin hypercube sampling (cLHS). The first comparison between IHS and SRS was conducted for mapping sand content of two soil layers in a study area in Anhui Province, China. Two sample sets of the same sample size were collected in the field based on IHS and SRS. The second case study is a simulation study, where we compared IHS and cLHS for mapping soil series in the Raffelson watershed in Wisconsin (USA). The study used an accurate and detailed soil series map produced previously as a proxy of the real soil distribution. Virtual samples with nine sample sizes designed by IHS and cLHS were collected on the soil map. For both case studies, an individual predictive soil mapping method was employed and independent validation samples were used to evaluate the mapping accuracies. Results indicate that IHS generally performs better than SRS for capturing distributions of the environmental variables. It obtained higher mapping accuracies than SRS at different sample sizes. On the other hand, cLHS appears to provide a better representation for distributions of the environmental variables than IHS, but the mapping accuracies with IHS are higher than those with cLHS at nearly all sample sizes. Finally, both case studies showed that IHS provides valuable information on representativeness of the samples.
[121] Yang L, Zhu A X, Qi F, et al.2013.

An integrative hierarchical stepwise sampling strategy for spatial sampling and its application in digital soil mapping

[J]. International Journal of Geographical Information Science, 27(1): 1-23.

https://doi.org/10.1080/13658816.2012.658053      URL      [本文引用: 1]      摘要

Sampling design plays an important role in spatial modeling. Existing methods often require a large amount of samples to achieve desired mapping accuracy, but imply considerable cost. When there are not enough resources for collecting a large set of samples at once, stepwise sampling approach is often the only option for collecting the needed large sample set, especially in the case of field surveying over large areas. This article proposes an integrative hierarchical stepwise sampling strategy which makes the samples collected at different stages an integrative one. The strategy is based on samples' representativeness of the geographic feature at different scales. The basic idea is to sample at locations that are representative of large-scale spatial patterns first and then add samples that represent more local patterns in a stepwise fashion. Based on the relationships between a geographic feature and its environmental covariates, the proposed sampling method approximates a hierarchy of spatial variations of the geographic feature under concern by delineating natural aggregates (clusters) of its relevant environmental covariates at different scales. The natural occurrence of such aggregates is modeled using a fuzzy c-means clustering method. We iterate through different numbers of clusters from only a few to many more to be able to reveal clusters at different spatial scales. At a particular iteration, locations that bear high similarity to the cluster prototypes are identified. If a location is consistently identified at multiple iterations, it is then considered to be more representative of the general or large-scale spatial patterns. Locations that are identified less during the iterations are representative of local patterns. The integrative stepwise sampling design then gives higher sampling priority to the locations that are more representative of the large-scale patterns than local ones. We applied this sampling design in a digital soil mapping case study. Different representative samples were obtained and used for soil inference. We started with samples that are the most representative of the large-scale patterns and then gradually included the samples representative of local patterns. Field evaluation indicated that the additions of more samples with lower representativeness lead to improvements of accuracy with a decreasing marginal gain. When cost-effectiveness is considered, the representative grade could provide essential information on the number and order of samples to be sampled for an effective sampling design.
[122] Yang Q Y, Luo W Q, Jiang Z C, et al.2016.

Improve the prediction of soil bulk density by cokriging with predicted soil water content as auxiliary variable

[J]. Journal of Soils and Sediments, 16(1): 77-84.

https://doi.org/10.1007/s11368-015-1193-4      URL      摘要

Soil bulk density (SBD) is a key soil physical property affecting the transport of water and solutes, which is essential to estimating soil carbon and nutrient reserves. However, it is considered to b
[123] Zeng C Y, Yang L, Zhu A X, et al.2016.

Mapping soil organic matter concentration at different scales using a mixed geographically weighted regression method

[J]. Geoderma, 281: 69-82.

https://doi.org/10.1016/j.geoderma.2016.06.033      URL      [本文引用: 1]      摘要

The present regression models in digital soil mapping usually assume that relationships between soil properties and environmental variables are always fixed (as in MLR) or varying (as in GWR) in geographical space. In reality, some of the environmental variables may be fixed in affecting soil property variation and some are local varying. In this study, a mixed geographically weighted regression (MGWR) method which can deal with fixed and varying spatial relationships between a target variable and its environmental variables were proposed and used to predict topsoil soil organic matter (SOM) concentration in two study areas (Heshan, Heilongjiang province and Xuancheng, Anhui province, China) at two scales. Three groups of sample sets were created based on the total samples in the study areas to evaluate the robustness and stability of the model. Multiple linear regression (MLR), geographically weighted regression (GWR), GWR-kriging (GWRK), local regression-kriging (LRK), kriging with an external drift (KED), and ordinary kriging (OK) were used for comparison with MGWR. The validation results showed that the use of MGWR reduced the RMSE of GWR by 10.5% and 7.6% on average, reduced the RMSE of MLR by 12.8% and 9.9% on average for Heshan and Xuancheng study areas respectively. MGWR also showed a good competitiveness when compared with GWRK, LRK, KED and OK. In Heshan study area, the influence of flow length, relative position index, foot slope and distance to the nearest drainage were constant, whereas the elevation, topographic wetness index and valley index showed different influence in different regions. In Xuancheng study area, the fixed environmental variables were profile curvature, topographic wetness index and slope, whereas the varying environmental variables were precipitation, temperature, elevation, and limestone. The results indicate that the accuracy of predictions can be improved by adaptive coefficient according to the variation of environmental variables as implemented in MGWR compared with others considering only the local or global relationships. It was concluded that mixed geographically weighted regression model could be a potential method for digital soil mapping.
[12] 王会肖, 张超. 2007.

利用MATLAB研究土壤水分空间变异初探

[J]. 中国生态农业学报, 15(1): 127-130.

URL      Magsci      [本文引用: 1]      摘要

本文利用强大的数据计算分析软件MATLAB对土壤水分空间变异进行分析,并通过对样地数据插值和趋势面分析,方便地得出了样地土壤水分的变化趋势,计算分析结果对水土保持和农田灌溉都有很好的指导作用。

[Wang H X, Zhang C.2007.

Studies on spatial variability of soil water with Matlab

[J]. Chinese Journal of Eco-Agriculture, 15(1): 127-130.]

URL      Magsci      [本文引用: 1]      摘要

本文利用强大的数据计算分析软件MATLAB对土壤水分空间变异进行分析,并通过对样地数据插值和趋势面分析,方便地得出了样地土壤水分的变化趋势,计算分析结果对水土保持和农田灌溉都有很好的指导作用。
[124] Zeng C Y, Zhu A X, Liu F, et al.2017.

The impact of rainfall magnitude on the performance of digital soil mapping over low-relief areas using a land surface dynamic feedback method

[J]. Ecological Indicators, 72: 297-309.

https://doi.org/10.1016/j.ecolind.2016.08.023      URL      [本文引用: 1]      摘要

Previous studies have demonstrated that the pattern of land surface dynamic feedbacks (LSDF) based on remote sensing images after a rainfall event can be used to derive environmental covariates to assist in predicting soil texture variation over low-relief areas. However, the impact of the rainfall magnitude on the performance of these covariates has not been thoroughly investigated. The objective of this study was to investigate this impact during ten observation periods following rainfall events of different magnitudes (0–4002mm). An individual predictive soil mapping method (iPSM) was used to predict soil texture over space based on the environmental covariates derived from land surface dynamic feedbacks. The prediction error showed strong negative correlation with rainfall magnitude ( Pearson ’ s r between root-mean squared error of prediction and rainfall magnitude02=02610.943 for percentage of sand and 610.883 for percentage of clay). When the rainfall reaches a certain magnitude, the prediction error becomes stable. The recommended rain magnitude (threshold) using LSDF method in this study area is larger than 2002mm for both sand and clay percentage. The predictive maps based on different observed periods with similar rainfall magnitudes show only slight differences. Rainfall magnitude can thus be said to have a significant impact on the prediction accuracy of soil texture mapping. Greater rainfall magnitude will improve the prediction accuracy when using the LSDF. And high wind speed, high evaporation and low relative humidity during the observed periods also improved the prediction accuracy, all by stimulating differential soil drying.
[125] Zhang C S, Tang Y, Xu X L, et al.2011.

Towards spatial geochemical modelling: Use of geographically weighted regression for mapping soil organic carbon contents in Ireland

[J]. Applied Geochemistry, 26(7): 1239-1248.

https://doi.org/10.1016/j.apgeochem.2011.04.014      URL      [本文引用: 1]      摘要

It is challenging to perform spatial geochemical modelling due to the spatial heterogeneity features of geochemical variables. Meanwhile, high quality geochemical maps are needed for better environmental management. Soil organic C (SOC) distribution maps are required for improvements in soil management and for the estimation of C stocks at regional scales. This study investigates the use of a geographically weighted regression (GWR) method for the spatial modelling of SOC in Ireland. A total of 1310 samples of SOC data were extracted from the National Soil Database of Ireland. Environmental factors of rainfall, land cover and soil type were investigated and included as the independent variables to establish the GWR model. The GWR provided comparable and reasonable results with the other chosen methods of ordinary kriging (OK), inverse distance weighted (IDW) and multiple linear regression (MLR). The SOC map produced using the GWR model showed clear spatial patterns influenced by environmental factors and the smoothing effect of spatial interpolation was reduced. This study has demonstrated that GWR provides a promising method for spatial geochemical modelling of SOC and potentially other geochemical parameters.
[13] 王劲峰, 姜成晟, 李连发, . 2009. 空间抽样与统计推断[M]. 北京: 科学出版社.

[本文引用: 1]     

[Wang J F, Jiang C S, Li L F, et al.2009. Kongjian chouyang yu tongji tuiduan[M]. Beijing, China: Science Press.]

[本文引用: 1]     

[126] Zhang S J, Zhu A X, Liu J, et al.2016.

An heuristic uncertainty directed field sampling design for digital soil mapping

[J]. Geoderma, 267: 123-136.

https://doi.org/10.1016/j.geoderma.2015.12.009      URL      [本文引用: 1]      摘要

Legacy samples are a valuable data source for digital soil mapping. However, these sample sets are often small in size and ad hoc in spatial distribution. Constrained by the limited representativeness of such a sample set, the obtained soil maps are often incomplete in spatial coverage with aps at the locations which cannot be well represented by these samples. The maps may also contain areas of high prediction uncertainty. In order to extend the predicted area and reduce prediction uncertainty, additional samples are needed. This paper presents a sampling design based on prediction uncertainty to select samples which will effectively complement the sparse and ad hoc samples, and maximize the spatial coverage of prediction and minimize prediction uncertainty. A case study in China shows that this sampling scheme was effective in achieving these goals. Compared with stratified random sampling scheme, when the number of additional samples is the same, the produced map using uncertainty directed samples has larger predicted area, and the accuracy of the produced map is higher than that of the maps using stratified random samples. The finding of this study suggests that prediction uncertainty is a useful indicator to aid field sample selection and to complement the legacy data. Furthermore, the mapping accuracy produced using this method can be quantitatively related to the number of additional samples needed which opens a new horizon for digital soil mapping.
[127] Zhao M S, Rossiter D G, Li D C, et al.2014.

Mapping soil organic matter in low-relief areas based on land surface diurnal temperature difference and a vegetation index

[J]. Ecological Indicators, 39: 120-133.

https://doi.org/10.1016/j.ecolind.2013.12.015      URL      [本文引用: 5]      摘要

Accurate estimates of the spatial variability of soil organic matter (SOM) are necessary to properly evaluate soil fertility and soil carbon sequestration potential. In plains and gently undulating terrains, soil spatial variability is not closely related to relief, and thus digital soil mapping (DSM) methods based on soil andscape relationships often fail in these areas. Therefore, different predictors are needed for DSM in the plains. Time-series remotely sensed data, including thermal imagery and vegetation indices provide possibilities for mapping SOM in such areas. Two low-relief agricultural areas (Peixian County, 28km 28km and Jiangyan County, 38km 50km) in northwest and middle Jiangsu Province, east China, were chosen as case study areas. Land surface diurnal temperature difference (DTD) extracted from moderate resolution imaging spectroradiometer (MODIS) land surface temperature (LST), and soil-adjusted vegetation index (SAVI) at the peak of growing season calculated from Landsat ETM+ image were used as predictors. Regression kriging (RK) with a mixed linear model fitted by residual maximum likelihood (REML) and residuals interpolated by simple kriging (SK) were used to model and map SOM spatial distribution; ordinary kriging (OK) was used as a baseline comparison. The root mean squared error, mean error and mean absolute error calculated from leave-one-out cross-validation were used to assess prediction accuracy. Results showed that the proposed covariates provided added value to the observations. SAVI aggregated to MODIS resolution was able to identify local highs and lows not apparent from the DTD imagery alone. Despite the apparent similarity of the two areas, the spatial structure of residuals from the linear mixed models were quite different; ranges on the order of 3km in Jiangyan but 16km in Peixian, and accuracy of best models differed by a factor of two (3.3g/kg and 6.3g/kg SOM, respectively). This suggests that time-series remotely sensed data can provide useful auxiliary variable for mapping SOM in low-relief agricultural areas, with three important cautions: (1) image dates must be carefully chosen; (2) vegetation indices should supplement diurnal temperature differences, (3) model structure must be calibrated for each area.
[14] 杨琳. 2006.

基于模糊c均值聚类提取土壤—环境系知识的方法研究

[D]. 北京:北京师范大学.

[本文引用: 1]     

[Yang L.2006.

Jiyu mohu c junzhi julei tiqu turang: Huanjingxi zhishi de fangfa yanjiu

[D]. Beijing, China: Beijing Normal University.]

[本文引用: 1]     

[128] Zhu A X.1997.

A similarity model for representing soil spatial information

[J]. Geoderma, 77(2-4): 217-242.

https://doi.org/10.1016/S0016-7061(97)00023-2      URL      [本文引用: 1]      摘要

A fuzzy logic based model (called a similarity model) was developed to represent soil spatial information so that soil landscape is perceived as a continuum in both the parameter space and the geographic space. The similarity model consists of two components: the similarity representation component and a raster representation scheme. The similarity representation component uses a set of prescribed soil taxonomic categories as the central concepts of the fuzzy soil classes and represents a soil at a given location as a set of similarity values to these central concepts. The collection of these similarity values forms an n-element vector called a soil similarity vector. With the use of a raster representation scheme, soil spatial information over an area can be represented as an array of soil similarity vectors. This similarity model has two main advantages for representing spatial soil information over conventional polygon-based soil maps. Firstly, the details of soil spatial information can be represented at the resolution of a raster data model rather than at the minimal mapping sizes as in conventional polygon-based soil maps. secondly, under the similarity representation, the deviation of a soil at a given location from typical soil classes can be preserved and its properties can then take values intermediate to the typical values of the prescribed soil types. A case study conducted in the Lubrecht Experiment Forest of western Montana demonstrated that soil spatial information represented under the similarity model has a higher resolution at both the attribute level and the spatial level than that in the conventional soil map of the area.
[129] Zhu A X.2000.

Mapping soil landscape as spatial continua: the neural network approach

[J]. Water Resources Research, 36(3): 663-677.

https://doi.org/10.1029/1999WR900315      URL      [本文引用: 1]      摘要

A neural network approach was developed to populate a soil similarity model that was designed to represent soil landscape as spatial continua for hydroecological modeling at watersheds of mesoscale size. The approach employs multilayer feed forward neural networks. The input to the network was data on a set of soil formative environmental factors; the output from the network was a set of similarity values to a set of prescribed soil classes. The network was trained using a conjugate gradient algorithm in combination with a simulated annealing technique to learn the relationships between a set of prescribed soils and their environmental factors. Once trained, the network was used to compute for every location in an area the similarity values of the soil to the set of prescribed soil classes. The similarity values were then used to produce detailed soil spatial information. The approach also included a Geographic Information System procedure for selecting representative training and testing samples and a process of determining the network internal structure. The approach was applied to soil mapping in a watershed, the Lubrecht Experimental Forest, in western Montana. The case study showed that the soil spatial information derived using the neural network approach reveals much greater spatial detail and has a higher quality than that derived from the conventional soil map. Implications of this detailed soil spatial information for hydroecological modeling at the watershed scale are also discussed.
[15] 杨琳. 2009.

目的性采样下样本设计与制图精度的关系研究: 以数字土壤制图为例

[D]. 北京: 中国科学院地理科学与资源研究所.

[本文引用: 1]     

[Yang L.2009.

Mudexing caiyang xia yangben sheji yu zhitu jingdu de guanxi yanjiu: Yi shuzi turang zhitu weili

[D]. Beijing, China: Institute of Geographical Sciences and Natural Resources Research, CAS.]

[本文引用: 1]     

[130] Zhu A X, Band L E.1994.

A knowledge-based approach to data integration for soil mapping

[J]. Canadian Journal of Remote Sensing, 20(4): 408-418.

https://doi.org/10.1080/07038992.1994.10874583      URL      [本文引用: 1]      摘要

L''int0108gration de donn0108es spatiales provenant de sources multiples est une m0108thode tr01¨s courante pour une foule d''analyses g0108ographiques. Cette m0108thode peut 0109tre divis0108e en deux 0108tapes fondamentales : l''int0108gration des donn0108es sur le plan syntaxique et l''int0108gration des donn0108es sur le plan s0108mantique. L''int0108gration des donn0108es sur le plan syntaxique concerne l''alignment et la structuration des donn0108es spatiales provenant de diff0108rentes sources, tandis que l''int0108gration sur le plan s0108mantique s''attache 0102 l''analyse et 0102 l''interpretation des donn0108es spatiales de sources multiples utilis0108es pour r0108pondre 0102 des questions pr0108cises. Cet article pr0108sente une m0108thode d''int0108gration de donn0108es de sources multiples fond0108e sur les connaissances, dans un contexte s0108mantique. Cette m0108thode fait appel 0102 une m0108thodologie propre aux syst01¨mes experts pour int0108grer des connaissances empiriques 0102 d''autres donn0108es environnementales afin d''en tirer de l''information sur un ph0108nom01¨ne spatial donn0108. La m0108thode incorpore 0108galement la logique floue au processus d''int0108gration des donn0108es afin de tenir compte de la nature continue dans l''espace des ph0108nom01¨nes g0108ographique. L''utilisation de cette m0108thode d''int0108gration des donn0108es est illustr0108e 0102 l''aide d''un exemple d''inf0108rence des sols. Les r0108sultats d''un relev0108 de terrain effectu0108 dans le cadre de cette 0108tude ont d0108montr0108 que la m0108thode fournissait de l''information sur les sols de meilleure qualit0108 que l''information tir0108e de la carte des sols produite 0102 partir d''un relev0108 de terrain classique.
[131] Zhu A X, Band L, Vertessy R, et al.1997.

Derivation of soil properties using a soil land inference model (SoLIM)

[J]. Soil Science Society of America Journal, 61(2): 523-533.

https://doi.org/10.2136/sssaj1997.03615995006100020022x      URL      [本文引用: 3]      摘要

SoLIM (Soil Land Inference Model) is a fuzzy inference scheme for estimating and representing the spatial distribution of soil types in a landscape. This study developed the inference method a step further to derive continuous soil property maps through two case studies. The first case illustrates the derivation of soil A horizon depth in a mountainous area in western Montana. It was found that the inferred depths are a closer fit to observed depths than those derived from the conventional soil map at both spatial and attribute levels. The second case shows the derivation of soil transmissivity values across a small catchment with a gentle environmental variation in Tumut, NSW, Australia. This case shows that the derived soil transmissivity map is comparable to the results from systematic held survey over a small area. SoLIM works well in an area where there is a good understanding of the relationships between soils and their formative environment and where the soil formative environment can be characterised using current geographical information system techniques. However, we experienced difficulty with the methodology when it was applied in an area where the environmental gradient is gentle and the soil formative environment cannot be very well described using the primitive environmental indices currently employed in SoLIM.
[16] 杨琳, 朱阿兴, 秦承志, . 2010.

基于典型点的目的性采样设计方法及其在土壤制图中的应用

[J]. 地理科学进展, 29(3): 279-286.

https://doi.org/10.11820/dlkxjz.2010.03.004      URL      Magsci      [本文引用: 1]      摘要

<p>鉴于经典采样和空间采样存在的局限性,提出了一种旨在寻找典型点的目的性采样设计方法。该方法通过分析与目标地理要素空间分布具有协同变化关系的环境因子,提取地理要素空间变化的典型模式,进而在典型模式上布设样点,即可获得典型点,从而减少所需样本量。以位于黑龙江鹤山农场的2个研究区为例,分别选择土壤厚度和表层有机质2个土壤属性,通过对土壤属性空间变化的4个协同环境因子进行模糊c均值聚类,获得对应土壤属性空间变化模式的环境因子组合;根据其模糊隶属度结果设计典型点并进行采样,最后结合典型点的属性值与环境因子组合模糊隶属度结果,采用加权平均模型得到土壤属性空间分布图,反映了土壤属性随地形变化的连续性分布。基于独立野外验证点,选择了4个评价指标对所得属性图进行定量评价。结果表明:2个研究区验证点集的预测值和观测值一致性指数均较高,可见本研究提出的方法是一种有效的样点布设方法。研究还对在每一环境组合类设计不同数量典型点所得土壤属性制图结果的差异进行了讨论,认为典型点增多并不一定能提高土壤属性空间推测的精度。</p>

[Yang L, Zhu A X, Qin C Z, et al.2010.

A purposive sampling design method based on typical points and its application in soil mapping

[J]. Progress in Geography, 29(3): 279-286.]

https://doi.org/10.11820/dlkxjz.2010.03.004      URL      Magsci      [本文引用: 1]      摘要

<p>鉴于经典采样和空间采样存在的局限性,提出了一种旨在寻找典型点的目的性采样设计方法。该方法通过分析与目标地理要素空间分布具有协同变化关系的环境因子,提取地理要素空间变化的典型模式,进而在典型模式上布设样点,即可获得典型点,从而减少所需样本量。以位于黑龙江鹤山农场的2个研究区为例,分别选择土壤厚度和表层有机质2个土壤属性,通过对土壤属性空间变化的4个协同环境因子进行模糊c均值聚类,获得对应土壤属性空间变化模式的环境因子组合;根据其模糊隶属度结果设计典型点并进行采样,最后结合典型点的属性值与环境因子组合模糊隶属度结果,采用加权平均模型得到土壤属性空间分布图,反映了土壤属性随地形变化的连续性分布。基于独立野外验证点,选择了4个评价指标对所得属性图进行定量评价。结果表明:2个研究区验证点集的预测值和观测值一致性指数均较高,可见本研究提出的方法是一种有效的样点布设方法。研究还对在每一环境组合类设计不同数量典型点所得土壤属性制图结果的差异进行了讨论,认为典型点增多并不一定能提高土壤属性空间推测的精度。</p>
[132] Zhu A X, Hudson B, Burt J E, et al.2001.

Soil mapping using GIS, expert knowledge, and fuzzy logic

[J]. Soil Science Society of America Journal, 65(5): 1463-1472.

https://doi.org/10.2136/sssaj2001.6551463x      URL      [本文引用: 3]     

[133] Zhu A X, Liu F, Li B L, et al.2010.

Differentiation of soil conditions over low relief areas using feedback dynamic patterns

[J]. Soil Science Society of America Journal, 74(3): 861-869.

https://doi.org/10.2136/sssaj2008.0411      URL      [本文引用: 3]      摘要

In many areas, such as plains and gently undulating terrain, easy-to-measure soil-forming factors such as landform and vegetation do not co-vary with soil conditions across space to the level that they can be effectively used in digital soil mapping. A challenging problem is how to develop a new environmental variable that co-varies with soil spatial variation under these situations. This study examined the idea that change patterns (dynamic feedback patterns) of the land surface, such as those captured daily by remote sensing images during a short period (6-7 d) after a major rain event, can be used to differentiate soil types. To examine this idea, we selected two study areas with different climates: one in northeastern China and the other in northwestern China. Images from the Moderate Resolution Imaging Spectroradiometer (MODIS) were used to capture land surface feedback. To measure feedback dynamics, we used spectral information divergence (SID). Results of an independent-samples t-test showed that there was a significant difference in SID values between pixel pairs of the same soil subgroup and those of different subgroups. This indicated that areas with different soil types (subgroup level) exhibited significantly different dynamic feedback patterns, and areas within the same soil type have similar dynamic feedback patterns. It was also found that the more similar the soil types, the more similar the feedback patterns. These findings could lead to the development of a new environmental covariate that could be used to improve the accuracy of soil mapping in low-relief areas.
[17] 杨琳, 朱阿兴, 秦承志, . 2011.

一种基于样点代表性等级的土壤采样设计方法

[J]. 土壤学报, 48(5): 938-946.

[本文引用: 2]     

[Yang L, Zhu A X, Qin C Z, et al.2011.

A soil sampling method based on representativeness grade of sampling points

[J]. Acta Pedologica Sinica, 48(5): 938-946.]

[本文引用: 2]     

[134] Zhu A X, Liu J, Du F, et al.2015.

Predictive soil mapping with limited sample data

[J]. European Journal of Soil Science, 66(3): 535-547.

https://doi.org/10.1111/ejss.12244      URL      [本文引用: 1]      摘要

Summary Existing predictive soil mapping (PSM) methods often require soil sample data to be sufficient to represent soil–environment relationships throughout the study area. However, in many parts of the world with only a limited quantity of soil sample data to represent the study area, this is still an issue for PSM application. This paper presents a method, named ‘individual predictive soil mapping’ (iPSM), which can make use of limited soil sample data for PSM. With the assumption that similar environmental conditions have similar soils, iPSM uses the soil–environment relationship at each individual soil sample location to predict soil properties at unvisited locations and estimate prediction uncertainty. Specifically, the environmental similarities of an unvisited location to a set of soil sample locations are used in a weighted average method to integrate the soil–environment relationships at sample locations for prediction and uncertainty estimation. As a case study, iPSM was applied to map soil organic matter (SOM) content (%) in the topsoil layer using two sets of soil samples. Compared with multiple linear regression (MLR), iPSM produced a more accurate SOM map (root mean squared error ( RMSE) 1.43, mean absolute error ( MAE) 1.16) than MLR (RMSE 8.54, MAE 7.34) the ability of the sample set to represent the study area is limited and achieved a comparable accuracy (RMSE 1.10, MAE 0.69) with MLR (RMSE 1.01, MAE 0.73) when the sample set could represent the study area better. In addition, the prediction uncertainty estimated by iPSM was positively related to prediction residuals in both scenarios. This study demonstrates that iPSM is an effective alternative when existing soil samples are limited in their ability to represent the study area and the prediction uncertainty in iPSM can be used as an indicator of its prediction accuracy.
[135] Zhu A X, Yang L, English E, et al.2008.

Purposive sampling for digital soil mapping under fuzzy logic

[C]//Proceedings of 2007 International Annual Meeting. New Orleans, Louisiana: ASA.

[本文引用: 1]     

[18] 杨奇勇, 杨劲松, 刘广明. 2011.

土壤速效养分空间变异的尺度效应

[J]. 应用生态学报, 22(2): 431-436.

URL      Magsci      [本文引用: 1]      摘要

<p>在GIS技术支持下,运用经典统计学和地统计学的方法,从经典统计分析、变异函数和Kriging插值图等方面探讨了禹城市耕地土壤速效磷(AP)、速效钾(AK)在县级和镇级两个不同采样尺度下的空间变异特征.结果表明:在两个采样尺度下,AP、AK都服从对数正态分布,它们的变异系数范围为26.5%~36.6%,均属中等变异强度,随着采样尺度的缩小,土壤AP、AK的变异系数都增大.两个采样尺度下,土壤AP和AK均在一定范围内存在空间相关关系,县级采样尺度下土壤AP和AK的空间自相关距离较大,分别为9.0 km和26.5 km,镇级采样尺度下土壤AP和AK的空间自相关距离明显变小,分别为1.7 km和2.8 km.两个采样尺度下的土壤AP和AK受结构性因素和随机性因素的影响,表现出明显不同的分布规律.</p>

[Yang Q Y, Yang J S, Liu G M.2011.

Scale-dependency of spatial variability of soil available nutrients

[J]. Chinese Journal of Applied Ecology, 22(2): 431-436.]

URL      Magsci      [本文引用: 1]      摘要

<p>在GIS技术支持下,运用经典统计学和地统计学的方法,从经典统计分析、变异函数和Kriging插值图等方面探讨了禹城市耕地土壤速效磷(AP)、速效钾(AK)在县级和镇级两个不同采样尺度下的空间变异特征.结果表明:在两个采样尺度下,AP、AK都服从对数正态分布,它们的变异系数范围为26.5%~36.6%,均属中等变异强度,随着采样尺度的缩小,土壤AP、AK的变异系数都增大.两个采样尺度下,土壤AP和AK均在一定范围内存在空间相关关系,县级采样尺度下土壤AP和AK的空间自相关距离较大,分别为9.0 km和26.5 km,镇级采样尺度下土壤AP和AK的空间自相关距离明显变小,分别为1.7 km和2.8 km.两个采样尺度下的土壤AP和AK受结构性因素和随机性因素的影响,表现出明显不同的分布规律.</p>
[19] 张黎明, 林金石, 史学正, . 2011.

中国水稻土氮密度变异性的幅度效应研究

[J]. 生态环境学报, 20(1): 1-6.

https://doi.org/10.3969/j.issn.1674-5906.2011.01.001      URL      [本文引用: 3]      摘要

基于全国第二次土壤普查的1491个水稻土剖面数据,研究了行政 区幅度(行政大区、省级、地区级)和土壤区幅度(土壤区、土壤带、土区)下水稻土氮密度的变异情况及其对幅度拓展的响应.结果表明,中国水稻土0~20 cm和0~100 cm土壤平均氮密度为18.7t·hm-2和12.4 t·hm-2,土壤氮素的空间变异性总体随幅度拓展而增大.在不同土壤区幅度下,0~20 cm全氮密度组内和组间变异性随研究区幅度的减小而减小;从土壤区到土区尺度.0~100 cm全氮密度组间变异率从250%左右下降至不足50%,而在行政大区和省级尺度时全氮密度的组间变异性没有太大变化,变异率都小于100%,到地级市尺 度时,组间变异率又达到了400%,说明同一地区所选择的研究幅度尺度不一样氮密度变异率差异也很大.因此,在今后的水稻土调查采样设计时,根据不同幅度 尺度的变异率大小来选择合适的布点方式和样点数是十分必要的.

[Zhang L M, Lin J S, Shi X Z, et al.2011.

The effect of different extents on variation of nitrogen density of paddy soils in China

[J]. Ecology and Environmental Sciences, 20(1): 1-6.]

https://doi.org/10.3969/j.issn.1674-5906.2011.01.001      URL      [本文引用: 3]      摘要

基于全国第二次土壤普查的1491个水稻土剖面数据,研究了行政 区幅度(行政大区、省级、地区级)和土壤区幅度(土壤区、土壤带、土区)下水稻土氮密度的变异情况及其对幅度拓展的响应.结果表明,中国水稻土0~20 cm和0~100 cm土壤平均氮密度为18.7t·hm-2和12.4 t·hm-2,土壤氮素的空间变异性总体随幅度拓展而增大.在不同土壤区幅度下,0~20 cm全氮密度组内和组间变异性随研究区幅度的减小而减小;从土壤区到土区尺度.0~100 cm全氮密度组间变异率从250%左右下降至不足50%,而在行政大区和省级尺度时全氮密度的组间变异性没有太大变化,变异率都小于100%,到地级市尺 度时,组间变异率又达到了400%,说明同一地区所选择的研究幅度尺度不一样氮密度变异率差异也很大.因此,在今后的水稻土调查采样设计时,根据不同幅度 尺度的变异率大小来选择合适的布点方式和样点数是十分必要的.
[20] 朱阿兴, 李宝林, 裴韬, . 2008. 精细数字土壤普查模型与方法[M]. 北京: 科学出版社.

[本文引用: 6]     

[Zhu A X, Li B L, Pei T, et al.2008. Jingxi shuzi turang pucha moxing yu fangfa[M]. Beijing, China: Science Press.]

[本文引用: 6]     

[21] 朱鹤健, 陈健飞, 陈松林, . 2010. 土壤地理学[M]. 2版. 北京: 高等教育出版社.

[本文引用: 1]     

[Zhu H J, Chen J F, Chen S L, et al.2010. Turang dilixue[M]. 2nd ed. Beijing, China: Higher Education Press.]

[本文引用: 1]     

[22] Behrens T, Schmidt K, Ramirez-Lopez L, et al.2014.

Hyper-scale digital soil mapping and soil formation analysis

[J]. Geoderma, 213: 578-588.

https://doi.org/10.1016/j.geoderma.2013.07.031      URL      [本文引用: 1]      摘要

Landscape characteristics show local, regional and supra-regional components. As a result pedogenesis and the spatial distribution of soil properties are both influenced by features emerging at multiple scales. To account for this effect in a predictive model, descriptors of the geomorphic signature are required at multiple scales. In this study, we present a new hyper-scale terrain analysis approach, referred to as Contextual Statistical Mapping (ConStat), which is based on statistical neighborhood measures derived for growing sparse circular neighborhoods. The statistical measures tested comprise basic descriptors such as the minimum, maximum, mean, standard deviation, and skewness, as well as statistical terrain attributes and directional components. We propose a data mining framework to determine the relevant statistical measures at the relevant scales to analyze and interpret the influence of these statistical measures and to map the geomorphic structures influencing soil formation and the regions where a statistical measure shows influence. We introduce ConStat on two landscape-scale DSM examples with different soil genesis regimes where the ConStat terrain features serve as proxies for multi-scale variations of climate and parent material conditions. The results show that ConStat provides high predictive power. The cross-validated R2 values range from 0.63 for predicting topsoil clay content in the Piracicaba area (Brazil) to 0.68 for topsoil silt content in the Rhine-Hesse area (Germany). The results obtained from data mining analysis allow for interpretations beyond conventional concepts and approaches to explain soil formation. As such it overcomes the trade-off between accuracy and interpretability of soil property predictions.
[23] Bell J C, Cunningham R L, Havens M W.1992.

Calibration and validation of a soil-landscape model for predicting soil drainage class

[J]. Soil Science Society of America Journal, 56(6): 1860-1866.

https://doi.org/10.2136/sssaj1992.03615995005600060035x      URL      [本文引用: 1]      摘要

ABSTRACT A statistical model was developed that relates soil drainage classes to eight landscape parameters describing slope morphology, proximity to surface drainage features, and soil parent material. Soil profiles and landscape parameters were described at 305 randomly selected sampling points within the Mifflintown 7.5-min topographic quadrangle in central Pennsylvania. Variables defining the spatial structure of the landscape were derived from digitized maps and the data were stored in a geographic information system. These soil-landscape combinations were used to derive a statistical soil-landscape model using multivariate discriminant analysis and class frequency information. The model correctly predicted a majority of the observations within each drainage class and provided a consistent method of extrapolating point information about soils to the three-dimensional landscape. -from Authors
[24] Bell J C, Cunningham R L, Havens M W.1994.

Soil drainage class probability mapping using a soil-landscape model

[J]. Soil Science Society of America Journal, 58(2): 464-470.

https://doi.org/10.2136/sssaj1994.03615995005800020031x      URL      [本文引用: 1]     

[25] Besson A, Cousin I, Richard G, et al.2010.

Changes in field soil water tracked by electrical resistivity

[M]//Viscarra Rossel R A, McBratney A B, Minasny B. Proximal soil sensing. Dordrecht, Netherlands: Springer.

[本文引用: 1]     

[26] Bezdek J C.1981.

Cluster validity

[M]//Bezdek J C. Pattern recognition with fuzzy objective function algorithms. Boston, MA: Springer.

[本文引用: 1]     

[27] Bishop T F A, McBratney A B, Laslett G M.1999.

Modelling soil attribute depth functions with equal-area quadratic smoothing splines

[J]. Geoderma, 91(1-2): 27-45.

https://doi.org/10.1016/S0016-7061(99)00003-8      URL      [本文引用: 1]      摘要

Abstract The objective of this paper is to test the ability of equal-area quadratic splines to predict soil depth functions based on bulk horizon data. In addition, the possibility of improving the prediction quality by the use of additional samples from the top and/or bottom of soil profiles along with horizon data is examined. The predictive performance of the splines is compared with that of exponential decay functions, and 1st and 2nd degree polynomials. In addition, the predictive quality of the conventional horizon data is examined. The measure of predictive performance used is the root mean square error values calculated from differences between the ‘true’ depth function and the fitted depth function. The ‘true’ depth functions were derived from the intensive sampling and laboratory analysis of soil profiles. Three soil profiles were sampled; a Red Podzolic Soil (Red Kurosol), Podzol (Aeric Podosol) and Krasnozem (Red Ferrosol). The soil attributes that were measured included; pH, electrical conductivity (EC), clay %, sand %, organic carbon %, gravimetric water content at 6133 kPa and air dry. The results clearly indicated the superiority of equal-area quadratric splines in predicting depth functions. Such splines depend on a parameter, λ that controls goodness-of-fit vs. roughness. Their quality of fit varied with the λ value used and it was found that a λ value of 0.1 was the best overall predictor of the depth functions. The results also showed that using additional samples from the top and/or bottom of the soil profiles improved the prediction quality of the spline functions.
[28] Blaszczynski J S.1997.

Landform characterization with geographic information systems

[J]. Photogrammetric Engineering and Remote Sensing, 53(2): 183-191.

https://doi.org/10.1016/S0031-0182(97)81130-3      URL      [本文引用: 1]      摘要

GIS-based methods for mapping and classification of the landscape surface into what can be understood as fourth-order-of-relief features and include convex areas and their crests, concave areas and their troughs, open concavities and enclosed basins, and horizontal and sloping flats are suggested.
[29] Boettinger J L.2010.

Environmental covariates for digital soil mapping in the Western USA

[M]//Boettinger J L, Howell D W, Moore A C, et al. Digital soil mapping: Bridging research, environmental application, and operation. Dordrecht, Netherlands: Springer.

[本文引用: 1]     

[30] Brus D J.1994.

Improving design-based estimation of spatial means by soil map stratification: A case study of phosphate saturation

[J]. Geoderma, 62(1-3): 33-246.

https://doi.org/10.1016/0016-7061(94)90038-8      URL      [本文引用: 1]      摘要

Abstract The usefulness of soil maps and maps of land use was evaluated to estimate the spatial means of several phosphate sorption characteristics in two areas with contrasting historical phosphate loads. The maps were used to stratify the areas for random sampling. This is a way of incorporating knowledge of spatial structure into a design-based sampling strategy. Three stratifications were evaluated, namely by land use, soil map unit and by both, in combination with three methods of allocating sample points to the strata: proportional, optimum and near-optimum. The efficiency of various stratified simple random sampling designs was calculated from data of one sample from each area. The phosphate sorption characteristics were: (a) the areic mass of P2O5 sorbed by soil, i.e the mass of P2O5 per M2 actually sorbed by soil above a reference depth; (b) the maximum areic mass Of P2O5 sorbed by soil, i.e the areic mass which can potentially be sorbed by soil above a reference depth; (c) the relative mass of phosphate sorbed by soil, i.e. the ratio of (a) and (b); (d) the areal fraction of soil saturated with phosphate, i.e. the fraction of an area with a relative mass of phosphate sorbed by soil greater than a critical value. For the maximum areic mass of P2O5 and the areic mass Of P2O5 sorbed by soil, stratification by soil map unit will be worthwhile in both areas. For the relative mass of phosphate sorbed by soil and the areal fraction of soil saturated with phosphate there will be a gain only where the historical phosphate load is small. The gain for the areal fraction of soil saturated with phosphate depends Strongly on the critical value of the relative mass of phosphate sorbed by soil. This gain may be further increased by stratifying also according to land use.
[31] Brus D J, Bogaert P, Heuvelink G B M.2008.

Bayesian maximum entropy prediction of soil categories using a traditional soil map as soft information

[J]. European Journal of Soil Science, 59(2): 166-177.

https://doi.org/10.1111/j.1365-2389.2007.00981.x      URL      [本文引用: 1]      摘要

Summary Bayesian Maximum Entropy was used to estimate the probabilities of occurrence of soil categories in the Netherlands, and to simulate realizations from the associated multi-point pdf. Besides the hard observations (H) of the categories at 8369 locations, the soil map of the Netherlands 1:50 000 was used as soft information (S). The category with the maximum estimated probability was used as the predicted category. The quality of the resulting BME(HS)-map was compared with that of the BME(H)-map obtained by using only the hard data in BME-estimation, and with the existing soil map. Validation with a probability sample showed that the use of the soft information in BME-estimation leads to a considerable and significant increase of map purity by 15%. This increase of map purity was due to the high purity of the existing soil map (71.3%). The purity of the BME(HS) was only slightly larger than that of the existing soil map. This was due to the small correlation length of the soil categories. The theoretical purity of the BME-maps overestimated the actual map purity, which can be partly explained by the biased estimates of the one-point bivariate probabilities of hard and soft categories of the same label. Part of the hard data is collected to describe characteristic soil profiles of the map units which explains the bias. Therefore, care must be taken when using the purposively selected data in soil information systems for calibrating the probability model. It is concluded that BME is a valuable method for spatial prediction and simulation of soil categories when the number of categories is rather small (say < 10). For larger numbers of categories, the computational burden becomes prohibitive, and large samples are needed for calibration of the probability model.
[32] Brus D J, Heuvelink G B M.2007.

Optimization of sample patterns for universal kriging of environmental variables

[J]. Geoderma, 138(1-2): 86-95.

https://doi.org/10.1016/j.geoderma.2006.10.016      URL      [本文引用: 3]      摘要

The quality of maps obtained by interpolation of observations of a target environmental variable at a restricted number of locations, is partly determined by the spatial pattern of the sample locations. A method is presented for optimization of the sample pattern when the environmental variable is interpolated with the help of exhaustively known covariates, which are assumed to be linearly related to the target variable. In this method the spatially averaged universal kriging variance (MUKV), which incorporates trend estimation error as well as spatial interpolation error, is minimized. The optimal pattern is obtained using simulated annealing. The method requires that the covariance function or variogram of the regression-residuals is known. The method is tested in a case study on the Mean Highest Water table in a coversand area in The Netherlands. The patterns of 25, 50 and 100 sample locations are optimized and compared with the patterns optimized with the ordinary kriging (OK) model (assuming no trend) and with the multiple linear regression (MLR) model (assuming no spatial autocorrelation of residuals). The results show that the UK-patterns are a good compromise between spreading in geographic space and feature space. The MUKV for the UK-patterns is 19% ( n 02=0225), 7% ( n 02=0250) and 3% ( n 02=02100) smaller than for the OK-patterns. Compared with the MLR-patterns the reduction is 2%, 4% and 4%, respectively.
[33] Brus D J, Kempen B, Heuvelink G B M.2011.

Sampling for validation of digital soil maps

[J]. European Journal of Soil Science, 62(3): 394-407.

https://doi.org/10.1111/j.1365-2389.2011.01364.x      URL      [本文引用: 1]      摘要

The increase in digital soil mapping around the world means that appropriate and efficient sampling strategies are needed for validation. Data used for calibrating a digital soil mapping model typically are non-random samples. In such a case we recommend collection of additional independent data and validation of the soil map by a design-based sampling strategy involving probability sampling and design-based estimation of quality measures. An important advantage over validation by data-splitting or cross-validation is that model-free estimates of the quality measures and their standard errors can be obtained, and thus no assumptions on the spatial auto-correlation of prediction errors need to be made. The quality of quantitative soil maps can be quantified by the spatial cumulative distribution function (SCDF) of the prediction errors, whereas for categorical soil maps the overall purity and the map unit purities (user's accuracies) and soil class representation (producer's accuracies) are suitable quality measures. The suitability of five basic types of random sampling design for soil map validation was evaluated: simple, stratified simple, systematic, cluster and two-stage random sampling. Stratified simple random sampling is generally a good choice: it is simple to implement, estimation of the quality measures and their precision is straightforward, it gives relatively precise estimates, and no assumptions are needed in quantifying the standard error of the estimated quality measures. Validation by probability sampling is illustrated with two case studies. A categorical soil map on point support depicting soil classes in the province of Drenthe of the Netherlands (268 000 ha) was validated by stratified simple random sampling. Sub-areas with different expected purities were used as strata. The estimated overall purity was 58% with a standard error of 4%. This was 9% smaller than the theoretical purity computed with the model. Map unit purities and class representations were estimated by the ratio estimator. A quantitative soil map, depicting the average soil organic carbon (SOC) contents of pixels in an area of 81 600 ha in Senegal, was validated by random transect sampling. SOC predictions were seriously biased, and the random error was considerable. Both case studies underpin the importance of independent validation of soil maps by probability sampling, to avoid unfounded trust in visually attractive maps produced by advanced pedometric techniques.
[34] Burgess T M, Webste R.1980.

Optimal interpolation and isarithmic mapping of soil properties

[J]. Journal of Soil Science, 31(2): 333-341.

https://doi.org/10.1111/j.1365-2389.1980.tb02085.x      URL      [本文引用: 3]     

[35] Burrough P A, 1989.

Fuzzy mathematical methods for soil survey and land evaluation

[J]. European Journal of Soil Science, 40(3): 477-492.

https://doi.org/10.1111/j.1365-2389.1989.tb01290.x      URL      [本文引用: 1]      摘要

SUMMARY The rigid-data model consisting of discrete, sharply bounded internally uniform entities that is used in hierarchical and relational databases of soil profiles, choropleth soil maps and land evaluation classifications ignores important aspects of reality caused by internal inhomogeneity, short-range spatial variation, measurement error, complexity and imprecision. Considerable loss of information can occur when data that have been classified according to this model are retrieved or combined using the methods of simple Boolean algebra available in most soil and geographical information systems. Fuzzy set theory, which is a generalization of Boolean algebra to situations where data are modelled by entities whose attributes have zones of gradual transition, rather than sharp boundaries, offers a useful alternative to existing methodology. The basic principles of fuzzy sets, operations on fuzzy sets and the derivation of membership functions according to the Semantic Import Model are explained and illustrated with data from case studies in Venezuela and Kenya.
[36] Chang C-W, Laird D A, Mausbach M J, et al.2001.

Near-infrared reflectance spectroscopy-principal components regression analyses of soil properties

[J]. Soil Science Society of America Journal, 65(2): 480-490.

https://doi.org/10.2136/sssaj2001.652480x      URL      [本文引用: 1]      摘要

A fast and convenient soil analytical technique is needed for soil quality assessment and precision soil management. The main objective of this study was to evaluate the ability of near-infrared reflectance spectroscopy (NIRS) to predict diverse soil properties. Near-infrared reflectance spectra, obtained from a Perstrop NIR Systems 6500 scanning monochromator (Foss NIRSystems, Silver Spring, MD), and 33 chemical, physical, and biochemical properties were studied for 802 soil samples collected from four Major Land Resource Areas (MLRAs). Calibrations were based on principal component regression (PCR) using the first derivatives of optical density [log(1/R)] for the 1300- to 2500-nm spectral range. Total C, total N, moisture, cation-exchange capacity (CEC), 1.5 MPa water, basal respiration rate, sand, silt, and Mehlich III extractable Ca were successfully predicted by NIRS (r
[37] Clifford D, Payne J E, Pringle M J, et al.2014.

Pragmatic soil survey design using flexible Latin hypercube sampling

[J]. Computers & Geosciences, 67: 62-68.

https://doi.org/10.1016/j.cageo.2014.03.005      URL      [本文引用: 1]      摘要

We review and give a practical example of Latin hypercube sampling in soil science using an approach we call flexible Latin hypercube sampling. Recent studies of soil properties in large and remote regions have highlighted problems with the conventional Latin hypercube sampling approach. It is often impractical to travel far from tracks and roads to collect samples, and survey planning should recognise this fact. Another problem is how to handle target sites that, for whatever reason, are impractical to sample should one just move on to the next target or choose something in the locality that is accessible? Working within a Latin hypercube that spans the covariate space, selecting an alternative site is hard to do optimally. We propose flexible Latin hypercube sampling as a means of avoiding these problems. Flexible Latin hypercube sampling involves simulated annealing for optimally selecting accessible sites from a region. The sampling protocol also produces an ordered list of alternative sites close to the primary target site, should the primary target site prove inaccessible. We highlight the use of this design through a broad-scale sampling exercise in the Burdekin catchment of north Queensland, Australia. We highlight the robustness of our design through a simulation study where up to 50% of target sites may be inaccessible.
[38] Davies B E, Gamm S A.1970.

Trend surface analysis applied to soil reaction values from Kent, England

[J]. Geoderma, 3(3): 223-231.

https://doi.org/10.1016/0016-7061(70)90022-4      URL      [本文引用: 1]      摘要

Soil pH data were obtained for a 2 km 2 area, containing a characteristic pattern of calcareous and non-calcareous soils, by sampling on a regular grid. The data were investigated by trend surface analysis and real cubic surfaces were produced for two soil depths. These surfaces accorded with predictions based on the nature and distribution of the soils and the patterns of residuals were explained satisfactorily by local pedogenetic factors.
[39] de Gruijter J J, Bierkens M F P, Brus D J, et al.2006.

Sampling for natural resource monitoring

[M]. Berlin, Germany: Springer-Verlag: 331.

[本文引用: 1]     

[40] Dobos E, Montanarella L, Nègre T, et al.2001.

A regional scale soil mapping approach using integrated AVHRR and DEM data

[J]. International Journal of Applied Earth Observation and Geoinformation, 3(1): 30-42.

https://doi.org/10.1016/S0303-2434(01)85019-4      URL      [本文引用: 1]      摘要

Hay una necesidad creciente de bases de datos de suelos a peque09a escala pero razonablemente precisas. La compilación de una base de datos de suelos a escala continental o global requiere una gran cantidad de datos de suelos que sean precisos desde los puntos de vista espacial y temático. El objectivo de este estudio fue el de probar un método de cartografia de suelos a peque09a escala realizada en Italia mediante uso del radiómetro avanzado de muy alta resolucíon (Advanced Very High Resolution Radiometer, AVHRR) y datos altitudinales en formato digital. En un estudio previo en Hungria, se aplicó este método a un área de mucho menor extensión y con un ambiente de formación de suelos significativamente diferente. Para el presente estudio se usó una base integrada con 45 capas combinando datos de AVHRR y de terreno, incluyendo un modelo digital de elevación (MDE), pendiente, curvatura, exposición, densidad potencial de drenaje, y las cinco bandas de datos AVHRR para ocho fechas diferentes. Se procesaron los datos mediante una función de extracción de rasgos por análisis discriminante (Discriminant Analysis Feature Extraction, DAFE), basada en un procedimiento de análisis canónico. Se clasificaron dos tipos de imagen (básico y transformado) usando el clasificador de máxima verosimilitud. Se escogieron dos conjuntos de prueba con idéntica cobertura geográfica, pero con distinto nivel de clasificación de suelos. Un conjunto estaba integrado por unidades de suelos (SU) de la leyenda revisada de la FAO, mientras que el otro conjunto representaba las agrupaciones de suelos mayores (MSG). Se seleccionaron y clasificaron los mejores conjuntos de capas, incluyendo 10, 15, 20, 25, 30, 35, 40 y 45 capas respectivamente, mediante el método de Bhattachryya para la seleción de rasgos. Se compararon los resultados de los diferentes conjuntos. Se interpretaron también los rendimientos obtenidos con las imágenes AVHRR solas y con las imágenes basadas únicamente en datos de terreno, respectivamente. Los resultados indican que los descriptors de terreno solos no son suficientes para clasificación de suelos. Sin embargo, los algoritmos de selección de rasgos siempre seleccionaron el MDE y sus derivados entre los primeros, lo que subraya su importancia para la caracterización del paisaje edáfico. Cuando se utilizaron solamente datos AVHRR, las clases de prueba rindieron 49.8% para los MSG y 48.6% para los SU. La integración de datos de terreno en la base de datos AVHRR produjo mejoramientos relativamente peque09os (4.6% y 2.8%). Los mejores rendimientos con las clases de prueba se obtuvieron cuando se utilization todos los canales disponibles para la clasificación, con 51.4% para los SU de la FAO y 54.4% para los MSG en la imagen básica, y con 51.7% y 54.4% respectivamente en las imágenes transformadas mediante DAFE. Las bandas AVHRR con mayor información eran las obtenidas en primavera (abril-mayo), mientras que las bandas más abundantes resultaron ser la banda 1 (rojo visible) y las bandas 3 y 4.
[41] Dunn J C.1973.

A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters

[J]. Journal of Cybernetics, 3(3): 32-57.

https://doi.org/10.1080/01969727308546046      URL      [本文引用: 1]      摘要

Two fuzzy versions of the k-means optimal, least squared error partitioning problem are formulated for finite subsets X of a general inner product space. In both cases, the extremizing solutions are shown to be fixed points of a certain operator T on the class of fuzzy, k-partitions of X, and simple iteration of T provides an algorithm which has the descent property relative to the least squared error criterion function. In the first case, the range of T consists largely of ordinary (i.e. non-fuzzy) partitions of X and the associated iteration scheme is essentially the well known ISODATA process of Ball and Hall. However, in the second case, the range of T consists mainly of fuzzy partitions and the associated algorithm is new; when X consists of k compact well separated (CWS) clusters, Xi, this algorithm generates a limiting partition with membership functions which closely approximate the characteristic functions of the clusters Xi. However, when X is not the union of k CWS clusters, the limiting partition is truly fuzzy in the sense that the values of its component membership functions differ substantially from 0 or 1 over certain regions of X. Thus, unlike ISODATA, the 090008fuzzy090009 algorithm signals the presence or absence of CWS clusters in X. Furthermore, the fuzzy algorithm seems significantly less prone to the 090008cluster-splitting090009 tendency of ISODATA and may also be less easily diverted to uninteresting locally optimal partitions. Finally, for data sets X consisting of dense CWS clusters embedded in a diffuse background of strays, the structure of X is accurately reflected in the limiting partition generated by the fuzzy algorithm. Mathematical arguments and numerical results are offered in support of the foregoing assertions.
[42] Fayyad U, Piatetsky-Shapiro G, Smyth P.1996.

From data mining to knowledge discovery in databases

[M]//Piatetsky-Shapiro G. Advances in knowledge discovery and data mining. Menlo Park: AAAI Press, 37.

[本文引用: 1]     

[43] Gao B B, Pan Y C, Chen Z Y, et al.2016.

A spatial conditioned Latin hypercube sampling method for mapping using ancillary data

[J]. Transactions in GIS, 2016, 20(5): 735-754. doi: 10.1111/tgis.12176.

URL      [本文引用: 1]      摘要

Abstract For obtaining maps of good precision by the spatial inference method, the distribution of sampling sites in geographical and feature space is very important. For a regional variable with trends, the predicting error comes from trend estimation, variogram estimation and spatial interpolation. Based on the cLHS (conditioned Latin hypercube Sampling) method, a sampling method called scLHS (spatial cLHS) considering all these three aspects with the help of ancillary data is proposed in this article. Its advantage lies in simultaneously improving trend estimation, variogram estimation and spatial interpolation. MODIS data and simulated data were used as sampling fields to draw sample sets using scLHS, cLHS, cLHS with x and y coordinates as covariates, simple random and spatial even sampling methods, and the distribution and prediction errors of sample sets from different methods were evaluated. The results showed that scLHS performed well in balancing spreading in geographic and feature space, and can generate points pairs with small distances, and the sample sets drawn by scLHS produced smaller mapping error, especially when there were trends in the target variable.
[44] Godinho Silva S H, Owens P R, De Menezes M D, et al.2014.

A technique for low cost soil mapping and validation using expert knowledge on a watershed in Minas Gerais, Brazil

[J]. Soil Science Society of America Journal, 78(4): 1310-1319.

https://doi.org/10.2136/sssaj2013.09.0382      URL      [本文引用: 1]      摘要

Understanding the soil attributes and types occurring within a region is critical for providing the best land-use decisions. Soils vary in their ability to clean and store water, provide water for plant growth, and many other ecosystem services. Soil variability is dependent on climate, parent material, organisms, time, and topography. When only topography varies within an area, the topography and redistribution of water should be the main drivers for soils differentiation. Digital soil mapping (DSM) has advantages due to computational tools and easily accessible digital elevation models (DEMs) at multiple resolutions. Terrain attributes (e.g., slope, wetness index, and profile curvature) are derived from the DEM and, in association with a soil expert, knowledge-based models can be applied to predict soil variability. The objective of this study was to create and validate a predicted Cambisol (Inceptisol) solum depth map for Lavrinha Creek Watershed (LCW) in Minas Gerais, Brazil, by applying DSM techniques for the Brazilian soil landscapes. The best available 30-m DEM was used to derive the terrain derivatives. A set of rules were formulated according to the terrain attributes, limited data, and expert knowledge to predict the solum depth behavior throughout the watershed. Conditioned Latin hypercube sampling scheme was used for allocating the validation points. In this study, 20 out of the 25 validating samples were correctly classified yielding a Kappa index of 0.616. Soil expert knowledge and Digital Soil Mapping techniques can be employed for mapping areas, especially in countries where there is limited data available, which will provide a useful soil map for planning while saving time and investments.
[45] Godinho Silva S H, Owens P R, Silva B M, et al.2015.

Evaluation of conditioned Latin hypercube sampling as a support for soil mapping and spatial variability of soil properties

[J]. Soil Science Society of America Journal, 79(2): 603-611.

https://doi.org/10.2136/sssaj2014.07.0299      URL      [本文引用: 1]      摘要

Abstract In soil surveys, the number of collected samples is commonly reduced by factors that hamper field activities, such as rugged terrain and lack of roads. Conditioned Latin hypercube (CLH) sampling has been used to properly capture soil variability across the landscape, whereas cost-constrained conditioned Latin hypercube (CCLH) sampling limits the sampling to areas of easy access. The objectives of this work were to: (i) compare the efficiency of CLH and CCLH sampling systems to create soil maps, considering the number of soil classes covered per system, (ii) compare both systems to map soil A horizon thickness, and (iii) generate a detailed soil map of the study area to assist in decision makings. The study was performed in Minas Gerais, Brazil. A digital elevation model (DEM) and its terrain derivatives were the basis for CLH and CCLH to determine the sampling points. The CCLH system also required a cost map that represented the difficulty of reaching every place in the area. At the sampling locations, soil information was observed, allowing for the creation of those maps that were further validated in the field. Kappa index, global index (GI), RMSE, 1:1 ratio graphic, and R2 were the comparison parameters. Conditioned Latin hypercube presented higher accuracy than CCLH to represent both soil classes and soil attributes, although the samples were spread out in the area. Cost-constrained conditioned Latin hypercube was less representative than CLH, but it may contribute to soil sampling in areas of difficult access, common in developing countries, such as Brazil.
[46] Goovaerts P.1999.

Geostatistics in soil science: State-of-the-art and perspectives

[J]. Geoderma, 89(1-2): 1-45.

https://doi.org/10.1016/S0016-7061(98)00078-0      URL      [本文引用: 4]      摘要

This paper presents an overview of the most recent developments in the field of geostatistics and describes their application to soil science. Geostatistics provides descriptive tools such as semivariograms to characterize the spatial pattern of continuous and categorical soil attributes. Various interpolation (kriging) techniques capitalize on the spatial correlation between observations to predict attribute values at unsampled locations using information related to one or several attributes. An important contribution of geostatistics is the assessment of the uncertainty about unsampled values, which usually takes the form of a map of the probability of exceeding critical values, such as regulatory thresholds in soil pollution or criteria for soil quality. This uncertainty assessment can be combined with expert knowledge for decision making such as delineation of contaminated areas where remedial measures should be taken or areas of good soil quality where specific management plans can be developed. Last, stochastic simulation allows one to generate several models (images) of the spatial distribution of soil attribute values, all of which are consistent with the information available. A given scenario (remediation process, land use policy) can be applied to the set of realizations, allowing the uncertainty of the response (remediation efficiency, soil productivity) to be assessed.
[47] Goulard M, Voltz M.1992.

Linear coregionalization model: Tools for estimation and choice of cross-variogram matrix

[J]. Mathematical Geology, 24(3): 269-286.

https://doi.org/10.1007/BF00893750      URL      [本文引用: 1]      摘要

The geostatistical analysis of multivariate data involves choosing and fitting theoretical models to the empirical matrix. This paper considers the specific case of the model of linear coregionalization, and describes an automated procedure for fitting models, that are adequate in the mathematical sense, using a least-squares like technique. It also describes how to decide whether the number of parameters of the cross-variogram matrix model should be reduced to improve stability of fit. The procedure is illustrated with an analysis of the spatial relations among the physical properties of an alluvial soil. The results show the main influence of the scale and the shape of the basic models on the goodness of fit. The choice of the number of basic models appears of secondary importance, though it greatly influences the resulting interpretation of the coregionalization analysis.
[48] Gray J M, Bishop T F A, Wilford J R.2016.

Lithology and soil relationships for soil modelling and mapping

[J]. CATENA, 147: 429-440.

https://doi.org/10.1016/j.catena.2016.07.045      URL      [本文引用: 2]      摘要

Parent material covariates are essential for the effective modelling and mapping of soil properties. Widely available lithology data have the potential for greater use in digital soil modelling and mapping (DSMM) programs. We compared the performance of the classified lithology data with other continuous, geophysical parent material covariates such as gamma radiometrics in digital soil models and maps over NSW. The lithology covariate was demonstrated to exert the greatest influence on all six soil properties, coming well ahead of all geophysical parent material and other environmental covariates. Validation statistics demonstrated strong improvement in both model and map quality when the lithology covariate was included. For example, Lin's concordance for the Cubist sum of bases model rose from 0.46 with no parent material covariates to 0.58 with the continuous geophysical covariates to a high of 0.77 when lithology was also used. The improvement was typically slightly less marked in the final digital maps than for the calibration models, probably due to the lower reliability of the lithology grid derived from broad scale polygonal geological and soil data. A process is suggested for the application of lithology data into DSMM programs. Despite the potential drawbacks of using polygonal data, properly organised categorical lithology data can be a strong covariate to complement other continuous geophysical data sources in DSMM programs, particularly where reliable and fine scale geological and soil data are available.
[49] Grimm R, Behrens T, Märker M, et al.2008.

Soil organic carbon concentrations and stocks on Barro Colorado Island: Digital soil mapping using Random Forests analysis

[J]. Geoderma, 146(1-2): 102-113.

https://doi.org/10.1016/j.geoderma.2008.05.008      URL      [本文引用: 1]      摘要

Spatial estimates of tropical soil organic carbon (SOC) concentrations and stocks are crucial to understanding the role of tropical SOC in the global carbon cycle. They also allow for spatial variation of SOC in environmental process models. SOC is spatially highly variable. In traditional approaches, SOC concentrations and stocks have been derived from estimates for single or very few profiles and spatially linked to existing units of soil or vegetation maps. However, many existing soil profile data are incomplete and untested as to whether they are representative or unbiased. Also single means for soil or vegetation map units cannot characterize SOC spatial variability within these units. We here use the digital soil mapping approach to predict the spatial distribution of SOC. This relies on a soil inference model based on spatially referenced environmental layers of topographic attributes, soil units, parent material, and forest history. We sampled soils at 165 sites, stratified according to topography and lithology, on Barro Colorado Island (BCI), Panama, at depths of 0–10cm, 10–20cm, 20–30cm, and 30–50cm, and analyzed them for SOC by dry combustion. We applied Random Forest (RF) analysis as a modeling tool to the SOC data for each depth interval in order to compare vertical and lateral distribution patterns. RF has several advantages compared to other modeling approaches, for instance, the fact that it is neither sensitive to overfitting nor to noise features. The RF-based digital SOC mapping approach provided SOC estimates of high spatial resolution and estimates of error and predictor importance. The environmental variables that explained most of the variation in the topsoil (0–10cm) were topographic attributes. In the subsoil (10–50cm), SOC distribution was best explained by soil texture classes as derived from soil mapping units. The estimates for SOC stocks in the upper 30cm ranged between 38 and 116Mg ha 61021 , with lowest stocks on midslope and highest on toeslope positions. This digital soil mapping approach can be applied to similar landscapes to refine the spatial resolution of SOC estimates.
[50] Guo S X, Meng L K, Zhu A-X, et al.2015.

Data-gap filling to understand the dynamic feedback pattern of soil

[J]. Remote Sensing, 7(9): 11801-11820.

https://doi.org/10.3390/rs70911801      URL      [本文引用: 1]      摘要

Detailed and accurate information on the spatial variation of soil over low-relief areas is a critical component of environmental studies and agricultural management. Early studies show that the pattern of soil dynamics provides comprehensive information about soil and can be used as a new environmental covariate to indicate spatial variation in soil in low relief areas. In practice, however, data gaps caused by cloud cover can lead to incomplete patterns over a large area. Missing data reduce the accuracy of soil information and make it hard to compare two patterns from different locations. In this study, we introduced a new method to fill data gaps based on historical data. A strong correlation between MODIS band 7 and cumulated reference evapotranspiration (CET0) has been confirmed by theoretical derivation and by the real data. Based on this correlation, data gaps in MODIS band 7 can be predicted by daily evaporation data. Furthermore, correlations among bands are used to predict soil reflectance in MODIS bands 1-6 from MODIS band 7. A location in northeastern Illinois with a large area of low relief farmland was selected to examine this idea. The results show a good exponential relationship between MODIS band 7 and CET00.5 in most locations of the study area (with average R-2 = 0.55, p < 0.001, and average NRMSE 10.40%). A five-fold cross validation shows that the approach proposed in this study captures the regular pattern of soil surface reflectance change in bands 6 and 7 during the soil drying process, with a Normalized Root Mean Square Error (NRMSE) of prediction of 13.04% and 10.40%, respectively. Average NRMSE of bands 1-5 is less than 20%. This suggests that the proposed approach is effective for filling the data gaps from cloud cover and that the method reduces the data collection requirement for understanding the dynamic feedback pattern of soil, making it easier to apply to larger areas for soil mapping.
[51] Guo S X, Zhu A-X, Meng L K, et al.2016.

Unification of soil feedback patterns under different evaporation conditions to improve soil differentiation over flat area

[J]. International Journal of Applied Earth Observation and Geoinformation, 49: 126-137.

https://doi.org/10.1016/j.jag.2016.02.002      URL      [本文引用: 1]      摘要

Detailed and accurate information on the spatial variation of soil types and soil properties are critical components of environmental research and hydrological modeling. Early studies introduced a soil feedback pattern as a promising environmental covariate to predict spatial variation over low-relief areas. However, in practice, local evaporation can have a significant influence on these patterns, making them incomparable at different locations. This study aims to solve this problem by examining the concept of transforming the dynamic patterns of soil feedback from the original time-related space to a new evaporation-related space. A study area in northeastern Illinois with large low-relief farmland was selected to examine the effectiveness of this idea. Images from MODIS in Terra for every April ay period over 12 years (2000 2011) were used to extract the soil feedback patterns. Compared to the original time-related space, the results indicate that the patterns in the new evaporation-related space tend to be more stable and more easily captured from multiple rain events regardless of local evaporation conditions. Random samples selected for soil subgroups from the SSURGO soil map show that patterns in the new space reveal a difference between different soil types. And these differences in patterns are closely related to the difference in the soil structure of the surface layer.
[52] Hallema D W, Lafond J A, Périard Y, et al.2015.

Long-term effects of peatland cultivation on soil physical and hydraulic properties: Case study in Canada

[J]. Vadose Zone Journal, 14(6): 1-12.

https://doi.org/10.2136/.vzj14.10.0147      URL      [本文引用: 1]      摘要

Organic soils are an excellent substrate for commercial lettuce (L.) farming; however, drainage accelerates oxidation of the surface layer and reduces the water holding capacity, which is often lethal for crops that are sensitive to water stress. In this case study, we analyzed 942 peat samples from a large cultivated peatland complex (18.7 km) in southern Quebec, Canada, and demonstrated from spatial and temporal patterns that agriculture resulted in a compacted layer below the root zone. We grouped the samples based on the year in which the corresponding fields were created on the previously undisturbed peatland (cutoff years 1970, 1980, 1990, and 2000) and discovered that bulk density has continued to increase, partly due to the overburden pressure, while organic matter has continued to decline since the fields were reclaimed and drained in phases between 1955 and 2006. Saturated hydraulic conductivity (K) in the upper 20 cm was remarkably lower on fields older than 10 yr (p = 0.0973 for Wilcoxon rank test), with more samples having a K< 2.0 X 10yr. Soil water available capacity (SWAC) was between approximately 5 and 33 cm on fields reclaimed after 2000, while samples from fields reclaimed before 2000 had a lower SWAC between 2 and 23 cm (groups discernable at p = 0.0203). It is possible, however, that the greatest rate of change in Kand SWAC occurred within even a year of reclamation. The results of this study call for active measures to reduce organic soil degradation such as reducing tillage and on-field traffic or following a crop rotation scheme.
[53] Hengl T, de Jesus J M, Heuvelink G B M, et al.2017.

SoilGrids250m: Global gridded soil information based on machine learning

[J]. PLoS One, 12(2): e0169748, doi: 10.1371/journal.pone.0169748.

URL      PMID: 5313206      [本文引用: 5]      摘要

This paper describes the technical development and accuracy assessment of the most recent and improved version of the SoilGrids system at 250m resolution (June 2016 update). SoilGrids provides global predictions for standard numeric soil properties (organic carbon, bulk density, Cation Exchange Capacity (CEC), pH, soil texture fractions and coarse fragments) at seven standard depths (0, 5, 15, 30, 60, 100 and 200 cm), in addition to predictions of depth to bedrock and distribution of soil classes based on the World Reference Base (WRB) and USDA classification systems (ca. 280 raster layers in total). Predictions were based on ca. 150,000 soil profiles used for training and a stack of 158 remote sensing-based soil covariates (primarily derived from MODIS land products, SRTM DEM derivatives, climatic images and global landform and lithology maps), which were used to fit an ensemble of machine learning methods andom forest and gradient boosting and/or multinomial logistic regression s implemented in theRpackagesranger,xgboost,nnetandcaret. The results of 10 old cross-validation show that the ensemble models explain between 56% (coarse fragments) and 83% (pH) of variation with an overall average of 61%. Improvements in the relative accuracy considering the amount of variation explained, in comparison to the previous version of SoilGrids at 1 km spatial resolution, range from 60 to 230%. Improvements can be attributed to: (1) the use of machine learning instead of linear regression, (2) to considerable investments in preparing finer resolution covariate layers and (3) to insertion of additional soil profiles. Further development of SoilGrids could include refinement of methods to incorporate input uncertainties and derivation of posterior probability distributions (per pixel), and further automation of spatial modeling so that soil maps can be generated for potentially hundreds of soil variables. Another area of future research is the development of methods for multiscale merging of SoilGrids predictions with local and/or national gridded soil products (e.g. up to 50 m spatial resolution) so that increasingly more accurate, complete and consistent global soil information can be produced. SoilGrids are available under the Open Data Base License.
[54] Hengl T, de Jesus J M, MacMillan R A, et al.2014.

SoilGrids1km: Global soil information based on automated mapping

[J]. PLoS One, 9(8): e105992.

https://doi.org/10.1371/journal.pone.0105992      URL      [本文引用: 1]      摘要

Background Soils are widely recognized as a non-renewable natural resource and as biophysical carbon sinks. As such, there is a growing requirement for global soil information. Although several global soil information systems already exist, these tend to suffer from inconsistencies and limited spatial detail. Methodology/Principal Findings We present SoilGrids1km — a global 3D soil information system at 1 km resolution — containing spatial predictions for a selection of soil properties (at six standard depths): soil organic carbon (g kg611), soil pH, sand, silt and clay fractions (%), bulk density (kg m613), cation-exchange capacity (cmol+/kg), coarse fragments (%), soil organic carbon stock (t ha611), depth to bedrock (cm), World Reference Base soil groups, and USDA Soil Taxonomy suborders. Our predictions are based on global spatial prediction models which we fitted, per soil variable, using a compilation of major international soil profile databases (ca. 110,000 soil profiles), and a selection of ca. 75 global environmental covariates representing soil forming factors. Results of regression modeling indicate that the most useful covariates for modeling soils at the global scale are climatic and biomass indices (based on MODIS images), lithology, and taxonomic mapping units derived from conventional soil survey (Harmonized World Soil Database). Prediction accuracies assessed using 5–fold cross-validation were between 23–51%. Conclusions/Significance SoilGrids1km provide an initial set of examples of soil spatial data for input into global models at a resolution and consistency not previously available. Some of the main limitations of the current version of SoilGrids1km are: (1) weak relationships between soil properties/classes and explanatory variables due to scale mismatches, (2) difficulty to obtain covariates that capture soil forming factors, (3) low sampling density and spatial clustering of soil profile locations. However, as the SoilGrids system is highly automated and flexible, increasingly accurate predictions can be generated as new input data become available. SoilGrids1km are available for download via http://soilgrids.org under a Creative Commons Non Commercial license.
[55] Hengl T, Gruber S, Shrestha D P.2004.

Reduction of errors in digital terrain parameters used in soil-landscape modelling

[J]. International Journal of Applied Earth Observation and Geoinformation, 5(2): 97-112.

https://doi.org/10.1016/j.jag.2004.01.006      URL      [本文引用: 1]      摘要

Quality of digital elevation models (DEMs) and DEM-derived products directly affects the quality of terrain analysis applications. The objective of this work was to review and systematise methods to improve geomorphic plausibility of DEMs and minimise artefacts and outliers in terrain parameters. Three approaches to the reduction of errors in DEM and DEM-derived products are described: (a) by using empirical knowledge, e.g. to adjust elevations using medial axes or stream networks; (b) by applying filtering operations and (c) by averaging terrain parameters derived from multiple realisations of DEM, i.e. using error propagation technique. The methods were tested using a 3.8 3.8 km sample area covering two distinct landscapes: hilland and plain with terraces. The DEM was produced by linear interpolation of contour data. The proportion of artefacts (padi terraces) in the unfiltered DEM was 17.3%. After the addition of medial axes, filtering of outliers and adjustment of elevation for streams, the proportion of padi terraces was reduced to 2.2%. Remaining errors in terrain parameters such as undefined pixels and local outliers were reduced using repeated filtering (with iterations). All terrain parameters were also calculated by averaging multiple realisations. Both the filtering approach and averaging multiple realisations give somewhat smoother maps of terrain parameters. The advantage of filtering of outliers is that it employs the spatial autocorrelation structure. The advantage of averaging multiple realisations is that it can be easier automated. The reduction of errors improved the mapping of landform facets (classification) and solum thickness (regression). The classification accuracy increased from 51.3 to 72% and the of the regression model for the prediction of the solum thickness increased from 0.27 to 0.40. These methods can be used to filter DEMs derived from contour data and terrain parameters, but also to reduce errors in other types of gridded DEMs and DEM-derived products.
[56] Hengl T, Heuvelink G B M, Kempen B, et al.2015.

Mapping soil properties of africa at 250 m resolution: Random forests significantly improve current predictions

[J]. PLoS One, 10(6): e0125814.

https://doi.org/10.1371/journal.pone.0125814      URL      PMID: 4482144      [本文引用: 1]      摘要

80% of arable land in Africa has low soil fertility and suffers from physical soil problems. Additionally, significant amounts of nutrients are lost every year due to unsustainable soil management practices. This is partially the result of insufficient use of soil management knowledge. To help bridge the soil information gap in Africa, the Africa Soil Information Service (AfSIS) project was established in 2008. Over the period 2008 2014, the AfSIS project compiled two point data sets: the Africa Soil Profiles (legacy) database and the AfSIS Sentinel Site database. These data sets contain over 28 thousand sampling locations and represent the most comprehensive soil sample data sets of the African continent to date. Utilizing these point data sets in combination with a large number of covariates, we have generated a series of spatial predictions of soil properties relevant to the agricultural management rganic carbon, pH, sand, silt and clay fractions, bulk density, cation-exchange capacity, total nitrogen, exchangeable acidity, Al content and exchangeable bases (Ca, K, Mg, Na). We specifically investigate differences between two predictive approaches: random forests and linear regression. Results of 5-fold cross-validation demonstrate that the random forests algorithm consistently outperforms the linear regression algorithm, with average decreases of 15 75% in Root Mean Squared Error (RMSE) across soil properties and depths. Fitting and running random forests models takes an order of magnitude more time and the modelling success is sensitive to artifacts in the input data, but as long as quality-controlled point data are provided, an increase in soil mapping accuracy can be expected. Results also indicate that globally predicted soil classes (USDA Soil Taxonomy, especially Alfisols and Mollisols) help improve continental scale soil property mapping, and are among the most important predictors. This indicates a promising potential for transferring pedological knowledge from data rich countries to countries with limited soil data.
[57] Hengl T, Heuvelink G B M, Rossiter D G.2007.

About regression-kriging: From equations to case studies

[J]. Computers & Geosciences, 33(10): 1301-1315.

https://doi.org/10.1016/j.cageo.2007.05.001      URL      [本文引用: 1]      摘要

This paper discusses the characteristics of regression-kriging (RK), its strengths and limitations, and illustrates these with a simple example and three case studies. RK is a spatial interpolation technique that combines a regression of the dependent variable on auxiliary variables (such as land surface parameters, remote sensing imagery and thematic maps) with simple kriging of the regression residuals. It is mathematically equivalent to the interpolation method variously called “Universal Kriging” (UK) and “Kriging with External Drift” (KED), where auxiliary predictors are used directly to solve the kriging weights. The advantage of RK is the ability to extend the method to a broader range of regression techniques and to allow separate interpretation of the two interpolated components. Data processing and interpretation of results are illustrated with three case studies covering the national territory of Croatia. The case studies use land surface parameters derived from combined Shuttle Radar Topography Mission and contour-based digital elevation models and multitemporal-enhanced vegetation indices derived from the MODIS imagery as auxiliary predictors. These are used to improve mapping of two continuous variables (soil organic matter content and mean annual land surface temperature) and one binary variable (presence of yew). In the case of mapping temperature, a physical model is used to estimate values of temperature at unvisited locations and RK is then used to calibrate the model with ground observations. The discussion addresses pragmatic issues: implementation of RK in existing software packages, comparison of RK with alternative interpolation techniques, and practical limitations to using RK. The most serious constraint to wider use of RK is that the analyst must carry out various steps in different software environments, both statistical and GIS.
[58] Hudson B D.1992.

The soil survey as paradigm-based science

[J]. Soil Science Society of America Journal, 56(3): 836-841.

https://doi.org/10.2136/sssaj1992.03615995005600030027x      URL      [本文引用: 1]      摘要

ABSTRACT The soil-landscape model, on which the soil survey is based, is an operative paradigm. This article introduces concepts important to understanding paradigm theory and the nature of tacit knowledge. Students and field soil scientists should be provided explicit instruction concerning the paradigm on which soil mapping and interpretation are based. It is also recommended that more of the soil geographic relationships discovered while making detailed soil maps be described and published so that the knowledge can be communicated to others. -from Author
[59] Hughes J P, Lettenmaier D P.1981.

Data requirements for kriging: Estimation and network design

[J]. Water Resources Research, 17(6): 1641-1650.

https://doi.org/10.1029/WR017i006p01641      URL      [本文引用: 1]      摘要

Kriging, a technique for interpolating nonstationary spatial phenomena, has recently been applied to such diverse hydrologic problems as interpolation of piezometric heads and transmissivities estimated from hydrogeologic surveys and estimation of mean areal precipitation accumulations. An important concern for users of this technique is the effect of sample size on the precision of estimates obtained. Comparisons made between conventional least squares and kriging estimators indicate that for samples of size less than approximately 50, kriging offered no clear advantage over least squares in a Bayesean sense, although kriging may be preferable from the minimax viewpoint. A network design algorithm was also developed; tests performed using the algorithm indicated that the information content of identified networks was relatively insensitive to the size of the pilot network. These results suggest that within the range of sample sizes typically of hydrologic interest, kriging may hold more potential for network design than for data analysis.
[60] Hutchinson M F.1995.

Interpolating mean rainfall using thin plate smoothing splines

[J]. International Journal of Geographical Information Systems, 9(4): 385-403.

https://doi.org/10.1080/02693799508902045      URL      [本文引用: 1]      摘要

Thin plate smoothing splines provide accurate, operationally straightforward and computationally efficient solutions to the problem of the spatial interpolation of annual mean rainfall for a standard period from point data which contains many short period rainfall means. The analyses depend on developing a statistical model of the spatial variation of the observed rainfall means, considered as noisy estimates of standard period means. The error structure of this model has two components which allow separately for strong spatially correlated departures of observed short term means from standard period means and for uncorrelated deficiencies in the representation of standard period mean rainfall by a smooth function of position and elevation. Thin plate splines, with the degree of smoothing determining by minimising generalised cross validation, can estimate this smooth function in two ways. First, the spatially correlated error structure of the data can be accommodated directly by estimating the corresponding non-diagonal error covariance matrix. Secondly, spatial correlation in the data error structure can be removed by standardising the observed short term means to standard period mean estimates using linear regression. When applied to data both methods give similar interpolation accuracy, and error estimates of the fitted surfaces are in good agreement with residuals from withheld data. Simplified versions of the data error model, which require only minimal summary data at each location, are also presented. The interpolation accuracy obtained with these models is only slightly inferior to that obtained with more complete statistical models. It is shown that the incorporation of a continuous, spatially varying, dependence on appropriately scaled elevation makes a dominant contribution to surface accuracy. Incorporating dependence on aspect, as determined from a digital elevation model, makes only a marginal further improvement.
[61] Isaaks E H, Srivastava R M.1989. Applied geostatistics[M]. New York, NY: Oxford University Press.

[本文引用: 5]     

[62] Jenny H.1941.

Factors of soil formation: A system of quantitative pedology

[M]. New York, NY: McGraw-Hill.

[本文引用: 1]     

[63] Juang K W, Liao W J, Liu T L, et al.2008.

Additional sampling based on regulation threshold and Kriging variance to reduce the probability of false delineation in a contaminated site

[J]. Science of the Total Environment, 389(1): 20-28.

https://doi.org/10.1016/j.scitotenv.2007.08.025      URL      PMID: 17888495      [本文引用: 1]      摘要

Kriging-based delineation when used to determine a cost-effective remediation plan should be based on the spatial distribution of the pollutant. This study proposed an adaptive cluster sampling (ACS) approach based on the regulation threshold and kriging variance for additional sampling to improve the reliability of delineating a heavy-metal contaminated site. A reliability index for reducing the probability of false delineation was used to determine the size and configuration of additional samples. A data set of Ni concentrations in soil was used for illustration. The results showed that the additional sampled observations during ACS were clustered where the Ni concentrations were close to the regulation threshold of 200 mg kg(-1), and were located where the first-phased sampling density was low. Compared with a simple random sampling (SRS), the relative frequency of misclassification over the whole study area (RFMW) using ACS in a 100 replicates simulation was lower when the same sample number of pooled data was used. In addition, the spatial distribution of the local misclassification rate (LMR) showed that the area with a high-valued LMR could be reduced and that the LMR gradients in the region could be lowered by using ACS instead of SRS. The above results suggest that the proposed ACS approach could improve the reliability of kriging-based delineation of heavy-metal contaminated soils.
[64] Kempen B, Brus D J, Heuvelink G B M, et al.2009.

Updating the 1:50,000 Dutch soil map using legacy soil data: A multinomial logistic regression approach

[J]. Geoderma, 151(3-4): 311-326.

https://doi.org/10.1016/j.geoderma.2009.04.023      URL      [本文引用: 1]      摘要

The 1:50,000 national soil survey of the Netherlands, completed in the early 1990s after more than three decades of mapping, is gradually becoming outdated. Large-scale changes in land and water management that took place after the field surveys have had a great impact on the soil. Especially oxidation of peat soils has resulted in a substantial decline of these soils. The aim of this research was to update the national soil map for the province of Drenthe (2680km 2 ) without additional fieldwork through digital soil mapping using legacy soil data. Multinomial logistic regression was used to quantify the relationship between ancillary variables and soil group. Special attention was given to model-building as this is perhaps the most crucial step in digital soil mapping. A framework for building a logistic regression model was taken from the literature and adapted for the purpose of soil mapping. The model-building process was guided by pedological expert knowledge to ensure that the final regression model is not only statistically sound but also pedologically plausible. We built separate models for the ten major map units, representing the main soil groups, of the national soil map for the province of Drenthe. The calibrated models were used to estimate the probability of occurrence of soil groups on a 25m grid. Shannon entropy was used to quantify the uncertainty of the updated soil map, and the updated soil map was validated by an independent probability sample. The theoretical purity of the updated map was 67%. The estimated actual purity of the updated map, as assessed by the validation sample, was 58%, which is 6% larger than the actual purity of the national soil map. The discrepancy between theoretical and actual purity might be explained by the spatial clustering of the soil profile observations used to calibrate the multinomial logistic regression models and by the age difference between calibration and validation observations.
[65] Knotters M, Brus D J, Oude Voshaar J H.1995.

A Comparison of kriging, co-kriging and kriging combined with regression for spatial Interpolation of horizon depth with censored observations

[J]. Geoderma, 67(3-4): 227-246.

https://doi.org/10.1016/0016-7061(95)00011-C      URL      [本文引用: 1]      摘要

Kriging combined with regression gave better results than co-kriging. Moreover, in kriging combined with regression fewer model parameters needed to be estimated. This would be even advantageous if two or more auxiliary variables were used.
[66] Kumar S, Lal R, Liu D S.2012.

A geographically weighted regression kriging approach for mapping soil organic carbon stock

[J]. Geoderma, 189-190: 627-634.

https://doi.org/10.1016/j.geoderma.2012.05.022      URL      [本文引用: 1]      摘要

Local variations in the model parameters can play an important explanatory role in the spatial modeling of soil organic carbon (SOC) stock. Linear regression models assume parameters to be spatially invariant and are unable to account for the spatially varying relationships in the variables. A recently developed approach, geographically weighted regression kriging (GWRK), was used in this study to examine the relationships between environmental variables and SOC stock for the state of Pennsylvania, USA. The specific objectives were to (i) estimate the SOC stock (kg02C02m 61022 ) to 1.0-m depth, and (ii) compare the GWRK results with those obtained from regression kriging (RK). Data for 878 georeferenced soil profiles, extracted from National Soil Survey Center database, were divided into calibration ( n 02=02702) and validation ( n 02=02176) datasets. Environmental variables including temperature, precipitation, elevation, slope, geology, land use, and normalized difference vegetation index were explored and included as independent variables to establish the model for estimating the SOC stock. Results using Pennsylvania as a case study conclude that GWRK was the least biased and more accurate compared to RK for estimating the SOC stock based on the lowest root mean square error (2.61 vs. 4.6102kg02m 61022 ), and high R 2 (0.36 vs. 0.23) values. Higher stock was consistent with higher precipitation and cooler temperature of the region. Total SOC stock ranged from 1.12 to 1.1802Pg for the soils of Pennsylvania. Forests store the highest SOC stock (64% of the total), followed by croplands (22%), wetlands (2.3%), and shrubs (2%). Results show that GWRK enhances the precision for estimating the SOC stock compared to the RK since the former takes into account the spatial non-stationarity coupled with spatial autocorrelation of the residuals.
[67] Lagacherie P, McBratney A B.2006.

Chapter 1 Spatial soil information systems and spatial soil inference systems: Perspectives for digital soil mapping

[J]. Developments in Soil Science, 31: 3-22.

https://doi.org/10.1016/S0166-2481(06)31001-X      URL     

[68] Li J, Heap A D.2011.

A review of comparative studies of spatial interpolation methods in environmental sciences: Performance and impact factors

[J]. Ecological Informatics, 6(3): 228-241.

https://doi.org/10.1016/j.ecoinf.2010.12.003      URL      [本文引用: 1]      摘要

Spatial interpolation methods have been applied to many disciplines. Many factors affect the performance of the methods, but there are no consistent findings about their effects. In this study, we use comparative studies in environmental sciences to assess the performance and to quantify the impacts of data properties on the performance. Two new measures are proposed to compare the performance of the methods applied to variables with different units/scales. A total of 53 comparative studies were assessed and the performance of 72 methods/sub-methods compared is analysed. The impacts of sample density, data variation and sampling design on the estimations of 32 methods are quantified using data derived from their application to 80 variables. Inverse distance weighting (IDW), ordinary kriging (OK), and ordinary co-kriging (OCK) are the most frequently used methods. Data variation is a dominant impact factor and has significant effects on the performance of the methods. As the variation increases, the accuracy of all methods decreases and the magnitude of decrease is method dependent. Irregular-spaced sampling design might improve the accuracy of estimation. The effect of sampling density on the performance of the methods is found not to be significant. The implications of these findings are discussed.
[69] Li J, Heap A D.2014.

Spatial interpolation methods applied in the environmental sciences: A review

[J]. Environmental Modelling & Software, 53: 173-189.

https://doi.org/10.1016/j.envsoft.2013.12.008      URL      [本文引用: 2]      摘要

61Comparison of commonly used spatial interpolation methods in environmental science.61Analysis of factors affecting the performance of spatial interpolation methods.61Classification of 25 methods to illustrate their relationship.61Guidelines for selecting an appropriate method for a given dataset.61A list of software packages for commonly used spatial interpolation methods.
[70] Li Y, Zhu A-X, Shi Z, et al.2016.

Supplemental sampling for digital soil mapping based on prediction uncertainty from both the feature domain and the spatial domain

[J]. Geoderma, 284: 73-84.

https://doi.org/10.1016/j.geoderma.2016.08.013      URL      [本文引用: 1]      摘要

This paper presents an uncertainty-directed sampling method that can be used to design additional samples for soil mapping. The method is based on uncertainty from both the feature domain (the domain of relationships with environmental covariates) and the spatial domain (the domain of spatial autocorrelation). Existing soil samples are also taken into account. The method comprises three steps: 1) the selection of a ranked list of additional sample locations based on uncertainty from the feature domain using individual predictive soil mapping (iPSM); 2) the selection of a ranked list of additional sample locations based on uncertainty from the spatial domain using ordinary kriging; 3) the determination of a final ranked list created by merging the ranked lists from steps 1) and 2) based on both uncertainties. To evaluate the method, the three lists were used to map soil organic matter (SOM) in a 299.14 km 2 study area near Fuyang city in the northwest region of Zhejiang Province, China. The mapping accuracy of each list was then calculated and used to assess the effectiveness of the method. Compared with the sampling scheme based on the uncertainty from either the feature domain or the spatial domain alone, the root-mean-squared error (RMSE), with the addition of the final list based on both uncertainties, was found to be the smallest, ranging from 0.829 to 1.126, and the agreement coefficient (AC) was the largest, ranging from 0.634 to 0.737. This confirms that sampling based on two uncertainties is better than sampling based on uncertainty from either the feature domain or the spatial domain alone. The results suggest that the proposed combined additional sampling method is more effective for sampling additional points in soil mapping.
[71] Loague K.1992.

Soil water content at R-5: Part 1. Spatial and temporal variability

[J]. Journal of Hydrology, 139(1-4): 233-251.

https://doi.org/10.1016/0022-1694(92)90204-9      URL      [本文引用: 1]      摘要

This paper, the first part in a two-part series, is concerned with the interpretation of spatial and temporal variations in soil water content across a small rangeland catchment; two data sets are examined. The first data set is comprised of 25 728 soil water content measurements made at 34 sites over an 8 year period. The second data set consists of individual soil water content measurements made at 247 sites over a 6 day period. Geostatistical methods are used to describe variations in soil water content; general characterizations are made. In the companion paper the impact of antecedent soil water conditions is investigated for a suite of R-5 event simulations with a quasi-physically based rainfall-runoff model.
[72] MacMillan R A, Pettapiece W W, Nolan S C, et al.2000.

A generic procedure for automatically segmenting landforms into landform elements using DEMs, heuristic rules and fuzzy logic

[J]. Fuzzy Sets and Systems, 113(1): 81-109.

https://doi.org/10.1016/S0165-0114(99)00014-7      URL      [本文引用: 1]      摘要

A robust new approach for describing and segmenting landforms which is directly applicable to precision farming has been developed in Alberta. The model uses derivatives computed from DEMs and a fuzzy rule base to identify up to 15 morphologically defined landform facets. The procedure adds several measures of relative landform position to the previous classification of Pennock et al. (Geoderma 40 (1987) 297–315; 64 (1994) 1–19). The original 15 facets can be grouped to reflect differences in complexity of the area or scale of application. Research testing suggests that a consolidation from 15 to 3 or 4 units provides practical, relevant separations at a farm field scale. These units are related to movement and accumulation of water in the landscape and are significantly different in terms of soil characteristics and crop yields. The units provide a base for benchmark soil testing, for applying biological models and for developing agronomic prescriptions and management options.
[73] Malone B P, de Gruijter J J, McBratney A B, et al.2011.

Using additional criteria for measuring the quality of predictions and their uncertainties in a digital soil mapping framework

[J]. Soil Science Society of America Journal, 75(3): 1032-1043.

https://doi.org/10.2136/sssaj2010.0280      URL      [本文引用: 1]      摘要

In this paper we introduce additional criteria to assess the quality of digital soil property maps. Soil map quality is estimated on the basis of validating both the accuracy of the predictions and their uncertainties (which are expressed as a prediction interval [PI]). The first criterion is an accuracy measure that is different in form to the usual mean square error (MSE) because it accounts also for the prediction uncertainties. This measure is the spatial average of the statistical expectation of the mean square error of a simulated random value (MSES). The second criterion addresses the quality of the uncertainties which is estimated as the total proportion of the study area where the (161α)–PI covers the true value. Ideally, this areal proportion equals the nominal value (1 61 α). In the Lower Hunter Valley, NSW, Australia, we used both criteria to validate a soil pH map using additional units collected from a probability sample at five depth intervals: 0 to 5, 5 to 15, 15 to 30, 30 to 60, and 60 to 100 cm. For the first depth interval (0–5 cm) in 96% of the area, the 95% PI of pH covered the true value. The root mean squared simulation error (RMSES) at this depth was 1.0 pH units. Generally, the discrepancy between the nominal value and the areal proportion in addition to the RMSES increased with soil depth, indicating largely a growing imprecision of the map and underestimation of the uncertainty with increasing soil depth. In exploring this result, conventional map quality indicators emphasized a combination of bias and imprecision particularly with increasing soil depth. There is great value in coupling conventional map quality indicators with those which we propose in this study as they target the decision making process for improving the precision of maps and their uncertainties. For our study area we discuss options for improving on our results in addition to determining the possibility of extending a similar sampling approach for which multiple soil property maps can be validated concurrently
[74] Matheron G.1963.

Principles of geostatistics

[J]. Economic Geology, 58(8): 1246-1266.

https://doi.org/10.2113/gsecongeo.58.8.1246      URL      [本文引用: 2]     

[75] McBratney A B, Hart G A, McGarry D.1991.

The use of region partitioning to improve the representation of geo statistically mapped soil attributes

[J]. Journal of Soil Science, 42(3): 513-532.

https://doi.org/10.1111/j.1365-2389.1991.tb00427.x      URL      [本文引用: 1]      摘要

SUMMARY An attempt to improve the representation of a geo statistically mapped soil attribute, clay content of the surface soil, through partitioning of the study area into two new regions was made. A topographic boundary divided the study area into hill and plain regions. Possible global non-stationarity or non-stationarity within the two newly defined regions was dealt with through the use of intrinsic random functions (IRF) of order k. Cross-validation of generalized covariance functions suggested that ordinary kriging might also have been appropriate. Exponential variogram models were subsequently fitted to the experimental variograms for each region. IRF- k block kriging and ordinary block kriging were then used as the primary methods of estimation. Both IRF- k and ordinary kriging performed badly in the vicinity of the topographic boundary when global models were used. This discontinuity was removed, at the expense of the introduction of some additional edge effects, when the hill and plain regions were kriged using models appropriate to each region. Independent zero-order generalized covariance functions with nugget and linear terms and exponential variogram models produced similar representations of clay content within each region, when used with their respective estimators. Splitting the region resulted in a 6% reduction in mean absolute deviation and a 14% reduction in mean squared deviation of predicted clay contents compared with a global model.
[76] McBratney A B, Mendonça Santos M L, Minasny B.2003.

On digital soil mapping

[J]. Geoderma, 117(1-2): 3-52.

https://doi.org/10.1016/S0016-7061(03)00223-4      URL      [本文引用: 4]     

[77] McBratney A B, Odeh I O A, Bishop T F A, et al.2000.

An overview of pedometric techniques for use in soil survey

[J]. Geoderma, 97(3-4): 293-327.

https://doi.org/10.1016/S0016-7061(00)00043-4      URL      [本文引用: 2]      摘要

Quantitative techniques for spatial prediction in soil survey are developing apace. They generally derive from geostatistics and modern statistics. The recent developments in geostatistics are reviewed particularly with respect to non-linear methods and the use of all types of ancillary information. Additionally analysis based on non-stationarity of a variable and the use of ancillary information are demonstrated as encompassing modern regression techniques, including generalised linear models (GLM), generalised additive models (GAM), classification and regression trees (RT) and neural networks (NN). Three resolutions of interest are discussed. Case studies are used to illustrate different pedometric techniques, and a variety of ancillary data. The case studies focus on predicting different soil properties and classifying soil in an area into soil classes defined a priori. Different techniques produced different error of interpolation. Hybrid methods such as CLORPT with geostatistics offer powerful spatial prediction methods, especially up to the catchment and regional extent. It is shown that the use of each pedometric technique depends on the purpose of the survey and the accuracy required of the final product.
[78] McBratney A B, Webster R.1983.

Optimal interpolation and isarithmic mapping of soil properties: V. Co-regionalization and multiple sampling strategy

[J]. Journal of Soil Science, 34(1): 137-162.

https://doi.org/10.1111/ejs.1983.34.issue-1      URL      [本文引用: 1]     

[79] McKenzie N J, Ryan P J.1999.

Spatial prediction of soil properties using environmental correlation

[J]. Geoderma, 89(1-2): 67-94.

https://doi.org/10.1016/S0016-7061(98)00137-2      URL      [本文引用: 1]      摘要

Conventional survey methods have efficiencies in medium to low intensity survey because they use relationships between soil properties and more readily observable environmental features as a basis for mapping. However, the implicit predictive models are qualitative, complex and rarely communicated in a clear manner. The possibility of developing an explicit analogue of conventional survey practice suited to medium to low intensity surveys is considered. A key feature is the use of quantitative environmental variables from digital terrain analysis and airborne gamma radiometric remote sensing to predict the spatial distribution of soil properties. The use of these technologies for quantitative soil survey is illustrated using an example from the Bago and Maragle State Forests in southeastern Australia. A design-based, stratified, two-stage sampling scheme was adopted for the 50,000 ha area using digital geology, landform and climate as stratifying variables. The landform and climate variables were generated using a high resolution digital elevation model with a grid size of 25 m. Site and soil data were obtained from 165 sites. Regression trees and generalised linear models were then used to generate spatial predictions of soil properties using digital terrain and gamma radiometric survey data as explanatory variables. The resulting environmental correlation models generate spatial predictions with a fine grain unmatched by comparable conventional survey methods. Example models and spatial predictions are presented for soil profile depth, total phosphorus and total carbon. The models account for 42%, 78% and 54% of the variance present in the sample respectively. The role of spatial dependence, issues of scale and landscape complexity are discussed along with the capture of expert knowledge. It is suggested that environmental correlation models may form a useful trend model for various forms of kriging if spatial dependence is evident in the residuals of the model.
[80] McSweeney K, Slater B K, Hammer R D, et al.1994.

Towards a new framework for modeling the soil-landscape continuum

[M]//Amundson R R, Harden J, Singer M. Factors of soil formation: A fiftieth anniversary retrospective. Madison, WI: Soil Science Society of America, 127-145.

[本文引用: 1]     

[81] Miller B A, Koszinski S, Wehrhan M, et al.2015.

Impact of multi-scale predictor selection for modeling soil properties

[J]. Geoderma, 239-240: 97-106.

https://doi.org/10.1016/j.geoderma.2014.09.018      URL      [本文引用: 1]      摘要

61Potentially useful predictors for digital soil mapping are often overlooked.61Different analysis scales should be treated as unique predictor variables.61The use of multi-scale predictor variables can greatly increase model performance.61Experimentation with subsets of predictor pools for data mining tools can be productive.
[82] Minasny B, McBratney A B.2006.

Latin hypercube sampling as a tool for digital soil mapping

[J]. Developments in Soil Science, 31: 153-165, 606.

https://doi.org/10.1016/S0166-2481(06)31012-4      URL      [本文引用: 2]     

[83] Mondal A, Khare D, Kundu S, et al.2017.

Spatial soil organic carbon (SOC) prediction by regression kriging using remote sensing data

[J]. The Egyptian Journal of Remote Sensing and Space Science, 20(1): 61-70.

https://doi.org/10.1016/j.ejrs.2016.06.004      URL      [本文引用: 1]      摘要

The present study has illustrated the estimation of the soil organic carbon (SOC) distribution from point survey data (prepared after laboratory test) by a hybrid interpolation method, viz. regression kriging (RK) in a part of the Narmada river basin in the central India. In this study, eight selected predictor variables are used such as, brightness index (BI), greenness index (GI), wetness index (WI), normalized difference vegetation index (NDVI), vegetation temperature condition index (VTCI), digital elevation model (DEM), and slope and compound topographic index (CTI). The RK method has given satisfactory results as observed from the level of accuracy. Finally, the amount of SOC content in varied slope, soil and landuse categories has been analysed. Concentration of SOC has been observed to be more in low elevated areas in clay soil with mainly agricultural and vegetated lands.
[84] Moore I D, Gessler P E, Nielsen G A, et al.1993.

Soil attribute prediction using terrain analysis

[J]. Soil Science Society of America Journal, 57(2): 443-452.

https://doi.org/10.2136/sssaj1993.03615995005700020026x      URL      [本文引用: 1]     

[85] Mulder V L, de Bruin S, Schaepman M E.2013.

Representing major soil variability at regional scale by constrained Latin Hypercube Sampling of remote sensing data

[J]. International Journal of Applied Earth Observation and Geoinformation, 21: 301-310.

https://doi.org/10.1016/j.jag.2012.07.004      URL      [本文引用: 3]      摘要

This paper presents a sparse, remote sensing-based sampling approach making use of conditioned Latin Hypercube Sampling (cLHS) to assess variability in soil properties at regional scale. The method optimizes the sampling scheme for a defined spatial population based on selected covariates, which are assumed to represent the variability of the target variables. The optimization also accounts for specific constraints and costs expressing the field sampling effort. The approach is demonstrated using a case study in Morocco, where a small but representative sample record had to be collected over a 15,000 km(2) area within 2 weeks. The covariate space of the Latin Hypercube consisted of the first three principal components of ASTER imagery as well as elevation. Comparison of soil properties taken from the topsoil with the existing soil map, a geological map and lithological data showed that the sampling approach was successful in representing major soil variability. The cLHS sample failed to express spatial correlation; constraining the LHS by a distance criterion favoured large spatial variability within a short distances resulting in an overestimation of the variograms nugget and short distance variability. However, the exhaustive covariate data appeared to be spatially correlated which supports our premise that once the relation between spatially explicit remote sensing data and soil properties has been modelled, the latter can be spatially predicted based on the densely sampled remotely sensed data. Therefore, the LHS approach is considered as time and cost efficient for regional scale surveys that rely on remote sensing-based prediction of soil properties. (C) 2012 Elsevier B.V. All rights reserved.
[86] Myers D B, Kitchen N R, Sudduth K A, et al.2010.

Combining proximal and penetrating soil electrical conductivity sensors for high-resolution digital soil mapping

[M]//Viscarra Rossel R A, McBratney A B, Minasny B. Proximal soil sensing. Dordrecht, Netherlands: Springer.

[本文引用: 1]     

[87] Nemes A, Rawls W J, Pachepsky Y A.2006.

Use of the nonparametric nearest neighbor approach to estimate soil hydraulic properties

[J]. Soil Science Society of America Journal, 70(2): 327-336.

https://doi.org/10.2136/sssaj2005.0128      URL      [本文引用: 1]      摘要

Non-parametric approaches are being used in various fields to address classification type problems, as well as to estimate continuous variables. One type of the non-parametric lazy learning algorithms, a k-Nearest Neighbor (k-NN) algorithm has been applied to estimate water retention at –33 and –1500 kPa matric potentials. Performance of the algorithm has subsequently been tested against estimations made by a neural network (NNet) model, developed using the same data and input soil attributes. We used a hierarchical set of inputs using soil texture, bulk density and organic matter content to avoid possible bias towards one set of inputs, and varied the size of the data set used to develop the NNet models and to run the k-NN estimation algorithms. Different ‘design-parameter’ settings, analogous to model parameters have been optimized. The k-NN technique showed little sensitivity to potential sub-optimal settings in terms of how many nearest soils were selected and how those were weighed while formulating the output of the algorithm, as long as extremes were avoided. The optimal settings were, however, dependent on the size of the development/reference data set. The non-parametric k-NN technique performed mostly equally well with the NNet models, in terms of root-mean-squared residuals and mean residuals. Gradual reduction of the data set size from 1600 to 100 resulted in only a slight loss of accuracy for both the k-NN and NNet approaches. The k-NN technique is a competitive alternative to other techniques to develop PTFs, especially since no re-development of PTFs is needed as new data become available.
[88] Odeh I O A, McBratney A B, Chittleborough D J.1994.

Spatial prediction of soil properties from landform attributes derived from a digital elevation model

[J]. Geoderma, 63(3-4): 197-214.

https://doi.org/10.1016/0016-7061(94)90063-9      URL      [本文引用: 1]      摘要

ABSTRACT Digital elevation models (DEMs) provide a good way of deriving landform attributes that may be used for soil prediction. The geostatistical techniques of kriging and cokriging are increasingly being applied to predicting soil properties. Whereas ordinary kriging (and universal kriging) utilise spatial correlation to determine the coefficients of the linear predictor, cokriging involves both inter-variable correlation and spatial covariation among variables. Multi-linear regression modelling also offers an alternative to predicting a soil variable by means of covariation. The performance of predicting four soil variables by these methods and two regression-kriging models are compared. The precision and bias of prediction of the six methods were dependent on the soil variable predicted. The mean error of prediction indicates reasonably small bias of prediction for all the soil variables by almost all of the methods. With the exception of topsoil gravel, for which multi-linear regression performed best, the root mean square error showed the two regression-kriging procedures to be best. Further analysis based on the mean ranks of performance by the methods confirmed this. All the kriging methods involving covariables (landform attributes) have a more smoothing effect on the predicted values, thus minimising the influence of outliers on prediction performance. Both the methods of regression-kriging show promise for predicting sparsely located soil properties from dense observations of landform attributes derived from the DEM. Histograms of subsoil clay residuals show outliers in the data set. These outliers are more evident in multi-linear regression, ordinary kriging and universal kriging than regression-kriging. There was a clear advantage in using the regression-kriging methods on those variables which had a small correlation with the landform attributes: root mean square errors for all the soil variables are much smaller than those resulting from any of the multi-linear regression, ordinary kriging, universal kriging or cokriging methods.
[89] Odeh I O A, McBratney A B, Chittleborough D J.1995.

Further results on prediction of soil properties from terrain attributes-heterotopic cokriging and regression-kriging

[J]. Geoderma, 67(3-4): 215-226.

https://doi.org/10.1016/0016-7061(95)00007-B      URL      [本文引用: 2]      摘要

ABSTRACT Several methods involving spatial prediction of soil properties from landform attributes are compared using carefully designed validation procedures, The methods, tested against ordinary kriging and universal kriging of the target variables, include multi-linear regression, isotopic cokriging, heterotopic cokriging and regression-kriging models A, B and C. Prediction performance by ordinary kriging and universal kriging was comparatively poor as the methods do not use covariation of the predictor variable with terrain attributes. Heterotopic cokriging outperformed isotopic cokriging because the former utilised more of the local information from the covariables. The combined regression-kriging methods generally performed well. Both the regression-kriging model C and heterotopic cokriging performed well when soil variables were predicted into a relatively finer gridded digital elevation model (DEM) and when all the local information was utilised. Regression-kriging model C generally performed best and is, perhaps, more flexible than heterotopic cokriging. Potential for further research and developments rests in improving the regression part of model C.
[90] Park S J, Vlek P L G.2002.

Environmental correlation of three-dimensional soil spatial variability: A comparison of three adaptive techniques

[J]. Geoderma, 109(1-2): 117-140.

https://doi.org/10.1016/S0016-7061(02)00146-5      URL      [本文引用: 1]      摘要

An appropriate inclusion of spatial variation of soils is becoming increasingly important for spatially distributed ecological modelling approaches. Even though soils are anisotropic vertically and laterally, most soil spatial variability studies have focused on the lateral variation of soil attributes over the landscape. This study characterizes the complexity of three-dimensional variations of individual soil attributes and examines the possibility of predicting soil property distribution using three different regression approaches: artificial neural networks (ANN), regression trees (RT) and general linear models (GLM). Thirty-two physiochemical attributes of 502 soil samples were collected from 64 soil profiles on a slope at Bicknoller Combe, Somerset, UK. After a principal component analysis, five soil attributes were selected to test for environmental correlation, assuming they reflect dominant pedological processes at the hillslope. Vegetation occurrence, soil types, terrain parameters and soil sample depth were used as predictors. Prediction using environmental variables was most successful for soil attributes whose spatial distribution is strongly influenced by lateral hydrological and slope processes with relatively simple depth functions (e.g. total exchangeable bases, Mn oxides and soil pH). These soil attributes also showed a high mobility, which implies that their spatial distribution quickly reaches an equilibrium with current slope processes. Soil taxonomic information only marginally improved the performance of models constructed from surface information such as vegetation and terrain parameters. On the other hand, soil attributes whose vertical distribution is strongly governed by vertical pedogenesis or unknown factors were poorly modelled by environmental variables due to stronger nonlinearity in their vertical distribution. Soil taxonomic information becomes more important for predicting these soil attributes. As an empirical modelling tool, GLM with interaction terms outperformed the other two methods tested, ANN and RT, in terms of both the simplicity of the model structure and the performance of derived empirical functions.
[91] Qi F, Zhu A X.2003.

Knowledge discovery from soil maps using inductive learning

[J]. International Journal of Geographical Information Science, 17(8): 771-795.

https://doi.org/10.1080/13658810310001596049      URL      [本文引用: 2]      摘要

This paper develops a knowledge discovery procedure for extracting knowledge of soil-landscape models from a soil map. It has broad relevance to knowledge discovery from other natural resource maps. The procedure consists of four major steps: data preparation, data preprocessing, pattern extraction, and knowledge consolidation. In order to recover true expert knowledge from the error-prone soil maps, our study pays specific attention to the reduction of representation noise in soil maps. The data preprocessing step has exhibited an important role in obtaining greater accuracy. A specific method for sampling pixels based on modes of environmental histograms has proven to be effective in terms of reducing noise and constructing representative sample sets. Three inductive learning algorithms, the See5 decision tree algorithm, Na ve Bayes, and artificial neural network, are investigated for a comparison concerning learning accuracy and result comprehensibility. See5 proves to be an accurate method and produces the most comprehensible results, which are consistent with the rules (expert knowledge) used in producing the soil map. The incorporation of spatial information into the knowledge discovery process is found not only to improve the accuracy of the extracted knowledge, but also to add to the explicitness and extensiveness of the extracted soil-landscape model.

/