Introduction

Overview

This Rmarkdown tutorial provides practical instructions, illustrated with sample dataset, on how to generate and evaluate sampling plans using your own data. The specific focus is put on preparing sampling designs for predictive mapping, running analysis and interpretation on existing point data and planning 2nd and 3rd round sampling (based on initial models). A similar tutorial focusing on Spatial and spatiotemporal interpolation using Ensemble Machine Learning is also available.

We use several key R packages and existing tutorials including:

sp package,
clhs package,
mlr package,
ranger package,
forestError package,

Other packages of interest for producing spatial sampling:

SamplingBigData package,
sf package,
spatstat package(s),

For an introduction to Spatial Data Science and Machine Learning with R we recommend studying first:

Baddeley, A., Rubak, E. and Turner, R.: “Spatial Point Patterns: Methodology and Applications with R”;
Becker, M. et al.: “mlr3 book”;
Irizarry, R.A.: “Introduction to Data Science: Data Analysis and Prediction Algorithms with R”;
Molnar, C.: “Interpretable Machine Learning: A Guide for Making Black Box Models Explainable”;
Lovelace, R., Nowosad, J. and Muenchow, J.: “Geocomputation with R”;
Pebesma, E. and Bivand, R: “Spatial Data Science: with applications in R”;

If you are looking for a more gentle introduction to spatial sampling methods in R please refer to Bivand, Pebesma, & Rubio (2013), Baddeley, Rubak, & Turner (2015), D. J. Brus (2019) and D. J. Brus (2021). The “Spatial sampling with R” book by Dick Brus and R code examples are available via https://github.com/DickBrus/SpatialSamplingwithR.

For an introduction to Predictive Soil Mapping using R refer to https://soilmapper.org.

Machine Learning in python with resampling can be best implemented via the scikit-learn library, which matches in functionality what is available via the mlr package in R.

To install the most recent landmap, ranger, forestError and clhs packages from Github use:

library(devtools)
devtools::install_github("envirometrix/landmap")
devtools::install_github("imbs-hl/ranger")
devtools::install_github("benjilu/forestError")
devtools::install_github("pierreroudier/clhs")

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Acknowledgements

$alt text$ This tutorial is based on the “R for Data Science” book by Hadley Wickham and contributors.

OpenLandMap is a collaborative effort and many people have contributed data, software, fixes and improvements via pull request. OpenGeoHub is an independent not-for-profit research foundation promoting Open Source and Open Data solutions. EnvirometriX Ltd. is the commercial branch of the group responsible for designing soil sampling designs for the AgriCapture and similar soil monitoring projects.

$EnvirometriX logo$

AgriCaptureCO2 receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 101004282.

1 Generating spatial sampling