Satellite remote sensing of phytoplankton biomarker pigments: a statistical learning approach
Phytoplankton are the foundation of marine food webs and oceanic carbon sequestration, yet there are many questions about their future in a changing climate and under expanding human uses of the seas. Because of their spatial and temporal coverage, satellite images are an important data source for phytoplankton monitoring and research at broad spatial scales. Over the last decades, researchers have proposed several algorithms to map different aspects of phytoplankton community composition from space. However, these algorithms have not been fully validated, and inter-comparisons between different algorithm types have been inconclusive. We thus developed a series of increasingly complex algorithms that predict the global spatial distribution of pigments serving as biomarkers for various phytoplankton taxa, building on recent advances in statistical learning methods for spatial data. We tuned, tested and selected the algorithms by means of double spatial block cross-validation, an approach for finding models that generalize well to new study areas, and for estimating extrapolation errors for regions that have no in-situ training data. We present global maps of biomarker pigments based on the best-performing algorithms, and discuss which types of input data (abundance, spectral, and environmental) are the best predictors for each pigment, and the associated phytoplankton communities.