October 8, 2024
Reading time: 79 minutes
Preface
Please note: this is a reproduction of a peer-reviewed article published by the Canadian Journal of Remote Sensing, 50(1) (Taylor & Francis online). This is an Open Access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The terms on which this article has been published allow the posting of the Accepted Manuscript in a repository by the author(s) or with their consent.
Citation: Donnini, J., Kross, A., & Alejo, C. (2024). Spectral Diversity as a Predictor of Tree Diversity: Exploring Challenges and Opportunities Across Forest Ecosystems. Canadian Journal of Remote Sensing, 50(1). https://doi.org/10.1080/07038992.2024.2403495
Authors
Jennifer Donnini, Angela Kross, Camilo Alejo
Publisher
Published by Informa UK Limited, trading as Taylor & Francis Group.
Abstract
Forests are crucial for ecosystem health, climate regulation, and biodiversity. However, many tree species face extinction threats, requiring active monitoring for conservation. The spectral variation hypothesis (SVH) suggests that spectral diversity can serve as a proxy for ground-measured biodiversity. Despite its promise, SVH’s application has shown inconsistent results, complicating its use in biodiversity monitoring. This study examines the relationship between tree diversity and Sentinel-2-derived spectral diversity across Quebec’s forests, analyzing 2531 inventory plots using a combination of spectral analysis, cluster analysis and random forest (RF) regressions. We evaluate four biodiversity indices: species richness, Shannon diversity, functional dispersion, and percent conifer. Our analysis reveals overlapping spectral signatures that make it challenging to differentiate between varying levels of species richness, Shannon diversity, and functional dispersion. However, percent conifer shows spectral separability and can be stratified using unsupervised k-means clustering. Using RF regression models, only the percent conifer models demonstrated strong performance (R2 = 0.77), while models for the other biodiversity indices did not exceed an R2 of 0.46. This study highlights the complex relationship between spectral diversity and tree diversity, and suggests that future research should aim to improve the understanding of the relationship, or lack thereof, between ground-measured biodiversity indices and relatable spectral metrics.
1. Introduction
1.1. Importance of forests and Forest biodiversity
Tree species worldwide face threats due to human-induced and natural factors. Alarmingly, close to 30% of tree species are threatened with extinction (Rivers et al. 2023). Major threats to tree species include land clearing for agriculture, livestock farming, and urban development, logging, fire, energy production, and the invasion of non-native species (Ellison et al. 2005; Rivers et al. 2023). Forest loss is especially concerning given that forests provide shelter for roughly 80% of amphibian species, 75% of birds and 68% of mammals (Rivers et al. 2023). Forests are also a cornerstone for addressing societal challenges, such as climate change, food and water security, and biodiversity loss through Nature-Based Solutions (Nbs) (Chen et al. 2023; Mensah et al. 2023; Mori et al. 2021; Rivers et al. 2023). Additionally, they supply economic assets, including timber and medicinal resources, while also catering to cultural, spiritual, and recreational needs (Rivers et al. 2023). The diversity within forests is particularly important, as it underpins all these benefits. Forest diversity plays an important role in ecosystem functioning (Ellison et al. 2005), and positive links between tree species diversity and the provision of ecosystem services such as soil carbon storage, berry, game, and biomass production among others have been reported (Liang et al. 2016; Mori et al. 2021; Paquette & Messier 2011; Rivers et al. 2023). Boreal forests are characterized by low functional redundancy; they contain a limited variety of species performing similar ecological roles (Wirth 2005). Ecosystems with low functional diversity or redundancy have fewer safeguards to preserve essential processes and services in response to disturbances or shifts in climate (Ellison et al. 2005; Thom et al. 2021). Therefore, monitoring the diversity of forests is crucial, especially the boreal forests, as their lower redundancy makes them less resilient to anthropogenic pressures, which could affect their capacity to provide essential ecosystem services and contribute to Nbs.
1.2. Remote sensing of Forest biodiversity
Biodiversity has commonly been categorized into three essential components: taxonomic diversity, functional diversity and structural diversity (Lausch et al. 2016; Bracy Knight et al. 2020). Taxonomic diversity refers to the number of distinct taxonomic entities, which can be measured at various levels such as species and families. Functional diversity refers to the variety of biological functions and processes that different entities perform. Structural diversity involves the physical structure and configuration. Methods used to monitor forest biodiversity typically include field surveys or forest inventories measuring variables such as species presence, wood volumes, age, height, and disturbance (Gillis et al. 2005). Although field surveys are recognized as the gold standard for biodiversity monitoring, they demand significant time and labor, making them challenging to implement on a large or global scale (Liu et al. 2023; Rocchini et al. 2016; Xi et al. 2023). Remote sensing can complement field surveys by offering the potential for efficient, consistent monitoring over time and space, and in places where field surveys would be difficult due to environmental or even political factors (Hernández-Stefanoni et al. 2012; Oldeland et al. 2010; Rocchini et al. 2016; Wang & Gamon 2019). In recent years, there has been growing interest in the use of satellite-derived spectral diversity as a predictor of plant biodiversity based on the spectral variation hypothesis (SVH) (Kacic and Kuenzer 2022; Turner 2014; Wang & Gamon 2019).
1.2.1. The spectral variation hypothesis
The SVH, originally outlined by Palmer et al. (2000, 2002) suggests that the spectral variability of a remotely sensed image is correlated with the variation in plant species richness on the ground. The rationale is that spectral diversity represents habitat diversity, and areas with greater habitat diversity harbor a greater number of species (Kacic and Kuenzer 2022; Rocchini et al. 2016; Torresani et al. 2019; Wang & Gamon 2019). Thus, rather than directly quantifying taxonomic, functional, or structural diversity, the SVH estimates it by using environmental heterogeneity as a proxy, based on the assumption that greater environmental variability corresponds to increased species diversity (Torresani et al. 2019).
1.2.2. Spectral diversity metrics
Spectral diversity metrics that attempt to capture environmental heterogeneity can be classified into three categories: continuous metrics based on vegetation indices, continuous metrics based on spectral information content and the ‘spectral species’ approach (Wang and Gamon 2019). Vegetation indices capture the distinct absorption and reflection patterns of the electromagnetic spectrum by plants. Plants primarily absorb light in the blue and red wavelengths for photosynthesis, their leaf and canopy structure affects the reflectance in the NIR range, and leaf and canopy water content absorb light in the SWIR range (Zeng et al. 2022). The Normalized Difference Vegetation Index (NDVI) stands out as the predominant index employed in forest spectral diversity research (Kacic and Kuenzer 2022). Nonetheless, other popular indices, such as the enhanced vegetation index (EVI) and the simple ratio index (SR) are also recognized for their ability to predict species richness (Kacic and Kuenzer 2022; Liu et al. 2023; Madonsela et al. 2017). Metrics based on spectral information content summarize the spectral data into statistical measures that highlight the variability or overall information content in the image (Wang and Gamon 2019). Examples include the coefficient of variation (Liu et al. 2023), image texture (Hernández-Stefanoni et al. 2012), or the distance from the spectral centroid (Rocchini et al. 2004). These measures represent the dimensionality of the data, which can be correlated with taxonomic, functional, or structural diversity in the study area (Wang and Gamon 2019). Among these measures, image texture has gained increasing attention as a tool for measuring image heterogeneity. Image texture is linked to spatial and structural variability, which can serve as a reflection of environmental heterogeneity and by extension biodiversity (Hernández-Stefanoni et al. 2012; Madonsela et al. 2017; Stein et al. 2014). Multiple studies have revealed connections between image texture and tree diversity (Hernández-Stefanoni et al. 2012; Liu et al. 2023; Madonsela et al. 2017; Peña-Lara et al. 2022), and vegetation structural diversity (Wood et al. 2012). The third category of spectral diversity is based on the ‘spectral species’ concept. This approach involves assigning each pixel in an image to a discrete class i.e., a ‘spectral species’ instead of a taxonomic species, using unsupervised clustering (Féret and Asner 2014; Rocchini et al. 2022a; 2022b; Wang and Gamon 2019). Each spectral species is characterized by its unique set of spectral properties, which can serve as proxies for identifying and categorizing them into taxonomic, functional, or community groups depending on the pixel scale (Rocchini et al. 2022a). Studies have also used unsupervised clustering for broad ecosystem classifications (Coops et al. 2009; Powers et al. 2013). In line with the SVH, the observation of a large number of classes in a specific region would suggest considerable environmental diversity, which could reflect a large number of species in that area (Schmidtlein & Fassnacht; 2017).
Many studies have tested the SVH using a variety of spectral diversity metrics and in various ecosystems including wetlands (Rocchini et al. 2004), savannahs (Madonsela et al. 2017; 2021; Oldeland et al. 2010), tropical forests (Féret and Asner 2014; Hernández-Stefanoni et al. 2012), coniferous forests (Torresani et al. 2019), mediterranean mountains (Levin et al. 2007), and dry forests (Peña-Lara et al. 2022). These studies have incorporated multispectral (Hernández-Stefanoni et al. 2012; Levin et al. 2007; Liu et al. 2023; Madonsela et al. 2017; 2021; Rocchini et al. 2004; Torresani et al. 2019; Warren et al. 2014) hyperspectral, (Oldeland et al. 2010; Féret & Asner; 2014) and even leaf level sensors (Frye et al. 2021). Despite the use of various spectral diversity metrics and sensors across different scales, the predictive efficacy of species diversity models based on the SVH demonstrate inconsistent results across and within different forest types.
1.3. Gaps and research objectives
In the research investigating the Spectral Variation Hypothesis (SVH), the prevailing methodology has largely been a “top-down” approach, emphasizing the inclusion of a wide array of predictors in models to determine their significance in predicting different measures of biodiversity. These studies focus primarily on the final R2 of the models and the correlation between various spectral diversity predictors (indices, spectral information content, individual bands) and biodiversity indices. However, this approach, while comprehensive, often fails to relate the potentially relevant information offered by spectral properties or signatures of the study plots themselves. In this study, we define a “bottom-up” approach as one that investigates the relationship between spectral reflectance variability and ground-measured biodiversity indices, starting with the spectral signatures of plots with different biodiversity levels (e.g., high, low). Our study makes a unique contribution to the area by examining the spectral separability of different biodiversity levels on the ground. If plots with low and high levels of biodiversity are spectrally indistinguishable at this level, i.e., they produce identical or closely matching spectral signatures, any method relying on differentiating them with spectral data will face significant challenges. An exploration of the spectral variability related to the variability of ground data may offer a more targeted exploration with the opportunity to explain why certain biodiversity indices and predictors perform better than others across different regions and ecosystems, enhancing the interpretability of models based on this hypothesis.
The overall objective of this study is to integrate ground data, satellite imagery, and machine learning models to systematically investigate the relationship between spectral metrics and environmental variables with ground-measured biodiversity indices within and across Quebec’s deciduous, mixed, and boreal forest types. To explore the relationships, we adopted a parsimonious approach, starting with simple linear correlations and progressing to clustering methods and random forest models. The novelty of our study lies in the systematic investigation of the relationship between ground-measured biodiversity indices and spectral reflectance. Using a bottom-up approach, and data from over 2500 forest inventory plots, presents us with a unique opportunity to investigate and determine if spectral information or diversity can be used to predict ground-measured biodiversity. The use of such an extensive ground dataset is a significant aspect of our research, enhancing the robustness and reliability of our findings.
Overall, this research will allow us to examine the SVH within Quebec’s forests, contributing to the understanding of the challenges, limitations, and opportunities for effectively monitoring forest biodiversity using spectral diversity.
The sub objectives of this study are then to:
- Assess biodiversity across forest inventory plots spanning three forest types, using four indices: species richness, Shannon diversity, functional dispersion, and percent conifer which represent the number of species, evenness, function, and relative abundance respectively.
- Evaluate the relationships between Sentinel 2- derived spectral signatures and the four biodiversity indices across Quebec, the three forest types, and at different phenological stages (start of season, peak of season and end of season)
- Investigate the relationship between unsupervised spectral clusters and the four indices to determine if the indices can be stratified using the base spectral information (bands) from Sentinel-2
- Using Sentinel-2 spectral data, and derived image heterogeneity metrics, alongside environmental and climatic variables, employ Random Forest (RF) variable selection to identify the most effective predictor variables for the four diversity indices. Variables for each region will then be categorized by type, with attention to the spectral data’s seasonal origin, to understand regional differences. Use the most effective predictor variables identified through variable selection in RF models to test their effectiveness in estimating the four diversity indices.
2. Methods
2.1. Study area: Quebec’s boreal, mixed, and deciduous forests
This study was conducted in Quebec, Canada, covering a significant expanse of the province’s southern region (Figure 1). Forests make up 900,000 km2 of the province and comprise 2% of forest area globally (Boulanger et al. 2022; MRNF 2021 n.d.). A significant portion of the forests (92%) are publicly owned and under government management, playing a crucial role in the province’s economy by providing approximately 80,000 jobs (Boulanger et al. 2022, 2023) They also provide essential habitats for an impressive array of wildlife, including over 200 species of birds, 60 species of mammals, 40 species of amphibians and reptiles, and more than 100 fish species (Ministere des Forets and Faune et Parcs (MFFP)) 2015). The forests in this region are vulnerable to wildfires, insect outbreaks, and climate change (Boulanger et al. 2022; Chavardès et al. 2023). Our study area encompasses three forest ecosystems: deciduous, mixed, and boreal forests. In southern Quebec, deciduous forests predominantly comprising broadleaf species such as sugar maple (Acer saccharum), red maple (Acer rubrum), American beech (Fagus grandifolia), yellow birch (Betula alleghaniensis) and temperate conifer species such as white pine (Pinus strobus) and hemlock (Tsuga.) are common (Morneau et al. 2021). The average annual temperature in the deciduous forest region is 4.0 °C and it has an average annual precipitation of approximately 999.3 mm (Hijmans et al. 2005). As one moves further north the mixed forest zone forms the transition between the northern deciduous zone and the boreal. Here there are mixed stands of yellow birch, balsam fir (Abies balsamea), spruce and maple species (Morneau et al. 2021). The average annual temperature in the mixed forest region is 2.1 °C and it has an average annual precipitation of approximately 1001.7 mm (Hijmans et al. 2005). Further north are the closed boreal forests, dominated by balsam fir, black spruce (Picea mariana), paper birch (Betula papyrifera) and trembling aspen (Populus tremuloides) (Morneau et al. 2021). The average annual temperature in the boreal forest region drops to −0.77 °C and it has an average annual precipitation of approximately 970.5 mm (Hijmans et al. 2005).
Figure 1. Quebec, Canada forest types and forest inventory plots measured in 2018, 2019, and 2020 (MRNF 2016; MRNF 2022).
2.2. Ground data: Quebec’s forest inventory
Tree diversity data was obtained from Quebec’s Ministère des Ressources Naturelles et des Forêts (MRNF) who conducts forest inventories every 10 years (MRNF 2022). Each forest survey records the number of trees, the species, tree characteristics such as diameter and quality, as well as ecological characteristics of the plot (details about the forest inventory can be found in the forest inventory document, https://mffp.gouv.qc.ca/nos-publications/norme-placettes-echantillons-permanentes-cinquieme-inventaire/). Plots have a radius of 11.28 m (area 400 m2), and only trees with a diameter at breast height greater than 90 mm are counted. In this study, we used the permanent plots last measured in 2018, 2019, and 2020 (n = 2668). We extracted data on the number of living tree species from each circular plot, for every measurement year, using Microsoft Access SQL queries. After extracting the number of species, we extracted the count of each species in each plot to calculate diversity indices in R representing different aspects of biodiversity such as richness, evenness, relative abundance, and functional diversity (R Core Team 2023). Species richness is the count of unique species in each plot. The Shannon diversity index evaluates both the count of species and their relative abundance (Appendix S2: Table S1). It has been suggested that Shannon diversity aligns more accurately with environmental heterogeneity metrics, as it better represents the dominant structure of the vegetation (Liu et al. 2023; Madonsela et al. 2017; Oldeland et al. 2010). Additionally, given that spectral signatures correspond closely to predominant vegetation types, we calculated a percent conifer index for each plot (Appendix S2: Table S1). In addition to species diversity metrics, we also calculate one measure of functional diversity, functional dispersion (FDis) using the FD package in R (Laliberté and Legendre 2010; R Core Team 2023) (Appendix S2: Table S1). Functional diversity, unlike species richness, focuses on the variety of traits within a plant community and is directly related to ecosystem functioning. These traits encompass morphological (physical structure), physiological (functioning), and phenological (timing of life cycle events) aspects that influence a plant’s development, persistence, and reproductive success (Thom et al. 2021). FDis calculates the average distance of each species from a central point in a space defined by multiple traits, adjusting the central point’s position based on species abundances and scaling each species’ distance by its abundance (Laliberté and Legendre 2010). FDis is not impacted by the number of species present and has no upper limit (Laliberté and Legendre 2010). In this study, we calculated FDis using the species information recorded for each forest inventory plot and traits from The Enhanced Species-Level Trait Dataset (Díaz et al. 2022). This dataset provides mean functional trait values, including plant height, stem-specific density, leaf area, leaf mass per area, leaf nitrogen content per dry mass, and diaspore mass for over 46,000 different plant species (Díaz et al. 2022). In our analysis, we used the traits that contained the most data for the tree species in our study area. Specifically, we used plant height (m), leaf area (mm2), and leaf nitrogen content per dry mass (mg/g) to calculate functional dispersion. We had to exclude plots (n = 93) containing willow species from further analysis (Salix spp). As the forest inventory did not provide specific species names for this group, we were unable to extract functional diversity traits from the database as there are a large number of willow species found in Quebec including both shrubs and trees.
2.3. Satellite, climatic, and environmental data
We obtained satellite imagery over the study area using Google Earth Engine (GEE). Specifically, we extracted Sentinel-2 Harmonized surface reflectance data (Appendix S2: Table S2) over three time periods corresponding to the start (May 15 – June 20), peak (July 15 – August 20) and end (September 15 – October 20) of the growing season. We used the Sentinel-2 Scene Classification layer to remove clouds and shadows from the study area. We also masked out all non-forested pixels using the ESA WorldCover 10 m v100 dataset (Zanaga et al. 2021). Given the large size of the study area, we calculated the median pixel values for each period over three years − 2018, 2019, and 2020 – to ensure image coverage of the entire area. Notably, the majority of the plots analyzed were surveyed in 2019. We imported the latitude and longitude coordinates of each plot in GEE and buffered them with a radius of 11.28 meters corresponding to the size of the forest plot on the ground. Then we extracted the mean value of the Sentinel-2 bands within each plot for each time period.
2.3.1. Vegetation indices
We calculated the NDVI, EVI, and SR, as these indices have shown relationships with species diversity in past studies (Hernández-Stefanoni et al. 2012; Kacic and Kuenzer 2022; Liu et al. 2023; Madonsela et al. 2017; Torresani et al. 2019). Additionally, we calculate two indices based on the visible bands (RGB) including the Green Red Variation Index (GRVI) and Visible Atmospherically Resistant Index (VARI), as we were interested in the potential benefits and applicability of RGB indices in estimating our biodiversity indices (Appendix S2: Table S3). Our interest in these indices stems from the potential cost-effectiveness and accessibility of RGB data, which can be derived from drone, airborne, and satellite imagery. We calculated each vegetation index for each time period, in addition to the standard deviation and coefficient of variation of certain bands for each pixel (Appendix S2: Table S3). Previous research has shown higher spectral separability for tree species in deciduous and coniferous foresting using red edge, near-infrared (NIR), and short-wave infrared (SWIR) portions of the spectrum (Jones et al./ 2010, Persson et al. 2018; Liu et al. 2023). In cases where the vegetation index uses two pixels of different resolution, GEE automatically resamples the lower resolution (20 m) pixels to the size of the higher resolution pixel (10 m) (Gorelick et al. 2017). We calculated each index over the entire image in GEE then extracted the mean value of each index within each buffered plot for each time period.
2.3.2. Spectral information content
2.3.2.1. GLCM texture
We calculated texture as a measure of image heterogeneity. Image texture is associated with spatial variability, which can indicate environmental heterogeneity, which in turn can be related to different measures of biodiversity. We used the Grey-Level Co-occurrence Matrix (GLCM) function, ee.Image.glcmTexture, in GEE. GLCM image texture metrics capture spatial variations in pixel brightness often referred to as grey levels or digital numbers (Hall-Beyer 2017a). In 1973, Haralick et al. proposed 14 second order texture measures based on GLCMs. These measures are derived from the normalized GLCM and calculated considering the angular relationship between a reference pixel and a neighboring pixel within a moving window. GLCM texture metrics can be broadly categorized into three classes, contrast, orderliness, and descriptive statistics (Hall-Beyer 2017b). Metrics within the contrast category evaluate the degree of variation between adjacent pixels in the GLCM and include measures such as contrast, dissimilarity, and homogeneity. The orderliness category focuses on the consistency of pixel changes within a moving window, and includes measures such as angular second moment, energy, and entropy. Descriptive statistics are the usual descriptive statistics such as mean, standard deviation, correlation, etc., but here they are calculated on the GLCM. In this analysis we calculate two descriptive statistics from the GLCM for each plot (variance and correlation) as well as one metric from the contrast category (dissimilarity) and one from the orderliness category (entropy) (Appendix S2: Table S4) (For details on calculations see Haralick et al. 1973). Only one metric from each category, except descriptive statistics, was chosen due to the high correlation between metrics within the same category (Hall-Beyer 2017b). Larger entropy, dissimilarity, and variance values indicate greater variability in the image while larger correlation values indicate less variability.
Texture was calculated for the bands B3, B4, B5, B6, B7, B8, B8A, B11, B12 and indices NDVI, GRVI, VARI using a moving window of 3 × 3 pixels in all directions and then averaged. This window size was chosen to keep the metrics relevant to the plot size and capture sufficient variation in and around the plots. Before proceeding with the texture analysis, the individual bands and vegetation index values were converted into discrete integer values ranging from 0 to 255 by linearly scaling the pixel values. This transformation is essential because the GLCM method requires discrete values. The original floating point reflectance values are unsuitable for GLCM, because it relies on counting the frequency of pixel pair occurrences at specific integer intensity levels. After calculating texture over the chosen bands for each time period’s image we extracted the mean value of each texture measure within each plot.
2.3.2.2. Coefficient of variation: moving window
In addition to the pixel-based coefficient of variation we also calculated the coefficient of variation using a moving window of 3 × 3 pixels (Appendix S2: Table S5). We calculated both the single dimensional coefficient of variation (SDCV) over the following bands: green (B3), red (B4), red edge (B5), NIR (B8), and the multi dimensional coefficient of variation (MDCV) using all of those bands (Liu et al. 2023; Wang et al. 2016). MDCV has shown modest relationships with species richness and Shannon diversity in previous studies (Madonsela et al. 2021; Wang et al. 2016).
2.3.3. Environmental and climate variables
In addition to the spectral variables, we also include environmental variables representing environmental diversity (Table 1). Established predictors of plant species richness such as climate, topography, and soil properties may improve predictive power of models using only spectral data (Perrone et al. 2023; Schmidtlein and Fassnacht 2017). We included four environmental predictors from the Worldclim bioclimatic variable dataset: Annual mean temperature (BIO 01), Temperature Seasonality (BIO 04), Annual precipitation (BIO 12), and precipitation seasonality (BIO 15)(Hijmans et al. 2005). As one moves toward higher latitudes, lower temperatures progressively limit the variety of tree species that can successfully survive (Thom et al. 2021). We also include both latitude and longitude of the plots as predictors. Additionally, we include elevation from the Canadian Digital Elevation Model as elevation can influence local environmental and growing conditions (Liu et al. 2023; Thom et al. 2021). We also included six different simulated soil measures at six different depths from the SIIGSOL-100m dataset: proportion of organic matter, clay, sand, and silt, cation exchange capacity (meq/100g), and pH (SIIGSOL 2022). The diversity of soil characteristics, including chemical attributes such as pH and soil organic carbon, which indicate nutrient availability, along with soil texture, characterized by the presence of clay, silt, and sand – impacting water retention, collectively influence the distribution of different tree species across a region (Mensah et al. 2023; Thom et al. 2021). The mean value of each environmental variable within each plot was extracted. Plots with missing values were excluded from further analysis; a total of 2531 of the initial 2668 plots were retained.
Table 1. All spectral variables calculated over all regions (Quebec, deciduous, mixed, and boreal) and all time periods (Start, Peak, End of the season) and environmental variables included in the Random Forest models.
2.4. Analysis
2.4.1. Data exploration
To understand the distribution of biodiversity across the forest inventory plots (sub-objective 1) we first create a heat plot of all the tree species found in each region in R (R Core Team 2023). Additionally, we map the distribution and hotspots of the four biodiversity indices across the study area by taking the values extracted in R and importing them into ArcGIS pro and running the Optimized Hot Spot Analysis tool (Esri 2021; R Core Team 2023). This tool calculates the Getis-Ord Gi statistic, which evaluates the value of each point—here, forest inventory plots—relative to its neighboring points and the overall average value in the study area. The z-scores generated by the Getis-Ord Gi statistic identify spatial clusters of high or low biodiversity index values. A high positive z-score for a point indicates a cluster of high biodiversity index values (‘hot spots’), while a high negative z-score indicates a cluster of low biodiversity index values (‘cold spots’). A z-score near zero indicates no significant spatial clustering. Statistically significant hot spots and cold spots consist of neighboring points with biodiversity index values significantly higher or lower, respectively, than the average value in the study area. To understand the relationship between the spectral signatures and the biodiversity indices (sub-objective 2), we plotted an average signature, across the Sentinel-2 bands, for groups of plots with low, medium, and high values for each diversity index. We grouped the plots into these categories using equal interval ranges (Figure 2). The spectral signatures provide our first insight into the potential of using Sentinel-2 data to distinguish between low and high values of our biodiversity indices.
Figure 2. Distribution of the biodiversity indices across all three forest types and dotted lines representing the equal interval ranges. The ranges for functional dispersion are: 0– 0.08 (low), > 0.08–0.16 (med), > 0.16 (high). For Shannon diversity: 0–0.71 (low), > 0.71–1.42 (medium), > 1.42 (high). For species richness: 1–3 (low), 4–6 (medium), > 7 (high). For percent conifer: 0–33% (low), > 33%–66% (medium), >66% (high).
2.4.2. Clusters
To further understand the relationship between the Sentinel-2 bands and the biodiversity indices we used K-means clustering in R (sub-objective 3) to cluster the plots over all forest types (Quebec region) and each time period (R Core Team 2023). We used only the Sentinel-2 spectral bands as input for clustering. Subsequently, we calculated the mean and standard deviation of the biodiversity indices using the plots within each cluster to determine if they were being stratified by the biodiversity indices. To determine the number of clusters to be used in our analysis we used the within-cluster sum of squares (WSS) elbow method, which showed a decrease in WSS after 4 clusters (Thorndike 1953).
2.4.3. Variable selection using VSURF
To test for linear correlations between the 295 predictors and the biodiversity indices we calculated Pearson’s correlation coefficients for each predictor and biodiversity index in each region and over all regions, using the cor() function in base R. Starting with simple linear methods helps identify direct relationships between predictors and biodiversity indices, providing an initial estimation of the relative importance of predictor variables. However, to select the most useful variables for predicting the four biodiversity indices, we used the Variable Selection Using Random Forests (VSURF) package in R (sub-objective 4) (Genuer et al. 2010; 2015). RFs are versatile in their ability to capture both linear and non-linear relationships between predictor variables and the response variable. VSURF was run for each response variable, i.e., the four biodiversity indices, for each forest type and over the entire area using predictors from all time periods, for a total of 16 runs. VSURF uses a three-step process. Initially, VSURF assesses all variables’ importance by averaging their scores across 20 RF iterations. Variables with low importance scores are discarded. This step ensures that only variables with a significant impact on the prediction are considered for further analysis. VSURF then creates a series of nested RF models, each incorporating and increasing number of top-ranked variables, from the single most important variable up to the total number of variables deemed important. It evaluates these models based on their out-of-bag (OOB) error rates. The model with the smallest OOB error rate is selected. This model’s variables are considered optimal for interpreting the response variable. Then with the optimal variables identified, VSURF refines the selection. Starting from the most important variable, it sequentially adds variables to the model, testing the impact of each addition. A variable is only retained if its inclusion results in a significant decrease in the OOB error rate, beyond a predetermined threshold. This threshold is calculated based on the average improvement in error rates when adding noisy variables, ensuring that each selected variable genuinely enhances the model’s predictive accuracy. The result of this process is an effective reduction in data dimensionality, providing a refined list of the best variables for prediction. After obtaining the best variables for predictive accuracy for each forest type and over the entire region, we categorized the variables by their type (environmental or spectral) and by season, to understand both the differences and similarities among the predictors for each forest category.
2.4.4. Random Forest regression models
Based on the optimal predictor variables from VSURF we ran RF regression models in R using the randomForest package (sub-objective 5) (Liaw and Wiener 2002). We used the optimal predictors selected for each forest type and across all regions to estimate each diversity index, additionally we ran the models using all predictor variables (295) for a total of 32 models (Table 2). In each RF model, we separated the dataset into training and test sets, allocating 70% for training and 30% for testing. We used the createDataPartition function from the caret package in R to randomly stratify these samples (Kuhn 2008). This approach systematically selected samples such that the proportion of the response variable values are consistent between the training and testing datasets. We ran each RF model with the default parameters for regression: mtry = number of predictors/3 and ntree = 1000 (Boehmke & Greenwell 2019). The mtry parameter refers to the number of predictor variables randomly selected and considered for splitting at each node in the decision trees, while ntree is the number of decision trees in the Random Forest. We calculated the coefficient of determination for each model along with the root-mean-square error (RMSE) percentage, which was normalized by the range of the test data for comparison across models.
Table 2. Names of all 32 random forest models, each region included a model with all variables and a model with the variables selected by VSURF.
3. Results
3.1. Data exploration
A total of 48 different species were found across all the forest inventory plots, in the three regions. The deciduous forest plots contain the highest number of species (46), while the mixed forest contains 29 species, and the boreal plots contains 18. The most abundant species in the study area is the balsam fir (Figure 3).
Figure 3. Heat map of the tree species found across the deciduous, mixed, and boreal forests in Quebec.
The spatial distribution of the biodiversity indices provides additional details on how each index varies across the three regions. Summary statistics for the plots in each region showed the decreasing averages for each biodiversity index, moving from the deciduous to the boreal region, except percent conifer, which shows the opposite trend (Table 3). Across the entire study area, species richness, Shannon diversity, and functional dispersion hotspots are concentrated in the deciduous forest (Figure 4a, 4b, 4d). These biodiversity indices are also highly correlated (Table 4). The percent conifer index is less correlated with the other biodiversity indices and its hotspots are concentrated in the boreal forest (Figure 4c).
Figure 4. Spatial distribution, along with hotspots and cold spots for each biodiversity index (species richness (a), Shannon diversity (b), percent conifer (c), functional dispersion (d)) across the study area. Hot spots (shown in red) indicate areas where the index values are significantly higher than average, while cold spots (shown in blue) indicate areas where the index values are significantly lower than average. The % confidence indicates the statistical significance of the hot and cold spots. For instance, a 95% confidence level corresponds to a p-value less than 0.05.
Table 3. Summary statistics for each region and over all regions where n is the number of plots.
Table 4. Pearsons’s correlation coefficients between the four biodiversity indices across the entire study area.
To understand how the biodiversity indices were related to spectral variability, we plotted spectral signatures for each biodiversity index and region, though we present the results for the entire study area here (Quebec region) (Figure 5). The signatures including all bands and just the visible bands for each region (boreal, mixed, and deciduous) and each biodiversity index are detailed in Appendix S1 (Figures S1 – S7). In all regions, the spectral signatures for the low, medium and high values (Figure 2) of each index showed considerable overlap (as measured by the mean and the standard deviation) during each season, except for percent conifer which shows some separation between the high and low values. This separation is highest in the red-edge (bands 6 and 7), NIR (bands 8 and 8 A) and SWIR (bands 11 and 12) portions of spectrum across the seasons and in all regions, though in general more bands are separable during the peak of the season. In the visible spectrum the red band (band 4) shows high separability for the high and low values of percent conifer during the end of the season and across all regions. Overall, the spectral signatures indicate that percent conifer has the greatest spectral separability of the biodiversity indices tested.
Figure 5. Average spectral signatures with +/- one standard deviation for each index, time period, and class (high, medium, low) over the entire study region.
3.2. Clusters
To further understand what the base spectral data (Sentinel-2 bands) were reflecting, K-means clustering was applied with four clusters at the start, peak, and end of season. The clustering results were consistent across these different periods. Therefore, we focus on presenting the results obtained at the peak of the season for clarity and brevity (Figure 6.a). Cluster one contained 627 plots, cluster two contained 635 plots, cluster three contained 723 plots, and cluster four contained 546 plots. Each cluster was spectrally separable based on the average spectral signatures obtained from the plots within each cluster (Figure 6.b). Cluster 4 and cluster 2 had the highest reflectance and among the top three species within those clusters, broadleaf trees were the most common. Cluster 3 and cluster 1 had lower reflectance and a higher percentage of coniferous species (Figure 6.c). Thus, the cluster with the highest reflectance corresponds to the areas with the lowest percentage of conifers. Conversely, the cluster with the lowest reflectance corresponds to areas with the highest percentage of conifers. Clusters with intermediate reflectance values align proportionally with intermediate percentages of conifers. When examining the average values of the plots within each cluster, they did not show stratification based on species richness, Shannon diversity, or functional diversity. Though, there was a greater stratification by percent conifer suggesting that reflectance is related more to composition than to diversity (Figure 7).
Figure 6. a: Map of the forest inventory plots and their respective cluster assignments. b: Average spectral signature and standard deviation of each band within each cluster. c: Top three species found within each cluster and their percentages and type (coniferous or broadleaf).
Figure 7. Mean and standard deviation of species richness (a), Shannon diversity (b), functional dispersion (c) and percent conifer (d), within the forest plots in each cluster.
3.3. Linear correlations and variable selection using VSURF
We began by using simple linear methods to identify direct relationships between predictors and biodiversity indices, which provided an initial estimation of which variables are important for each region and index. Among the biodiversity indices tested, percent conifer showed the largest linear correlations with our predictor variables across all of our study region, reaching a peak of −0.81 with the standard deviation of the NIR (band 8), red (band 4), red-edge (band 5), and green (band 3) at the peak of the season in the Quebec region (Table 5). For species richness and Shannon diversity the top correlation was with latitude in the Quebec region, −0.41 and −0.40 respectively. For Functional dispersion the top correlation (0.39) was the coefficient of variation of the NIR (band 8), red (band 4), red-edge (band 5), and green (band 3) at the peak of the season in the Quebec region. The majority of the top five correlated variables were selected from the peak of the season. In the deciduous forest there were more highly correlated variables from the start and end of the season for species richness and Shannon diversity. The top five correlated variables are presented here (Table 5) for each region and index, for a list of all correlations for each region and index see Appendix B (Tables S6-S21). Variable codes and descriptions can be found in Appendix S2 (Table S22). Overall, linear correlations between the predictor variables and the percent conifer index were higher than for species richness, Shannon diversity, and functional dispersion. The Sentinel-2 – derived spectral variability metrics (measured by CV and SD) consistently exhibited significant correlations with all biodiversity indices and regions, showing the strongest correlations with percent conifer and functional dispersion. Spectral variability demonstrated the weakest correlation with species richness. Specific spectral bands (green, red-edge and NIR) and the texture metrics derived from these bands showed significant correlations with biodiversity indices, but mostly with species richness and Shannon.
Table 5. Top five Pearsons correlation coefficients for each region and index.
We used the VSURF algorithm in R to reduce the number of variables and identify the most relevant ones for prediction in our RF models. The variables selected by VSURF were not necessarily those that had the highest linear correlations, as RF, the underlying algorithm of VSURF, also captures non-linear relationships. Of the 295 predictor variables, VSURF selected a total of 45 unique variables for species richness, 41 for Shannon diversity, 46 for functional dispersion, and 38 for percent conifer across all regions. The VSURF variable figure for percent conifer is presented in the main text (Figure 8), for species richness, Shannon diversity, and functional dispersion figures can be found in Appendix S1 (Figures S8 – S10). Variables selected by VSURF for each biodiversity index and for each region differed, though a couple variables were selected for each region for some indices. For example, reflectance of the red edge band at the start of the season (B5_start) and the proportion of organic matter at depths of 5-15cm (carbon515) were selected as important across all regions and for estimating all biodiversity indices, except percent conifer. For species richness the GRVI index at the peak of the season (GRVI_peak) was selected across all regions as an important predictor variable. For Shannon diversity band 12 (SWIR) at the end of the season (B12_end) was important across all regions. For percent conifer, the red band at the end of the season (B4_end), the coefficient of variation between band 11 (SWIR), 12 (SWIR), and 4 (red) at the peak of the season (CV_B11B12red_peak), and the standard deviation of band 11 (SWIR), 12 (SWIR), and 4 (red) at the start of the season (SD_B11B12red_start) were important predictors across each region.
Figure 8. Top predictor variables selected by VSURF for the estimation of percent conifer. Yellow outlines show the top predictor variable used in the subsequent Random Forest models. Variables starting with B refer to individual Sentinel-2 bands. Variables beginning with SD and CV refer to pixel based standard deviation and coefficient of variation respectively. There are several vegetation indices including the green-red vegetation index (GRVI), the normalized difference vegetation index (NDVI), the simple ratio index (SR), and the visible atmospherically resistant index (VARI). Correlation, entropy, and dissimilarity refer to grey-level co-occurrence matrix (GLCM) textures. WinCV refers to the window based coefficient of variation.
Examining broader trends in the variables selected by VSURF, we observe some key patterns. Seasonal differences: As for seasonal differences, no region – index combination relied solely on variables derived from a specific season; variables were selected from the start, peak, and end of the growing season. Individual bands: For the individual bands identified by VSURF for each biodiversity index, most were found within the visible, red-edge, and SWIR regions of the spectrum. While NIR bands were only chosen a total of four times for species richness, Shannon diversity, and functional dispersion. Notably, NIR bands by themselves were not selected as predictors for conifer percentage. Vegetation indices: For each biodiversity index and region, the most commonly selected vegetation indices were pixel-based coefficients of variation and standard deviations. For each combination of biodiversity index and region, vegetation indices based on both the visible spectrum and the NIR, red-edge, and SWIR portions were identified as important predictors. Spectral information content: For spectral information content metrics selected by VSURF, GLCM texture variables were selected more often than the single and multi-dimensional coefficient of variation. No one texture metric was universally important across the four regions for any of the biodiversity indices. Across the four biodiversity indices, Shannon diversity predictors contained the most texture metrics, while percent conifer predictors contained the least. Across the four biodiversity indices and regions textures calculated using the NIR band were selected more often than those based on other bands or vegetation indices. GLCM dissimilarity and entropy textures were selected more often than GLCM variance and correlation textures, across all regions and biodiversity indices. For regional differences, the boreal region used the most texture metrics across the four biodiversity indices, the majority being entropy textures. Environmental variables: Environmental variables from each category (climatic, geographic, and soil) were also selected for each biodiversity index. However, the soil variables selected as important consisted only of carbon and pH. The mixed and boreal forest regions did not use any variables from the climatic category in the environmental group for any biodiversity index. The boreal region across all biodiversity indices only used soil variables from the environmental variables. Overall, each biodiversity index has distinct relationships with the predictor variables, and these relationships tend to vary between regions, except for the few consistently selected variables discussed earlier in this section.
3.4. Random Forest models
The VSURF algorithm identified a diverse set of optimal predictor variables across different seasons, spectral bands, and environmental variables for each study region. On average, VSURF reduced the number of variables from 295 to 18. The R2 values for the RF models using the VSURF variables ranged from 0.25 to 0.77 while the models using all variables range from 0.16 to 0.77, indicating a wide variation in model performance (Figure 9). We found that the models predicting percent conifer performed better than those for other biodiversity indices. After percent conifer, the Shannon diversity VSURF model has the highest performance in the Quebec and deciduous regions. In contrast, the functional dispersion VSURF model achieved the best results in the mixed region after percent conifer, while the species richness VSURF model was most effective in the boreal region after percent conifer. We also observed that the models for deciduous and mixed forests had lower R2 values compared to those for the boreal forest. When we ran the models with the VSURF variables selected for each region and index the models did marginally better, with an average R2 increase of 0.03 and decrease in %RMSE by 0.58%. These results demonstrate that VSURF can identify the most relevant variables without compromising model accuracy and that overall percent conifer models outperform those for the other biodiversity indices in all study regions. Variable importance figures for each RF model can be found in Appendix S1 (Figures S11 – S42).
Figure 9. Results for the Random Forest regression models for each region and each biodiversity index. Models for each region were run using all variables (ALL) and with their respective VSURF variables (VSURF).
4. Discussion
In this study, we integrated ground data, satellite imagery, and machine learning models to explore the relationship between spectral metrics, environmental variables, and biodiversity indices in Quebec’s forests. Our focus was on various aspects of biodiversity such as richness, evenness, function, and relative abundance within and across deciduous, mixed, and boreal forests. Deciduous forests showed high species richness, Shannon diversity, and functional diversity, while boreal forests have higher abundance of coniferous species, aligning with the transition from deciduous to coniferous vegetation with latitude. Through the analysis of spectral signatures associated with each biodiversity index, we found that percent conifer showed reasonable spectral separability and had high linear correlations with our spectral variables. We also assessed unsupervised clustering and RF regression models for their effectiveness in identifying these biodiversity indices. Clustering revealed stratification with conifer percentage but no strong patterns for other indices. Models that used VSURF variables performed slightly better than models including all 295 variables, indicating that models with fewer variables can match or improve the predictive power of models including hundreds of predictors. While environmental predictors of plant species diversity can complement spectral diversity predictors, our models demonstrated limited predictive capability even with these measures included. The differences in spectral signatures across forest types and across low, medium, and high biodiversity index values help explain the varying performance of our models.
4.1. Spectral signature analysis
The spectral signatures (average reflectance ± standard deviation) for each region demonstrate that the biodiversity indices display large overlap in representing low, medium and high values of the indices, especially for species richness, Shannon diversity, and functional dispersion. This spectral similarity contributes to the lack of distinct clusters when classifying regions using only Sentinel-2 spectral bands and explains the suboptimal performance of RF models in predicting these biodiversity indices, with none achieving an R2 value above 0.46. Conversely, for the percent conifer index, distinct spectral differences are observed between high and low classes across all study regions in the NIR, SWIR, and visible light, contributing to higher linear correlations with the spectral variables. This spectral separation likely contributes to more accurate predictions by RF models (R2 = 0.77, Quebec region) and clearer stratification in unsupervised clustering. Differences in leaf inclination angles, forms, and aggregation between conifers and broadleaf trees, are amplified at the canopy scale which influence the spectral reflectance of the forest canopy (Yang et al. 2021). Conifers generally exhibit lower reflectance due to their lower transmittance and higher absorption of light (Rautiainen et al. 2018). As a result, differences in reflectance can be observed between coniferous and broadleaf trees.
Species-specific indices such as percent conifer can be identified by remote sensing due to the distinct reflectance of coniferous vegetation compared to broadleaf and other vegetation types. In contrast, indices such as species richness and Shannon diversity, which quantify the presence and evenness of different species, do not display unique spectral signatures that can be readily distinguished by remote sensors at this pixel, spectral, and geographical scale. This lack of distinctiveness arises because areas with identical species counts may have vastly different species compositions. Considering the diversity of species within our study area, the potential species combinations for a given level of species richness are substantial. For instance, in the boreal forest, which has 18 different species, a forest inventory plot with five species can theoretically present as many as 8568 possible species combinations. In the deciduous forest, where the total number of species increases to 46, the complexity increases, resulting in up to 1,370,754 different combinations for plots containing five species. The sheer number of different combinations makes spectral identification of distinct classes of species diversity, whether that be species richness, or another index based on species counts challenging. In addition to this, even if the combination of species is the same, the spectral reflectance and diversity can also be affected by the age, health, and phenology of the individual species (Rocchini et al. 2022a). It is likely that this complexity contributes to the lower R2 values of the RF models in the mixed and deciduous forests. However, spectral diversity could potentially be more effective in lower-diversity ecosystems, which have fewer species and consequently fewer possible species compositions per plot. For functional dispersion, despite being grounded in species specific traits, it also encounters similar challenges in model performance. The spectral and spatial resolution of Sentinel-2 may not be sensitive enough to detect the specific trait variations that we used to define functional dispersion. Together, these results indicate that at the examined scale, variations in reflectance are more indicative of species composition than the mere number of species present, Shannon diversity, or functional diversity (Fassnacht et al. 2022).
4.2. Model performance
The performance of our Random Forest (RF) models in predicting species richness and Shannon diversity aligns with previous studies investigating the SVH, such as those conducted by Oldeland et al. (2010), Mandonsela et al. (2017), and Peña-Lara et al. (2022). However, a few studies reported higher model performances, notably Xi et al. (2023), who achieved an R2 of 0.81, Torresani et al. (2019), with an R2 of 0.70, and Liu et al. (2023) with an R2 of 0.63 for Shannon diversity. Additionally, our model’s predictions of functional diversity were similar to the results (R2 = 0.50) reported by Peña-Lara et al. (2022). These findings highlight a broader issue: the generally low predictive power when modeling species richness, Shannon diversity, and functional diversity based on the SVH. Additionally, the relationship between spectral variability and species diversity varies greatly between studies within similar and different ecosystems and this could be due to differing pixel sizes, spectral resolution of the sensor used, size of the study area, the season in which the study takes place, the chosen spectral diversity metrics used as predictors, the number of different species, the biodiversity indices, and statistical method (Fassnacht et al. 2022; Hauser et al. 2021a; Perrone et al. 2023; Schmidtlein & Fassnacht 2017; Torresani et al. 2019). In contrast, the model using the percentage of conifer species as a response achieved a higher R2 value 0.77, suggesting that this metric may be a valuable biodiversity index that is detectable by remote sensing in our study area. The proportion of conifer species within a forest has been associated with various ecological dynamics, including wildfire behavior, risks of insect outbreaks, conifer encroachment or invasiveness, and shifts in forest composition driven by climate change (Engber et al. 2011; Filippelli et al. 2020, Ma et al. 2023; Nie et al. 2018; Richardson and Rejmánek 2004). Earlier studies have applied different methods to estimate this index: Yang et al. (2021) combined reflectance modeling with satellite data and semi-supervised k-means clustering, while Filippelli et al. (2020) used Sentinel-2 and Landsat data alongside LiDAR and RF. Both reported comparable R2 with values over 0.80. In our study, in addition to our RF models, we showed that unsupervised k-means clustering effectively differentiates forest plots based on the conifer percentage. This method allowed us to estimate the proportion of conifers within each cluster, using spectral clusters. We found a clear trend in our study area: the cluster exhibiting the highest reflectance corresponded to the lowest proportion of conifers, whereas the cluster with the lowest reflectance indicated the highest proportion of conifers. Clusters with intermediate reflectance levels followed this same pattern, confirming the trend across the spectrum. This approach provides a simple, straightforward technique for estimating this index in addition to the more complex RF models.
4.3. VSURF predictor variables
4.3.1. Environmental variables
Incorporating environmental variables, as recommended by prior research (Perrone et al. 2023; Schmidtlein and Fassnacht 2017), did not yield higher R2’s for species richness, Shannon diversity, and functional diversity, compared to the existing literature. Yet, including these variables helped identify key environmental predictors and their differences and similarities among the different forest types. Across the biodiversity indices, environmental variables selected by VSURF in the Quebec region and deciduous forest included climatic, geographic, and soil variables. In the Quebec region, which spans each forest type, a combination of climatic, geographic, and soil variables may help capture the environmental constraints specific to each forest type. Additionally, the inclusion of geographical variables such as latitude in the Quebec region likely captures the trend of decreasing species richness with increasing latitude across this large area (Liang et al. 2022). In the deciduous forest climatic variables are particularly important because temperature is a limiting factor, as trees in these regions begin to suffer cell damage when temperatures drop below −40 C (Goldblum and Rigg 2010). In the mixed forest only geographical (latitude, longitude) and soil variables were selected. In the boreal region, no climate or geographical variables were chosen, instead the selection focused exclusively on soil variables, namely the proportion of organic matter and pH. Soil pH is particularly relevant in our study area. Acidic soils restrict the availability of essential nutrients such as nitrogen and phosphorus, which limits tree growth and reduces the diversity of tree species capable of surviving under these conditions (Ma et al. 2023). Coniferous trees, however, are more prevalent in regions with acidic soils due to their ability to adapt to low nutrient conditions (Ma et al. 2023). Positive correlations have been identified between soil carbon content, the primary component of soil organic matter, and plant diversity which may explain why the proportion of organic matter (carbon515) was selected as an important predictor of species richness, Shannon diversity, and functional diversity across all regions (Feeney et al. 2022; Spohn et al. 2023). Future research should consider the inclusion of climatic variables, especially over large geographical areas, and soil variables, particularly where there are different soil compositions across the region.
4.3.2. Spectral variables
Spectral variables were selected across the spectrum for each index and region. In the context of seasonal differences in spectral variables selected by VSURF, each region and each biodiversity index selected variables from each season. For individual bands selected by VSURF, the majority were concentrated in the visible, red-edge, and SWIR portion of the spectrum. Both red-edge bands and SWIR bands have been highlighted in past studies for their usefulness in tree species diversity estimation (Ming et al. 2024; Xi et al. 2023). Jones et al. (2010) showed using hyperspectral data that for both coniferous and broad-leaved trees, NIR and SWIR regions of the spectrum offered the highest spectral separation for different species. SWIR bands are sensitive to water content and reflect differences in plant composition and structure (Xi et al. 2023). This sensitivity may explain why these bands were selected as important for estimating our biodiversity indices. Liu et al. (2023) also found that the most relevant Sentinel-2 bands for estimating tree species diversity in a temperate forest were red-edge and NIR, however they did not find the SWIR bands as important. In our study area NIR by itself was not selected as an important predictor variable as often as variables from other portions of the spectrum but vegetation indices and texture measures that incorporated it were important.
By examining the relationship between various vegetation indices, derived from a broad spectrum of spectral bands, and our biodiversity indices, we identified a few trends. Vegetation indices, based on the pixel-based coefficient of variation and standard deviation across the following bands: NIR, red, red-edge, and combinations including green, showed high linear correlations with our biodiversity indices as did those using the SWIR bands. These vegetation indices were also consistently identified as important predictors by VSURF across the study regions and for all biodiversity indices. Moreover, indices based on visible light, specifically the GRVI at the peak of the season, were important for assessing species richness and functional dispersion across all study regions. Notably, the GRVI index was ranked within the top five linear correlations for all biodiversity indices, within one or more regions, with the exception of percent conifer. In contrast, Arekhi et al. (2017) reported that visible light indices from Landsat imagery, such as the ratio of green to red bands, showed lower correlation with tree diversity, highlighting NDVI as the most correlated instead. Despite this, in our study area, vegetation indices derived from the visible spectrum show potential, owing to their strong correlations and importance in our models. These indices are particularly relevant for RGB drone and aerial imagery applications, which can offer a cost-effective alternative to high-resolution satellite data.
Spectral information content metrics were selected less often as important predictors by VSURF and had lower correlations with our biodiversity indices compared to individual bands and vegetation indices. Nevertheless, GLCM textures were still selected as important predictors across the four study regions and biodiversity indices. However, their effectiveness has received mixed evaluations in the literature. Studies incorporating GLCM textures have both supported their usefulness (Liu et al. 2023; Peña-Lara et al. 2022) and found that they were not significant predictors of tree species diversity (Madonsela et al. 2017). Just as we encounter obstacles with spectral signatures in distinguishing species diversity in our study area and at the Sentinel-2 scale, texture information manifests similar complications. For instance, when evaluating forest plots, a plot with a species richness of two, comprising a mix of coniferous and broadleaf species, might exhibit higher textural dissimilarity values in comparison to a plot with a species richness of five, homogeneously populated with broadleaf species. The larger heterogeneity of the first plot, indicated by high dissimilarity values, does not straightforwardly translate to a higher tree species diversity but is more indicative of differences in vegetation structure such as canopy density, leaf size, and branching patterns. Variation in these structural variables may be a better indicator of other aspects of biodiversity such as bird or insect biodiversity (Farwell et al. 2021; Hoffmann et al. 2022; Wood et al. 2013, Storch et al. 2018).
4.4. Moving forward: challenges, recommendations, and limitations
In this study, we identified several challenges in distinguishing spectral signatures and metrics corresponding to varying levels of tree diversity. These challenges stem from the varied influences of multiple factors that determine these spectral signatures. Despite these challenges, our findings also highlight the potential of biodiversity indices based on broad vegetation classes. Central to our recommendations for future research moving forward is the use of ground-measured biodiversity indices that are fundamentally relatable to the remote sensing-derived spectral metrics, comprising the three components of biodiversity: taxonomic diversity, structural biodiversity and functional diversity.
Taxonomic diversity: Previous studies have mostly characterized taxonomic diversity by biodiversity indices that rely on the number of species in a plot, such as species richness and Shannon diversity. The challenge with these approaches is that plots with the same species counts can have vastly different species compositions, leading to highly variable spectral signatures. This variation complicates the spectral identification of distinct classes of tree diversity indices based solely on species counts. Furthermore, even with identical species combinations, the spectral reflectance and diversity can be influenced by factors such as the age, health, and phenology of the individual plants (Rocchini et al. 2022a). Our analysis of spectral signatures revealed large overlaps across various levels of species richness, Shannon diversity, and functional dispersion, posing challenges for the differentiation of groups of high versus low values of these biodiversity indices. Percent conifer, however, did show spectral separability, which likely explains why the unsupervised clustering and RF models were more effective for identifying percent conifer, compared to the other biodiversity indices. Our findings echo the challenges in applying the Spectral Variation Hypothesis at this scale, and potentially more broadly, especially for accurately detecting biodiversity indices such as species richness and Shannon diversity that rely only on species counts (Fassnacht et al. 2022; Perrone et al. 2023; Schmidtlein and Fassnacht 2017). Yet, our results do demonstrate that the Sentinel-2-derived spectral signatures allow for the identification of different levels of taxonomic composition (percent conifer). Future research should further investigate the use of biodiversity indices related to the relative abundance or composition of specific taxonomic classes rather than the number of species. The spectral signature is determined by the species composition rather than by the species number, which was demonstrated by the strong performance of the percent conifer index which represents the relative abundance at a high taxonomic level (division). Our findings are consistent with those reported by Olmo et al. (2024), confirming that spectral diversity can serve as an effective tool for distinguishing among broad vegetation classes or ‘spectral communities’ (Rocchini et al. 2021) and those of Fassnacht et al. (2022) that state differences in reflectance serve as a more reliable indicator of species composition than simply the number or richness of species. Future research should also explore the potential of relative abundance indices at lower taxonomic levels such as genus and family, and biodiversity indices related to functional group compositions (e.g., grass, trees, shrubs, C3/C4 plants).
Structural diversity: Structural diversity can be characterized by indices related to the spatial arrangement of the trees such as leaf area index, foliage density, branch architecture, and diameter at breast height variance. As a measure of the spatial arrangement of pixel intensities, the spectral information content metrics (e.g., texture) may be better suited to estimate structural diversity parameters, rather than tree taxonomic diversity indices (Ozdemir and Karnieli 2011; Zhou et al. 2017). Yet, most studies have explored the relationship between texture measures and taxonomic diversity indices. In our analysis, the texture measures were part of the final VSURF variables, but they were chosen less often than the vegetation indices and individual bands. In the absence of ground measurements of structural diversity, we related the texture metrics to taxonomic diversity. Future research should explore the relationships between texture metrics and structural diversity indices.
Functional Diversity: Functional diversity represents the variety of biological functions performed by plants, such as photosynthesis and CO2 sequestration (Thom et al. 2021). It can be measured using indices that reflect the diversity of a plant’s functional traits- traits that affect its growth, survival, and reproduction. These traits include biochemical aspects (e.g, nitrogen and chlorophyll content), morphological features such as leaf area, plant height, and biomass, physiological properties including photosynthetic rates, and phenological characteristics such as timing of flowering, fruiting, and leaf unfolding. Earlier studies have evaluated relationships between functional diversity indices (e.g., functional dispersion, richness, and evenness) and spectral reflectance and vegetation indices; however, most reported relationships were weak to moderate (Crofts et al. 2024; Peña-Lara et al. 2022). While functional dispersion is a species-specific index, our analysis of spectral signatures also revealed large overlaps across the different levels of FD, weak correlations with spectral variables, and low prediction power by the RF models. It is possible that the spatial and spectral resolutions of the Sentinel data are too coarse to capture the diversity of the traits that determine the functional diversity. Future research should further investigate the impact of image spectral and spatial resolution on estimating functional diversity. However, the observed weak relationships may also reflect the inherent complexities of functional diversity, indicating a need for further research into identifying the most suitable functional diversity indices that correlate with spectral metrics. Additionally, employing radiative transfer models (RTMs) could enhance the estimation and determination of relevant plant traits. In our study, in the absence of ground data, we used three traits from the Enhanced Species-Level Trait Dataset: plant height, leaf area, and leaf nitrogen content per dry mass. Considering that these were all leaf specific traits, the inclusion of canopy level traits in future studies could be beneficial. Hauser et al. (2021b), for example, showed the importance of using a combination of canopy traits (leaf area index) and leaf traits (chlorophyll content and leaf mass per area). These traits can be estimated through RTMs, but also through a combination of vegetation indices that can characterize these specific traits (Helfenstein et al. 2022). Furthermore, RTMs, when inverted, offer a method to assess how effectively satellite imagery can infer functional diversity. Using RTMs, Pacheco-Labrador et al. (2022), showed that the type of functional diversity index and the spatial resolution matter. They highlighted Rao’s Q index, functional dispersion and functional richness as the best-performing indices. Remote sensing-based phenology is another approach that can be explored for estimating functional diversity. Phenology metrics such as the length of the growing season, the peak of the growing season and the cumulative area under the time series curve have been used to estimate ecosystem functions such as photosynthesis and carbon uptake. Phenological diversity has been quantified (Sánchez-Ochoa et al. 2022) and linked to functional diversity (Filippa et al. 2018).
Overall, the primary focus going forward should be to understand the relationship, or lack thereof, between ground-measured biodiversity indices—whether taxonomic, structural, or functional—and remote sensing-derived spectral metrics. This includes but is not limited to, examining taxonomic indices through relative abundance, structural indices through texture analysis, and functional indices through RTMs and phenology models (Figure 10).
Figure 10. Illustration of the integration of plant, canopy, and environmental properties with remote sensing spectral metrics to derive tree diversity ground indices. The left panel highlights key properties, including leaf, canopy, and environmental variables. The middle panel describes the use of spectral diversity metrics derived from remote sensing, such as spectral indices, texture analysis, and spectral clustering techniques. The right panel illustrates diversity indices from the three main components of diversity that may be detectable using these spectral metrics: species diversity (species abundance, functional group composition), structural diversity (leaf area index variability, diameter variability), and functional diversity (functional dispersion, phenology). Colors of the shapes in the right panel represent different species while different shapes represent structure. Various models and approaches, including regression, classification, machine learning, phenology, and radiative transfer models, can be used to enhance the understanding of tree diversity from remote sensing data.
4.5. Limitations
The reflectance of a plant is affected by multiple factors including structure, biochemistry and physiology, health, and phenology which can also change over the course of the plant’s life (Lausch et al. 2016; Ustin and Gamon 2010). To effectively map species diversity, there must be biochemical or structural differences between species or species groups that lead to distinguishable plant signals or ‘optical types’ (Ustin and Gamon 2010; Rocchini et al. 2022a, 2022). The spectral diversity will be further affected by the spatial and spectral resolutions of the sensor, the plot’s understory vegetation, and surrounding environment (soil, topography). It is also recognized that the correlation between spectral diversity and biodiversity tends to strengthen as plot sizes increase (Oldeland et al. 2010, Torresani et al. 2019; Rocchini et al. 2004). Given the relatively modest size of our plots, which are around 400 square meters, the observed relationship between our spectral diversity measures and the ground data might be less pronounced due to the smaller plot size. In addition, there was no information about the understory vegetation in the plots, which can affect the plot reflectance. Moreover, combining data from different seasons has been shown to enhance the predictive power of models that estimate tree species diversity (Liu et al. 2023; Xi et al. 2023). However, the extent to which the inclusion of multitemporal variables enhances model performance was not specifically evaluated in this study. Future research in this region should investigate the impact of plot sizes and the understory vegetation, seasonal models, and the sensor spatial and spectral resolutions to identify the most effective scale for analysis as higher resolutions may better allow the detection of subtle differences between the various tree species.
5. Conclusion
This research was conducted to allow us to gain a better understanding of the challenges and opportunities for effectively monitoring forest biodiversity using spectral diversity metrics. Based on Sentinel-2 data and the large number of inventory plots, we were able to identify key challenges and recommendations for research moving forward. Altogether, detecting species richness, evenness, and functional diversity using spectral variability is challenging due to factors such as spatial and spectral resolution from the remote sensing perspective, and interspecific and intraspecific variations and similarities among species from the vegetation perspective. Looking ahead, the exploration of continuous species-specific (relative abundance) indices, such as percent conifer, tailored to specific study areas is promising. This approach aligns with one of the core objectives of the original SVH, challenging the rigid boundaries established by traditional classification methods. As critiqued in the foundational work by Palmer et al. (2000) these conventional approaches often overlook the continuous gradient that exists between classes, failing to capture the natural transitions in vegetation. Changes in the composition of broader vegetation classes can help us understand various ecological processes such responses to disturbance or adaptations to environmental changes. By identifying areas experiencing rapid shifts or those remaining static, we can enhance our understanding of these dynamics, which is essential for developing strategies aimed at combating biodiversity loss and addressing the effects of climate change.
Authorship contribution statement
Jennifer Donnini: Conceptualization, Methodology, Data curation, Formal analysis, Visualization, Writing – original draft. Angela Kross: Conceptualization, Methodology, Writing – Review & Editing, Supervision. Camilo Alejo: Conceptualization, Writing – Review & Editing.
Open research statement
Data availability statement: the ground data used in the study are from the Ministère des Forêts, de la Faune et des Parcs Secteur des forêts, Direction des inventaires forestiers, (inventaires.forestiers@mffp.gouv.qc.ca). The cleaned, extracted data will be made available through Concordia’s Dataverse (Concordia University Dataverse (borealisdata.ca) doi:10.5683/SP3/U8TI1E). All R code scripts required to replicate the analysis will be made available on request. The satellite-derived metrics can be reproduced in Google Earth Engine (GEE), the GEE script will be made available through a shared link: https://code.earthengine.google.com/9e5e0ac55756b8d88f7d-63c4f559e888?noload=true.
Supplemental material
View supplemental material on the original article web page: https://www.tandfonline.com/doi/full/10.1080/07038992.2024.2403495?scroll=top&needAccess=true#supplemental-material-section
View and download PDF version of this article: https://www.tandfonline.com/doi/pdf/10.1080/07038992.2024.2403495
Acknowledgement
This work has been supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) Alliance grant (ALLPR 571271‐21). Jennifer Donnini was supported by the Fonds de recherche du Québec master’s training scholarship. Angela Kross was supported by the NSERC Alliance grant, and Camilo Alejo was supported by the Horizon Postdoctoral fellowship from Concordia University.
Conflict of interest statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References
- Arekhi, M., Yılmaz, O.Y., Yılmaz, H., and Akyüz, Y.F. 2017. “Can tree species diversity be assessed with Landsat data in a temperate forest?” Environmental Monitoring and Assessment, Vol. 189 (No. 11):pp. 586. doi:10.1007/s10661-017-6295-6.
- Boehmke, B., and Greenwell, B. 2019. Hands-On Machine Learning with R. Boca Raton, FL, USA: Chapman and Hall/CRC. doi:10.1201/9780367816377.
- Boulanger, Y., Pascual, J., Bouchard, M., D’Orangeville, L., Périé, C., and Girardin, M.P. 2022. “Multi-model projections of tree species performance in Quebec, Canada under future climate change.” Global Change Biology, Vol. 28 (No. 5):pp. 1884–1902. doi:10.1111/gcb.16014.
- Boulanger, Y., Puigdevall, J.P., Bélisle, A.C., Bergeron, Y., Brice, M.-H., Cyr, D., De Grandpré, L., et al. 2023. “A regional integrated assessment of the impacts of climate change and of the potential adaptation avenues for Quebec’s forests.” Canadian Journal of Forest Research, Vol. 53 (No. 8):pp. 556–578. doi:10.1139/cjfr-2022-0282.
- Bracy Knight, K., Seddon, E.S., and Toombs, T.P. 2020. “A framework for evaluating biodiversity mitigation metrics.” Ambio, Vol. 49 ((No. 6):pp. 1232–1240. doi:10.1007/s13280-019-01266-y.
- Chavardès, R.D., Balducci, L., Bergeron, Y., Grondin, P., Poirier, V., Morin, H., and Gennaretti, F. 2023. “Greater tree species diversity and lower intraspecific competition attenuate impacts from temperature increases and insect epidemics in boreal forests of western Quebec, Canada.” Canadian Journal of Forest Research, Vol. 53 ((No. 1):pp. 48–59. doi:10.1139/cjfr-2022-0114.
- Chen, X., Taylor, A.R., Reich, P.B., Hisano, M., Chen, H.Y.H., and Chang, S.X. 2023. “Tree diversity increases decadal forest soil carbon and nitrogen accrual.” Nature, Vol. 618 (No. 7963):pp. 94–101. doi:10.1038/s41586-023-05941-9.
- Coops, N.C., Wulder, M.A., and Iwanicka, D. 2009. “An environmental domain classification of Canada using earth observation data for biodiversity assessment.” Ecological Informatics, Vol. 4 (No. 1):pp. 8–22. doi:10.1016/j.ecoinf.2008.09.005.
- Crofts, A.L., Wallis, C.I., St‐Jean, S., Demers‐Thibeault, S., Inamdar, D., Arroyo‐Mora, J.P., Kalacska, M., Laliberté, E., and Vellend, M. 2024. “Linking aerial hyperspectral data to canopy tree biodiversity: An examination of the spectral variation hypothesis.” Ecological Monographs, Vol. 94 (No. 3):pp. 1-23. doi:10.1002/ecm.1605.
- Díaz, S., Kattge, J., Cornelissen, J.H.C., Wright, I.J., Lavorel, S., Dray, S., Reu, B., et al. 2022. “The global spectrum of plant form and function: Enhanced species-level trait dataset.” Scientific Data, Vol. 9 (No. 1):pp. 755. doi:10.1038/s41597-022-01774-9.
- Ellison, A.M., Bank, M.S., Clinton, B.D., Colburn, E.A., Elliott, K., Ford, C.R., Foster, D.R., et al. 2005. “ Loss of foundation species: Consequences for the structure and dynamics of forested ecosystems.” Frontiers in Ecology and the Environment, Vol. 3 (No. 9):pp. 479–486. doi:10.1890/1540-9295(2005)003[0479:LOFSCF]2.0.CO;2.
- Engber, E.A., Varner, J.M., Arguello, L.A., and Sugihara, N.G. 2011. “The effects of conifer encroachment and overstory structure on fuels and fire in an oak woodland landscape.” Fire Ecology, Vol. 7 (No. 2):pp. 32–50. doi:10.4996/fireecology.0702032.
- Esri 2021. ArcGIS Pro (Version 2.8.8) [Software]. https://www.esri.com/en-us/arcgis/products/arcgis-pro/overviewGoogle Scholar
- Farwell, L.S., Gudex-Cross, D., Anise, I.E., Bosch, M.J., Olah, A.M., Radeloff, V.C., Razenkova, E., et al. 2021. “Satellite image texture captures vegetation heterogeneity and explains patterns of bird richness.” Remote Sensing of Environment, Vol. 253 pp. 112175. doi:10.1016/j.rse.2020.112175.
- Fassnacht, F.E., Müllerová, J., Conti, L., Malavasi, M., and Schmidtlein, S. 2022. “About the link between biodiversity and spectral variation.” Applied Vegetation Science, Vol. 25 (No. 1):pp. e12643. doi:10.1111/avsc.12643.
- Feeney, C.J., Cosby, B.J., Robinson, D.A., Thomas, A., Emmett, B.A., and Henrys, P. 2022. “Multiple soil map comparison highlights challenges for predicting topsoil organic carbon concentration at national scale.” Scientific Reports, Vol. 12 (No. 1):pp. 1379. doi:10.1038/s41598-022-05476-5.
- Féret, J.-B., and Asner, G.P. 2014. “Mapping tropical forest canopy diversity using high-fidelity imaging spectroscopy.” Ecological Applications: a Publication of the Ecological Society of America, Vol. 24 (No. 6):pp. 1289–1296. doi:10.1890/13-1824.1.
- Filippelli, S.K., Vogeler, J.C., Falkowski, M.J., and Meneguzzo, D.M. 2020. “Monitoring conifer cover: Leaf-off lidar and image-based tracking of eastern redcedar encroachment in central Nebraska.” Remote Sensing of Environment, Vol. 248 pp. 111961. doi:10.1016/j.rse.2020.111961.
- Filippa, G., Cremonese, E., Galvagno, M., Migliavacca, M., Römermann, C., Bucher, S., and Musavi, T. 2018. Phenological diversity is linked to the diversity of functional traits in alpine grasslands. In: F.-S.-U. Jena (Eds.), ICEI 2018: 10th International Conference on Ecological Informatics-Translating Ecological Data into Knowledge and Decisions in a Rapidly Changing World, 24-28 September, 2018, Jena, Germany. 10.22032/dbt.37985
- Frye, H.A., Aiello-Lammens, M.E., Euston-Brown, D., Jones, C.S., Kilroy Mollmann, H., Merow, C., Slingsby, J.A., van der Merwe, H., Wilson, A.M., and SilanderJr, J.A. 2021. “Plant spectral diversity as a surrogate for species, functional and phylogenetic diversity across a hyper-diverse biogeographic region.” Global Ecology and Biogeography, Vol. 30 (No. 7):pp. 1403–1417. doi:10.1111/geb.13306.
- Genuer, R., Poggi, J.-M., and Tuleau-Malot, C. 2010. “Variable selection using random forests.” Pattern Recognition Letters, Vol. 31 (No. 14):pp. 2225–2236. doi:10.1016/j.patrec.2010.03.014.
- Genuer, R., Poggi, J.-M., and Tuleau-Malot, C. 2015. “VSURF: an R Package for variable selection using random forests.” The R Journal, Vol. 7 (No. 2):pp. 19–33. doi:10.32614/RJ-2015-018.
- Gillis, M.D., Omule, A.Y., and Brierley, T. 2005. “Monitoring Canada’s forests: The National forest inventory.” The Forestry Chronicle, Vol. 81 (No. 2):pp. 214–221. doi:10.5558/tfc81214-2.
- Goldblum, D., and Rigg, L.S. 2010. “The deciduous forest – boreal forest ecotone.” Geography Compass, Vol. 4 (No. 7):pp. 701–717. doi:10.1111/j.1749-8198.2010.00342.x.
- Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., and Moore, R. 2017. “Google Earth Engine: Planetary-scale geospatial analysis for everyone.” Remote Sensing of Environment, Vol. 202 pp. 18–27. doi:10.1016/j.rse.2017.06.031.
- Hall-Beyer, M. 2017a. “Practical guidelines for choosing GLCM textures to use in landscape classification tasks over a range of moderate spatial scales.” International Journal of Remote Sensing, Vol. 38 (No. 5):pp. 1312–1338. doi:10.1080/01431161.2016.1278314.
- Hall-Beyer, M. 2017b. GLCM Texture: A Tutorial v. 1.0 through 2.7. https://prism.ucalgary.ca/server/api/core/bitstreams/8f9de234-cc94-401d-b701-f08ceee6cfdf/contentGoogle Scholar
- Haralick, R.M., Shanmugam, K., and Dinstein, I. 1973. “Textural features for image classification.” IEEE Transactions on Systems, Man, and Cybernetics, Vol. SMC-3 (No. 6):pp. 610–621. doi:10.1109/TSMC.1973.4309314.
- Hauser, L.T., Timmermans, J., van der Windt, N., Sil, Â.F., César de Sá, N., Soudzilovskaia, N.A., and van Bodegom, P.M. 2021a. “Explaining discrepancies between spectral and in-situ plant diversity in multispectral satellite earth observation.” Remote Sensing of Environment, Vol. 265 pp. 112684. doi:10.1016/j.rse.2021.112684.
- Hauser, L.T., Féret, J.B., Binh, N.A., van Der Windt, N., Sil, Â.F., Timmermans, J., Soudzilovskaia, N.A., and van Bodegom, P.M. 2021b. “Towards scalable estimation of plant functional diversity from Sentinel-2: In-situ validation in a heterogeneous (semi-) natural landscape.” Remote Sensing of Environment, Vol. 262 pp. 112505. doi:10.1016/j.rse.2021.112505.
- Helfenstein, I.S., Schneider, F.D., Schaepman, M.E., and Morsdorf, F. 2022. “Assessing biodiversity from space: Impact of spatial and spectral resolution on trait-based functional diversity.” Remote Sensing of Environment, Vol. 275 pp. 113024. doi:10.1016/j.rse.2022.113024.
- Hernández-Stefanoni, J.L., Gallardo-Cruz, J.A., Meave, J.A., Rocchini, D., Bello-Pineda, J., and López-Martínez, J.O. 2012. “Modeling α- and β-diversity in a tropical forest from remotely sensed and spatial data.” International Journal of Applied Earth Observation and Geoinformation, Vol. 19 pp. 359–368. doi:10.1016/j.jag.2012.04.002.
- Hijmans, R.J., Cameron, S.E., Parra, J.L., Jones, P.G., &., and Jarvis, A. 2005. “WorldClim V1 BIO, Very high resolution interpolated climate surfaces for Global Land Areas.” International Journal of Climatology, Vol. 25 ((No. 15):pp. 1965–1978. doi:10.1002/joc.1276.
- Hoffmann, J., Muro, J., and Dubovyk, O. 2022. “Predicting Species and Structural Diversity of Temperate Forests with Satellite Remote Sensing and Deep Learning.” Remote Sensing, Vol. 14 ((No. 7):pp. 1631. doi:10.3390/rs14071631.
- Jones, T.G., Coops, N.C., and Sharma, T. 2010. “Assessing the utility of airborne hyperspectral and LiDAR data for species distribution mapping in the coastal Pacific Northwest, Canada.” Remote Sensing of Environment, Vol. 114 ((No. 12):pp. 2841–2852. doi:10.1016/j.rse.2010.07.002.
- Kacic, P., and Kuenzer, C. 2022. “Forest Biodiversity Monitoring Based on Remotely Sensed Spectral Diversity—A Review.” Remote Sensing, Vol. 14 ((No. 21):pp. 5363. Article 21 doi:10.3390/rs14215363.
- Kuhn, M. 2008. “Building Predictive Models in R Using the caret Package.” Journal of Statistical Software, Vol. 28 ((No. 5):pp. 1–26. doi:10.18637/jss.v028.i05.
- Laliberté, E., and Legendre, P. 2010. “A distance-based framework for measuring functional diversity from multiple traits.” Ecology, Vol. 91 (No. 1):pp. 299–305. doi:10.1890/08-2244.1.
- Lausch, A., Bannehr, L., Beckmann, M., Boehm, C., Feilhauer, H., Hacker, J.M., Heurich, M., et al. 2016. “Linking Earth Observation and taxonomic, structural and functional biodiversity: Local to ecosystem perspectives.” Ecological Indicators, Vol. 70 pp. 317–339. doi:10.1016/j.ecolind.2016.06.022.
- Levin, N., Shmida, A., Levanoni, O., Tamari, H., and Kark, S. 2007. “Predicting mountain plant richness and rarity from space using satellite-derived vegetation indices.” Diversity and Distributions, Vol. 13 (No. 6):pp. 692–703. doi:10.1111/j.1472-4642.2007.00372.x.
- Liang, J., Crowther, T.W., Picard, N., Wiser, S., Zhou, M., Alberti, G., Schulze, E.-D., et al. 2016. “Positive biodiversity-productivity relationship predominant in global forests.” Science (New York, N.Y.), Vol. 354 (No. 6309):pp. aaf8957. doi:10.1126/science.aaf8957.
- Liang, J., Gamarra, J.G.P., Picard, N., Zhou, M., Pijanowski, B., Jacobs, D.F., Reich, P.B., et al. 2022. “Co-limitation towards lower latitudes shapes global forest diversity gradients.” Nature Ecology & Evolution, Vol. 6 (No. 10):pp. 1423–1437. doi:10.1038/s41559-022-01831-x.
- Liaw, A., and Wiener, M. 2002. “Classification and Regression by randomForest.” R News, Vol. 2 (No. 3):pp. 18–22.Google Scholar
- Liu, X., Frey, J., Munteanu, C., Still, N., and Koch, B. 2023. “Mapping tree species diversity in temperate montane forests using Sentinel-1 and Sentinel-2 imagery and topography data.” Remote Sensing of Environment, Vol. 292 pp. 113576. doi:10.1016/j.rse.2023.113576.
- Ma, H., Crowther, T.W., Mo, L., Maynard, D.S., Renner, S.S., van den Hoogen, J., Zou, Y., et al. 2023. “The global biogeography of tree leaf form and habit.” Nature Plants, Vol. 9 (No. 11):pp. 1795–1809. doi:10.1038/s41477-023-01543-5.
- Madonsela, S., Cho, M.A., Ramoelo, A., and Mutanga, O. 2017. “Remote sensing of species diversity using Landsat 8 spectral variables.” ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 133 pp. 116–127. doi:10.1016/j.isprsjprs.2017.10.008.
- Madonsela, S., Cho, M.A., Ramoelo, A., and Mutanga, O. 2021. “Investigating the relationship between tree species diversity and Landsat-8 spectral heterogeneity across multiple phenological stages.” Remote Sensing, Vol. 13 (No. 13):pp. 2467. doi:10.3390/rs13132467.
- Mensah, S., Noulèkoun, F., Dimobe, K., Seifert, T., and Glèlè Kakaï, R. 2023. “Climate and soil effects on tree species diversity and aboveground carbon patterns in semi-arid tree savannas.” Scientific Reports, Vol. 13 (No. 1):pp. 11509. doi:10.1038/s41598-023-38225-3.
- Ming, L., Liu, J., Quan, Y., Li, M., Wang, B., and Wei, G. 2024. “Mapping tree species diversity in a typical natural secondary forest by combining multispectral and LiDAR data.” Ecological Indicators, Vol. 159 pp. 111711. doi:10.1016/j.ecolind.2024.111711.
- Ministere des Forets, Faune et Parcs (MFFP) 2015. Sustainable forest management strategy. 54 pages. ISBN: 978-2-550-71493-4. Available at: https://mffp.gouv.qc.ca/english/publications/forest/sustainable-forest-management-strategy.pdf (Last accessed February, 2024)
- Ministère des Ressources naturelles et des Forêts (MRNF). n.d. Our forests. Retrieved from https://mrnf.gouv.qc.ca/en/forests/#:∼:text=Our%20forests%20are%20ubiquitous%20in,nearly%20half%20of%20the%20province.
- Ministère des Ressources naturelles et des Forêts, Classification écologique du territoire québécois, [Jeu de données], dans Données Québec 2016., mis à jour le 21 novembre 2023 [https://www.donneesquebec.ca/recherche/dataset/mrn]
- Ministère des Ressources naturelles et des Forêts (MRNF) 2022. Placette-échantillon permanente, [Jeu de données], dans Données Québec, mis à jour le 21 novembre 2023 [https://www.donneesquebec.ca/recherche/dataset/mrn]
- Mori, A.S., Dee, L.E., Gonzalez, A., Ohashi, H., Cowles, J., Wright, A.J., Loreau, M., et al. 2021. “Biodiversity–productivity relationships are key to nature-based climate solutions.” Nature Climate Change, Vol. 11 (No. 6):pp. 543–550. doi:10.1038/s41558-021-01062-1.
- Morneau, C., Couillard, P.-L., Laflamme, J., Major, M., Roy, V., and Veilleux, A. 2021. Classification écologique du territoire québécois 1–16 Quebec, Quebec: MFFP.
- Nie, Z., MacLean, D.A., and Taylor, A.R. 2018. “Forest overstory composition and seedling height influence defoliation of understory regeneration by spruce budworm.” Forest Ecology and Management, Vol. 409 pp. 353–360. doi:10.1016/j.foreco.2017.11.033.
- Oldeland, J., Wesuls, D., Rocchini, D., Schmidt, M., and Jürgens, N. 2010. “Does using species abundance data improve estimates of species diversity from remotely sensed spectral heterogeneity?” Ecological Indicators, Vol. 10 ((No. 2):pp. 390–396. doi:10.1016/j.ecolind.2009.07.012.
- Olmo, V., Bacaro, G., Sigura, M., Alberti, G., Castello, M., Rocchini, D., and Petruzzellis, F. 2024. “Mind the gaps: Horizontal canopy structure affects the relationship between taxonomic and spectral diversity.” International Journal of Remote Sensing, Vol. 45 (No. 9):pp. 2833–2864. doi:10.1080/01431161.2024.2334776.
- Ozdemir, I., and Karnieli, A. 2011. “Predicting forest structural parameters using the image texture derived from WorldView-2 multispectral imagery in a dryland forest, Israel.” International Journal of Applied Earth Observation and Geoinformation, Vol. 13 (No. 5):pp. 701–710. doi:10.1016/j.jag.2011.05.006.
- Pacheco-Labrador, J., Migliavacca, M., Ma, X., Mahecha, M.D., Carvalhais, N., Weber, U., Benavides, R., et al. 2022. “Challenging the link between functional and spectral diversity with radiative transfer modeling and data.” Remote Sensing of Environment, Vol. 280 pp. 113170. doi:10.1016/j.rse.2022.113170.
- Palmer, M.W., Earls, P.G., Hoagland, B.W., White, P.S., and Wohlgemuth, T. 2002. “Quantitative tools for perfecting species lists.” Environmetrics, Vol. 13 ((No. 2):pp. 121–137. doi:10.1002/env.516.
- Palmer, M. W., Thomas, W., Peter, E., Jose Ramon, A., and Steven, T. 2000. Opportunities for Long-Term Ecological Research at the Tallgrass Prairie Preserve, Oklahoma. ILTER Regional Workshop, 123–128.
- Paquette, A., and Messier, C. 2011. “The effect of biodiversity on tree productivity: From temperate to boreal forests.” Global Ecology and Biogeography, Vol. 20 (No. 1):pp. 170–180. doi:10.1111/j.1466-8238.2010.00592.x.
- Peña-Lara, V.A., Dupuy, J.M., Reyes-Garcia, C., Sanaphre-Villanueva, L., Portillo-Quintero, C.A., and Hernández-Stefanoni, J.L. 2022. “modelling species richness and functional diversity in tropical dry forests using multispectral remotely sensed and topographic data.” Remote Sensing, Vol. 14 ((No. 23):pp. 5919. Article 23 doi:10.3390/rs14235919.
- Perrone, M., Di Febbraro, M., Conti, L., Divíšek, J., Chytrý, M., Keil, P., Carranza, M.L., et al. 2023. “The relationship between spectral and plant diversity: disentangling the influence of metrics and habitat types at the landscape scale.” Remote Sensing of Environment, Vol. 293 pp. 113591. doi:10.1016/j.rse.2023.113591.
- Persson, M., Lindberg, E., and Reese, H. 2018. “Tree Species Classification with Multi-Temporal Sentinel-2 Data.” Remote Sensing, Vol. 10 (No. 11):pp. 1794. Article 11 doi:10.3390/rs10111794.
- Powers, R.P., Coops, N.C., Morgan, J.L., Wulder, M.A., Nelson, T.A., Drever, C.R., and Cumming, S.G. 2013. “A remote sensing approach to biodiversity assessment and regionalization of the Canadian boreal forest.” Progress in Physical Geography: Earth and Environment, Vol. 37 (No. 1):pp. 36–62. doi:10.1177/0309133312457405.
- R Core Team 2023. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. <https://www.R-project.org/>.
- Rautiainen, M., Lukeš, P., Homolová, L., Hovi, A., Pisek, J., and Mõttus, M. 2018. “Spectral Properties of Coniferous Forests: A Review of In Situ and Laboratory Measurements.” Remote Sensing, Vol. 10 ((No. 2):pp. 207. Article 2 doi:10.3390/rs10020207.
- Richardson, D.M., and Rejmánek, M. 2004. “Conifers as invasive aliens: A global survey and predictive framework.” Diversity and Distributions, Vol. 10 (No. 5–6):pp. 321–331. doi:10.1111/j.1366-9516.2004.00096.x.
- Rivers, M., Newton, A.C., Oldfield, S., and Contributors, G.T.A 2023. “Scientists’ warning to humanity on tree extinctions.” Plants, People, Planet, Vol. 5 (No. 4):pp. 466–482. doi:10.1002/ppp3.10314.
- Rocchini, D., Boyd, D.S., Féret, J.-B., Foody, G.M., He, K.S., Lausch, A., Nagendra, H., Wegmann, M., and Pettorelli, N. 2016. “Satellite remote sensing to monitor species diversity: Potential and pitfalls.” Remote Sensing in Ecology and Conservation, Vol. 2 (No. 1):pp. 25–36. doi:10.1002/rse2.9.
- Rocchini, D., Chiarucci, A., and Loiselle, S.A. 2004. “Testing the spectral variation hypothesis by using satellite multispectral images.” Acta Oecologica, Vol. 26 (No. 2):pp. 117–120. doi:10.1016/j.actao.2004.03.008.
- Rocchini, D., Salvatori, N., Beierkuhnlein, C., Chiarucci, A., de Boissieu, F., Förster, M., Garzon-Lopez, C.X., et al. 2021. “From local spectral species to global spectral communities: A benchmark for ecosystem diversity estimate by remote sensing.” Ecological Informatics, Vol. 61 pp. 101195. doi:10.1016/j.ecoinf.2020.101195.
- Rocchini, D., Santos, M.J., Ustin, S.L., Féret, J.-B., Asner, G.P., Beierkuhnlein, C., Dalponte, M., et al. 2022a. “The spectral species concept in living color.” Journal of Geophysical Research. Biogeosciences, Vol. 127 (No. 9):pp. e2022JG007026. doi:10.1029/2022JG007026.
- Rocchini, D., Torresani, M., Beierkuhnlein, C., Feoli, E., Foody, G.M., Lenoir, J., Malavasi, M., Moudrý, V., Šímová, P., and Ricotta, C. 2022b. “Double down on remote sensing for biodiversity estimation: A biological mindset.” Community Ecology, Vol. 23 (No. 3):pp. 267–276. doi:10.1007/s42974-022-00113-7.
- Sánchez-Ochoa, D., González, E.J., del Coro Arizmendi, M., Koleff, P., Martell-Dubois, R., Meave, J.A., and Pérez-Mendoza, H.A. 2022. “Quantifying phenological diversity: a framework based on Hill numbers theory.” PeerJ., Vol. 10 pp. e13412. doi:10.7717/peerj.13412.
- Schmidtlein, S., and Fassnacht, F.E. 2017. “The spectral variability hypothesis does not hold across landscapes.” Remote Sensing of Environment, Vol. 192 pp. 114–125. doi:10.1016/j.rse.2017.01.036.
- SIIGSOL 2022. – 100m Soil properties map MINISTRY OF NATURAL RESOURCES AND FORESTS, SIIGSOL-100m – Map of soil properties, [Dataset], in Data Quebec, updated June 26, 2023 [https://www.donneesquebec.ca/recherche/dataset/mrn] (accessed November 20, 2023)
- Spohn, M., Bagchi, S., Biederman, L.A., Borer, E.T., Bråthen, K.A., Bugalho, M.N., Caldeira, M.C., et al. 2023. “The positive effect of plant diversity on soil carbon depends on climate.” Nature Communications, Vol. 14 ((No. 1):pp. 6624. doi:10.1038/s41467-023-42340-0.
- Stein, A., Gerstner, K., and Kreft, H. 2014. “Environmental heterogeneity as a universal driver of species richness across taxa, biomes and spatial scales.” Ecology Letters, Vol. 17 (No. 7):pp. 866–880. doi:10.1111/ele.12277.
- Storch, F., Dormann, C.F., and Bauhus, J. 2018. “Quantifying forest structural diversity based on large-scale inventory data: A new approach to support biodiversity monitoring.” Forest Ecosystems, Vol. 5 (No. 1):pp. 34. doi:10.1186/s40663-018-0151-1.
- Thom, D., Taylor, A.R., Seidl, R., Thuiller, W., Wang, J., Robideau, M., and Keeton, W.S. 2021. “Forest structure, not climate, is the primary driver of functional diversity in northeastern North America.” The Science of the Total Environment, Vol. 762 pp. 143070. doi:10.1016/j.scitotenv.2020.143070.
- Thorndike, R.L. 1953. “Who belongs in the family?” Psychometrika, Vol. 18 (No. 4):pp. 267–276. doi:10.1007/BF02289263.
- Torresani, M., Rocchini, D., Sonnenschein, R., Zebisch, M., Marcantonio, M., Ricotta, C., and Tonon, G. 2019. “Estimating tree species diversity from space in an alpine conifer forest: The Rao’s Q diversity index meets the spectral variation hypothesis.” Ecological Informatics, Vol. 52 pp. 26–34. doi:10.1016/j.ecoinf.2019.04.001.
- Turner, W. 2014. “Sensing biodiversity.” Science (New York, N.Y.), Vol. 346 ((No. 6207):pp. 301–302. doi:10.1126/science.1256014.
- Ustin, S.L., and Gamon, J.A. 2010. “Remote sensing of plant functional types.” The New Phytologist, Vol. 186 ((No. 4):pp. 795–816. doi:10.1111/j.1469-8137.2010.03284.x.
- Wang, R., and Gamon, J.A. 2019. “Remote sensing of terrestrial plant biodiversity.” Remote Sensing of Environment, Vol. 231 pp. 111218. doi:10.1016/j.rse.2019.111218.
- Wang, R., Gamon, J.A., Montgomery, R.A., Townsend, P.A., Zygielbaum, A.I., Bitan, K., Tilman, D., and Cavender-Bares, J. 2016. “Seasonal variation in the NDVI–species richness relationship in a prairie grassland experiment (Cedar Creek).” Remote Sensing, Vol. 8 (No. 2):pp. 128. doi:10.3390/rs8020128.
- Warren, S.D., Alt, M., Olson, K.D., Irl, S.D.H., Steinbauer, M.J., and Jentsch, A. 2014. “The relationship between the spectral diversity of satellite imagery, habitat heterogeneity, and plant species richness.” Ecological Informatics, Vol. 24 pp. 160–168. doi:10.1016/j.ecoinf.2014.08.006.
- Wirth, C. 2005. Fire Regime and Tree Diversity in Boreal Forests: Implications for the Carbon Cycle. In M. Scherer-Lorenzen, C. Körner, & E.-D. Schulze (Eds.), Forest Diversity a nd Function: Temperate and Boreal Systems (Vol. 176; pp. 309–344). Berlin, Heidelberg: Springer. doi:10.1007/3-540-26599-6_15.
- Wood, E.M., Pidgeon, A.M., Radeloff, V.C., and Keuler, N.S. 2013. “Image Texture Predicts Avian Density and Species Richness.” PLOS One, Vol. 8 (No. 5):pp. e63211. doi:10.1371/journal.pone.0063211.
- Wood, E.M., Pidgeon, A.M., Radeloff, V.C., and Keuler, N.S. 2012. “Image texture as a remotely sensed measure of vegetation structure.” Remote Sensing of Environment, Vol. 121 pp. 516–526. doi:10.1016/j.rse.2012.01.003.
- Xi, Y., Zhang, W., Brandt, M., Tian, Q., and Fensholt, R. 2023. “Mapping tree species diversity of temperate forests using multi-temporal Sentinel-1 and -2 imagery.” Science of Remote Sensing, Vol. 8 pp. 100094. doi:10.1016/j.srs.2023.100094.
- Yang, R., Wang, L., Tian, Q., Xu, N., and Yang, Y. 2021. “Estimation of the Conifer-Broadleaf Ratio in Mixed Forests Based on Time-Series Data.” Remote Sensing, Vol. 13 (No. 21):pp. 4426. doi:10.3390/rs13214426.
- Zanaga, D., Van De Kerchove, R., De Keersmaecker, W., Souverijns, N., Brockmann, C., Quast, R., Wevers, J., et al. 2021. ESA WorldCover 10 m 2020 v100. doi:10.5281/zenodo.5571936).
- Zeng, Y., Hao, D., Huete, A., Dechant, B., Berry, J., Chen, J.M., Joiner, J., et al. 2022. “Optical vegetation indices for monitoring terrestrial ecosystems globally.” Nature Reviews Earth & Environment, Vol. 3 (No. 7):pp. 477–493. doi:10.1038/s43017-022-00298-5.
- Zhou, J., Yan Guo, R., Sun, M., Di, T.T., Wang, S., Zhai, J., and Zhao, Z. 2017. “The Effects of GLCM parameters on LAI estimation using texture values from Quickbird Satellite Imagery.” Scientific Reports, Vol. 7 (No. 1):pp. 7366. doi:10.1038/s41598-017-07951-w.