A Comparative Analysis of Kriging and Machine Learning-based Spatial Interpolation Models for Chlorophyll-a Estimation in the Chesapeake Bay

Jacob Aronow; Rakshita Chidananda; Chaowei Yang

doi:10.13021/jssr2025.5354

Authors

Jacob Aronow NSF Spatiotemporal Innovation Center, Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA
Rakshita Chidananda NSF Spatiotemporal Innovation Center, Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA
Chaowei Yang NSF Spatiotemporal Innovation Center, Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA

DOI:

https://doi.org/10.13021/jssr2025.5354

Abstract

Chlorophyll-a (chl-a) is a key indicator of water quality in coastal and estuarine ecosystems. In regions like the Chesapeake Bay, elevated chl-a levels often signal the presence of harmful algal blooms and biomass accumulation. However, satellite-based chl-a observations are frequently obscured by cloud cover, limiting their use in continuous coastal monitoring. Spatial interpolation models, which estimate values at unsampled locations based on existing data, offer one solution to resolve this scarcity. Existing research has examined the effectiveness of different interpolation models for the bay’s salinity and temperature; however, few studies have investigated the application of these models for chl-a. In this study, we evaluate the performance of three kriging-based models—universal kriging (UK), ordinary kriging (OK), and empirical bayesian kriging (EBK)—and three machine learning-based models—k-Nearest Neighbor (KNN), Extra Trees (ET), and XGBoost—for interpolating chl-a concentrations. Using over 5 million remotely sensed observations across nine days in early 2025, we find that EBK exhibits the best performance among kriging-based models, while ET outperforms all other machine learning models. Our results also demonstrate that while kriging-based models outperform machine learning models in data-rich conditions, machine learning models are more adaptable and accurate for data-sparse conditions. Since data availability varies significantly from day-to-day based on cloud cover, these findings suggest that no one model is universally optimal. Integrating both approaches may offer a hybrid framework for improving the continuity and reliability of chl-a monitoring in coastal regions.

A Comparative Analysis of Kriging and Machine Learning-based Spatial Interpolation Models for Chlorophyll-a Estimation in the Chesapeake Bay

Authors

DOI:

Abstract

Published

Issue

Section

License

assip