A Spatiotemporal Transformer Architecture for Long-Term Post-Wildfire Vegetation Recovery Forecasting

Authors

  • Svaran Medavarapu Center for Spatial Information Science and Systems, Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA
  • Ziheng Sun Center for Spatial Information Science and Systems, Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA

DOI:

https://doi.org/10.13021/jssr2025.5329

Abstract

The increasing frequency and scale of wildfires in regions like California present a critical challenge for ecological management. Predicting post-fire vegetation recovery is essential for restoration efforts, yet it remains a complex problem due to the interplay of static topographical features and dynamic climatic conditions. Traditional remote sensing models, often based on solely statistical analysis or Convolutional Neural Networks, struggle to capture these long-range spatiotemporal dependencies effectively. To address this, a novel spatiotemporal transformer architecture for forecasting vegetation regrowth was utilized. The model leverages a Vision Transformer (ViT) backbone to extract rich spatial features from multi-modal data, including Sentinel-2 imagery, burn severity, and topography. A temporal transformer encoder-decoder then processes a 12-month sequence of these features alongside climatic variables to predict the following 24 months of Normalized Difference Vegetation Index (NDVI). The decoder utilizes a unique querying mechanism, combining a spatial hint (the last known NDVI map) with learnable temporal embeddings to dynamically generate each future monthly prediction. This approach was trained on a comprehensive dataset spanning 41 historical wildfires across California. The model achieved a preliminary Structural Similarity Index Measure (SSIM) of 0.58 and a Mean Squared Error (MSE) of 0.15 on the validation set. While initial visualizations confirm the model is just beginning to learn temporal dynamics, these results indicate a foundational capacity for capturing complex spatiotemporal patterns. Further training is expected to significantly enhance its ability to forecast dynamic changes. By integrating a ViT with a temporal transformer, our approach provides a powerful new tool for land managers, offering more accurate insights into ecosystem recovery trajectories and enabling more effective post-fire environmental planning.

Published

2025-09-25

Issue

Section

College of Science: Department of Geography and Geoinformation Science