A cautionary tale about deep learning-based climate emulators (Papers Track)
Björn Lütjens (Massachussets Institute of Technology); Raffaele Ferrari (Massachusetts Institute of Technology); Paolo Giani (Massachusetts Institute of Technology); Dava Newman (MIT); Andre Souza (Massachusetts Institute of Technology); Duncan Watson-Parris (University of California San Diego); Noelle Selin (Massachusetts Institute of Technology)
Abstract
Climate models are computationally too expensive for many tasks, such as, rapidly exploring future impacts of climate policies. Thus, since the 1980s scientists have been developing lightweight approximations or emulators of climate models. Recently, deep learning has been proposed for this task and most commonly been evaluated on the benchmark ClimateBenchv1.0. We implemented a linear regression-based model from the 1990s with 30K parameters, called linear pattern scaling, that is now the 'best' model on ClimateBenchv1.0 -- outperforming the incumbent 100M-parameter foundation model, ClimaX, on the spatial error of 3 out of the 4 variables. Nevertheless, climate emulation might benefit from innovations in machine learning and we analyse two aspects that need to be addressed in future emulators: First, the data complexity depends strongly on the climate variable of interest and the chosen spatiotemporal resolution. Second, current benchmarks do not sufficiently address the large impact of interannual variability in the climate system. We have published our analysis as an interactive tutorial at github.com/ygaxolotl/tags-linear-pattern-scaling.