Browsing by Author "Chattopadhyay, Ashesh"
Now showing 1 - 6 of 6
Results Per Page
Sort Options
Item Data Imbalance, Uncertainty Quantification, and Transfer Learning in Data-Driven Parameterizations: Lessons From the Emulation of Gravity Wave Momentum Transport in WACCM(Wiley, 2024) Sun, Y. Qiang; Pahlavan, Hamid A.; Chattopadhyay, Ashesh; Hassanzadeh, Pedram; Lubis, Sandro W.; Alexander, M. Joan; Gerber, Edwin P.; Sheshadri, Aditi; Guan, YifeiNeural networks (NNs) are increasingly used for data-driven subgrid-scale parameterizations in weather and climate models. While NNs are powerful tools for learning complex non-linear relationships from data, there are several challenges in using them for parameterizations. Three of these challenges are (a) data imbalance related to learning rare, often large-amplitude, samples; (b) uncertainty quantification (UQ) of the predictions to provide an accuracy indicator; and (c) generalization to other climates, for example, those with different radiative forcings. Here, we examine the performance of methods for addressing these challenges using NN-based emulators of the Whole Atmosphere Community Climate Model (WACCM) physics-based gravity wave (GW) parameterizations as a test case. WACCM has complex, state-of-the-art parameterizations for orography-, convection-, and front-driven GWs. Convection- and orography-driven GWs have significant data imbalance due to the absence of convection or orography in most grid points. We address data imbalance using resampling and/or weighted loss functions, enabling the successful emulation of parameterizations for all three sources. We demonstrate that three UQ methods (Bayesian NNs, variational auto-encoders, and dropouts) provide ensemble spreads that correspond to accuracy during testing, offering criteria for identifying when an NN gives inaccurate predictions. Finally, we show that the accuracy of these NNs decreases for a warmer climate (4 × CO2). However, their performance is significantly improved by applying transfer learning, for example, re-training only one layer using ∼1% new data from the warmer climate. The findings of this study offer insights for developing reliable and generalizable data-driven parameterizations for various processes, including (but not limited to) GWs.Item Data-driven predictions of a multiscale Lorenz 96 chaotic system using machine-learning methods: reservoir computing, artificial neural network, and long short-term memory network(Copernicus Publications, 2020) Chattopadhyay, Ashesh; Hassanzadeh, Pedram; Subramanian, DevikaIn this paper, the performance of three machine-learning methods for predicting short-term evolution and for reproducing the long-term statistics of a multiscale spatiotemporal Lorenz 96 system is examined. The methods are an echo state network (ESN, which is a type of reservoir computing; hereafter RC–ESN), a deep feed-forward artificial neural network (ANN), and a recurrent neural network (RNN) with long short-term memory (LSTM; hereafter RNN–LSTM). This Lorenz 96 system has three tiers of nonlinearly interacting variables representing slow/large-scale (X), intermediate (Y), and fast/small-scale (Z) processes. For training or testing, only X is available; Y and Z are never known or used. We show that RC–ESN substantially outperforms ANN and RNN–LSTM for short-term predictions, e.g., accurately forecasting the chaotic trajectories for hundreds of numerical solver's time steps equivalent to several Lyapunov timescales. The RNN–LSTM outperforms ANN, and both methods show some prediction skills too. Furthermore, even after losing the trajectory, data predicted by RC–ESN and RNN–LSTM have probability density functions (pdf's) that closely match the true pdf – even at the tails. The pdf of the data predicted using ANN, however, deviates from the true pdf. Implications, caveats, and applications to data-driven and data-assisted surrogate modeling of complex nonlinear dynamical systems, such as weather and climate, are discussed.Item Data‐Driven Super‐Parameterization Using Deep Learning: Experimentation With Multiscale Lorenz 96 Systems and Transfer Learning(Wiley, 2020) Chattopadhyay, Ashesh; Subel, Adam; Hassanzadeh, PedramTo make weather and climate models computationally affordable, small‐scale processes are usually represented in terms of the large‐scale, explicitly resolved processes using physics‐based/semi‐empirical parameterization schemes. Another approach, computationally more demanding but often more accurate, is super‐parameterization (SP). SP involves integrating the equations of small‐scale processes on high‐resolution grids embedded within the low‐resolution grid of large‐scale processes. Recently, studies have used machine learning (ML) to develop data‐driven parameterization (DD‐P) schemes. Here, we propose a new approach, data‐driven SP (DD‐SP), in which the equations of the small‐scale processes are integrated data‐drivenly (thus inexpensively) using ML methods such as recurrent neural networks. Employing multiscale Lorenz 96 systems as the testbed, we compare the cost and accuracy (in terms of both short‐term prediction and long‐term statistics) of parameterized low‐resolution (PLR) SP, DD‐P, and DD‐SP models. We show that with the same computational cost, DD‐SP substantially outperforms PLR and is more accurate than DD‐P, particularly when scale separation is lacking. DD‐SP is much cheaper than SP, yet its accuracy is the same in reproducing long‐term statistics (climate prediction) and often comparable in short‐term forecasting (weather prediction). We also investigate generalization: when models trained on data from one system are applied to a more chaotic system, we find that models often do not generalize, particularly when short‐term prediction accuracies are examined. However, we show that transfer learning, which involves re‐training the data‐driven model with a small amount of data from the new system, significantly improves generalization. Potential applications of DD‐SP and transfer learning in climate/weather modeling are discussed.Item Interpretable Structural Model Error Discovery From Sparse Assimilation Increments Using Spectral Bias-Reduced Neural Networks: A Quasi-Geostrophic Turbulence Test Case(Wiley, 2024) Mojgani, Rambod; Chattopadhyay, Ashesh; Hassanzadeh, PedramEarth system models suffer from various structural and parametric errors in their representation of nonlinear, multi-scale processes, leading to uncertainties in their long-term projections. The effects of many of these errors (particularly those due to fast physics) can be quantified in short-term simulations, for example, as differences between the predicted and observed states (analysis increments). With the increase in the availability of high-quality observations and simulations, learning nudging from these increments to correct model errors has become an active research area. However, most studies focus on using neural networks, which while powerful, are hard to interpret, are data-hungry, and poorly generalize out-of-distribution. Here, we show the capabilities of Model Error Discovery with Interpretability and Data Assimilation (MEDIDA), a general, data-efficient framework that uses sparsity-promoting equation-discovery techniques to learn model errors from analysis increments. Using two-layer quasi-geostrophic turbulence as the test case, MEDIDA is shown to successfully discover various linear and nonlinear structural/parametric errors when full observations are available. Discovery from spatially sparse observations is found to require highly accurate interpolation schemes. While NNs have shown success as interpolators in recent studies, here, they are found inadequate due to their inability to accurately represent small scales, a phenomenon known as spectral bias. We show that a general remedy, adding a random Fourier feature layer to the NN, resolves this issue enabling MEDIDA to successfully discover model errors from sparse observations. These promising results suggest that with further development, MEDIDA could be scaled up to models of the Earth system and real observations.Item Theoretical and Applied Deep Learning for Turbulence(2022-12-02) Chattopadhyay, Ashesh; Hassanzadeh, PedramWhile turbulence remains the oldest unsolved mystery in physics, recent efforts in building high-resolution physics-based simulation models, availability of highquality observational data, coupled with advances in data-driven scientific computing, offers significant hope for modeling and predicting turbulent flow. On the other hand, data-driven methods, primarily fueled by the unprecedented success of deep learning, often suffer from the lack of interpretability and rigorous theoretical understanding of their inner working-mechanism. In this thesis, I propose, a principled approach to constructing data-driven deep learning algorithms leveraging rigorous theories of deep learning, that are motivated from established theories of numerical analysis and fluid physics, to build prediction models for complex turbulent flows in both engineering applications and geophysical fluid dynamics. I would discuss how these data-driven models can be interpreted from the lenses of physics, how they can fail under certain circumstances, and how these failure modes can be mitigated once we understand their inner working mechanism. In conclusion, I would discuss how deep learning-based algorithms can be reliably used for scientific applications specifically focusing on highly chaotic, nonlinear geophysical turbulent flows on several systems of increasing complexity from canonical systems, fully-coupled climate models, to actual atmospheric observations.Item Towards physics-inspired data-driven weather forecasting: integrating data assimilation with a deep spatial-transformer-based U-NET in a case study with ERA5(Copernicus Publications, 2022) Chattopadhyay, Ashesh; Mustafa, Mustafa; Hassanzadeh, Pedram; Bach, Eviatar; Kashinath, KarthikThere is growing interest in data-driven weather prediction (DDWP), e.g., using convolutional neural networks such as U-NET that are trained on data from models or reanalysis. Here, we propose three components, inspired by physics, to integrate with commonly used DDWP models in order to improve their forecast accuracy. These components are (1) a deep spatial transformer added to the latent space of U-NET to capture rotation and scaling transformation in the latent space for spatiotemporal data, (2) a data-assimilation (DA) algorithm to ingest noisy observations and improve the initial conditions for next forecasts, and (3) a multi-time-step algorithm, which combines forecasts from DDWP models with different time steps through DA, improving the accuracy of forecasts at short intervals. To show the benefit and feasibility of each component, we use geopotential height at 500 hPa (Z500) from ERA5 reanalysis and examine the short-term forecast accuracy of specific setups of the DDWP framework. Results show that the spatial-transformer-based U-NET (U-STN) clearly outperforms the U-NET, e.g., improving the forecast skill by 45 %. Using a sigma-point ensemble Kalman (SPEnKF) algorithm for DA and U-STN as the forward model, we show that stable, accurate DA cycles are achieved even with high observation noise. This DDWP+DA framework substantially benefits from large (O(1000)) ensembles that are inexpensively generated with the data-driven forward model in each DA cycle. The multi-time-step DDWP+DA framework also shows promise; for example, it reduces the average error by factors of 2–3. These results show the benefits and feasibility of these three components, which are flexible and can be used in a variety of DDWP setups. Furthermore, while here we focus on weather forecasting, the three components can be readily adopted for other parts of the Earth system, such as ocean and land, for which there is a rapid growth of data and need for forecast and assimilation.