Poster

Community members
ORCID iD icon

Approximation and Optimization of Environmental Simulations in High Spatio-Temporal Resolution through Machine Learning Methods

Azmi, Elnaz1ORCID iD icon , Meyer, J.1ORCID iD icon , Strobl, M.1ORCID iD icon , Streit, A.1ORCID iD icon
  1. Karlsruhe Institute of Technology

Environmental simulations in high spatio-temporal resolution consisting of large-scale dynamic systems are compute-intensive, thus usually demand parallelization of the simulations as well as high performance computing (HPC) resources. Furthermore, the parallelization of existing sequential simulations involves potentially a large configuration overhead and requires advanced programming expertise of domain scientists. On the other hand, despite the availability of modern powerful computing technologies, and under the perspective of saving energy, there is a need to address the issues such as complexity and scale reduction of large-scale systems’ simulations. In order to tackle these issues, we propose two approaches: 1. Approximation of simulations by model order reduction and unsupervised machine learning methods, and 2. Approximation of simulations by supervised machine learning methods.

In the first method, we approximate large-scale and high-resolution environmental simulations and reduce their computational complexity by employing model order reduction techniques and unsupervised machine learning algorithms. In detail, we cluster functionally similar model units to reduce model redundancies, particularly similarities in the functionality of simulation model units and computation complexity. The underlying principle is that the simulation dynamics depend on model units’ static properties, current state, and external forcing. Based on this, we assume that similar model units’ settings lead to similar simulation dynamics. Considering this principle in the use case of a hydrological simulation named CAOS [1], we clustered the model units, ran the simulation model on a small representative subset of each cluster, and scaled the simulation output of the cluster representatives to the remaining cluster members. Experiments of this approach resulted in a balance between the simulation uncertainty and its computational effort. For evaluation of the quality of our approach, we used the proximity of the test simulation output to the original simulation, and to show the computational complexity of the approach, we measured the speedup of test simulation run time to the original simulation. Applying this approach to the CAOS use case results in a Root Mean Square Error (RMSE) of 0.0049 and a 1.8x speedup compared to the original simulation.

In the second method, we approximate simulations through supervised machine learning methods focusing on deep neural networks. In this ongoing approach, we input multidimensional time series data into a Long Short-Term Memory network (LSTM). The LSTM model learns long-term dependencies and memorizes the information of previously seen data to predict the future data. In our use case simulation ICON-ART [2], the atmosphere is divided into cells with several input variables in which the concentration of trace gases is simulated. This simulation is based on coupled differential equations. The goal of this approach is to replace the compute-intensive chemistry simulation of about two million atmospheric cells with a trained neural network model to predict the concentration of trace gases at each cell and to reduce the computation complexity of the simulation.

[1] E. Zehe, et al. 2014. HESS Opinions: From response units to functional units: a thermodynamic reinterpretation of the HRU concept to link spatial organization and functioning of intermediate scale catchments. HESS 18: 4635–4655. doi: 10.5194/hess-18-4635-2014

[2] D. Rieger, et al. 2015. ICON–ART 1.0 – a new online-coupled model system from the global to regional scale. GMD 8: 1659–1676. doi: 10.5194/gmd-8-1659-2015