Skip to content

Challenge 23 - FloodMule: a machine learning emulator of the LISFLOOD hydrological model #7

@EsperanzaCuartero

Description

@EsperanzaCuartero

Challenge 23 - FloodMule: a machine learning emulator of the LISFLOOD hydrological model

Stream 2 - Machine Learning for Earth Science

Goal

Emulate LISFLOOD to reduce significantly the running time of the model for a given configuration

Mentors and skills

  • Mentors: Corentin Carton, Cinzia Mazzetti, Matthew Chantry, Juan Pereira Colonese, Francesca Moschini, Eleanor Hansford
  • Skills required:
    • Good knowledge of machine learning approaches and libraries
    • Good knowledge of Python
    • Knowledge of hydrological modelling is not required but would be an advantage

Note: Only nationals or residents from the ECMWF Member States and Co-operating States are eligible to participate (see Terms and Conditions).


Challenge description

LISFLOOD is a spatially distributed (gridded) hydrological rainfall-runoff model that can simulate the main hydrological processes occurring in a catchment. LISFLOOD explicitly considers the spatial distribution of physical properties across the catchments to provide estimates of river discharge and other hydrological variables such as snow accumulation, soil moisture, etc. Driven by meteorological forcing data (precipitation, temperature and evaporation), it calculates a complete water balance for every grid cell of the computational domain.

Running the LISFLOOD hydrological model at high resolution and global (or pan-European) scale, as will be done in the next versions of EFAS and GloFAS, becomes a challenge as the running time of the model becomes too large for an operational context. Instead of optimising the current model, which would only give incremental improvement, emulating the hydrological model using machine learning could give us orders of magnitude of improvement in terms of speedup with hopefully limited or no degradation of results.

The emulator would mimic the hydrological model for a given configuration, meaning:

  • A freezed version of LISFLOOD
  • A fixed domain and resolution
  • A fixed set of static maps (gridded) that describe the hydro-morphological characteristics of the river basins, including the parameter maps obtained through the model calibration process
  • A single temporal step, removing the temporal complexity of the problem

This would result in a simple workflow for the emulator with the following inputs:

  • Initial conditions given through LISFLOOD state maps
  • LISFLOOD forcing maps for one step, such as temperature, precipitation, etc.

The emulator, as the hydrological model, would provide the following outputs:

  • State maps representing different variables of the hydrological model, which could potentially be used as the initial condition for a next step
  • Some additional maps for variables such as discharge, snow melt, etc.

This very well-defined problem offers a multitude of areas of exploration for training the model, as we could build a training dataset by feeding into the hydrological model any set of the initial condition and forcing and using the outputs to train the emulator. For instance, the ML training could be based on one of the following approaches:

  • Use of existing dataset for forcing and state files from reanalysis, forecasts, reforecasts, etc.
  • Creating stochastic dataset around climatological data obtained through the reanalysis

These two approaches would already give us thousands of data points (i.e. time slices) to train the model, even millions if the stochastic approach is successful.

As a continental domain is composed of thousands of hydrological catchments, the approach could first experiment on small-size basins, then scaled up to larger basins and finally to the full EFAS or GloFAS computational domain.

The details of the implementation, such as the data flow or the ML approach and libraries, will be discussed during the project. The candidates will be provided with a utility, interfacing with the LISFLOOD hydrological model, that will generate training datasets for the ML kernels.

Training/evaluation workflow:
pic_FloodMule

Metadata

Metadata

Labels

Stream 2Machine Learning for Earth Sciences

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions