Challenge 23 - FloodMule: a machine learning emulator of the LISFLOOD hydrological model

# Challenge 23 - FloodMule: a machine learning emulator of the LISFLOOD hydrological model

> **Stream 2** - Machine Learning for Earth Science


### Goal
Emulate LISFLOOD to reduce significantly the running time of the model for a given configuration

### Mentors and skills
* **Mentors:** Corentin Carton, Cinzia Mazzetti, Matthew Chantry, Juan Pereira Colonese, Francesca Moschini, Eleanor Hansford
* **Skills required:**
  * Good knowledge of machine learning approaches and libraries
  * Good knowledge of Python
  * Knowledge of hydrological modelling is not required but would be an advantage

<br>

> <b> *Note: Only nationals or residents from the [ECMWF Member States and Co-operating States](https://www.ecmwf.int/en/about/who-we-are/member-states) are eligible to participate (see [Terms and Conditions](https://codeforearth.ecmwf.int/terms-and-conditions)).* </b>

<hr>


### Challenge description
[LISFLOOD](https://ec-jrc.github.io/lisflood-model/) is a spatially distributed (gridded) hydrological rainfall-runoff model that can simulate the main hydrological processes occurring in a catchment. LISFLOOD explicitly considers the spatial distribution of physical properties across the catchments to provide estimates of river discharge and other hydrological variables such as snow accumulation, soil moisture, etc. Driven by meteorological forcing data (precipitation, temperature and evaporation), it calculates a complete water balance for every grid cell of the computational domain. 

Running the LISFLOOD hydrological model at high resolution and global (or pan-European) scale, as will be done in the next versions of EFAS and GloFAS, becomes a challenge as the running time of the model becomes too large for an operational context. Instead of optimising the current model, which would only give incremental improvement, emulating the hydrological model using machine learning could give us orders of magnitude of improvement in terms of speedup with hopefully limited or no degradation of results.

The emulator would mimic the hydrological model for a given configuration, meaning:
* A freezed version of LISFLOOD
* A fixed domain and resolution
* A fixed set of static maps (gridded) that describe the hydro-morphological characteristics of the river basins, including the parameter maps obtained through the model calibration process
* A single temporal step, removing the temporal complexity of the problem

This would result in a simple workflow for the emulator with the following inputs:
* Initial conditions given through LISFLOOD state maps
* LISFLOOD forcing maps for one step, such as temperature, precipitation, etc.

The emulator, as the hydrological model, would provide the following outputs:
* State maps representing different variables of the hydrological model, which could potentially be used as the initial condition for a next step
* Some additional maps for variables such as discharge, snow melt, etc. 

This very well-defined problem offers a multitude of areas of exploration for training the model, as we could build a training dataset by feeding into the hydrological model any set of the initial condition and forcing and using the outputs to train the emulator. For instance, the ML training could be based on one of the following approaches:
* Use of existing dataset for forcing and state files from reanalysis, forecasts, reforecasts, etc.
* Creating stochastic dataset around climatological data obtained through the reanalysis

These two approaches would already give us thousands of data points (i.e. time slices) to train the model, even millions if the stochastic approach is successful.

As a continental domain is composed of thousands of hydrological catchments, the approach could first experiment on small-size basins, then scaled up to larger basins and finally to the full EFAS or GloFAS computational domain.

The details of the implementation, such as the data flow or the ML approach and libraries, will be discussed during the project. The candidates will be provided with a utility, interfacing with the LISFLOOD hydrological model, that will generate training datasets for the ML kernels.


Training/evaluation workflow:
![pic_FloodMule](https://user-images.githubusercontent.com/46716581/221345586-35a1628e-9e0e-4dda-bd90-f55655b0721e.png)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Challenge 23 - FloodMule: a machine learning emulator of the LISFLOOD hydrological model #7

Challenge 23 - FloodMule: a machine learning emulator of the LISFLOOD hydrological model

Goal

Mentors and skills

Challenge description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Challenge 23 - FloodMule: a machine learning emulator of the LISFLOOD hydrological model #7

Description

Challenge 23 - FloodMule: a machine learning emulator of the LISFLOOD hydrological model

Goal

Mentors and skills

Challenge description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions