Larger-than-memory datasets with iris-esmf-regrid and dask

## 📰 Custom Issue

I was wondering if you have any recommendations on what `dask` settings I should use if I want to regrid a larger-than-memory dataset using `iris-esmf-regrid`.

The dataset is from an LFRic C24 run, containing about ten 2D or 3D variables with 1000 time slices loaded from 100 files (i.e. 10 time slices per file). The data are chunked accordingly: 100 chunks for every variable. The total size amounts to about 16G on disk.

I can obviously process this in a file-by-file loop but I hope to load the whole dataset and apply regridding in one go. Currently, my script halts because the regridding step consumes all available RAM. I understand this might be asking for too much specialised help but any advice would be highly appreciated!

<details>
<summary>Machine specs</summary>

```
                 OS : Linux
             CPU(s) : 8
            Machine : x86_64
       Architecture : 64bit
                RAM : 31.2 GiB
        Environment : Jupyter
        File system : ext4
         GPU Vendor : Intel
       GPU Renderer : Mesa Intel(R) UHD Graphics 620 (KBL GT2)
        GPU Version : 4.6 (Core Profile) Mesa 23.0.4-0ubuntu1~22.04.1

  Python 3.11.5 | packaged by conda-forge | (main, Aug 27 2023, 03:34:09) [GCC 12.3.0]
```

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Larger-than-memory datasets with iris-esmf-regrid and dask #310

📰 Custom Issue

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Larger-than-memory datasets with iris-esmf-regrid and dask #310

Description

📰 Custom Issue

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions