|
55 | 55 | "cell_type": "markdown", |
56 | 56 | "metadata": {}, |
57 | 57 | "source": [ |
58 | | - "- Tree of arbitrary groups\n", |
59 | | - "- Each holds arbitrary data in the form of arrays + metadata\n", |
60 | | - "- No relationship enforced between groups\n", |
61 | | - "- No relationship enforced between arrays within a group\n", |
62 | | - "- No concept of \"coordinates\" vs \"data\"\n", |
63 | | - "- No references from one group to another" |
| 58 | + "* **Tree of groups** – Tree of arbitrary groups.\n", |
| 59 | + "\n", |
| 60 | + "* **Separate groups** – No relationship enforced between groups, and no references from one group to another.\n", |
| 61 | + "\n", |
| 62 | + "* **Separate arrays** – No relationship enforced between arrays within a group.\n", |
| 63 | + "\n", |
| 64 | + "* **Arbitrary JSON metadata** – Each holds arbitrary data in the form of arrays + metadata." |
64 | 65 | ] |
65 | 66 | }, |
66 | 67 | { |
|
79 | 80 | "cell_type": "markdown", |
80 | 81 | "metadata": {}, |
81 | 82 | "source": [ |
82 | | - "How does zarr relate to `xarray`?\n", |
| 83 | + "### How does zarr relate to `xarray`?\n", |
| 84 | + "\n", |
| 85 | + "* **Arrays <-> `Variables`** - zarr arrays map well to `xarray.Variables`\n", |
| 86 | + " - Especially as zarr v3 includes (optional) `dimension_names`\n", |
| 87 | + "\n", |
| 88 | + "* **Groups <-> `Datasets`** - zarr groups map reasonably well to `xarray.Dataset` objects\n", |
| 89 | + " - Open a single zarr group in xarray via `xr.open_dataset(store, group='/path', engine='zarr')`\n", |
83 | 90 | "\n", |
84 | | - "- zarr arrays map well to `xarray.Variables`\n", |
85 | | - " - especially because zarr v3 includes (optional) `dimension_names`\n", |
86 | | - "- zarr groups map reasonably well to `xarray.Dataset` objects\n", |
87 | | - " - `xr.open_dataset(store, group='/path', engine='zarr')`\n", |
88 | | - " - but `xarray.Dataset`s require that all arrays in the Dataset have aligned dimensions\n", |
89 | | - " - so it is possible to create a zarr group that is not a valid `xarray.Dataset`, if the group contains arrays with non-aligning dimensions\n", |
90 | | - " - Also zarr has no concept of \"coordinate\" vs \"data\" variables\n", |
91 | | - " - so xarray has to save this piece of information as an additional piece of metadata \n", |
92 | | - "- zarr store has a tree of groups\n", |
| 91 | + "* **Groups must be alignable** - But `xarray.Dataset`s require that all arrays in the Dataset have aligned dimensions\n", |
| 92 | + " - so it is possible to create a zarr group that is not a valid `xarray.Dataset`, if the group contains arrays with non-aligning dimensions\n", |
| 93 | + "\n", |
| 94 | + "* **No \"coordinates\"** – No arrays are special, so Zarr has no intrinsic concept of \"coordinate\" vs \"data\" variables.\n", |
| 95 | + " - So xarray has to save this piece of information as an additional piece of zarr metadata.\n", |
| 96 | + "\n", |
| 97 | + "* **Tree of groups <-> `DataTree`** - zarr store has a tree of groups\n", |
93 | 98 | " - maps to either a set of independent `xarray.Datasets`\n", |
94 | 99 | " - `xr.open_groups(store)`\n", |
95 | 100 | " - or to a single `xarray.DataTree`\n", |
|
192 | 197 | "cell_type": "markdown", |
193 | 198 | "metadata": {}, |
194 | 199 | "source": [ |
195 | | - "TIFF (Tag Image File Format) is a *flexible* raster container widely used in biosciences, remote sensing and GIS. \n", |
| 200 | + "TIFF (Tag Image File Format) is a raster container widely used in biosciences, remote sensing and GIS. \n", |
196 | 201 | "\n", |
197 | 202 | "A **GeoTIFF** is simply a TIFF that stores additional additional georeferencing information tags (CRS, affine transform, etc.) so geospatial software knows where each pixel sits on Earth. \n", |
198 | 203 | "\n", |
|
206 | 211 | "\n", |
207 | 212 | "* **Compression / tiling** – DEFLATE, LZW, etc. Tiling lets software fetch small windows efficiently.\n", |
208 | 213 | "\n", |
209 | | - "### Practical notes for xarray users\n", |
210 | | - "\n", |
211 | | - "* **Read** – use `rioxarray.open_rasterio()` (wraps rasterio) to get an immediate, Dask-chunked DataArray.\n", |
212 | | - "\n", |
213 | | - "* **Write** – `DataArray.rio.to_raster(\"out.tif\")`; choose compression + tiling via driver_kwargs.\n", |
214 | | - "\n", |
215 | | - "* **Dimensionality** – TIFF is inherently 2-D per band; no native time or vertical axis. If you need 4-D data, NetCDF or Zarr is usually a better fit.\n", |
216 | | - "\n", |
217 | | - "* **Metadata depth** – single-level tags only (no nested groups). For rich hierarchies, stick to HDF5 / NetCDF-4.\n", |
218 | | - "\n", |
219 | | - "* **Cloud-optimized GeoTIFF (COG)** – same format, arranged so HTTP range requests can stream windows efficiently; xarray handles it transparently when rasterio is compiled with libcurl.\n" |
| 214 | + "* **Cloud-optimized GeoTIFF (COG)** – same format, arranged so HTTP range requests can stream windows efficiently; xarray handles it transparently when rasterio is compiled with libcurl." |
220 | 215 | ] |
221 | 216 | }, |
222 | 217 | { |
223 | 218 | "cell_type": "markdown", |
224 | 219 | "metadata": {}, |
225 | 220 | "source": [ |
226 | 221 | "### How does TIFF relate to xarray?\n", |
227 | | - "\n" |
| 222 | + "\n", |
| 223 | + "* **Dimensionality** – Each raster image maps well to a single `xarray.Variable`, but TIFF is inherently 2-D per band; no native time or vertical axis. If you need 4-D data, NetCDF or Zarr is usually a better fit.\n", |
| 224 | + "\n", |
| 225 | + "* **No named dimensions** - TIFFs don't have named dimensions for the two axes of the raster.\n", |
| 226 | + "\n", |
| 227 | + "* **IFDs as groups** - IFDs can be mapped to groups, which may be useful for multi-resolution TIFFs (also known as \"overviews\") and multi-page TIFFs.\n", |
| 228 | + "\n", |
| 229 | + "* **Metadata depth** – single-level tags only (no nested groups). For rich hierarchies, stick to HDF5 / NetCDF-4.\n", |
| 230 | + "\n", |
| 231 | + "* **Read** – use `rioxarray.open_rasterio()` (wraps rasterio) to get an immediate, Dask-chunked DataArray. However `rioxarray` is for interacting with GeoTIFFs, not general TIFFs.\n", |
| 232 | + "\n", |
| 233 | + "* **Write** – `DataArray.rio.to_raster(\"out.tif\")`; choose compression + tiling via driver_kwargs." |
228 | 234 | ] |
229 | 235 | }, |
230 | 236 | { |
|
0 commit comments