AD4GD
diff --git a/‎readme.md‎
Lines changed: 5 additions & 0 deletions b/‎readme.md‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎src/1_pas.ipynb‎
Lines changed: 71 additions & 34 deletions b/‎src/1_pas.ipynb‎
Lines changed: 71 additions & 34 deletions
@@ -35,6 +35,11 @@ To execute Jupyter Notebook users need to either:
 - navigate to http:localhost:9999 or
 - copy URL with token from the console and paste it to browser or another tool used to run Jupyter Notebook (for example, *http://localhost:9999/?token=abcd123400000000000000000000000*)
 
+If you are experiencing issues with building Docker image, try to replace in `Dockerfile` the following line: \
+`FROM ghcr.io/osgeo/gdal:ubuntu-small-3.9.2 AS base` with \
+`FROM ghcr.io/osgeo/gdal:ubuntu-small-latest` or \
+`FROM ghcr.io/osgeo/gdal:ubuntu-full-latest AS base`
+
 Now, everything is prepared to execute the Jupyter Notebooks within a Docker environment.
 
 #### Impact
 
@@ -110,6 +110,17 @@
     "- [geopandas built-in dataset from the Natural Earth](https://www.naturalearthdata.com/downloads/50m-cultural-vectors/50m-admin-0-countries-2/), but the dataset with the boundaries of countries is not curently available there."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The class below is preprocessing the input data for follow-up ingestion of protected areas:\n",
+    "- extracting all file names of input land-use/land-cover datasets and years they represent\n",
+    "- sending a request to ohsome API [endpoint](https://api.ohsome.org/v1/elements/geometry)\n",
+    "- fetching the country code(s) (ISO 3166-1 alpha-3 standard) from a bounding box of input raster\n",
+    "\n"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -203,7 +214,6 @@
     "        # update lulc_series with files that exist\n",
     "        return existing_lulc_series\n",
     "        \n",
-    "    #NOTE Ohsome API is using openstreetmap data, which may not be the best source to fetch country codes from bounding box with. The GAUL dataset provided by FAO (UN) is a better source for this but it is not available through API.\n",
     "    def get_country_code_from_bbox(self, bbox:str, output_path:str) -> set:\n",
     "        \"\"\"\n",
     "        This function sends a request to the ohsome API to get the country code from a given bounding box\n",
@@ -242,9 +252,7 @@
     "            return unique_country_names\n",
     "        else:\n",
     "            raise Exception(f\"Error: {response.status_code}\")\n",
-    "\n",
-    "        \n",
-    "        \n",
+    "    \n",
     "    def fetch_lulc_country_codes(self, output_path:str) -> dict[set]:\n",
     "        \"\"\"\n",
     "        Fetch the country codes for the LULC rasters\n",
@@ -264,6 +272,13 @@
     "        return lulc_country_codes"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now, we should run this class:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -283,7 +298,7 @@
    "source": [
     "#### Looping over countries from the bounding box\n",
     "\n",
-    "Now, we can loop over the countries of the bounding box of input raster dataset, fetch json response and convert them into GeoJSON format."
+    "To get the data from the World Database on Protected Areas, we are defining functions to convert JSON responses from the Protected Planet API to a merged GeoJSON per country of interest."
    ]
   },
   {
@@ -413,6 +428,16 @@
     "        return geojson_filepath"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The following class is used to fetch data from the World Database on Protected Areas:\n",
+    "- defining API token, configuration parameter on marine areas (by default, marine areas are not considered) and accessing the list of country codes from previous function\n",
+    "- combining all protected areas for each country, concatenating by pages, and processing them as a single GeoJSON\n",
+    "- saving GeoJSONs to output directory and exporting to GeoPackages"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -542,7 +567,6 @@
     "# define country codes from the previous block\n",
     "countries = unique_country_names\n",
     "\n",
-    "# TODO remove API response in final versions\n",
     "# directory to save GeoJSON files\n",
     "protected_areas_data_output_dir = os.path.join(pa_output_dir, \"pa_data\")\n",
     "os.makedirs(protected_areas_data_output_dir, exist_ok=True)\n",
@@ -574,6 +598,15 @@
     "4. Compress protected areas."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The code below transforms fetched data on protected areas:\n",
+    "- reprojects to the same CRS as the LULC raster dataset, and filters them based on the year of establishment into separate GeoPackage files\n",
+    "- rasterises vector data on protected areas, using the spatial extent, spatial resolution and CRS of the input LULC raster dataset"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -702,8 +735,6 @@
     "        except subprocess.CalledProcessError as e:\n",
     "            print(f\"Error rasterizing protected areas: {e}\")\n",
     "        \n",
-    "        \n",
-    "\n",
     "    def rasterize_pa_geopackage(self, lulc_metadata:RasterMetadata, pa_to_yearly_rasters:bool=True , keep_intermediate_gpkg:bool=False) -> None:\n",
     "        \"\"\"\n",
     "        Rasterizes the protected areas to the same extent and resolution as the LULC raster dataset.\n",
@@ -744,30 +775,12 @@
    "metadata": {},
    "source": [
     "It is important to extract year stamps from the filenames. \\\n",
-    "**WARNING:** The name of your input dataset must always end up with four-digit year before the file extension!"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
+    "**WARNING:** The name of your input dataset must always end up with four-digit year before the file extension!\n",
+    "Protected areas should be filtered by year stamp according to the PA's establishment year.\n",
+    "\n",
     "Then, extent of LULC files (minimum and maximum coordinates) is extracted."
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Protected areas should be filtered by year stamp according to the PA's establishment year."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Rasterization function based on yearstamps of protected areas is launched."
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -781,6 +794,13 @@
     "os.makedirs(raster_output_dir, exist_ok=True)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Rastesization function based on yearstamps of protected areas is launched."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -806,6 +826,15 @@
     "4. Compression and assignment of null values."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Code below is responsible for:\n",
+    "- converting a shell script to Unix format in case of Windows-inherited symbols\n",
+    "- runs raster calculation Shell command using `subprocess`"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -878,11 +907,22 @@
    "metadata": {},
    "source": [
     "#### 4. Updating landscape impedance\n",
-    "Impedance is reclassified by [CSV table](data/input/impedance/lulc_descr_esa.csv) and compressed (through LZW compression, not Cloud Optimised Geotiff standard to avoid any further issues in processing). Landscape impedance is required by Miramon ICT and Graphab tools both.\n",
+    "Impedance is reclassified by [CSV table](data/input/impedance/lulc_descr_esa.csv) and compressed (through LZW compression, not Cloud Optimised Geotiff standard to avoid any further issues in processing). Landscape impedance, or landscape resistance, or movement cost is required by many software used in biodiversity studies, including Graphab and MiraMon ICT plugin.\n",
     "\n",
     "Let's import another set of libraries needed and define the class to update the impedance."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now, we will update the landscape impedance dataset with the fetched data on protected areas:\n",
+    "- updating the impedance dataset based on the reclassification table or the multiplier effect of protected areas. To change the method of updating impedance, modify `lulc_reclass_table: false` in [the configuration file](config/config.yaml). By default, it uses the multiplicating effect of protected areas. For example, the same LULC type with protected status will be less difficult for species to pass through. So, if the impedance of non-protected LULC type is 30 and multiplicator is 0.3, protected LULC type will be defined by impedance = 9.\n",
+    "- multiplying a raster based on the effect of protected areas (if chosen)\n",
+    "- generating a reclassification dictionary from a reclassification table for impedance, depending on the data type (if chosen)\n",
+    "\n"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -942,7 +982,6 @@
     "        self.tiff_files = [f for f in os.listdir(self.input_folder) if f.endswith('_pa.tif')] # ADDED SUFFIX (UPDATED LULC)\n",
     "        self.impedance_files = [f for f in os.listdir(self.output_folder) if f.endswith('.tif')] # IMPEDANCE DATASET\n",
     "\n",
-    "\n",
     "    def update_impedance(self, impedance_dir:str) -> None:\n",
     "        \"\"\"\n",
     "        Updates the impedance dataset based on the reclassification table or the multiplier effect of protected areas.\n",
@@ -1012,7 +1051,6 @@
     "                \n",
     "                print(\"Multiplication complete for:\", impedance_in_path + \"\\n------------------------------------\")\n",
     "        \n",
-    "    \n",
     "    def apply_multiplier(self, impedance_in_path:str, impedance_out_path:str, lulc_path:str, reclass_table:str, pa_effect:float) -> str:\n",
     "        \"\"\"\n",
     "        Multiplies a raster based on the effect of protected areas.\n",
@@ -1110,7 +1148,6 @@
     "            \n",
     "        return reclass_dict , has_decimal , data_type\n",
     "\n",
-    "\n",
     "    def reclassify_raster(self, input_raster:str, output_raster:str, reclass_table:str) -> str:\n",
     "        \"\"\"\n",
     "        Reclassifies a raster based on a reclassification table.\n",
@@ -1207,7 +1244,7 @@
    "metadata": {},
    "source": [
     "#### 5. Updating landscape affinity \n",
-    "Landscape affinity is computed and compressed based on the math expression processing landscape impedance. By now, landscape affinity is computed as a reversed value of landscape impedance but it is planned to develop it as a more flexible input to compute connectivity further. This output is required by Miramon ICT software, not Graphab."
+    "Landscape affinity is computed and compressed based on the math expression processing landscape impedance. By now, landscape affinity is computed as a reversed value of landscape impedance but it is planned to develop it as a more flexible input to compute connectivity further. This output is required by Miramon ICT software, but not Graphab."
    ]
   },
   {