Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions etc/scripts/d2d/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
Run ScanCode.io Mapping Script
================================

This script executes the ``map_deploy_to_develop`` mapping workflow from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insert a RST link to https://github.com/aboutcode-org/scancode.io/blob/main/docs/built-in-pipelines.rst?plain=1#L188, cross links between related docummention pages are always useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a short description of d2d (similar to the pipeline docstring) would also be useful?

ScanCode.io inside a Docker container. It optionally spins up a temporary
PostgreSQL instance when needed. The script copies the specified input files to
a working directory, runs the mapping, writes the output to a file, and cleans
up afterward.

Usage
-----

.. code-block:: bash

./map-deploy-to-develop.sh <from-path> <to-path> <output-file> [options] <spin-db> [db-port]

Arguments
---------

+-----------------+-------------------------------------------------------------+
| Argument | Description |
+=================+=============================================================+
| ``from-path`` | Path to the base deployment/scan file |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here from and to sides should be more closely aligned with the explanation we have at https://github.com/aboutcode-org/scancode.io/blob/main/scanpipe/pipelines/deploy_to_develop.py#L42 so it is clear that one is the source side and the other is the deployed side. The base/target names used here is new and a tad bit confusing.

+-----------------+-------------------------------------------------------------+
| ``to-path`` | Path to the target deployment/scan file |
+-----------------+-------------------------------------------------------------+
| ``options`` | D2D pipeline parameters (can be empty ``""``) |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

options should also be a bit more descriptive, to communicate that these are ecosystem specific optional steps, and instead of can be empty we can probably mention in someway which parameters are optional and which are required.

We should probably also have a reference page on all the supported ecosystems in d2d and the capabilities supported there, and link to this page. I opened a seperate issue for this: #1922

+-----------------+-------------------------------------------------------------+
| ``output-file`` | File where ScanCode.io output will be written |
+-----------------+-------------------------------------------------------------+
| ``spin-db`` | ``true`` = spin temp DB container, ``false`` = skip |
+-----------------+-------------------------------------------------------------+
| ``db-port`` | Port to bind Postgres (default: ``5432``) |
+-----------------+-------------------------------------------------------------+


Example
-------

Run mapping without database:

.. code-block:: bash

./map-deploy-to-develop.sh ./from.tar.gz ./to.whl results.json

Run mapping with database on a custom port:

.. code-block:: bash

./map-deploy-to-develop.sh ./from.tar.gz ./to.whl output.json --options "Python,Java" --spin-db --port 5433

Script Actions
--------------

1. Validates required arguments
2. Starts PostgreSQL in Docker (if ``spin-db=true``)
3. Creates a temporary working directory: ``./d2d``
4. Copies input files into working directory
5. Runs ScanCode.io mapping step:

.. code-block:: text

run map_deploy_to_develop:<D2D_OPTIONS> \
"/code/<from-file>:from,/code/<to-file>:to"

6. Writes mapping output into ``output-file``
7. Cleans up temp directory
8. Stops DB container if it was started

Dependencies
------------

* Bash
* Docker
* Local filesystem permissions for creating ``./d2d`` and writing output


Before running the script:
----------------------------------

Ensure the script has execute permissions:

.. code-block:: bash

sudo su -
chmod +x map-deploy-to-develop.sh

Then execute:

.. code-block:: bash

./map-deploy-to-develop.sh ...
108 changes: 108 additions & 0 deletions etc/scripts/d2d/map-deploy-to-develop.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
#!/bin/bash
set -e

FROM_PATH="$1"
TO_PATH="$2"
OUTPUT_FILE="$3"

D2D_OPTIONS=""
SPIN_DB=false
DB_PORT=5432


shift 3
while [[ "$#" -gt 0 ]]; do
case "$1" in
--options)
D2D_OPTIONS="$2"
shift 2
;;
--spin-db)
SPIN_DB=true
shift 1
;;
--port)
DB_PORT="$2"
shift 2
;;
*)
echo "Unknown parameter: $1"
exit 1
;;
esac
done

if [ -z "$FROM_PATH" ] || [ -z "$TO_PATH" ] || [ -z "$OUTPUT_FILE" ]; then
echo "Missing required arguments!"
echo "Usage: $0 <from-path> <to-path> [options] <output-file> <spin-db(true|false)> [db-port]"
exit 1
fi

if [ -z "$DB_PORT" ]; then
DB_PORT=5432
fi

echo "Arguments:"
echo "FROM_PATH: $FROM_PATH"
echo "TO_PATH: $TO_PATH"
echo "D2D_OPTIONS: $D2D_OPTIONS"
echo "OUTPUT_FILE: $OUTPUT_FILE"
echo "SPIN_DB: $SPIN_DB"
echo "DB_PORT: $DB_PORT"

DB_STARTED=false

if [ "$SPIN_DB" = true ]; then
echo "Starting Postgres container on port $DB_PORT..."

docker run -d \
--name scancodeio-run-db \
-e POSTGRES_DB=scancodeio \
-e POSTGRES_USER=scancodeio \
-e POSTGRES_PASSWORD=scancodeio \
-e POSTGRES_INITDB_ARGS="--encoding=UTF-8 --lc-collate=en_US.UTF-8 --lc-ctype=en_US.UTF-8" \
-v scancodeio_pgdata:/var/lib/postgresql/data \
-p "${DB_PORT}:5432" \
postgres:17 || {
echo "Failed to start DB container. Cleaning up…"
docker rm -f scancodeio-run-db >/dev/null 2>&1 || true
exit 1
}

DB_STARTED=true
echo "DB container started"
fi

WORKDIR="d2d"
mkdir -p "$WORKDIR"

cp "$FROM_PATH" "$WORKDIR/"
cp "$TO_PATH" "$WORKDIR/"

FROM_FILENAME=$(basename "$FROM_PATH")
TO_FILENAME=$(basename "$TO_PATH")

echo "Running ScanCode.io mapping..."

docker run --rm \
-v "$(pwd)/$WORKDIR":/code \
--network host \
-e SCANCODEIO_NO_AUTO_DB=1 \
ghcr.io/aboutcode-org/scancode.io:latest \
run map_deploy_to_develop:"$D2D_OPTIONS" \
"/code/${FROM_FILENAME}:from,/code/${TO_FILENAME}:to" \
> "$OUTPUT_FILE"

echo "Output saved to $OUTPUT_FILE"


rm -rf "$WORKDIR"
echo "Temporary directory cleaned up"

if [ "$DB_STARTED" = true ]; then
echo "Stopping DB container..."
docker rm -f scancodeio-run-db >/dev/null 2>&1 || true
echo "DB container removed"
fi

echo "Done!"