Metadata storage for jobs/workflows

Metadata storage for bigflow jobs/workflows

There are several usecases for simple document/key-value storage

1. Save (append) information about executed workflows/jobs.
   ID, run-time, docker hash, execution time, cost estimate, result etc...
   Basically some sort of structured logs, which may be used to 
   see execution history & do some cost estimation (manually)

2. Query for running workflows/jobs, their status (history and/or curenly running workflows)
   > bigflow history -w workflow_id
   Such cli api migh be a first step towards "airflow-free" solution 
   (aka ability to replace airflow with custom cron-like service)

3. Communicate between taks/workflows.
   In some rare cases one workflow migh want to check status of another.
   Also workflow migh check if another instance is currently running.
   This especially important for dev-like environments, where 
   workflows are executed locally (via bigflow run).

4. Persist some information between tasks/jobs.
   Like 'last-processed-id' (for incremental processing),
   last time-per-batch (to auto-adjust batch-size) etc.

Database - anything for 1. BigQuery / any-sql-like DB for 1/2/3/4.

Client visible API - TBD.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata storage for jobs/workflows #240

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Metadata storage for jobs/workflows #240

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions