Skip to content

Derive default S3 creds from env vars via new commands #10

@mogul

Description

@mogul

In order to improve the operations and abstraction for using this connector, we should make improvements to how credentials are supplied and used.

Here's a spec. Ask questions to resolve unclear points and present an implementation summary for approval before you actually generate the PR.


Specification

1. Goals:

  • Improve the abstraction of the SpiffWorkflow connector.
  • Simplify the process modeler's experience.
  • Enhance security by relaxing the requirement to pass credential headers as parameters.

2. New Commands:

  • artifacts/GenerateArtifact:

    • Description: Generates an artifact based on the provided input and stores it (in the configured S3 bucket).
    • Input Parameters:
      • id: (Required) A string value that corresponds to the artifact in storage. This will be used to name the object in the S3 bucket (e.g., "case1234/report.pdf").
      • template: (Required) The name of an HTML template to use. Valid names come from the set of templates deployed in the connector in the templates subdirectory.
      • data: (Required) The data used to populate the template.
      • generate_links: (Optional, Boolean, Default: false) If true, a signed link to the artifact will be generated and returned.
      • storage: (Optional, s3:// URL) If provided, the connector will generate the artifact in the given S3 bucket instead of the internally configured one.
    • Output:
      • private_link: A link to the artifact in the S3 bucket. This link only works with appropriate AWS IAM authentication.
      • presigned_link: (Conditional) A time-limited, authenticated link to access the artifact directly, if generate_links was set to true.
  • artifacts/GetLinkToArtifact:

    • Description: Generates links to an existing artifact in the S3 bucket.
    • Input Parameters:
      • id: (Required) A string value that corresponds to the artifact in storage. This will be used to name the object in the S3 bucket (e.g., "case1234/report.pdf").
      • storage: (Optional, s3:// URL) If provided, the connector will generate the artifact in the given S3 bucket instead of the internally configured one.
    • Output:
      • private_link: A link to the artifact in the S3 bucket. This link only works with appropriate AWS IAM authentication.
      • signed_link: (Conditional) A time-limited, authenticated link to access the artifact directly, if generate_link was set to true.

3. Configuration:

The connector should be configurable via environment variables or a configuration file. Key configuration parameters include:

  • S3_BUCKET: (Required) The name of the S3 bucket to use.
  • S3_REGION: (Required) The AWS region of the S3 bucket.
  • AWS_ACCESS_KEY_ID: (Required) The AWS access key ID.
  • AWS_SECRET_ACCESS_KEY: (Required) The AWS secret access key.
  • SIGNED_LINK_EXPIRATION: (Optional, Default: 3600 seconds) The expiration time for signed links.

If the bucket, region, access_key, and secret are not provided, look for a Cloud Foundry-style VCAP_SERVICES environment variable, formatted as JSON. If there's an S3 service in there named "artifacts", use the credentials provided in that entry.

4. Security Considerations:

  • The connector should be configured to use a private S3 bucket; it's the responsibility of the operator to ensure this is the case.
  • Access to artifacts should be controlled via signed links.

5. Backward Compatibility:

  • The old pdf/pdf_to_s3 command should be deprecated but remain functional for a period of time to allow existing workflows to be updated. A warning message should be logged when the old command is used. The new artifacts/GenerateArtifact command should closely follow the implementation of the pdf/pdf_to_s3 command, but using the new command and parameter abstraction.
  • It's fine to use features of the boto3 library when that will simplify the new and existing code. Use DRY principles where the complexity of doing so is low.

6. Example Usage:

  • Generating an artifact and getting a signed link:

    {
      "command": "artifacts/GenerateArtifact",
      "parameters": {
        "id": "processID/report.pdf",
        "template": "ce.html",
        "data": { "name": "World", "value": 42 },
        "generate_links": true
      }
    }

    Output:

    {
      "private_link": "s3://your-bucket/reports/2024/Q3/report_123.pdf",
      "presigned_link": "https://your-bucket.s3.amazonaws.com/reports/2024/Q3/report_123.pdf?AWSAccessKeyId=...",
    }
  • Getting a signed link to an existing artifact:

    {
      "command": "artifacts/GetLinkToArtifact",
      "parameters": {
        "artifact_id": "processID/report.pdf",
      }
    }

    Output:

    {
      "private_link": "s3://your-bucket/reports/2024/Q3/report_123.pdf",
      "presigned_link": "https://your-bucket.s3.amazonaws.com/reports/2024/Q3/report_123.pdf?AWSAccessKeyId=..."
    }

7. Future Considerations:

  • More granular control over signed link restrictions (e.g., IP address restrictions).
  • Consider adding functionality to delete artifacts.
  • Consider adding metadata to the artifact in S3.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions