Skip to content

feat(ingestion/snaplogic): Add snaplogic as a source for metadata ingestion #14231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions datahub-web-react/src/app/ingest/source/builder/constants.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ import qlikLogo from '@images/qliklogo.png';
import redshiftLogo from '@images/redshiftlogo.png';
import sacLogo from '@images/saclogo.svg';
import sigmaLogo from '@images/sigmalogo.png';
import snaplogic from '@images/snaplogic.png';
import snowflakeLogo from '@images/snowflakelogo.png';
import supersetLogo from '@images/supersetlogo.png';
import tableauLogo from '@images/tableaulogo.png';
Expand Down Expand Up @@ -149,6 +150,8 @@ export const NEO4J = 'neo4j';
export const NEO4J_URN = `urn:li:dataPlatform:${NEO4J}`;
export const VERTEX_AI = 'vertexai';
export const VERTEXAI_URN = `urn:li:dataPlatform:${VERTEX_AI}`;
export const SNAPLOGIC = 'snaplogic';
export const SNAPLOGIC_URN = `urn:li:dataPlatform:${SNAPLOGIC}`;

export const PLATFORM_URN_TO_LOGO = {
[ATHENA_URN]: athenaLogo,
Expand Down Expand Up @@ -196,6 +199,7 @@ export const PLATFORM_URN_TO_LOGO = {
[DATAHUB_URN]: datahubLogo,
[NEO4J_URN]: neo4j,
[VERTEXAI_URN]: vertexAI,
[SNAPLOGIC_URN]: snaplogic,
};

export const SOURCE_TO_PLATFORM_URN = {
Expand Down
8 changes: 8 additions & 0 deletions datahub-web-react/src/app/ingest/source/builder/sources.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,14 @@
"docsUrl": "https://docs.datahub.com/docs/quick-ingestion-guides/redshift/overview",
"recipe": "source: \n type: redshift\n config:\n # Coordinates\n host_port: # Your Redshift host and post, e.g. example.something.us-west-2.redshift.amazonaws.com:5439\n database: # Your Redshift database, e.g. SampleDatabase\n\n # Credentials\n # Add secret in Secrets Tab with relevant names for each variable\n username: null # Your Redshift username, e.g. admin\n\n table_lineage_mode: stl_scan_based\n include_table_lineage: true\n include_tables: true\n include_views: true\n profiling:\n enabled: true\n profile_table_level_only: true\n stateful_ingestion:\n enabled: true"
},
{
"urn": "urn:li:dataPlatform:snaplogic",
"name": "snaplogic",
"displayName": "Snaplogic",
"description": "Import lineage from Snaplogic.",
"docsUrl": "https://docs.datahub.com/docs/quick-ingestion-guides/snaplogic/overview",
"recipe": "source:\n type: snaplogic\n config:\n username: # username\n password: # password\n base_url: https://elastic.snaplogic.com\n org_name: # Organization name from Snaplogic instance\n stateful_ingestion:\n enabled: True\n remove_stale_metadata: False\n"
},
{
"urn": "urn:li:dataPlatform:snowflake",
"name": "snowflake",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,14 @@
"docsUrl": "https://docs.datahub.com/docs/quick-ingestion-guides/redshift/overview",
"recipe": "source: \n type: redshift\n config:\n # Coordinates\n host_port: # Your Redshift host and post, e.g. example.something.us-west-2.redshift.amazonaws.com:5439\n database: # Your Redshift database, e.g. SampleDatabase\n\n # Credentials\n # Add secret in Secrets Tab with relevant names for each variable\n username: null # Your Redshift username, e.g. admin\n\n table_lineage_mode: stl_scan_based\n include_table_lineage: true\n include_tables: true\n include_views: true\n profiling:\n enabled: true\n profile_table_level_only: true\n stateful_ingestion:\n enabled: true"
},
{
"urn": "urn:li:dataPlatform:snaplogic",
"name": "snaplogic",
"displayName": "Snaplogic",
"description": "Import lineage from Snaplogic.",
"docsUrl": "https://docs.datahub.com/docs/quick-ingestion-guides/snaplogic/overview",
"recipe": "source:\n type: snaplogic\n config:\n username: # username\n password: # password\n base_url: https://elastic.snaplogic.com\n org_name: # Organization name from Snaplogic instance\n stateful_ingestion:\n enabled: True\n remove_stale_metadata: False\n"
},
{
"urn": "urn:li:dataPlatform:snowflake",
"name": "snowflake",
Expand Down
Binary file added datahub-web-react/src/images/snaplogic.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
74 changes: 74 additions & 0 deletions datahub-web-react/src/images/snaplogic.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
72 changes: 72 additions & 0 deletions metadata-ingestion/docs/sources/snaplogic/snaplogic_pre.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
## Integration Details

<!-- Plain-language description of what this integration is meant to do. -->
<!-- Include details about where metadata is extracted from (ie. logs, source API, manifest, etc.) -->

This integration extracts data lineage information from the public SnapLogic Lineage API and ingests it into DataHub. It enables visibility into how data flows through SnapLogic pipelines by capturing metadata directly from the source API. This allows users to track data transformations and dependencies across their data ecosystem, enhancing observability, governance, and impact analysis within DataHub.

### Concept Mapping

<!-- This should be a manual mapping of concepts from the source to the DataHub Metadata Model -->
<!-- Authors should provide as much context as possible about how this mapping was generated, including assumptions made, known shortcuts, & any other caveats -->

This ingestion source maps the following Source System Concepts to DataHub Concepts:

<!-- Remove all unnecessary/irrevant DataHub Concepts -->

| Source Concept | DataHub Concept | Notes |
| -------------- | ------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------- |
| Snap-pack | [Data Platform](docs/generated/metamodel/entities/dataPlatform.md) | Snap-packs are mapped to Data Platforms, either directly (e.g., Snowflake) or dynamically based on connection details (e.g., JDBC URL). |
| Table/Dataset | [Dataset](docs/generated/metamodel/entities/dataset.md) | May be differernt. It depends on a snap type. For sql databases it's table. For kafka it's topic, etc |
| Snap | [Data Job](docs/generated/metamodel/entities/dataJob.md) | |
| Pipeline | [Data Flow](docs/generated/metamodel/entities/dataFlow.md) | |

## Metadata Ingestion Quickstart

### Prerequisites

In order to ingest lineage from snaplogic, you will need valid snaplogic credentials with access to the SnapLogic Lineage API.

### Install the Plugin(s)

Run the following commands to install the relevant plugin(s):

`pip install 'acryl-datahub[snaplogic]'`

### Configure the Ingestion Recipe(s)

Use the following recipe(s) to get started with ingestion.

#### `'acryl-datahub[snaplogic]'`

```yml
pipeline_name: <action-pipeline-name>
source:
type: snaplogic
config:
username: <snaplogic-username>
password: <snaplogic-password>
base_url: https://elastic.snaplogic.com
org_name: <snaplogic-org-name>
stateful_ingestion:
enabled: True
remove_stale_metadata: False
```

<details>
<summary>View All Recipe Configuartion Options</summary>

| Field | Required | Default | Description |
| ----------------------------- | :------: | :---------------------------: | --------------------------------------------------------------- |
| `username` | ✅ | | SnapLogic account login |
| `password` | ✅ | | SnapLogic account password. |
| `base_url` | ✅ | https://elastic.snaplogic.com | Snaplogic url |
| `org_name` | ✅ | | Organisation name in snaplogic platform |
| `namespace_mapping` | ❌ | | Namespace mapping. Used to map namespaces to platform instances |
| `case_insensitive_namespaces` | ❌ | | List of case insensitive namespaces |

</details>

## Troubleshooting

### [Common Issue]
15 changes: 15 additions & 0 deletions metadata-ingestion/docs/sources/snaplogic/snaplogic_recipe.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
pipeline_name: "snaplogic_incremental_ingestion"
source:
type: snaplogic
config:
username: example@snaplogic.com
password: password
base_url: https://elastic.snaplogic.com
org_name: "ExampleOrg"
namespace_mapping:
snowflake://snaplogic: snaplogic
case_insensitive_namespaces:
- snowflake://snaplogic
stateful_ingestion:
enabled: True
remove_stale_metadata: False
3 changes: 3 additions & 0 deletions metadata-ingestion/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -569,6 +569,7 @@
# databricks is alias for unity-catalog and needs to be kept in sync
"databricks": databricks | sql_common,
"fivetran": snowflake_common | bigquery_common | sqlalchemy_lib | sqlglot_lib,
"snaplogic": set(),
"qlik-sense": sqlglot_lib | {"requests", "websocket-client"},
"sigma": sqlglot_lib | {"requests"},
"sac": sac,
Expand Down Expand Up @@ -700,6 +701,7 @@
"redshift",
"s3",
"snowflake",
"snaplogic",
"slack",
"tableau",
"teradata",
Expand Down Expand Up @@ -842,6 +844,7 @@
"gcs = datahub.ingestion.source.gcs.gcs_source:GCSSource",
"sql-queries = datahub.ingestion.source.sql_queries:SqlQueriesSource",
"fivetran = datahub.ingestion.source.fivetran.fivetran:FivetranSource",
"snaplogic = datahub.ingestion.source.snaplogic.snaplogic:SnaplogicSource",
"qlik-sense = datahub.ingestion.source.qlik_sense.qlik_sense:QlikSenseSource",
"sigma = datahub.ingestion.source.sigma.sigma:SigmaSource",
"sac = datahub.ingestion.source.sac.sac:SACSource",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2922,6 +2922,38 @@
"platform_name": "Slack",
"support_status": "TESTING"
},
"snaplogic": {
"capabilities": [
{
"capability": "LINEAGE_FINE",
"description": "Enabled by default",
"subtype_modifier": null,
"supported": true
},
{
"capability": "DELETION_DETECTION",
"description": "Not supported yet",
"subtype_modifier": null,
"supported": false
},
{
"capability": "PLATFORM_INSTANCE",
"description": "Snaplogic does not support platform instances",
"subtype_modifier": null,
"supported": false
},
{
"capability": "LINEAGE_COARSE",
"description": "Enabled by default",
"subtype_modifier": null,
"supported": true
}
],
"classname": "datahub.ingestion.source.snaplogic.snaplogic.SnaplogicSource",
"platform_id": "snaplogic",
"platform_name": "Snaplogic",
"support_status": "TESTING"
},
"snowflake": {
"capabilities": [
{
Expand Down Expand Up @@ -3565,4 +3597,4 @@
"support_status": "CERTIFIED"
}
}
}
}
Empty file.
Loading
Loading