Skip to content
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ For your convenience, here is an overview of the contents of this document:
| [Setting up {{< product-c8y-iot >}} DataHub](/datahub/setting-up-datahub) | Set up {{< product-c8y-iot >}} DataHub and its components |
| [Working with {{< product-c8y-iot >}} DataHub](/datahub/working-with-datahub) | Manage offloading pipelines and query the offloaded results |
| [Operating {{< product-c8y-iot >}} DataHub](/datahub/operating-datahub) | Run administrative tasks |
| [Running {{< product-c8y-iot >}} DataHub on {{< product-c8y-iot >}} Edge](/datahub/running-datahub-on-the-edge) | Run the Edge edition of {{< product-c8y-iot >}} DataHub |
| [Integrating {{< product-c8y-iot >}} DataHub with other products](/datahub/integrating-datahub-with-other-products) | Learn how to integrate {{< product-c8y-iot >}} DataHub with other products |

The [change log](/change-logs/?component=.component-datahub) provides an overview on features, changes, and other relevant information.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

9 changes: 0 additions & 9 deletions content/datahub/running-datahub-on-the-edge.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -98,11 +98,7 @@ Value: Your key name, for example, `arn:aws:kms:eu-west-2:123456789012:key/071a8
**SSE-C**: The client specifies an base64-encoded AES-256 key to be used to encrypt and decrypt the data. **{{< product-c8y-iot >}} DataHub does not support this option.**

##### NAS {#nas}
**NAS** is a storage system mounted (NFS, SMB) directly into the Dremio cluster. It is only available for {{< product-c8y-iot >}} Edge installations. The following settings must be defined for this data lake:

|Settings|Description|
|:---|:---|
|Mount path|The mount path refers to a path in the local Linux file system on both the coordinator and executor containers. By default, the file system of {{< product-c8y-iot >}} Edge is mounted into /datalake inside the containers. To use some other folder, you must map the folder into both containers, for example, to /datalake inside the containers.|
**NAS** is a storage system mounted (NFS, SMB) directly into the Dremio cluster. It is only available on {{< product-c8y-iot >}} Edge installations.

#### Saving settings {#saving-settings}
Once all settings are defined, click **Save** in the action bar to the right. During the save process, the following steps are automatically conducted:
Expand Down
31 changes: 31 additions & 0 deletions content/edge-kubernetes/datahub.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
weight: 80
title: DataHub
layout: bundle
sector:
- edge_server
---

{{< product-c8y-iot >}} DataHub on Edge offers the same functionality as a cloud installation of {{< product-c8y-iot >}} DataHub, and is an optional component of Edge. The significant difference is that processes and data are entirely local to your network, rather than in the cloud. You can define offloading pipelines, which regularly move data from the Operational Store of {{< product-c8y-iot >}} into a data lake. In the Edge setup, a NAS or local disk is used as data lake. Dremio, the internal engine of {{< product-c8y-iot >}} DataHub, can access the data lake and run analytical queries against its contents, using SQL as the query interface.

To learn more about DataHub in general, see [DataHub overview](/datahub/datahub-overview). As an end user, DataHub on Edge appears and behaves much the same as DataHub in a cloud installation, subject to the limitations in the comparison table later in this section.

### Installing and Using DataHub

DataHub is an optional component of Edge, and can be enabled by updating the `spec.dataHub` field in the Edge custom resource (CR). For more details on the `spec.messagingService` field, refer to [Edge custom resource - DataHub](/edge-kubernetes/edge-custom-resource-definition/#k8-edge-datahub). For general guidance on configuring Edge, see the [Install Edge](/edge-kubernetes/installing-edge-on-k8/) and [Modify Edge](/edge-kubernetes/manage-edge/#modify-edge) sections in the Edge documentation.

The data lake will always be written to the host file-system under the path `/datahub/datalake`, whatever is mounted there. You are expected to have NFS or some other form of NAS file-system mounted at that path _on all nodes of the Kubernetes cluster that Edge is running on_. This is to ensure the resilience of your data lake contents.

In order to access Dremio, you must also make the domain `datahub-<domain_name>` resolvable, just as the configured domain name and `management-<domain_name>` were made resolvable in [Accessing Edge](/edge-kubernetes/installing-edge-on-k8/#accessing-edge).

### Comparison between DataHub Edge and DataHub Cloud

| Area | {{< product-c8y-iot >}} DataHub Edge | {{< product-c8y-iot >}} DataHub Cloud |
| ----- | ----- | ----- | ----- |
| High Availability | Depending on any underlying virtualization technology | Depending on the cloud deployment setup |
| Vertical scalability | Yes | Yes |
| Horizontal scalability | No | Yes |
| Upgrades with no downtime | No | No |
| Installation | Offline & Online | Online |
| Dremio cluster setup | 1 master, 1 executor | Minimum 1 master, 1 executor |
| Data lakes | NAS or local disk | Azure Storage, S3, (NAS) |
Loading