-
Notifications
You must be signed in to change notification settings - Fork 14
docs: 1701 document single node k8s deployment #1877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
| @@ -0,0 +1,515 @@ | |||
| #!/usr/bin/env bash | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exploring whether to break this script up into multiple scripts as inspired by Keifer. This would make it easier to follow and also allows for lift and shift where a section may not be applicable to a node operator.
| @@ -0,0 +1,159 @@ | |||
| # Single Node Kubernetes Deployment | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a reference here from the charts README after the current PR is merged.
AlfredoG87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, I did a first pass on this. I might need to do another pass and did not test it out yet. let me know if you need me to test the process out on the 2nd pass, I might need a new server but I can arrange that locally on a VM.
| - sudo tar -xzf grpcurl.tar.gz -C /usr/local/bin grpcurl | ||
| - rm grpcurl.tar.gz | ||
|
|
||
| scale-bn: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we call this job scale-bn is more of a restart-bn
| - kubectl -n ${NAMESPACE} exec ${POD} -- sh -c 'rm -rf /opt/hiero/block-node/data/live/* /opt/hiero/block-node/data/historic/*' | ||
| - kubectl -n ${NAMESPACE} delete pod $POD | ||
|
|
||
| reset-upgrade: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we talk about reset as a storage clean, but for some reset might mean "restart".
should we be more explicit? I believe we can make comments in this task file. maybe a comment is enough.
| - helm upgrade $RELEASE oci://ghcr.io/hiero-ledger/hiero-block-node/block-node-server --version $VERSION -n $NAMESPACE --install --values values-override/bare-metal-values.yaml | ||
| - sleep 90 | ||
| - kubectl get all -n $NAMESPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we reuse the helm-upgrade job here instead of re-doing the same cmds.
|
|
||
| reset-upgrade: | ||
| cmds: | ||
| - kubectl -n ${NAMESPACE} exec ${POD} -- sh -c 'rm -rf /opt/hiero/block-node/data/live/* /opt/hiero/block-node/data/historic/*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we re-use the reset-file-store job here?
| The single requirement is a server with a supported operating system and sufficient resources to run Kubernetes and the | ||
| Block Node Server. | ||
|
|
||
| Suggested minimum specifications: | ||
|
|
||
| 1. **Local Full History (LFH)**: All block history is stored locally on the server. | ||
| - CPU: 24 cores, 48 threads (2024 or newer CPU) (PCIe 4+) | ||
| - RAM: 256 GB | ||
| - Disk: | ||
| - 8 TB NVMe SSD | ||
| - 500 TB | ||
| - 2 x 10 Gbps Network Interface Cards (NICs) | ||
| 2. **Remote Full History (RFH)**: Block history is stored remotely. | ||
| - CPU: 24 cores, 48 threads (2024 or newer CPU) (PCIe 4+) | ||
| - RAM: 256 GB | ||
| - Disk: 8 TB NVMe SSD | ||
| - 2 x 10 Gbps Network Interface Cards (NICs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specs are for mainnet I assume, should we explicitly say so?
This is just a suggestion for the future.:
Maybe we can have in another place, not sure, a table of small, medium, large and x-large deployments and and example of TPS that each deployment is meant to support.
| - 2 x 10 Gbps Network Interface Cards (NICs) | ||
|
|
||
| Recommendations: | ||
| - In both configurations a Linux-based operating system is recommended, such as Ubuntu 22.04 LTS or Debian 11 LTS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should start provisioning in Ubuntu 24.04 LTS
| Once a server has been acquired, it needs to be provisioned with the necessary software components to run Kubernetes | ||
| and the Block Node Server. | ||
|
|
||
| Assuming a Linux based environment, [./scripts/linux-bare-metal-provisioner.sh](scripts/linux-bare-metal-provisioner.sh) serves as a provisioner script to automate the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we validated this script in both Ubuntu 22.04 LTS and Debian 11 LTS ?
| curl -L https://github.com/fullstorydev/grpcurl/releases/download/v1.8.7/grpcurl_1.8.7_linux_x86_64.tar.gz -o grpcurl.tar.gz | ||
| sudo tar -xzf grpcurl.tar.gz -C /usr/local/bin grpcurl | ||
| rm grpcurl.tar.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use the setup-grpcurl job/task on the Taskfile.yml instead of repeating the same snippet?
| VERSION="latest stable version here" | ||
| BASE_URL="https://github.com/hiero-ledger/hiero-block-node/releases/download/v${VERSION}" | ||
| ARCHIVE="block-node-protobuf-${VERSION}.tgz" | ||
| TGZ="block-node-protobuf-${VERSION}.tgz" | ||
| DEST_DIR="${HOME}/block-node-protobuf-${VERSION}" | ||
|
|
||
| curl -L "${BASE_URL}/${ARCHIVE}" -o "${ARCHIVE}" | ||
|
|
||
| # Ensure unzip is available | ||
| command -v unzip >/dev/null 2>&1 || sudo apt-get install -y unzip | ||
|
|
||
| mkdir -p "${DEST_DIR}" | ||
| tar -xzf "${ARCHIVE}" -C "${DEST_DIR}" | ||
| # This is needed as tar doesnt support to overwrite the existing tar | ||
| tar -xzf "${DEST_DIR}"/block-node-protobuf-${VERSION}.tgz -C "${DEST_DIR}" | ||
| rm "${ARCHIVE}" "${DEST_DIR}/${TGZ}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to the above comment, but I see that we might want to improve the setup-bn-proto to install unzip if missing.
Codecov Report✅ All modified and coverable lines are covered by tests. @@ Coverage Diff @@
## main #1877 +/- ##
============================================
- Coverage 80.64% 80.58% -0.06%
- Complexity 1174 1179 +5
============================================
Files 127 127
Lines 5553 5553
Branches 591 591
============================================
- Hits 4478 4475 -3
- Misses 802 806 +4
+ Partials 273 272 -1 see 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
Reviewer Notes
For production purposes block nodes are recommended to be installed in a single node k8s style.
It's important to make this process clear and easy to reference for node operators of all types.
With this operators can customize and add on their own business needs
Related Issue(s)
Fixes #1701
Fixes #527