-
Notifications
You must be signed in to change notification settings - Fork 559
Add Alibaba Cloud storage artifact-store #4289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
htahir1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too much vibe!
Documentation Link Check Results❌ Absolute links check failed |
|
@claude /full-review |
|
Claude finished @htahir1's task —— View job Full Review Complete ✅I've completed a comprehensive review of the PR adding Alibaba Cloud OSS and MinIO artifact store documentation. Here are my findings: SummaryThis PR adds documentation for using Alibaba Cloud OSS and MinIO with ZenML's S3 artifact store integration. While the approach is sound and the documentation structure follows ZenML patterns, there are several critical issues that need to be addressed, particularly around incorrect authentication information and content organization. Critical Issues1. Incorrect Authentication Method in Alibaba OSS Documentation
|
| zenml secret create alibaba_secret \ | ||
| --aws_access_key_id='<YOUR_ALIBABA_ACCESS_KEY_ID>' \ | ||
| --aws_secret_access_key='<YOUR_ALIBABA_SECRET_ACCESS_KEY>' | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i still dont get why this is called "aws" when its alibaba....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because it uses s3 integration and that what the integration looks for as a key and secret for the given authentication_secret
| ```shell | ||
| zenml secret create alibaba_secret \ | ||
| --aws_access_key_id='<YOUR_ALIBABA_ACCESS_KEY_ID>' \ | ||
| --aws_secret_access_key='<YOUR_ALIBABA_SECRET_ACCESS_KEY>' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarification needed: While the parameter names contain aws_ (for S3 API compatibility), users should provide their Alibaba Cloud OSS access credentials here, not AWS credentials. Consider adding a note like:
> **Note:** The parameter names use `aws_` prefixes because Alibaba Cloud OSS uses an S3-compatible API. However, you should provide your Alibaba Cloud OSS Access Key ID and Secret Access Key, which you can obtain from the Alibaba Cloud console.|
|
||
| To use Alibaba Cloud OSS with ZenML, you need to configure the S3 Artifact Store with specific settings for OSS compatibility: | ||
|
|
||
| {% tabs %} | ||
| {% tab title="Using a ZenML Secret (recommended)" %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing authentication context: Unlike AWS S3, Alibaba Cloud OSS does not support ZenML Service Connectors. Consider adding a note before the tabs section:
{% hint style="info" %}
Alibaba Cloud OSS does not support ZenML Service Connectors. Use ZenML Secrets to securely store and reference your Alibaba Cloud credentials.
{% endhint %}| {% hint style="warning" %} | ||
| **Important:** When using Alibaba Cloud OSS, you must set the following `config_kwargs`: | ||
|
|
||
| ```json | ||
| {"signature_version": "s3", "s3": {"addressing_style": "virtual"}} | ||
| ``` | ||
|
|
||
| This is required for proper compatibility with Alibaba Cloud OSS's S3 API implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enhance warning: Consider updating this warning to explicitly reference that the configuration example below shows how to set these parameters:
{% hint style="warning" %}
**Important:** When using Alibaba Cloud OSS, you must set the following `config_kwargs`:
\`\`\`json
{"signature_version": "s3", "s3": {"addressing_style": "virtual"}}
\`\`\`
This is required for proper compatibility with Alibaba Cloud OSS's S3 API implementation. You'll see how to include these parameters in the configuration example below.
{% endhint %}| zenml secret create minio_secret \ | ||
| --aws_access_key_id='<YOUR_MINIO_ACCESS_KEY>' \ | ||
| --aws_secret_access_key='<YOUR_MINIO_SECRET_KEY>' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarification needed: Similar to Alibaba OSS, the parameter names contain aws_ (for S3 API compatibility), but users should provide their MinIO access credentials. Add a note:
> **Note:** The parameter names use `aws_` prefixes because MinIO implements the S3-compatible API. Provide your MinIO Access Key and Secret Key here (default is `minioadmin`/`minioadmin` for local development, but should be changed for production).| You'll also need a running MinIO instance. MinIO can be deployed in various ways: | ||
|
|
||
| * **Docker**: `docker run -p 9000:9000 -p 9001:9001 minio/minio server /data --console-address ":9001"` | ||
| * **Kubernetes**: Using the [MinIO Operator](https://min.io/docs/minio/kubernetes/upstream/) | ||
| * **Binary**: Download from [MinIO's website](https://min.io/download) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enhancement suggestion: Consider adding security considerations for production deployments:
You'll also need a running MinIO instance. MinIO can be deployed in various ways:
* **Docker**: `docker run -p 9000:9000 -p 9001:9001 minio/minio server /data --console-address ":9001"`
* **Kubernetes**: Using the [MinIO Operator](https://min.io/docs/minio/kubernetes/upstream/)
* **Binary**: Download from [MinIO's website](https://min.io/download)
{% hint style="warning" %}
**Security Note:** The default MinIO credentials are `minioadmin`/`minioadmin`. Always change these credentials in production environments and ensure MinIO is not exposed to the public internet without proper authentication and TLS.
{% endhint %}|
@claude /full-review |
|
@safoinme didnt u want to add backwards comaptibility here on the args as we discussed? |
|
Claude finished @htahir1's task —— View job Full PR Review Complete ✅I've completed a comprehensive review of the PR adding Alibaba Cloud OSS and MinIO artifact store documentation. SummaryThis PR adds documentation for using Alibaba Cloud OSS and MinIO with ZenML's S3 artifact store integration. The documentation follows ZenML's structure and patterns well, but needs refinement in several areas to address @htahir1's "Too much vibe!" feedback and technical accuracy concerns. Overall Assessment: The fundamental approach is sound and the integration into the documentation structure is correct. However, changes are requested to address critical clarity issues around authentication and tone. Critical Issues (Must Fix Before Merge)1. Misleading Authentication Parameter Names
|
| Aspect | S3 Docs | GCP Docs | Alibaba Docs | MinIO Docs | Assessment |
|---|---|---|---|---|---|
| Structure | Complete | Complete | ✅ Good | ✅ Good | Follows pattern |
| Tone | Technical | Technical | Needs adjustment | ||
| Authentication | 3 methods | 3 methods | 1 method | 1 method | Appropriate |
| Security notes | ✅ Extensive | ✅ Good | ✅ Basic | ❌ Missing | MinIO needs more |
| Advanced config | ✅ Detailed | ✅ Referenced | Could enhance |
Files Reviewed:
- ✅
docs/book/component-guide/artifact-stores/alibaba-oss.md(78 lines) - ✅
docs/book/component-guide/artifact-stores/minio.md(76 lines) - ✅
docs/book/component-guide/artifact-stores/README.md(+2 lines) - ✅
docs/book/component-guide/artifact-stores/toc.md(+2 lines)
Action Items for Author
Must fix (blockers):
- ✅ Add clarification notes about
aws_parameter naming in both docs (inline comments posted) - ✅ Make tone more technical in "When to use it?" sections (inline comments posted)
- ✅ Add note about why Service Connectors aren't supported (inline comment posted)
- ✅ Add security warnings to MinIO deployment section (inline comment posted)
⚠️ Fix broken absolute links causing CI failure (check CI logs)
Should fix:
6. Enhance config_kwargs warning to reference configuration example
7. Add "Advanced Configuration" section or pointer
8. Improve MinIO SSL/TLS hint with specific example
Conclusion
This PR adds valuable documentation that will help users integrate Alibaba Cloud OSS and MinIO with ZenML. The structure is solid and the technical approach is correct (especially the decision not to document non-existent Service Connector support).
However, the documentation needs refinement primarily around:
- Authentication clarity - Users must understand the
aws_parameter naming - Technical tone - Needs to be more precise and less conversational
- Security guidance - MinIO needs production deployment warnings
I've posted 6 inline comments with specific, actionable suggestions that include the exact text to add. These address the critical issues identified by @htahir1's review.
Once these changes are made, this will be a solid addition to ZenML's artifact store documentation.
| zenml secret create alibaba_secret \ | ||
| --aws_access_key_id='<YOUR_ALIBABA_ACCESS_KEY_ID>' \ | ||
| --aws_secret_access_key='<YOUR_ALIBABA_SECRET_ACCESS_KEY>' | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRITICAL: Add clarification about parameter naming before the secret creation. Users will be confused why they're using aws_ parameters for Alibaba Cloud credentials.
Suggested addition before line 47:
{% hint style="info" %}
**About the parameter names:** The secret parameters use `aws_` prefixes (`aws_access_key_id` and `aws_secret_access_key`) because Alibaba Cloud OSS implements the S3-compatible API. However, you should provide your **Alibaba Cloud OSS access credentials** here, not AWS credentials. You can obtain these from the Alibaba Cloud console under AccessKey Management.
{% endhint %}| To use Alibaba Cloud OSS with ZenML, you need to configure the S3 Artifact Store with specific settings for OSS compatibility: | ||
|
|
||
| {% tabs %} | ||
| {% tab title="Using a ZenML Secret (recommended)" %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add context: Explain why there's no "AWS Service Connector" tab (unlike the S3 docs). Add this note before the tabs:
{% hint style="warning" %}
**Note:** Alibaba Cloud OSS does not support ZenML Service Connectors. Unlike AWS S3, which can be authenticated via AWS Service Connectors, Alibaba Cloud requires using ZenML Secrets to store and reference your credentials.
{% endhint %}|
|
||
| * You're already using Alibaba Cloud services | ||
| * You need to store artifacts in specific geographic regions served by Alibaba Cloud | ||
| * You want to leverage Alibaba Cloud's pricing and performance characteristics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tone issue ("too much vibe"): This section is too casual. Make it more technical and specific:
You should use the Alibaba Cloud OSS Artifact Store when:
* Your infrastructure is already deployed on Alibaba Cloud and you want to maintain data locality
* You require artifact storage in specific geographic regions served by Alibaba Cloud (China, Asia-Pacific, Europe, Middle East)
* You need S3-compatible object storage with Alibaba Cloud's pricing model and service level agreements
* Compliance requirements mandate data residency in Alibaba Cloud regions| zenml secret create minio_secret \ | ||
| --aws_access_key_id='<YOUR_MINIO_ACCESS_KEY>' \ | ||
| --aws_secret_access_key='<YOUR_MINIO_SECRET_KEY>' | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRITICAL: Add clarification about parameter naming. Same issue as Alibaba OSS - users need to understand these are MinIO credentials, not AWS credentials.
Add before line 42:
{% hint style="info" %}
**About the parameter names:** The secret parameters use `aws_` prefixes because MinIO implements the S3-compatible API. Provide your **MinIO Access Key and Secret Key** here. For local development, the default credentials are `minioadmin`/`minioadmin`, but these must be changed for production deployments.
{% endhint %}|
|
||
| * **Docker**: `docker run -p 9000:9000 -p 9001:9001 minio/minio server /data --console-address ":9001"` | ||
| * **Kubernetes**: Using the [MinIO Operator](https://min.io/docs/minio/kubernetes/upstream/) | ||
| * **Binary**: Download from [MinIO's website](https://min.io/download) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HIGH PRIORITY: Add security warnings for production deployments. Add after this line:
{% hint style="warning" %}
**Security Considerations:**
- The default MinIO credentials are `minioadmin`/`minioadmin` - **always change these in production**
- Use TLS/HTTPS for production deployments to encrypt data in transit
- Configure MinIO access policies to follow the principle of least privilege
- Do not expose MinIO directly to the public internet without proper authentication and network security
- For production use, refer to [MinIO's security best practices](https://min.io/docs/minio/linux/operations/concepts/security-best-practices.html)
{% endhint %}| * You want to run a self-hosted object storage solution | ||
| * You need S3-compatible storage in your own data center or private cloud | ||
| * You're running ZenML in a Kubernetes environment and want co-located storage | ||
| * You want to avoid cloud vendor lock-in while maintaining S3 API compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tone issue ("too much vibe"): Make this more technical and specific:
You should use the MinIO Artifact Store when:
* You require self-hosted object storage for data sovereignty or compliance requirements
* Your MLOps infrastructure runs on-premises or in a private cloud environment
* You need S3-compatible storage co-located with your Kubernetes-based ZenML deployment
* You want to eliminate cloud vendor dependencies while maintaining S3 API compatibility
* You're developing locally and need a lightweight S3-compatible storage backend for testing
Describe changes
I implemented/fixed _ to achieve _.
Pre-requisites
Please ensure you have done the following:
developand the open PR is targetingdevelop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes