Skip to content

Conversation

gustavolira
Copy link
Member

includes AI-generated content

  • Added new refactored scripts and modules for improved CI/CD processes.
  • Introduced environment configuration files and example scripts for local testing.
  • Created documentation for the refactored architecture and usage of cursor rules.
  • Updated .gitignore to include new environment override files.
  • Added various Kubernetes resource configurations for deployment and service accounts.

This refactor enhances maintainability and simplifies the deployment process.

Which issue(s) does this PR fix

PR acceptance criteria

Please make sure that the following steps are complete:

  • GitHub Actions are completed and successful
  • Unit Tests are updated and passing
  • E2E Tests are updated and passing
  • Documentation is updated if necessary (requirement for new features)
  • Add a screenshot if the change is UX/UI related

How to test changes / Special notes to the reviewer

@openshift-ci openshift-ci bot requested review from psrna and subhashkhileri October 9, 2025 21:59
Copy link

openshift-ci bot commented Oct 9, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign pataknight for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive refactored architecture for the Red Hat Developer Hub (RHDH) CI/CD system, transitioning from a monolithic script approach to a modular, maintainable structure. The refactor significantly improves code organization, testing capabilities, and deployment flexibility.

  • Modular architecture with separate modules for platform detection, deployment strategies, and testing
  • Enhanced environment configuration system with local testing support and validation
  • Comprehensive Kubernetes resource management for various deployment scenarios

Reviewed Changes

Copilot reviewed 94 out of 96 changed files in this pull request and generated 6 comments.

File Description
.ibm/refactored/openshift-ci-tests.sh Main entry point with job routing and execution framework
.ibm/refactored/modules/ Core modules for logging, k8s operations, deployment strategies, and platform detection
.ibm/refactored/value_files/ Helm values configurations for different deployment scenarios (showcase, RBAC, cloud providers)
.ibm/refactored/resources/ Kubernetes resource definitions for services, operators, and testing infrastructure
Comments suppressed due to low confidence (1)

.ibm/refactored/modules/config-validation.sh:1

  • The base64 regex pattern is too permissive and could match regular strings. Consider adding a minimum length check or more specific validation to avoid false positives with short strings that happen to match the base64 character set.
#!/usr/bin/env bash

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +63 to +69
- apiVersion: "v1beta1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1beta1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1beta1
Copy link

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using deprecated v1beta1 API version for Tekton resources. Consider upgrading to v1 API version for better long-term compatibility, as v1beta1 may be removed in future Tekton versions.

Suggested change
- apiVersion: "v1beta1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1beta1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1beta1
- apiVersion: "v1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1

Copilot uses AI. Check for mistakes.

Comment on lines +63 to +69
- apiVersion: "v1beta1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1beta1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1beta1
Copy link

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using deprecated v1beta1 API version for Tekton resources. Consider upgrading to v1 API version for better long-term compatibility, as v1beta1 may be removed in future Tekton versions.

Suggested change
- apiVersion: "v1beta1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1beta1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1beta1
- apiVersion: "v1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1

Copilot uses AI. Check for mistakes.

Comment on lines +63 to +69
- apiVersion: "v1beta1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1beta1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1beta1
Copy link

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using deprecated v1beta1 API version for Tekton resources. Consider upgrading to v1 API version for better long-term compatibility, as v1beta1 may be removed in future Tekton versions.

Suggested change
- apiVersion: "v1beta1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1beta1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1beta1
- apiVersion: "v1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1

Copilot uses AI. Check for mistakes.

Comment on lines +88 to +69
- apiVersion: "v1beta1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1beta1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1beta1
Copy link

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using deprecated v1beta1 API version for Tekton resources. Consider upgrading to v1 API version for better long-term compatibility, as v1beta1 may be removed in future Tekton versions.

Suggested change
- apiVersion: "v1beta1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1beta1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1beta1
- apiVersion: "v1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1

Copilot uses AI. Check for mistakes.

Comment on lines +88 to +69
- apiVersion: "v1beta1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1beta1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1beta1
Copy link

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using deprecated v1beta1 API version for Tekton resources. Consider upgrading to v1 API version for better long-term compatibility, as v1beta1 may be removed in future Tekton versions.

Suggested change
- apiVersion: "v1beta1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1beta1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1beta1
- apiVersion: "v1"
group: "tekton.dev"
plural: "pipelines"
- apiVersion: v1
group: tekton.dev
plural: pipelineruns
- apiVersion: v1

Copilot uses AI. Check for mistakes.

Comment on lines +201 to +220
local namespace="$1"
local junit_file="$2"

[[ "${OPENSHIFT_CI}" != "true" ]] && return 0

local artifacts_url=$(get_artifacts_url "${namespace}")

# Replace attachments with links to OpenShift CI storage
sed -i.bak "s#\[\[ATTACHMENT|\(.*\)\]\]#${artifacts_url}/\1#g" "${junit_file}"

# Fix XML property tags format for Data Router compatibility
# Convert to self-closing format
sed -i.bak 's#</property>##g' "${junit_file}"
sed -i.bak 's#<property name="\([^"]*\)" value="\([^"]*\)">#<property name="\1" value="\2"/>#g' "${junit_file}"

# Copy to shared directory for CI
cp "${junit_file}" "${SHARED_DIR}/junit-results-${namespace}.xml" 2>/dev/null || true

log_info "JUnit results adapted for Data Router and saved"
}
Copy link

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple sed operations modify the same file sequentially, creating .bak files that aren't cleaned up. Consider combining operations or explicitly cleaning up backup files to avoid accumulation of temporary files.

Copilot uses AI. Check for mistakes.

Copy link
Contributor

github-actions bot commented Oct 9, 2025

@zdrapela
Copy link
Member

/review

@zdrapela
Copy link
Member

/describe

Copy link

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🎫 Ticket compliance analysis 🔶

RHIDP-9038 - Partially compliant

Compliant requirements:

  • Deduplicate CI bash scripts and modularize utilities (break utils.sh into smaller files)
  • Use a unified method for creating Secrets/ConfigMaps (e.g., envsubst) across platforms
  • Extend logging for easier debugging while masking sensitive info
  • Unify configuration across OCP/K8s and Helm/Operator (centralized values/env)
  • Refactor CI jobs by platform/install method and enable easy extensibility
  • Enable test reporting with JUnit/ReportPortal (e.g., bats) and generate artifacts

Non-compliant requirements:

  • Prefer local variables in bash functions and validate required/global vars
  • Add checks for population of global variables; unify namespace/project usage

Requires further human verification:

  • Verify end-to-end CI jobs run successfully across OCP/AKS/EKS/GKE with new structure
  • Confirm sensitive data is consistently masked in all logs across all modules
  • Validate JUnit integration with ReportPortal/Data Router in CI environments
⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🔒 Security concerns

Sensitive information exposure:
Some logs echo environment-derived values and endpoints. While masking helpers exist (e.g., mask_value in EKS module), ensure tokens like K8S_CLUSTER_TOKEN, OCM tokens, AWS keys, and secrets substituted via envsubst are never logged. Verify that envsubst of value files does not inadvertently print sensitive values in debug logs (e.g., helm.sh logs OCM vars presence).

⚡ Recommended focus areas for review

Possible Issue

In attempt_deployment_recovery, the variable DIR is used when reapplying configs but not defined in this scope or module, which may cause reapply to fail. Consider passing the directory as a parameter or using a known path.

# Check if it's a config issue
if kubectl logs "${pod_name}" -n "${namespace}" 2>/dev/null | grep -q "config.*not found\|missing.*config"; then
    log_info "Detected configuration issue, reapplying configs..."
    apply_yaml_files "${DIR}" "${namespace}" ""
    kubectl rollout restart deployment "${deployment}" -n "${namespace}" 2>/dev/null || true
    return 0
fi
Robustness

get_chart_version relies on jq and curl without explicit checks; failures default silently to a hardcoded version. Add command existence checks and clearer error handling to avoid deploying unintended chart versions.

    # Get latest chart version using Quay.io API
    local version
    version=$(curl -sSX GET "https://quay.io/api/v1/repository/rhdh/chart/tag/?onlyActiveTags=true&filter_tag_name=like:${major_version}-" \
        -H "Content-Type: application/json" 2>/dev/null | \
        jq -r '.tags[0].name' 2>/dev/null | \
        grep -oE '[0-9]+\.[0-9]+-[0-9]+-CI' || echo "")

    # Fallback if API fails
    if [[ -z "${version}" ]]; then
        log_warning "Could not fetch chart version from API, using default" >&2
        version="1.7-156-CI"
    fi

    echo "${version}"
}
Portability Concern

sed -i usage for cluster role bindings mixes GNU/BSD flags; current logic may still fail on some environments. Consider a portable approach (tmp file) to ensure namespace substitution works reliably.

    # Update namespace in the file
    sed -i.bak "s/namespace:.*/namespace: ${namespace}/g" "$file" 2>/dev/null || \
        sed -i '' "s/namespace:.*/namespace: ${namespace}/g" "$file" 2>/dev/null || true
    log_debug "Applying cluster role binding: $(basename "$file")"
    kubectl apply -f "$file"
done
📄 References
  1. No matching references available

Copy link

PR Type

Enhancement


Description

Major architectural refactor: Complete restructuring of CI/CD scripts into modular, reusable components with improved maintainability
Multi-cloud support: Added comprehensive cloud provider modules for AWS EKS, Azure AKS, and Google GKE with authentication, cluster management, and ingress configuration
Kubernetes operations: Implemented robust K8s/OpenShift operations with deployment recovery, resource management, and retry mechanisms
Enhanced deployment workflows: Created specialized job handlers for operator deployments, Helm deployments, upgrade testing, and authentication provider testing across all cloud platforms
Database integration: Added PostgreSQL module with Crunchy operator support and external database configuration with TLS
Testing framework: Comprehensive Backstage testing module with health checks, E2E test support, and JUnit results processing
Reporting and monitoring: Integrated Slack notifications, test result tracking, and artifact management with Data Router integration
Infrastructure modules: Added Tekton pipelines, orchestrator workflows, RBAC deployment, and platform detection capabilities
Configuration management: Centralized environment variable handling, configuration validation, and Helm chart operations with value file merging
Documentation: Added architecture documentation, development guides, and cursor rules setup for the refactored system


File Walkthrough

Relevant files
Enhancement
29 files
k8s-operations.sh
Kubernetes Operations Module Implementation                           

.ibm/refactored/modules/k8s-operations.sh

• Added comprehensive Kubernetes/OpenShift operations module with
functions for login, namespace management, and deployment operations

Implemented deployment recovery mechanisms and resource management
utilities
• Integrated Sealight and Tekton/Topology modules for
enhanced deployment capabilities
• Added retry logic and error
handling for robust cluster operations

+606/-0 
eks.sh
AWS EKS Cloud Provider Module                                                       

.ibm/refactored/modules/cloud/eks.sh

• Created AWS EKS cloud helper module with authentication and cluster
operations
• Implemented Route53 DNS management and certificate
handling for EKS deployments
• Added ingress configuration and load
balancer hostname retrieval functions
• Included cleanup utilities for
DNS records and cloud resources

+561/-0 
reporting.sh
Test Reporting and Status Tracking Module                               

.ibm/refactored/modules/reporting.sh

• Developed comprehensive reporting module for test results and status
tracking
• Added JUnit results processing and Data Router integration
for OpenShift CI
• Implemented Slack notifications and summary report
generation
• Created log collection utilities and artifact management
functions

+465/-0 
helm.sh
Helm Chart Operations and Management Module                           

.ibm/refactored/modules/helm.sh

• Created Helm operations module with chart version management and
validation
• Implemented value file merging using yq with plugin
deduplication
• Added preflight validation and installation functions
for RHDH deployments
• Integrated Sealight parameters and environment
variable substitution

+413/-0 
common.sh
Common Utilities and Helper Functions Module                         

.ibm/refactored/modules/common.sh

• Developed common utilities module with preflight checks and cleanup
operations
• Added cluster resource verification and namespace
management functions
• Implemented image tagging utilities and
configuration map creation
• Created platform-agnostic helper
functions for deployment operations

+406/-0 
bootstrap.sh
Cloud Provider Bootstrap and Management Module                     

.ibm/refactored/modules/cloud/bootstrap.sh

• Created cloud bootstrap module for unified cloud provider detection
and loading
• Implemented authentication wrappers for AWS, Azure, and
GCP
• Added cluster credentials management and ingress configuration
functions
• Provided cloud-specific cleanup utilities and resource
management

+257/-0 
base.sh
Base deployment module with monitoring and Redis support 

.ibm/refactored/modules/deployment/base.sh

• Added comprehensive base deployment module with functions for
monitoring deployment status, deploying RHDH, Redis cache, and test
applications
• Implemented deployment monitoring with health checks
for Helm releases, deployments, services, and routes/ingress
• Added
Redis deployment with retry logic, health checks, and proper resource
management
• Included OpenShift-specific test application deployment
for test-backstage-customization-provider

+351/-0 
gke.sh
GKE cloud helper module with authentication and ingress   

.ibm/refactored/modules/cloud/gke.sh

• Added comprehensive GKE cloud helper module with authentication,
cluster operations, and SSL certificate management
• Implemented GCP
service account authentication and GKE cluster credential management

Added SSL certificate creation, ingress configuration, and Cloud DNS
record management
• Included Workload Identity setup and cleanup
functions for GKE-specific operations

+365/-0 
k8s-utils.sh
Generic Kubernetes utilities for service accounts and operations

.ibm/refactored/modules/cloud/k8s-utils.sh

• Added generic Kubernetes utilities module with service account
operations, resource patching, and wait functions
• Implemented
service account token creation and management with proper RBAC
bindings
• Added patch and restart functionality for deployments with
graceful pod termination
• Included ingress operations, namespace
management, and cluster platform detection utilities

+356/-0 
tekton-topology.sh
Tekton and Topology plugin support with cloud patches       

.ibm/refactored/modules/tekton-topology.sh

• Added Tekton and Topology plugin support module with installation
and configuration functions
• Implemented Tekton Pipelines
installation, test resource deployment, and topology application setup

• Added cloud provider-specific patches for AKS, EKS, and GKE
platforms
• Included verification and cleanup functions for Tekton and
Topology integration testing

+385/-0 
postgres.sh
PostgreSQL database module with operator and TLS support 

.ibm/refactored/modules/database/postgres.sh

• Added PostgreSQL database configuration module with Crunchy operator
support
• Implemented external PostgreSQL database setup with TLS
certificate management
• Added credential configuration and database
readiness checks with timeout handling
• Included cleanup functions
and proper error handling for PostgreSQL operations

+301/-0 
gke-operator.sh
GKE operator deployment job with RBAC and ingress support

.ibm/refactored/jobs/gke-operator.sh

• Added GKE operator deployment job with cluster setup and Workload
Identity support
• Implemented standard and RBAC-enabled RHDH
deployments using operator pattern
• Added GKE-specific ingress
configuration and preemptible node patching
• Included comprehensive
cleanup and error handling for GKE operator deployments

+306/-0 
detection.sh
Platform detection module with router base and hostname calculation

.ibm/refactored/modules/platform/detection.sh

• Added platform detection module for OpenShift, Kubernetes
distributions, and container platforms
• Implemented cluster router
base detection for different cloud providers (AKS, EKS, GKE)
• Added
hostname calculation functions for Route/Ingress configuration

Included base URL calculation and export functions for CORS and secret
configuration

+246/-0 
eks-operator.sh
EKS operator deployment job with ALB and RBAC support       

.ibm/refactored/jobs/eks-operator.sh

• Added EKS operator deployment job with AWS Load Balancer Controller
integration
• Implemented standard and RBAC-enabled deployments with
EKS-specific configurations
• Added EKS ingress setup with ALB
annotations and resource patching
• Included cleanup functionality and
proper error handling for EKS operator deployments

+299/-0 
aks-operator.sh
AKS Operator deployment job implementation                             

.ibm/refactored/jobs/aks-operator.sh

• Added new AKS Operator deployment job script with comprehensive
deployment functions
• Implemented setup, deployment, patching, and
cleanup functions for AKS operator workflows
• Includes support for
both standard and RBAC-enabled deployments with spot instance patching

• Added ingress configuration and main execution flow for complete AKS
operator deployment

+291/-0 
aks.sh
AKS cloud helper module implementation                                     

.ibm/refactored/modules/cloud/aks.sh

• Added comprehensive AKS cloud helper module with Azure
authentication functions
• Implemented cluster management operations
(start, stop, get credentials)
• Added app routing enablement and
cluster information retrieval functions
• Included ingress
configuration and cleanup utilities for AKS deployments

+280/-0 
openshift-ci-tests.sh
Modular OpenShift CI tests main entry point                           

.ibm/refactored/openshift-ci-tests.sh

• Created main entry point script for modular OpenShift CI tests
architecture
• Implemented job routing system to execute different job
types (pull, operator, nightly, cloud deployments)
• Added built-in
job handlers and comprehensive usage documentation
• Includes error
handling, reporting, and Slack notification integration

+288/-0 
gke-helm.sh
GKE Helm deployment job implementation                                     

.ibm/refactored/jobs/gke-helm.sh

• Added GKE Helm deployment job for deploying RHDH to Google
Kubernetes Engine
• Implemented cluster setup, deployment, and cleanup
functions for GKE
• Includes support for Workload Identity, SSL
certificates, and ingress configuration
• Added both standard and RBAC
deployment workflows with proper error handling

+240/-0 
orchestrator.sh
Orchestrator module for workflow management                           

.ibm/refactored/modules/orchestrator.sh

• Added orchestrator module for SonataFlow workflow management

Implemented infrastructure installation and workflow deployment
functions
• Added verification and status checking for orchestrator
components
• Includes database configuration for SonataFlow
persistence

+217/-0 
tekton.sh
Tekton module for pipeline management                                       

.ibm/refactored/modules/tekton.sh

• Added Tekton module for OpenShift Pipelines integration

Implemented operator installation and pipeline deployment functions

Added verification, pipeline operations, and cleanup utilities

Includes trigger setup and topology test deployment functions

+267/-0 
eks-helm.sh
EKS Helm deployment job implementation                                     

.ibm/refactored/jobs/eks-helm.sh

• Added EKS Helm deployment job for Amazon Elastic Kubernetes Service

• Implemented cluster setup, deployment, and DNS configuration
functions
• Includes support for spot instances, load balancer
hostname resolution
• Added certificate management and comprehensive
cleanup procedures

+226/-0 
sealight.sh
Sealight integration module for code coverage                       

.ibm/refactored/modules/sealight.sh

• Added Sealight integration module for code coverage and quality
analysis
• Implemented configuration functions for Playwright tests
and environment setup
• Added Helm parameter generation and test
reporting functions
• Includes coverage report generation and test
session management

+229/-0 
retry.sh
Retry library with exponential backoff mechanisms               

.ibm/refactored/modules/retry.sh

• Added comprehensive retry library with exponential backoff
mechanisms
• Implemented generic retry function and Kubernetes
resource retry utilities
• Added health check retry functions with
proper error handling
• Includes timeout management and diagnostic
information collection

+202/-0 
cluster-setup.sh
Cluster setup module for operator installations                   

.ibm/refactored/modules/operators/cluster-setup.sh

• Added cluster setup module for installing required operators and
infrastructure
• Implemented setup functions for OpenShift and
Kubernetes deployments
• Added operator installation functions (ACM,
RHDH, Pipelines, NGINX ingress)
• Includes MultiClusterHub
configuration and readiness checks

+211/-0 
aks-helm.sh
AKS Helm deployment job implementation                                     

.ibm/refactored/jobs/aks-helm.sh

• Added AKS Helm deployment job for Azure Kubernetes Service

Implemented cluster setup, deployment, and cleanup functions

Includes app routing configuration and spot instance patching
• Added
support for both standard and RBAC deployments with proper error
handling

+193/-0 
config-validation.sh
Configuration validation and normalization module               

.ibm/refactored/modules/config-validation.sh

• Added configuration validation module for base64 decoding and config
normalization
• Implemented functions to detect and decode base64
encoded values
• Added GitLab integration and tech-radar configuration
functions
• Includes comprehensive configuration fixes and validation
utilities

+166/-0 
ocp-nightly.sh
OpenShift nightly job handler implementation                         

.ibm/refactored/jobs/ocp-nightly.sh

• Added OpenShift nightly job handler with comprehensive testing setup

• Implemented orchestrator and ACM enablement for nightly tests

Added E2E test execution and proper namespace configuration
• Includes
chart version validation and cluster setup procedures

+103/-0 
rbac.sh
RBAC deployment module with external PostgreSQL                   

.ibm/refactored/modules/deployment/rbac.sh

• Added RBAC deployment module for RHDH with external PostgreSQL

Implemented namespace configuration and external database setup

Added orchestrator workflow deployment and SonataFlow configuration

Includes preflight validation and comprehensive error handling

+77/-0   
deploy-rbac.sh
Deploy RBAC job script implementation                                       

.ibm/refactored/jobs/deploy-rbac.sh

• Added deploy RBAC job script for RHDH with RBAC and external
PostgreSQL
• Implemented platform detection, cluster setup, and
deployment workflow
• Added test execution and reporting functionality

• Includes comprehensive error handling and result tracking

+84/-0   
Configuration changes
4 files
env_variables.sh
Environment Variables Configuration Setup                               

.ibm/refactored/env_variables.sh

• Established centralized environment variable configuration with
fallback system
• Added OpenShift credentials detection from active
sessions
• Configured cloud provider variables for AWS, GCP, and Azure

• Set up authentication provider variables and secret management

+260/-0 
cluster-role-binding-ocm.yaml
ClusterRoleBinding for RHDH Kubernetes plugin OCM support

.ibm/refactored/resources/cluster_role_binding/cluster-role-binding-ocm.yaml

• Added ClusterRoleBinding configuration for RHDH Kubernetes plugin
with OCM support
• Configured RBAC binding for rhdh-k8s-plugin-ocm
ClusterRole to service accounts
• Set up multiple service account
bindings in showcase namespace for plugin access

+18/-0   
exporters.sh
Environment exporters for provider configurations               

.ibm/refactored/modules/env/exporters.sh

• Added environment exporters module for centralizing provider
environment exports
• Implemented OCM, Keycloak, and GitHub variable
export functions
• Added plain and encoded variable handling for
ConfigMaps and Secrets
• Includes comprehensive provider environment
variable management

+72/-0   
diff-values_showcase_upgrade.yaml
Orchestrator disable configuration for upgrades                   

.ibm/refactored/value_files/diff-values_showcase_upgrade.yaml

• Added value file configuration to disable orchestrator for upgrade
scenarios
• Simple YAML configuration setting orchestrator to null

+1/-0     
Tests
3 files
auth-providers.sh
Authentication providers test job with multiple provider support

.ibm/refactored/jobs/auth-providers.sh

• Added authentication providers test job supporting Azure, GitHub,
and Keycloak/RHSSO
• Implemented secrets setup for multiple auth
providers and RBAC policy deployment
• Added E2E test execution
framework with basic health checks as fallback
• Included cleanup
functionality and proper error handling for auth provider testing

+313/-0 
upgrade.sh
RHDH upgrade test job with version management and verification

.ibm/refactored/jobs/upgrade.sh

• Added upgrade test job for testing RHDH upgrades from previous to
current releases
• Implemented base version deployment, upgrade
execution, and verification processes
• Added platform-specific setup
for OpenShift and cloud providers with PostgreSQL support
• Included
data persistence verification and comprehensive cleanup functionality

+309/-0 
backstage.sh
Backstage testing module with comprehensive test functions

.ibm/refactored/modules/testing/backstage.sh

• Added comprehensive Backstage testing module with health checks and
test execution
• Implemented API and UI testing functions with retry
mechanisms
• Added E2E test support and diagnostic information
collection
• Includes test result tracking and JUnit results
processing

+238/-0 
Additional files
57 files
CURSOR_RULES_SETUP.md +456/-0 
Makefile +356/-0 
README.md +396/-0 
secrets-rhdh-secrets.yaml +47/-0   
service-account-rhdh-secret.yaml +7/-0     
README.md +155/-0 
architecture.md +564/-0 
development-guide.md +1608/-0
env_override.local.sh.example +28/-0   
deploy-base.sh +80/-0   
ocp-operator.sh +70/-0   
ocp-pull.sh +75/-0   
bootstrap.sh +32/-0   
constants.sh +95/-0   
logging.sh +42/-0   
operator.sh +53/-0   
validation.sh +37/-0   
cluster-role-k8s.yaml +86/-0   
cluster-role-ocm.yaml +22/-0   
cluster-role-binding-k8s.yaml +12/-0   
app-config-rhdh-rbac.yaml +143/-0 
app-config-rhdh.yaml +238/-0 
dynamic-global-floating-action-button-config.yaml +44/-0   
dynamic-global-header-config.yaml +84/-0   
dynamic-plugins-config.yaml +261/-0 
hello-world-pipeline-run.yaml +10/-0   
hello-world-pipeline.yaml +25/-0   
pipelines-operator.yaml +10/-0   
dynamic-plugins-root-PVC.yaml +10/-0   
postgres-cred.yaml +12/-0   
postgres-crt-rds.yaml +2535/-0
postgres.yaml +74/-0   
rds-app-config.yaml +24/-0   
values-showcase-postgres.yaml +110/-0 
redis-deployment.yaml +64/-0   
redis-secret.yaml +8/-0     
rhdh-start-rbac.yaml +26/-0   
rhdh-start-rbac_K8s.yaml +30/-0   
rhdh-start-runtime.yaml +23/-0   
rhdh-start.yaml +26/-0   
rhdh-start_K8s.yaml +24/-0   
service-account-rhdh.yaml +5/-0     
topology-test-ingress.yaml +19/-0   
topology-test-route.yaml +14/-0   
topology-test.yaml +72/-0   
diff-values_showcase-rbac_AKS.yaml +147/-0 
diff-values_showcase-rbac_EKS.yaml +139/-0 
diff-values_showcase-rbac_GKE.yaml +138/-0 
diff-values_showcase-sanity-plugins.yaml +213/-0 
diff-values_showcase_AKS.yaml +47/-0   
diff-values_showcase_EKS.yaml +41/-0   
diff-values_showcase_GKE.yaml +37/-0   
values_showcase-auth-providers.yaml +257/-0 
values_showcase-rbac.yaml +370/-0 
values_showcase-rbac_nightly.yaml +379/-0 
values_showcase.yaml +355/-0 
values_showcase_nightly.yaml +347/-0 

Copy link
Member

@zdrapela zdrapela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @gustavolira

I like many ideas in this PR overall. But making so many changes in one PR is making it impossible to review properly and discuss all the changes.

  1. I suggest we start by setting up an OpenShift CI nightly job(s) and optional PR check(s), that would use the refactoring folder as an entry point. This will give everyone access to logs and allow us to verify how this script behaves in CI.
  2. Then you can work on the base refactoring. Start with OCP jobs, maybe just Helm chart first, then add the operator. Review this PR and merge it.
  3. When this is working, we can reporting, gathering artifacts etc.
  4. Only after that, I'd add other cloud providers like AKS, GKE, EKS, and special tests, like auth-providers and upgrade tests. And again, I'd first verify that the scripts work well with all of them before we merge it. It can also be one PR for each provider, if we need it.

On the architectural level:

  1. Do we need to keep the openshift-ci-tests.sh and all the decision making based on JOB_NAME? We can instead configure each job use different starting point. This is the point for OCP Helm PR checks e.g.: https://github.com/openshift/release/blob/6a5999d35c9bedca66a608cf5a9a2ad6bff49712/ci-operator/step-registry/redhat-developer/rhdh/ocp/helm/redhat-developer-rhdh-ocp-helm-commands.sh#L147
  2. Even Makefile relies on openshift-ci-tests.sh and JOB_NAME and I don't think it's needed and I see it as a weak point.
  3. There are many ShellCheck errors and warning in the code and the sourcing is not working with ShellCheck and BashIDE. I think it's a great tool and using it can help you and Cursor to pick up the errors and warning earlier and fix it during the code generation.
  4. I'd keep the linting and formatting on yarn and configuration in package.json and avoid duplicating it in the Makefile.

@gustavolira
Copy link
Member Author

gustavolira commented Oct 10, 2025

Hi @gustavolira

I like many ideas in this PR overall. But making so many changes in one PR is making it impossible to review properly and discuss all the changes.

  1. I suggest we start by setting up an OpenShift CI nightly job(s) and optional PR check(s), that would use the refactoring folder as an entry point. This will give everyone access to logs and allow us to verify how this script behaves in CI.
  2. Then you can work on the base refactoring. Start with OCP jobs, maybe just Helm chart first, then add the operator. Review this PR and merge it.
  3. When this is working, we can reporting, gathering artifacts etc.
  4. Only after that, I'd add other cloud providers like AKS, GKE, EKS, and special tests, like auth-providers and upgrade tests. And again, I'd first verify that the scripts work well with all of them before we merge it. It can also be one PR for each provider, if we need it.

On the architectural level:

  1. Do we need to keep the openshift-ci-tests.sh and all the decision making based on JOB_NAME? We can instead configure each job use different starting point. This is the point for OCP Helm PR checks e.g.: https://github.com/openshift/release/blob/6a5999d35c9bedca66a608cf5a9a2ad6bff49712/ci-operator/step-registry/redhat-developer/rhdh/ocp/helm/redhat-developer-rhdh-ocp-helm-commands.sh#L147
  2. Even Makefile relies on openshift-ci-tests.sh and JOB_NAME and I don't think it's needed and I see it as a weak point.
  3. There are many ShellCheck errors and warning in the code and the sourcing is not working with ShellCheck and BashIDE. I think it's a great tool and using it can help you and Cursor to pick up the errors and warning earlier and fix it during the code generation.
  4. I'd keep the linting and formatting on yarn and configuration in package.json and avoid duplicating it in the Makefile.

Thanks a lot for the detailed feedback!
I understand your point about splitting the refactor into smaller PRs, but unfortunately that would be quite tricky in this case, many of these scripts depend on each other, and separating them now would likely cause additional rework and inconsistencies.

Another reason is that most of this refactor was generated and structured with AI assistance, which means the code was produced as a complete, consistent set.
Breaking it apart at this stage would require manually undoing and re-organizing pieces that were already designed to work together, effectively creating more work than benefit.

I agree that the auth-providers part can be postponed; that one is easier to migrate later.
However, the rest of the refactor is already complete and consistent, and partitioning it now would probably lead to redundant effort.

My proposal would be to keep the refactored/ folder temporarily alongside the current code, and create new OpenShift CI jobs pointing to the refactored scripts.
This would let us test the new flow without removing the existing, working one.

Once we’re confident that everything runs smoothly in CI, we can then fully switch over, deleting the old scripts and keeping only the refactored version.

This approach should give us a safer and more gradual migration while keeping the current pipeline stable.

@gustavolira
Copy link
Member Author

Hi @gustavolira
I like many ideas in this PR overall. But making so many changes in one PR is making it impossible to review properly and discuss all the changes.

  1. I suggest we start by setting up an OpenShift CI nightly job(s) and optional PR check(s), that would use the refactoring folder as an entry point. This will give everyone access to logs and allow us to verify how this script behaves in CI.
  2. Then you can work on the base refactoring. Start with OCP jobs, maybe just Helm chart first, then add the operator. Review this PR and merge it.
  3. When this is working, we can reporting, gathering artifacts etc.
  4. Only after that, I'd add other cloud providers like AKS, GKE, EKS, and special tests, like auth-providers and upgrade tests. And again, I'd first verify that the scripts work well with all of them before we merge it. It can also be one PR for each provider, if we need it.

On the architectural level:

  1. Do we need to keep the openshift-ci-tests.sh and all the decision making based on JOB_NAME? We can instead configure each job use different starting point. This is the point for OCP Helm PR checks e.g.: https://github.com/openshift/release/blob/6a5999d35c9bedca66a608cf5a9a2ad6bff49712/ci-operator/step-registry/redhat-developer/rhdh/ocp/helm/redhat-developer-rhdh-ocp-helm-commands.sh#L147
  2. Even Makefile relies on openshift-ci-tests.sh and JOB_NAME and I don't think it's needed and I see it as a weak point.
  3. There are many ShellCheck errors and warning in the code and the sourcing is not working with ShellCheck and BashIDE. I think it's a great tool and using it can help you and Cursor to pick up the errors and warning earlier and fix it during the code generation.
  4. I'd keep the linting and formatting on yarn and configuration in package.json and avoid duplicating it in the Makefile.

Thanks a lot for the detailed feedback! I understand your point about splitting the refactor into smaller PRs, but unfortunately that would be quite tricky in this case, many of these scripts depend on each other, and separating them now would likely cause additional rework and inconsistencies.

Another reason is that most of this refactor was generated and structured with AI assistance, which means the code was produced as a complete, consistent set. Breaking it apart at this stage would require manually undoing and re-organizing pieces that were already designed to work together, effectively creating more work than benefit.

I agree that the auth-providers part can be postponed; that one is easier to migrate later. However, the rest of the refactor is already complete and consistent, and partitioning it now would probably lead to redundant effort.

My proposal would be to keep the refactored/ folder temporarily alongside the current code, and create new OpenShift CI jobs pointing to the refactored scripts. This would let us test the new flow without removing the existing, working one.

Once we’re confident that everything runs smoothly in CI, we can then fully switch over, deleting the old scripts and keeping only the refactored version.

This approach should give us a safer and more gradual migration while keeping the current pipeline stable.

about the other points like openshift-ci-tests.sh and Shellcheck I also agree with you

… setup

- Added new refactored scripts and modules for improved CI/CD processes.
- Introduced environment configuration files and example scripts for local testing.
- Created documentation for the refactored architecture and usage of cursor rules.
- Updated .gitignore to include new environment override files.
- Added various Kubernetes resource configurations for deployment and service accounts.

This refactor enhances maintainability and simplifies the deployment process.
…provider support

- Introduced a new ShellCheck configuration for improved script linting.
- Updated the major chart version from 1.7 to 1.8 across various scripts and configurations.
- Added upgrade testing functionality to validate upgrades from previous releases.
- Implemented new entry points for auth providers, cleanup, and deployment jobs.
- Enhanced documentation to include upgrade flow and cloud provider deployment details.
- Refactored Makefile to streamline CI/CD targets and improve usability.

This update significantly improves the deployment process and testing capabilities for RHDH.
@gustavolira
Copy link
Member Author

@zdrapela make another improvement with your suggestion. I think a good next step would be create some jobs on Openshift CI enabling run the handlers from the refactored folder and see the results.
I will also move some tests to nightly and remove from the PR checks, tests like Orchestrator and OCM

Copy link
Contributor

@zdrapela
Copy link
Member

Thanks a lot for the detailed feedback! I understand your point about splitting the refactor into smaller PRs, but unfortunately that would be quite tricky in this case, many of these scripts depend on each other, and separating them now would likely cause additional rework and inconsistencies.

Another reason is that most of this refactor was generated and structured with AI assistance, which means the code was produced as a complete, consistent set. Breaking it apart at this stage would require manually undoing and re-organizing pieces that were already designed to work together, effectively creating more work than benefit.

I agree that the auth-providers part can be postponed; that one is easier to migrate later. However, the rest of the refactor is already complete and consistent, and partitioning it now would probably lead to redundant effort.

My proposal would be to keep the refactored/ folder temporarily alongside the current code, and create new OpenShift CI jobs pointing to the refactored scripts. This would let us test the new flow without removing the existing, working one.

Once we’re confident that everything runs smoothly in CI, we can then fully switch over, deleting the old scripts and keeping only the refactored version.

This approach should give us a safer and more gradual migration while keeping the current pipeline stable.

There are 11 jobs in the scope, if I'm counting correctly. How many of them have you tested to see if they work correctly? To do an AI refactor is one thing, and to ensure the code for all the jobs is doing exactly what it's supposed to is another thing.

If we do the refactor properly, it should be low effort to split it into pieces. I wouldn't merge the code into the repository if we don't know if it works. The proper way is to split it into pieces, review them, see if they work, and then merge them.

And most importantly, this PR is adding 20,000 lines. If any PR is impossible to review, it's this one.

@gustavolira
Copy link
Member Author

Thanks a lot for the detailed feedback! I understand your point about splitting the refactor into smaller PRs, but unfortunately that would be quite tricky in this case, many of these scripts depend on each other, and separating them now would likely cause additional rework and inconsistencies.
Another reason is that most of this refactor was generated and structured with AI assistance, which means the code was produced as a complete, consistent set. Breaking it apart at this stage would require manually undoing and re-organizing pieces that were already designed to work together, effectively creating more work than benefit.
I agree that the auth-providers part can be postponed; that one is easier to migrate later. However, the rest of the refactor is already complete and consistent, and partitioning it now would probably lead to redundant effort.
My proposal would be to keep the refactored/ folder temporarily alongside the current code, and create new OpenShift CI jobs pointing to the refactored scripts. This would let us test the new flow without removing the existing, working one.
Once we’re confident that everything runs smoothly in CI, we can then fully switch over, deleting the old scripts and keeping only the refactored version.
This approach should give us a safer and more gradual migration while keeping the current pipeline stable.

There are 11 jobs in the scope, if I'm counting correctly. How many of them have you tested to see if they work correctly? To do an AI refactor is one thing, and to ensure the code for all the jobs is doing exactly what it's supposed to is another thing.

If we do the refactor properly, it should be low effort to split it into pieces. I wouldn't merge the code into the repository if we don't know if it works. The proper way is to split it into pieces, review them, see if they work, and then merge them.

And most importantly, this PR is adding 20,000 lines. If any PR is impossible to review, it's this one.

I agree with you. To start testing, I was thinking of creating all those new jobs in OpenShift CI so that we can test each one individually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants