Skip to content

Conversation

@rabi
Copy link
Contributor

@rabi rabi commented Dec 25, 2025

Previously, the SSH key mount path differed between global services (DeployOnAllNodeSets=true) and non-global services:

  • Global services: /runner/env/ssh_key/ssh_key_< nodesetname >
  • Non-global services: /runner/env/ssh_key

The non-global services path happened to work because ansible-runner has a built-in mechanism that looks for an SSH key at /runner/env/ssh_key and automatically loads it into ssh-agent. However, this relied on ansible-runner's implicit behavior rather than the explicit ansible_ssh_private_key_file variable set in the inventory.

The inventory always sets ansible_ssh_private_key_file to /runner/env/ssh_key/ssh_key_ regardless of service type (see inventory.go line 178). This inconsistency meant non-global services were mounting the SSH key at a different path than what Ansible expected from the inventory variable, relying on ansible-runner's fallback behavior. However, there were errors in ansible logs as there were no files in /runner/env/ssh_key/ssh_key_ which was confusing to users.

This change unifies the SSH key mount path to always use the format:
/runner/env/ssh_key/ssh_key_< nodesetname >

This ensures:

  1. The mount path matches the ansible_ssh_private_key_file variable set in the inventory for all service types
  2. Explicit and consistent SSH key configuration rather than relying on ansible-runner's implicit ssh-agent loading
  3. Simplified code by removing the conditional branching
  4. Consistent behavior between global and non-global services

For global services, multiple SSH keys are mounted (one per nodeset) in the ssh_key folder. For non-global services, only the matching nodeset's key is mounted, but at the same path format.

Assisted-by: Claude-4.5-opus

Previously, the SSH key mount path differed between global services
(DeployOnAllNodeSets=true) and non-global services:

- Global services: /runner/env/ssh_key/ssh_key_<nodesetname>
- Non-global services: /runner/env/ssh_key

The non-global services path happened to work because ansible-runner
has a built-in mechanism that looks for an SSH key at /runner/env/ssh_key
and automatically loads it into ssh-agent. However, this relied on
ansible-runner's implicit behavior rather than the explicit
ansible_ssh_private_key_file variable set in the inventory.

The inventory always sets ansible_ssh_private_key_file to
/runner/env/ssh_key/ssh_key_<nodesetname> regardless of service type
(see inventory.go line 178). This inconsistency meant non-global
services were mounting the SSH key at a different path than what
Ansible expected from the inventory variable, relying on ansible-runner's
fallback behavior. However, there were errors in ansible logs as
there were no files in  /runner/env/ssh_key/ssh_key_<nodesetname>
which was confusing to users.

This change unifies the SSH key mount path to always use the format:
  /runner/env/ssh_key/ssh_key_<nodesetname>

This ensures:
1. The mount path matches the ansible_ssh_private_key_file variable
   set in the inventory for all service types
2. Explicit and consistent SSH key configuration rather than relying
   on ansible-runner's implicit ssh-agent loading
3. Simplified code by removing the conditional branching
4. Consistent behavior between global and non-global services

For global services, multiple SSH keys are mounted (one per nodeset)
in the ssh_key folder. For non-global services, only the matching
nodeset's key is mounted, but at the same path format.

Assisted-by: Claude-4.5-opus
Signed-off-by: rabi <ramishra@redhat.com>
@openshift-ci openshift-ci bot requested review from slagle and stuggi December 25, 2025 04:06
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 25, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rabi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…itions

1. 'Failed deployment followed by completed deployment' test:
   - The test was creating both deployments simultaneously, causing a race
     where the test would try to update the status of a job that hadn't
     been created yet
   - Fixed by ensuring sequential execution: wait for the first deployment's
     job to fail before creating the second deployment

2. 'should not reconcile after failure' test:
   - The test was manually setting deployment status to 'backoff limit
     exceeded' without waiting for the controller to finish its initial
     reconciliation
   - Fixed by waiting for ObservedGeneration to match Generation before
     manually setting the failure status, ensuring the controller has
     completed processing

Assisted-by: Claude-4.5-opus
Signed-off-by: rabi <ramishra@redhat.com>
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/d0734dccfadd4586a253cb68910e1637

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 53m 54s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 25m 59s
cifmw-crc-podified-edpm-baremetal FAILURE in 27m 55s
✔️ openstack-operator-tempest-multinode SUCCESS in 1h 39m 19s

@rabi
Copy link
Contributor Author

rabi commented Dec 25, 2025

recheck

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant