Skip to content

Implement exponential backoff and do not update MCD, MS on same error #1030

@elankath

Description

@elankath

How to categorize this issue?

/area control-plane
/kind bug
/priority 1

What happened:

We had a live, scalability issue where due to invalid credentials, the etcd database was filled up.
The machine-controller-manager was continuing updating MachineDeployments and MachineSets. The MachineDeployment status contains entry for each Machine and its lastError.

(issues-canary/issues/7190 internally)

What you expected to happen:

  • machine-controller-manager should adhere to controller best practices such as exponential backoff and skipping no-op (status) updates if there is no change in the status.

How to reproduce it (as minimally and precisely as possible):

  • Use the virtual mcm provider and local api-server and etcd to simulate credential failure for large number of machines (> 1000)
  • Check size of etcd db.

Anything else we need to know?:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions