-
Notifications
You must be signed in to change notification settings - Fork 101
Rework Plan CRD generation to add docs and kubectl explain
support
#369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6ae0eb5
to
14aaee0
Compare
dc070ff
to
f471b95
Compare
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Fixes `kubectl.kubernetes.io/last-applied-configuration` annotation being passed through from Plan to Job Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Also clean up some of the api spec for better documentation, and fix tests. Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
f471b95
to
6815127
Compare
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
6815127
to
fa16395
Compare
kubectl explain
support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edit: I have fat fingers
| Field | Description | Default | Validation | | ||
| --- | --- | --- | --- | | ||
| `concurrency` _integer_ | The maximum number of concurrent nodes to apply this update on. | | | | ||
| `jobActiveDeadlineSecs` _integer_ | Sets ActiveDeadlineSeconds on Jobs generated to apply this Plan.<br />If the Job does not complete within this time, the Plan will stop processing until it is updated to trigger a redeploy.<br />If set to 0, Jobs have no deadline. If not set, the controller default value is used. | | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not set, the controller default value is used
Maybe it is worth adding that is the default value (900)? Or pointing where to find it SYSTEM_UPGRADE_JOB_ACTIVE_DEADLINE_SECONDS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could cover that elsewhere, yeah. We don't really have any docs on controller CLI flags at the moment. This text is generated automatically from the field docstring and isn't the best place to cover controller config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nits, approving.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
Ref: SURE-10336
With the current deployment defaults, upgrade jobs fail if not complete after 15 minutes, and are then cleaned up (deleted) 15 minutes later. The Plan does not trigger a recreate of the failed job, and the Plan must be modified in order for the Job to be retried. This is the expected behavior, as the default SUC deployment creates Jobs with activeDeadlineSeconds: 900 and ttlSecondsAfterFinished: 900.
Prior releases of the SUC (I tested with 0.13.4) had a bug where failed Jobs would be infinitely deleted and recreated, causing the Job to effectively ignore the configured deadline. There were no specific changes made to SUC that fixed this, but I suspect it was resolved by the bump from Wrangler v1 to v3, which contained many changes.
Apparently some users were relying on the unexpected behavior of Plans continuously recreating failed jobs, and their workflow was broken by this change in behavior (see https://xkcd.com/1172/). The correct fix is NOT for the SUC to continuously recreate failed jobs, but rather for the Plan's jobActiveDeadlineSecs to be set to an appropriately high value, or 0 for no deadline.
This change to align CRD building with rancher improves the docs by not only publishing the annotated CRD spec as markdown, but also makes
kubectl explain
work properly for Plan fields. Documentation for thejobActiveDeadlineSecs
field now covers expected behavior:Example of docs available via
kubectl explain
: