-
Notifications
You must be signed in to change notification settings - Fork 36
fix(linkerd-cni): prevent parent from outpacing forked monitor_cni_config() process
#585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
sdickhoven
wants to merge
1
commit into
linkerd:main
Choose a base branch
from
sdickhoven:BUG-fix-race-condition
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -344,6 +344,21 @@ install_cni_bin | |
| # CNI config files. | ||
| CNI_CONF_SHA='{}' | ||
| monitor_cni_config & | ||
| monitor_pid=$! | ||
|
|
||
| # The following logic waits (indefinitely if need be) for `inotifywait` in the | ||
| # forked `monitor_cni_config()` function to enter "interruptible sleep" state. | ||
| # This state indicates that `inotifywait` is fully up and ready to respond to | ||
| # filesystem events. | ||
| log "Wait for CNI config monitor to become ready" | ||
| while true; do | ||
| monitor_state=$( | ||
| (ps --ppid=$monitor_pid -o comm=,state= || true) | | ||
| awk '$1 == "inotifywait" && $2 == "S" {print "ok"}' | ||
| ) | ||
| [ -z "$monitor_state" ] || break | ||
| sleep .1 # 100ms | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. given that i saw a ~140ms race in production, i figured that waiting in increments of 100ms is reasonable. |
||
| done | ||
|
|
||
| # Append our config to any existing config file (*.conflist or *.conf) | ||
| config_files=$(find "${HOST_CNI_NET}" -maxdepth 1 -type f ! -name '*linkerd*' \( -iname '*conflist' -o -iname '*conf' \)) | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the
monitor_statevariable will either be an empty string or it will contain (one or more) "ok"....it should only be one "ok" because there is only one
inotifywaitprocess that gets spawned by themonitor_cni_config()function.however, this logic will continue to function even if multiple
inotifywaitprocesses are spawned bymonitor_cni_config()because i wanted to make it very difficult for future changes to break this logic.the subsequent check that is responsible for breaking the infinite loop only cares about whether the
monitor_statevariable is an empty string or not. so anything besides "ok" (including multiple "ok") will break the loop just fine.i also experimented with only using
pshere:however it appears as though
psdoes not like multiple filters (1:--ppid=$monitor_pid, 2:-C inotifywait) and therefore ignores one of the two.so i opted to have
psemit all processes that are spawned bymonitor_cni_config()(which should be onlyinotifywait) and then haveawkcheck if one of those processes isinotifywaitand whether that process is in the stateS(="interruptible sleep") which should indicate that it is not in the process of actively starting up but is instead waiting for filesystem events to come in.https://www.man7.org/linux/man-pages/man1/ps.1.html#PROCESS_STATE_CODES
the
=after the output column name (comm=,state=) tellspsto omit the header for that column. this is not strictly necessary for the logic to work but we don't need the headers so i added it.vs
the
|| trueis there because ifpsdoesn't find any processes (which is the whole reason for this logic to exist in the first place), it exits with a non-zero exit code (which would then abort the entire script... which we don't want).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could get much more elaborate here and use the filesystem for ipc between the main process and the forked
monitor_cni_config()process but that would require a lot more logic.so i opted to keep things relatively straight-forward.