-
Notifications
You must be signed in to change notification settings - Fork 98
fix(container): Propagate engine initialization errors to the caller #1020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix(container): Propagate engine initialization errors to the caller #1020
Conversation
e56b8e9 to
c7c0968
Compare
Rules files suggestions |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rabbitstack The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
1 similar comment
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rabbitstack The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
If the specific container engine worker fails during initialization, the error is silently skipped making it hard to troubleshoot the real problem. i Instead, accumulate and bubble up all the errors to the async handler. Signed-off-by: rabbitstack <nedim.sabic@sysdig.com>
46c632c to
76943d2
Compare
Rules files suggestions |
| errmsg := C.CString("") | ||
| ptr := StartWorker((*[0]byte)(C.echo_cb), cstr, &enabledSocks, &errmsg) | ||
| if ptr == nil { | ||
| fmt.Println("Failed to start worker; nothing configured?") | ||
| fmt.Println(fmt.Sprintf("Failed to start worker; nothing configured? %s", C.GoString(errmsg))) | ||
| os.Exit(1) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This snippet introduces a memory leak, because C.Cstring("") allocates a C string on the heap, and the Go runtime will not garbage-collect it. We should add a call to defer C.free(...). but in order to be sure that this is called just after StartWorker(...) invocation, and regardless the fact that this function could panic, I would add a new wrapping function like the following:
func startWorker(...) unsafe.Pointer {
errmsg := C.CString("")
defer C.free(unsafe.Pointer(errmsg))
return StartWorker((*[0]byte)(C.echo_cb), cstr, &enabledSocks, &errmsg)
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string buffer is deallocated on the C/C++ side of the plugin: https://github.com/falcosecurity/plugins/pull/1020/files#diff-66d6df581476d08b6c98945b62b26d631c6a61a3393e39f23334a2476eac0312R59. Pretty much in line with the preexisting code to free the enabled_engines string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some concerns here. First of all, we have two callers to StartWorker():
- the first is the
main()function inmain_exe.goin Go code - the second is
my_plugin::start_async_events()method in C++ code
In main_exe.go:main(), we are indeed allocating a C string with errmsg := C.CString(""). This is our first allocation. We then pass its address to StartWorker() that, in some cases, replaces the pointed value:
if errs.Len() > 0 {
*errmsg = C.CString(errs.String())
}The memory backing the first allocation is leaked after this instruction. Moreover, after StartWorker() returns, nobody deallocates the second string (the one containing errs.String() we just saw): this is a second leak.
Now let's analyze the second code path: the one starting from my_plugin::start_async_events(). As err is initialized with nullptr, a call to StartWorker() will work fine as long as it doesn't panic: if it panics, who is gonna release its allocated string? Notice that panics are also problematic for the main_exe.go:main() path.
In order to simplify handling, and fixing all these issues, I would suggest to design a wrapper around StartWorker() that allocates the C string and (this is important), ensures this string is deallocated, wheter or not the call to StartWorker() panicked. This is also easier to maintain, as we are putting handling ownership in a single place.
Finally, I agree with you that enabled_engines should be better handled. Specifically, who ensures it is initialized when free((void *)enabled_engines); is executed?
WDYT? @rabbitstack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ekoops I agree with your analysis and the sentiment about generally improving the error handling. I didn't want to be too disruptive and adopted the same approach as with the enabled_engines string :). That said, main_exe.go source can be ignored as it is solely used to verify the go-worker inner-workings during development without the need to spin up a full-fledged plugin running inside Falco.
Also, IMO, the leaks you identified are not critical, since StartWorker is invoked once during plugin lifetime. However, it is definitely worth the improvements you highlighted above.
|
Hey @rabbitstack Any update on this? |
What type of PR is this?
/kind bug
Any specific area of the project related to this PR?
/area plugins
What this PR does / why we need it:
Which issue(s) this PR fixes:
If the specific container engine worker fails during initialization, the error is silently skipped, making it hard to troubleshoot the real problem. Instead, accumulate and bubble up all the errors to the async handler.
Special notes for your reviewer: