libs: Handling plugin extraction errors
Motivation Currently, errors are ignored during field extraction in the plugin framework. In theory, a plugin might fail extracting a field for two main reasons:
- The field is not present in the given event, for which the
ss_plugin_extract_field.field_presentflag is set to false - The
extract_fieldsexported plugin function encounters some error and returns a code different thanSS_PLUGIN_SUCCESS. In the current implementation, in both cases the filtercheck returns a NULL pointer, which is interpreted as a not-available field. This is visible here 👇🏼 https://github.com/falcosecurity/libs/blob/83f460cc5ba39cd09f924b4d05a1ffd13eec07de/userspace/libsinsp/plugin.cpp#L630 https://github.com/falcosecurity/libs/blob/83f460cc5ba39cd09f924b4d05a1ffd13eec07de/userspace/libsinsp/plugin.cpp#L304
Although this is semantically correct, the two failure paths have a quite different meaning. In the second case, the plugin returns a failure code and the framework silently ignores it to maintain a non-blocking extraction flow. This is makes error handling efforts useless for plugin developers, and generally makes it harder to debug plugins at runtime.
Feature I propose to catch the error and make it visible somehow.
I agree that maintaining field extraction non-blocking might be a priority here, so maybe throwing an exception might not be a viable option. We can consider some weaker error propagation methods, or maybe logging to stderr. To the bare minimum, we might log the error if a debug mode is enabled.
Alternatives
Keep things as they are, and just ignore plugin failures for extract_fields.
About this issue
- Original URL
- State: open
- Created 3 years ago
- Reactions: 1
- Comments: 22 (22 by maintainers)
I believe this is still relevant. @jasondellaluce to confirm.
However, I think it would be best to extend this discussion to the plugin domain and all field extraction mechanisms, including those built-in sinsp. I know Jason has some thoughts in this regard, so I’m eager to hear from him.
That being said, I don’t think this is a top priority. However, it is still an improvement that is worth tackling.
Just my 2 cents
cc @Andreagit97 @FedeDP
/remove-lifecycle stale