diff options
Diffstat (limited to 'src/content/blog/2021/02/17/fallible.adoc')
-rw-r--r-- | src/content/blog/2021/02/17/fallible.adoc | 285 |
1 files changed, 0 insertions, 285 deletions
diff --git a/src/content/blog/2021/02/17/fallible.adoc b/src/content/blog/2021/02/17/fallible.adoc deleted file mode 100644 index 1f2f641..0000000 --- a/src/content/blog/2021/02/17/fallible.adoc +++ /dev/null @@ -1,285 +0,0 @@ -= ANN: fallible - Fault injection library for stress-testing failure scenarios -:updatedat: 2022-03-06 - -:fallible: https://euandreh.xyz/fallible/ - -Yesterday I pushed v0.1.0 of {fallible}[fallible], a miniscule library for -fault-injection and stress-testing C programs. - -== _EDIT_ - -:changelog: https://euandreh.xyz/fallible/CHANGELOG.html -:tarball: https://euandre.org/static/attachments/fallible.tar.gz - -2021-06-12: As of {changelog}[0.3.0] (and beyond), the macro interface improved -and is a bit different from what is presented in this article. If you're -interested, I encourage you to take a look at it. - -2022-03-06: I've {tarball}[archived] the project for now. It still needs some -maturing before being usable. - -== Existing solutions - -:gnu-std: https://www.gnu.org/prep/standards/standards.html#Semantics -:valgrind: https://www.valgrind.org/ -:so-alloc: https://stackoverflow.com/questions/1711170/unit-testing-for-failed-malloc - -Writing robust code can be challenging, and tools like static analyzers, fuzzers -and friends can help you get there with more certainty. As I would try to -improve some of my C code and make it more robust, in order to handle system -crashes, filled disks, out-of-memory and similar scenarios, I didn't find -existing tooling to help me get there as I expected to find. I couldn't find -existing tools to help me explicitly stress-test those failure scenarios. - -Take the "{gnu-std}[Writing Robust Programs]" section of the GNU Coding -Standards: - -____ -Check every system call for an error return, unless you know you wish to ignore -errors. (...) Check every call to malloc or realloc to see if it returned NULL. -____ - -From a robustness standpoint, this is a reasonable stance: if you want to have a -robust program that knows how to fail when you're out of memory and `malloc` -returns `NULL`, than you ought to check every call to `malloc`. - -Take a sample code snippet for clarity: - -[source,c] ----- -void a_function() { - char *s1 = malloc(A_NUMBER); - strcpy(s1, "some string"); - - char *s2 = malloc(A_NUMBER); - strcpy(s2, "another string"); -} ----- - -At a first glance, this code is unsafe: if any of the calls to `malloc` returns -`NULL`, `strcpy` will be given a `NULL` pointer. - -My first instinct was to change this code to something like this: - -[source,diff] ----- -@@ -1,7 +1,15 @@ - void a_function() { - char *s1 = malloc(A_NUMBER); -+ if (!s1) { -+ fprintf(stderr, "out of memory, exitting\n"); -+ exit(1); -+ } - strcpy(s1, "some string"); - - char *s2 = malloc(A_NUMBER); -+ if (!s2) { -+ fprintf(stderr, "out of memory, exitting\n"); -+ exit(1); -+ } - strcpy(s2, "another string"); - } ----- - -As I later found out, there are at least 2 problems with this approach: - -. *it doesn't compose*: this could arguably work if `a_function` was `main`. - But if `a_function` lives inside a library, an `exit(1);` is an inelegant way - of handling failures, and will catch the top-level `main` consuming the - library by surprise; -. *it gives up instead of handling failures*: the actual handling goes a bit - beyond stopping. What about open file handles, in-memory caches, unflushed - bytes, etc.? - -If you could force only the second call to `malloc` to fail, -{valgrind}[Valgrind] would correctly complain that the program exitted with -unfreed memory. - -So the last change to make the best version of the above code is: - -[source,diff] ----- -@@ -1,15 +1,14 @@ --void a_function() { -+bool a_function() { - char *s1 = malloc(A_NUMBER); - if (!s1) { -- fprintf(stderr, "out of memory, exitting\n"); -- exit(1); -+ return false; - } - strcpy(s1, "some string"); - - char *s2 = malloc(A_NUMBER); - if (!s2) { -- fprintf(stderr, "out of memory, exitting\n"); -- exit(1); -+ free(s1); -+ return false; - } - strcpy(s2, "another string"); - } ----- - -Instead of returning `void`, `a_function` now returns `bool` to indicate whether -an error ocurred during its execution. If `a_function` returned a pointer to -something, the return value could be `NULL`, or an `int` that represents an -error code. - -The code is now a) safe and b) failing gracefully, returning the control to the -caller to properly handle the error case. - -After seeing similar patterns on well designed APIs, I adopted this practice for -my own code, but was still left with manually verifying the correctness and -robustness of it. - -How could I add assertions around my code that would help me make sure the -`free(s1);` exists, before getting an error report? How do other people and -projects solve this? - -From what I could see, either people a) hope for the best, b) write safe code -but don't strees-test it or c) write ad-hoc code to stress it. - -The most proeminent case of c) is SQLite: it has a few wrappers around the -familiar `malloc` to do fault injection, check for memory limits, add warnings, -create shim layers for other environments, etc. All of that, however, is -tightly couple with SQLite itself, and couldn't be easily pulled off for using -somewhere else. - -When searching for it online, an {so-alloc}[interesting thread] caught my -atention: fail the call to `malloc` for each time it is called, and when the -same stacktrace appears again, allow it to proceed. - -== Implementation - -:mallocfail: https://github.com/ralight/mallocfail -:should-fail-fn: https://euandre.org/git/fallible/tree/src/fallible.c?id=v0.1.0#n16 - -A working implementation of that already exists: {mallocfail}[mallocfail]. It -uses `LD_PRELOAD` to replace `malloc` at run-time, computes the SHA of the -stacktrace and fails once for each SHA. - -I initially envisioned and started implementing something very similar to -mallocfail. However I wanted it to go beyond out-of-memory scenarios, and using -`LD_PRELOAD` for every possible corner that could fail wasn't a good idea on the -long run. - -Also, mallocfail won't work together with tools such as Valgrind, who want to do -their own override of `malloc` with `LD_PRELOAD`. - -I instead went with less automatic things: starting with a -`fallible_should_fail(char *filename, int lineno)` function that fails once for -each `filename`+`lineno` combination, I created macro wrappers around common -functions such as `malloc`: - -[source,c] ----- -void *fallible_malloc(size_t size, const char *const filename, int lineno) { -#ifdef FALLIBLE - if (fallible_should_fail(filename, lineno)) { - return NULL; - } -#else - (void)filename; - (void)lineno; -#endif - return malloc(size); -} - -#define MALLOC(size) fallible_malloc(size, __FILE__, __LINE__) ----- - -With this definition, I could replace the calls to `malloc` with `MALLOC` (or -any other name that you want to `#define`): - -[source,diff] ----- ---- 3.c 2021-02-17 00:15:38.019706074 -0300 -+++ 4.c 2021-02-17 00:44:32.306885590 -0300 -@@ -1,11 +1,11 @@ - bool a_function() { -- char *s1 = malloc(A_NUMBER); -+ char *s1 = MALLOC(A_NUMBER); - if (!s1) { - return false; - } - strcpy(s1, "some string"); - -- char *s2 = malloc(A_NUMBER); -+ char *s2 = MALLOC(A_NUMBER); - if (!s2) { - free(s1); - return false; ----- - -With this change, if the program gets compiled with the `-DFALLIBLE` flag the -fault-injection mechanism will run, and `MALLOC` will fail once for each -`filename`+`lineno` combination. When the flag is missing, `MALLOC` is a very -thin wrapper around `malloc`, which compilers could remove entirely, and the -`-lfallible` flags can be omitted. - -This applies not only to `malloc` or other `stdlib.h` functions. If -`a_function` is important or relevant, I could add a wrapper around it too, that -checks if `fallible_should_fail` to exercise if its callers are also doing the -proper clean-up. - -The actual code is just this single function, -{should-fail-fn}[`fallible_should_fail`], which ended-up taking only ~40 lines. -In fact, there are more lines of either Makefile (111), README.md (82) or troff -(306) on this first version. - -The price for such fine-grained control is that this approach requires more -manual work. - -== Usage examples - -=== `MALLOC` from the `README.md` - -:fallible-check: https://euandreh.xyz/fallible/fallible-check.1.html - -[source,c] ----- -// leaky.c -#include <string.h> -#include <fallible_alloc.h> - -int main() { - char *aaa = MALLOC(100); - if (!aaa) { - return 1; - } - strcpy(aaa, "a safe use of strcpy"); - - char *bbb = MALLOC(100); - if (!bbb) { - // free(aaa); - return 1; - } - strcpy(bbb, "not unsafe, but aaa is leaking"); - - free(bbb); - free(aaa); - return 0; -} ----- - -Compile with `-DFALLIBLE` and run {fallible-check}[`fallible-check.1`]: - -[source,sh] ----- -$ c99 -DFALLIBLE -o leaky leaky.c -lfallible -$ fallible-check ./leaky -Valgrind failed when we did not expect it to: -(...suppressed output...) -# exit status is 1 ----- - -== Conclusion - -:package: https://euandre.org/git/package-repository/ - -For my personal use, I'll {package}[package] them for GNU Guix and Nix. -Packaging it to any other distribution should be trivial, or just downloading -the tarball and running `[sudo] make install`. - -Patches welcome! |