diff options
Diffstat (limited to '')
-rw-r--r-- | _tils/2020-12-15-awk-snippet-shellcheck-all-scripts-in-a-repository.md | 154 |
1 files changed, 154 insertions, 0 deletions
diff --git a/_tils/2020-12-15-awk-snippet-shellcheck-all-scripts-in-a-repository.md b/_tils/2020-12-15-awk-snippet-shellcheck-all-scripts-in-a-repository.md new file mode 100644 index 0000000..91ab22e --- /dev/null +++ b/_tils/2020-12-15-awk-snippet-shellcheck-all-scripts-in-a-repository.md @@ -0,0 +1,154 @@ +--- + +title: 'Awk snippet: ShellCheck all scripts in a repository' + +date: 2020-12-15 + +layout: post + +lang: en + +ref: awk-snippet-shellcheck-all-scripts-in-a-repository + +--- + +Inspired by Fred Herbert's "[Awk in 20 Minutes][awk-20min]", here's a problem I +just solved with a line of Awk: run ShellCheck in all scripts of a repository. + +In my repositories I usually have Bash and POSIX scripts, which I want to keep +tidy with [ShellCheck][shellcheck]. Here's the first version of +`assert-shellcheck.sh`: + +```shell +#!/bin/sh +set -eu + +find . -type f -name '*.sh' -print0 | xargs -0 shellcheck +``` + +This is the type of script that I copy around to all repositories, and I want it +to be capable of working on any repository, without requiring a list of files to +run ShellCheck on. + +This first version worked fine, as all my scripts had the '.sh' ending. But I +recently added some scripts without any extension, so `assert-shellcheck.sh` +called for a second version. The first attempt was to try grepping the shebang +line: + +```shell +$ grep '^#!/' assert-shellcheck.sh +#!/usr/sh +``` + +Good, we have a grep pattern on the first try. Let's try to find all the +matching files: + +```shell +$ find . -type f | xargs grep -l '^#!/' +./TODOs.org +./.git/hooks/pre-commit.sample +./.git/hooks/pre-push.sample +./.git/hooks/pre-merge-commit.sample +./.git/hooks/fsmonitor-watchman.sample +./.git/hooks/pre-applypatch.sample +./.git/hooks/pre-push +./.git/hooks/prepare-commit-msg.sample +./.git/hooks/commit-msg.sample +./.git/hooks/post-update.sample +./.git/hooks/pre-receive.sample +./.git/hooks/applypatch-msg.sample +./.git/hooks/pre-rebase.sample +./.git/hooks/update.sample +./build-aux/with-guile-env.in +./build-aux/test-driver +./build-aux/missing +./build-aux/install-sh +./build-aux/install-sh~ +./bootstrap +./scripts/assert-todos.sh +./scripts/songbooks +./scripts/compile-readme.sh +./scripts/ci-build.sh +./scripts/generate-tasks-and-bugs.sh +./scripts/songbooks.in +./scripts/with-container.sh +./scripts/assert-shellcheck.sh +``` + +This approach has a problem, though: it includes files ignored by Git, such as +`builld-aux/install-sh~`, and even goes into the `.git/` directory and finds +sample hooks in `.git/hooks/*`. + +To list the files that Git is tracking we'll try `git ls-files`: + +```shell +$ git ls-files | xargs grep -l '^#!/' +TODOs.org +bootstrap +build-aux/with-guile-env.in +old/scripts/assert-docs-spelling.sh +old/scripts/build-site.sh +old/scripts/builder.bats.sh +scripts/assert-shellcheck.sh +scripts/assert-todos.sh +scripts/ci-build.sh +scripts/compile-readme.sh +scripts/generate-tasks-and-bugs.sh +scripts/songbooks.in +scripts/with-container.sh +``` + +It looks to be almost there, but the `TODOs.org` entry shows a flaw in it: grep +is looking for a `'^#!/'` pattern on any part of the file. In my case, +`TODOs.org` had a snippet in the middle of the file where a line started with +`#!/bin/sh`. + +So what we actually want is to match the **first** line against the pattern. We +could loop through each file, get the first line with `head -n 1` and grep +against that, but this is starting to look messy. I bet there is another way of +doing it concisely... + +Let's try Awk. I need a way to select the line numbers to replace `head -n 1`, +and to stop processing the file if the pattern matches. A quick search points me +to using `FNR` for the former, and `{ nextline }` for the latter. Let's try it: + +```shell +$ git ls-files | xargs awk 'FNR>1 { nextfile } /^#!\// { print FILENAME; nextfile }' +bootstrap +build-aux/with-guile-env.in +old/scripts/assert-docs-spelling.sh +old/scripts/build-site.sh +old/scripts/builder.bats.sh +scripts/assert-shellcheck.sh +scripts/assert-todos.sh +scripts/ci-build.sh +scripts/compile-readme.sh +scripts/generate-tasks-and-bugs.sh +scripts/songbooks.in +scripts/with-container.sh +``` + +Great! Only `TODOs.org` is missing, but the script is much better: instead of +matching against any part of the file that may have a shebang-like line, we only +look for the first. Let's put it back into the `assert-shellcheck.sh` file and +use `NULL` for separators to accommodate files with spaces in the name: + +``` +#!/usr/sh +set -eu + +git ls-files -z | \ + xargs -0 awk 'FNR>1 { nextfile } /^#!\// { print FILENAME; nextfile }' | \ + xargs shellcheck +``` + +This is where I've stopped, but I imagine a likely improvement: match against +only `#!/bin/sh` and `#!/usr/bin/env bash` shebangs (the ones I use most), to +avoid running ShellCheck on Perl files, or other shebangs. + +Also when reviewing the text of this article, I found that `{ nextfile }` is a +GNU Awk extension. It would be an improvement if `assert-shellcheck.sh` relied +on the POSIX subset of Awk for working correctly. + +[awk-20min]: https://ferd.ca/awk-in-20-minutes.html +[shellcheck]: https://www.shellcheck.net/ |