diff options
Diffstat (limited to 'src/content/tils/2020/12/15/shellcheck-repo.adoc')
-rw-r--r-- | src/content/tils/2020/12/15/shellcheck-repo.adoc | 171 |
1 files changed, 171 insertions, 0 deletions
diff --git a/src/content/tils/2020/12/15/shellcheck-repo.adoc b/src/content/tils/2020/12/15/shellcheck-repo.adoc new file mode 100644 index 0000000..71d10a3 --- /dev/null +++ b/src/content/tils/2020/12/15/shellcheck-repo.adoc @@ -0,0 +1,171 @@ +--- + +title: 'Awk snippet: ShellCheck all scripts in a repository' + +date: 2020-12-15 + +updated_at: 2020-12-16 + +layout: post + +lang: en + +ref: awk-snippet-shellcheck-all-scripts-in-a-repository + +eu_categories: shell + +--- + +Inspired by Fred Herbert's "[Awk in 20 Minutes][awk-20min]", here's a problem I +just solved with a line of Awk: run ShellCheck in all scripts of a repository. + +In my repositories I usually have Bash and POSIX scripts, which I want to keep +tidy with [ShellCheck][shellcheck]. Here's the first version of +`assert-shellcheck.sh`: + +```shell +#!/bin/sh -eux + +find . -type f -name '*.sh' -print0 | xargs -0 shellcheck +``` + +This is the type of script that I copy around to all repositories, and I want it +to be capable of working on any repository, without requiring a list of files to +run ShellCheck on. + +This first version worked fine, as all my scripts had the '.sh' ending. But I +recently added some scripts without any extension, so `assert-shellcheck.sh` +called for a second version. The first attempt was to try grepping the shebang +line: + +```shell +$ grep '^#!/' assert-shellcheck.sh +#!/usr/sh +``` + +Good, we have a grep pattern on the first try. Let's try to find all the +matching files: + +```shell +$ find . -type f | xargs grep -l '^#!/' +./TODOs.org +./.git/hooks/pre-commit.sample +./.git/hooks/pre-push.sample +./.git/hooks/pre-merge-commit.sample +./.git/hooks/fsmonitor-watchman.sample +./.git/hooks/pre-applypatch.sample +./.git/hooks/pre-push +./.git/hooks/prepare-commit-msg.sample +./.git/hooks/commit-msg.sample +./.git/hooks/post-update.sample +./.git/hooks/pre-receive.sample +./.git/hooks/applypatch-msg.sample +./.git/hooks/pre-rebase.sample +./.git/hooks/update.sample +./build-aux/with-guile-env.in +./build-aux/test-driver +./build-aux/missing +./build-aux/install-sh +./build-aux/install-sh~ +./bootstrap +./scripts/assert-todos.sh +./scripts/songbooks +./scripts/compile-readme.sh +./scripts/ci-build.sh +./scripts/generate-tasks-and-bugs.sh +./scripts/songbooks.in +./scripts/with-container.sh +./scripts/assert-shellcheck.sh +``` + +This approach has a problem, though: it includes files ignored by Git, such as +`builld-aux/install-sh~`, and even goes into the `.git/` directory and finds +sample hooks in `.git/hooks/*`. + +To list the files that Git is tracking we'll try `git ls-files`: + +```shell +$ git ls-files | xargs grep -l '^#!/' +TODOs.org +bootstrap +build-aux/with-guile-env.in +old/scripts/assert-docs-spelling.sh +old/scripts/build-site.sh +old/scripts/builder.bats.sh +scripts/assert-shellcheck.sh +scripts/assert-todos.sh +scripts/ci-build.sh +scripts/compile-readme.sh +scripts/generate-tasks-and-bugs.sh +scripts/songbooks.in +scripts/with-container.sh +``` + +It looks to be almost there, but the `TODOs.org` entry shows a flaw in it: grep +is looking for a `'^#!/'` pattern on any part of the file. In my case, +`TODOs.org` had a snippet in the middle of the file where a line started with +`#!/bin/sh`. + +So what we actually want is to match the **first** line against the pattern. We +could loop through each file, get the first line with `head -n 1` and grep +against that, but this is starting to look messy. I bet there is another way of +doing it concisely... + +Let's try Awk. I need a way to select the line numbers to replace `head -n 1`, +and to stop processing the file if the pattern matches. A quick search points me +to using `FNR` for the former, and `{ nextline }` for the latter. Let's try it: + +```shell +$ git ls-files | xargs awk 'FNR>1 { nextfile } /^#!\// { print FILENAME; nextfile }' +bootstrap +build-aux/with-guile-env.in +old/scripts/assert-docs-spelling.sh +old/scripts/build-site.sh +old/scripts/builder.bats.sh +scripts/assert-shellcheck.sh +scripts/assert-todos.sh +scripts/ci-build.sh +scripts/compile-readme.sh +scripts/generate-tasks-and-bugs.sh +scripts/songbooks.in +scripts/with-container.sh +``` + +Great! Only `TODOs.org` is missing, but the script is much better: instead of +matching against any part of the file that may have a shebang-like line, we only +look for the first. Let's put it back into the `assert-shellcheck.sh` file and +use `NULL` for separators to accommodate files with spaces in the name: + +``` +#!/usr/sh -eux + +git ls-files -z | \ + xargs -0 awk 'FNR>1 { nextfile } /^#!\// { print FILENAME; nextfile }' | \ + xargs shellcheck +``` + +This is where I've stopped, but I imagine a likely improvement: match against +only `#!/bin/sh` and `#!/usr/bin/env bash` shebangs (the ones I use most), to +avoid running ShellCheck on Perl files, or other shebangs. + +Also when reviewing the text of this article, I found that `{ nextfile }` is a +GNU Awk extension. It would be an improvement if `assert-shellcheck.sh` relied +on the POSIX subset of Awk for working correctly. + +## *Update* + +After publishing, I could remove `{ nextfile }` and even make the script +simpler: + +```shell +#!/usr/sh -eux + +git ls-files -z | \ + xargs -0 awk 'FNR==1 && /^#!\// { print FILENAME }' | \ + xargs shellcheck +``` + +Now both the shell and Awk usage are POSIX compatible. + +[awk-20min]: https://ferd.ca/awk-in-20-minutes.html +[shellcheck]: https://www.shellcheck.net/ |