--- title: 'Awk snippet: ShellCheck all scripts in a repository' date: 2020-12-15 updated_at: 2020-12-16 layout: post lang: en ref: awk-snippet-shellcheck-all-scripts-in-a-repository eu_categories: shell --- Inspired by Fred Herbert's "[Awk in 20 Minutes][awk-20min]", here's a problem I just solved with a line of Awk: run ShellCheck in all scripts of a repository. In my repositories I usually have Bash and POSIX scripts, which I want to keep tidy with [ShellCheck][shellcheck]. Here's the first version of `assert-shellcheck.sh`: ```shell #!/bin/sh -eux find . -type f -name '*.sh' -print0 | xargs -0 shellcheck ``` This is the type of script that I copy around to all repositories, and I want it to be capable of working on any repository, without requiring a list of files to run ShellCheck on. This first version worked fine, as all my scripts had the '.sh' ending. But I recently added some scripts without any extension, so `assert-shellcheck.sh` called for a second version. The first attempt was to try grepping the shebang line: ```shell $ grep '^#!/' assert-shellcheck.sh #!/usr/sh ``` Good, we have a grep pattern on the first try. Let's try to find all the matching files: ```shell $ find . -type f | xargs grep -l '^#!/' ./TODOs.org ./.git/hooks/pre-commit.sample ./.git/hooks/pre-push.sample ./.git/hooks/pre-merge-commit.sample ./.git/hooks/fsmonitor-watchman.sample ./.git/hooks/pre-applypatch.sample ./.git/hooks/pre-push ./.git/hooks/prepare-commit-msg.sample ./.git/hooks/commit-msg.sample ./.git/hooks/post-update.sample ./.git/hooks/pre-receive.sample ./.git/hooks/applypatch-msg.sample ./.git/hooks/pre-rebase.sample ./.git/hooks/update.sample ./build-aux/with-guile-env.in ./build-aux/test-driver ./build-aux/missing ./build-aux/install-sh ./build-aux/install-sh~ ./bootstrap ./scripts/assert-todos.sh ./scripts/songbooks ./scripts/compile-readme.sh ./scripts/ci-build.sh ./scripts/generate-tasks-and-bugs.sh ./scripts/songbooks.in ./scripts/with-container.sh ./scripts/assert-shellcheck.sh ``` This approach has a problem, though: it includes files ignored by Git, such as `builld-aux/install-sh~`, and even goes into the `.git/` directory and finds sample hooks in `.git/hooks/*`. To list the files that Git is tracking we'll try `git ls-files`: ```shell $ git ls-files | xargs grep -l '^#!/' TODOs.org bootstrap build-aux/with-guile-env.in old/scripts/assert-docs-spelling.sh old/scripts/build-site.sh old/scripts/builder.bats.sh scripts/assert-shellcheck.sh scripts/assert-todos.sh scripts/ci-build.sh scripts/compile-readme.sh scripts/generate-tasks-and-bugs.sh scripts/songbooks.in scripts/with-container.sh ``` It looks to be almost there, but the `TODOs.org` entry shows a flaw in it: grep is looking for a `'^#!/'` pattern on any part of the file. In my case, `TODOs.org` had a snippet in the middle of the file where a line started with `#!/bin/sh`. So what we actually want is to match the **first** line against the pattern. We could loop through each file, get the first line with `head -n 1` and grep against that, but this is starting to look messy. I bet there is another way of doing it concisely... Let's try Awk. I need a way to select the line numbers to replace `head -n 1`, and to stop processing the file if the pattern matches. A quick search points me to using `FNR` for the former, and `{ nextline }` for the latter. Let's try it: ```shell $ git ls-files | xargs awk 'FNR>1 { nextfile } /^#!\// { print FILENAME; nextfile }' bootstrap build-aux/with-guile-env.in old/scripts/assert-docs-spelling.sh old/scripts/build-site.sh old/scripts/builder.bats.sh scripts/assert-shellcheck.sh scripts/assert-todos.sh scripts/ci-build.sh scripts/compile-readme.sh scripts/generate-tasks-and-bugs.sh scripts/songbooks.in scripts/with-container.sh ``` Great! Only `TODOs.org` is missing, but the script is much better: instead of matching against any part of the file that may have a shebang-like line, we only look for the first. Let's put it back into the `assert-shellcheck.sh` file and use `NULL` for separators to accommodate files with spaces in the name: ``` #!/usr/sh -eux git ls-files -z | \ xargs -0 awk 'FNR>1 { nextfile } /^#!\// { print FILENAME; nextfile }' | \ xargs shellcheck ``` This is where I've stopped, but I imagine a likely improvement: match against only `#!/bin/sh` and `#!/usr/bin/env bash` shebangs (the ones I use most), to avoid running ShellCheck on Perl files, or other shebangs. Also when reviewing the text of this article, I found that `{ nextfile }` is a GNU Awk extension. It would be an improvement if `assert-shellcheck.sh` relied on the POSIX subset of Awk for working correctly. ## *Update* After publishing, I could remove `{ nextfile }` and even make the script simpler: ```shell #!/usr/sh -eux git ls-files -z | \ xargs -0 awk 'FNR==1 && /^#!\// { print FILENAME }' | \ xargs shellcheck ``` Now both the shell and Awk usage are POSIX compatible. [awk-20min]: https://ferd.ca/awk-in-20-minutes.html [shellcheck]: https://www.shellcheck.net/