aboutsummaryrefslogtreecommitdiff

title: 'Awk snippet: ShellCheck all scripts in a repository'

date: 2020-12-15

updated_at: 2020-12-16

layout: post

lang: en

ref: awk-snippet-shellcheck-all-scripts-in-a-repository

eu_categories: shell


Inspired by Fred Herbert's "Awk in 20 Minutes", here's a problem I just solved with a line of Awk: run ShellCheck in all scripts of a repository.

In my repositories I usually have Bash and POSIX scripts, which I want to keep tidy with ShellCheck. Here's the first version of assert-shellcheck.sh:

#!/bin/sh -eux

find . -type f -name '*.sh' -print0 | xargs -0 shellcheck

This is the type of script that I copy around to all repositories, and I want it to be capable of working on any repository, without requiring a list of files to run ShellCheck on.

This first version worked fine, as all my scripts had the '.sh' ending. But I recently added some scripts without any extension, so assert-shellcheck.sh called for a second version. The first attempt was to try grepping the shebang line:

$ grep '^#!/' assert-shellcheck.sh
#!/usr/sh

Good, we have a grep pattern on the first try. Let's try to find all the matching files:

$ find . -type f | xargs grep -l '^#!/'
./TODOs.org
./.git/hooks/pre-commit.sample
./.git/hooks/pre-push.sample
./.git/hooks/pre-merge-commit.sample
./.git/hooks/fsmonitor-watchman.sample
./.git/hooks/pre-applypatch.sample
./.git/hooks/pre-push
./.git/hooks/prepare-commit-msg.sample
./.git/hooks/commit-msg.sample
./.git/hooks/post-update.sample
./.git/hooks/pre-receive.sample
./.git/hooks/applypatch-msg.sample
./.git/hooks/pre-rebase.sample
./.git/hooks/update.sample
./build-aux/with-guile-env.in
./build-aux/test-driver
./build-aux/missing
./build-aux/install-sh
./build-aux/install-sh~
./bootstrap
./scripts/assert-todos.sh
./scripts/songbooks
./scripts/compile-readme.sh
./scripts/ci-build.sh
./scripts/generate-tasks-and-bugs.sh
./scripts/songbooks.in
./scripts/with-container.sh
./scripts/assert-shellcheck.sh

This approach has a problem, though: it includes files ignored by Git, such as builld-aux/install-sh~, and even goes into the .git/ directory and finds sample hooks in .git/hooks/*.

To list the files that Git is tracking we'll try git ls-files:

$ git ls-files | xargs grep -l '^#!/'
TODOs.org
bootstrap
build-aux/with-guile-env.in
old/scripts/assert-docs-spelling.sh
old/scripts/build-site.sh
old/scripts/builder.bats.sh
scripts/assert-shellcheck.sh
scripts/assert-todos.sh
scripts/ci-build.sh
scripts/compile-readme.sh
scripts/generate-tasks-and-bugs.sh
scripts/songbooks.in
scripts/with-container.sh

It looks to be almost there, but the TODOs.org entry shows a flaw in it: grep is looking for a '^#!/' pattern on any part of the file. In my case, TODOs.org had a snippet in the middle of the file where a line started with #!/bin/sh.

So what we actually want is to match the first line against the pattern. We could loop through each file, get the first line with head -n 1 and grep against that, but this is starting to look messy. I bet there is another way of doing it concisely...

Let's try Awk. I need a way to select the line numbers to replace head -n 1, and to stop processing the file if the pattern matches. A quick search points me to using FNR for the former, and { nextline } for the latter. Let's try it:

$ git ls-files | xargs awk 'FNR>1 { nextfile } /^#!\// { print FILENAME; nextfile }'
bootstrap
build-aux/with-guile-env.in
old/scripts/assert-docs-spelling.sh
old/scripts/build-site.sh
old/scripts/builder.bats.sh
scripts/assert-shellcheck.sh
scripts/assert-todos.sh
scripts/ci-build.sh
scripts/compile-readme.sh
scripts/generate-tasks-and-bugs.sh
scripts/songbooks.in
scripts/with-container.sh

Great! Only TODOs.org is missing, but the script is much better: instead of matching against any part of the file that may have a shebang-like line, we only look for the first. Let's put it back into the assert-shellcheck.sh file and use NULL for separators to accommodate files with spaces in the name:

1
2
3
4
5
#!/usr/sh -eux

git ls-files -z | \
  xargs -0 awk 'FNR>1 { nextfile } /^#!\// { print FILENAME; nextfile }' | \
  xargs shellcheck

This is where I've stopped, but I imagine a likely improvement: match against only #!/bin/sh and #!/usr/bin/env bash shebangs (the ones I use most), to avoid running ShellCheck on Perl files, or other shebangs.

Also when reviewing the text of this article, I found that { nextfile } is a GNU Awk extension. It would be an improvement if assert-shellcheck.sh relied on the POSIX subset of Awk for working correctly.

Update

After publishing, I could remove { nextfile } and even make the script simpler:

#!/usr/sh -eux

git ls-files -z | \
  xargs -0 awk 'FNR==1 && /^#!\// { print FILENAME }' | \
  xargs shellcheck

Now both the shell and Awk usage are POSIX compatible.