title: 'Awk snippet: ShellCheck all scripts in a repository'
date: 2020-12-15
updated_at: 2020-12-16
layout: post
lang: en
ref: awk-snippet-shellcheck-all-scripts-in-a-repository
eu_categories: shell
Inspired by Fred Herbert's "Awk in 20 Minutes", here's a problem I just solved with a line of Awk: run ShellCheck in all scripts of a repository.
In my repositories I usually have Bash and POSIX scripts, which I want to keep
tidy with ShellCheck. Here's the first version of
assert-shellcheck.sh
:
#!/bin/sh -eux
find . -type f -name '*.sh' -print0 | xargs -0 shellcheck
This is the type of script that I copy around to all repositories, and I want it to be capable of working on any repository, without requiring a list of files to run ShellCheck on.
This first version worked fine, as all my scripts had the '.sh' ending. But I
recently added some scripts without any extension, so assert-shellcheck.sh
called for a second version. The first attempt was to try grepping the shebang
line:
$ grep '^#!/' assert-shellcheck.sh
#!/usr/sh
Good, we have a grep pattern on the first try. Let's try to find all the matching files:
$ find . -type f | xargs grep -l '^#!/'
./TODOs.org
./.git/hooks/pre-commit.sample
./.git/hooks/pre-push.sample
./.git/hooks/pre-merge-commit.sample
./.git/hooks/fsmonitor-watchman.sample
./.git/hooks/pre-applypatch.sample
./.git/hooks/pre-push
./.git/hooks/prepare-commit-msg.sample
./.git/hooks/commit-msg.sample
./.git/hooks/post-update.sample
./.git/hooks/pre-receive.sample
./.git/hooks/applypatch-msg.sample
./.git/hooks/pre-rebase.sample
./.git/hooks/update.sample
./build-aux/with-guile-env.in
./build-aux/test-driver
./build-aux/missing
./build-aux/install-sh
./build-aux/install-sh~
./bootstrap
./scripts/assert-todos.sh
./scripts/songbooks
./scripts/compile-readme.sh
./scripts/ci-build.sh
./scripts/generate-tasks-and-bugs.sh
./scripts/songbooks.in
./scripts/with-container.sh
./scripts/assert-shellcheck.sh
This approach has a problem, though: it includes files ignored by Git, such as
builld-aux/install-sh~
, and even goes into the .git/
directory and finds
sample hooks in .git/hooks/*
.
To list the files that Git is tracking we'll try git ls-files
:
$ git ls-files | xargs grep -l '^#!/'
TODOs.org
bootstrap
build-aux/with-guile-env.in
old/scripts/assert-docs-spelling.sh
old/scripts/build-site.sh
old/scripts/builder.bats.sh
scripts/assert-shellcheck.sh
scripts/assert-todos.sh
scripts/ci-build.sh
scripts/compile-readme.sh
scripts/generate-tasks-and-bugs.sh
scripts/songbooks.in
scripts/with-container.sh
It looks to be almost there, but the TODOs.org
entry shows a flaw in it: grep
is looking for a '^#!/'
pattern on any part of the file. In my case,
TODOs.org
had a snippet in the middle of the file where a line started with
#!/bin/sh
.
So what we actually want is to match the first line against the pattern. We
could loop through each file, get the first line with head -n 1
and grep
against that, but this is starting to look messy. I bet there is another way of
doing it concisely...
Let's try Awk. I need a way to select the line numbers to replace head -n 1
,
and to stop processing the file if the pattern matches. A quick search points me
to using FNR
for the former, and { nextline }
for the latter. Let's try it:
$ git ls-files | xargs awk 'FNR>1 { nextfile } /^#!\// { print FILENAME; nextfile }'
bootstrap
build-aux/with-guile-env.in
old/scripts/assert-docs-spelling.sh
old/scripts/build-site.sh
old/scripts/builder.bats.sh
scripts/assert-shellcheck.sh
scripts/assert-todos.sh
scripts/ci-build.sh
scripts/compile-readme.sh
scripts/generate-tasks-and-bugs.sh
scripts/songbooks.in
scripts/with-container.sh
Great! Only TODOs.org
is missing, but the script is much better: instead of
matching against any part of the file that may have a shebang-like line, we only
look for the first. Let's put it back into the assert-shellcheck.sh
file and
use NULL
for separators to accommodate files with spaces in the name:
1 2 3 4 5 |
|
This is where I've stopped, but I imagine a likely improvement: match against
only #!/bin/sh
and #!/usr/bin/env bash
shebangs (the ones I use most), to
avoid running ShellCheck on Perl files, or other shebangs.
Also when reviewing the text of this article, I found that { nextfile }
is a
GNU Awk extension. It would be an improvement if assert-shellcheck.sh
relied
on the POSIX subset of Awk for working correctly.
Update
After publishing, I could remove { nextfile }
and even make the script
simpler:
#!/usr/sh -eux
git ls-files -z | \
xargs -0 awk 'FNR==1 && /^#!\// { print FILENAME }' | \
xargs shellcheck
Now both the shell and Awk usage are POSIX compatible.