title: Grep online repositories
date: 2020-08-28
layout: post
lang: en
ref: grep-online-repositories
eu_categories: git
I often find interesting source code repositories online that I want to grep for some pattern but I can't, because either:
- the repository is on cgit or a similar code repository that doesn't allow search in files, or;
- the search function is really bad, and doesn't allow me to use regular expressions for searching patterns in the code.
Here's a simple script that allows you to overcome that problem easily:
#!/usr/bin/env bash
set -eu
end="\033[0m"
red="\033[0;31m"
red() { echo -e "${red}${1}${end}"; }
usage() {
red "Missing argument $1.\n"
cat <<EOF
Usage:
$0 <REGEX_PATTERN> <REPOSITORY_URL>
Arguments:
REGEX_PATTERN Regular expression that "git grep" can search
REPOSITORY_URL URL address that "git clone" can download the repository from
Examples:
Searching "make get-git" in cgit repository:
git search 'make get-git' https://git.zx2c4.com/cgit/
git search 'make get-git' https://git.zx2c4.com/cgit/ -- \$(git rev-list --all)
EOF
exit 2
}
REGEX_PATTERN="${1:-}"
REPOSITORY_URL="${2:-}"
[[ -z "${REGEX_PATTERN}" ]] && usage 'REGEX_PATTERN'
[[ -z "${REPOSITORY_URL}" ]] && usage 'REPOSITORY_URL'
mkdir -p /tmp/git-search
DIRNAME="$(echo "${REPOSITORY_URL%/}" | rev | cut -d/ -f1 | rev)"
if [[ ! -d "/tmp/git-search/${DIRNAME}" ]]; then
git clone "${REPOSITORY_URL}" "/tmp/git-search/${DIRNAME}"
fi
pushd "/tmp/git-search/${DIRNAME}"
shift 3 || shift 2 # when "--" is missing
git grep "${REGEX_PATTERN}" "${@}"
It is a wrapper around git grep
that downloads the repository when missing.
Save in a file called git-search
, make the file executable and add it to your
path.
Overview:
- lines 1~2:
Bash shebang and the set -eu
options to exit on error or undefined
variables.
- lines 4~30:
Usage text to be printed when providing less arguments than expected.
- line 33:
Extract the repository name from the URL, removing trailing slashes.
- lines 34~37:
Download the repository when missing and go to the folder.
- line 39:
Make the variable $@
contain the rest of the unused arguments.
- line 40:
Perform git grep
, forwarding the remaining arguments from $@
.
Example output:
$ git search 'make get-git' https://git.zx2c4.com/cgit/
Clonage dans '/tmp/git-search/cgit'...
remote: Enumerating objects: 542, done.
remote: Counting objects: 100% (542/542), done.
remote: Compressing objects: 100% (101/101), done.
warning: object 51dd1eff1edc663674df9ab85d2786a40f7ae3a5: gitmodulesParse: could not parse gitmodules blob
remote: Total 7063 (delta 496), reused 446 (delta 441), pack-reused 6521
Réception d'objets: 100% (7063/7063), 8.69 Mio | 5.39 Mio/s, fait.
Résolution des deltas: 100% (5047/5047), fait.
/tmp/git-search/cgit ~/dev/libre/songbooks/docs
README: $ make get-git
$ git search 'make get-git' https://git.zx2c4.com/cgit/
/tmp/git-search/cgit ~/dev/libre/songbooks/docs
README: $ make get-git
Subsequent greps on the same repository are faster because no download is needed.
When no argument is provided, it prints the usage text:
$ git search
Missing argument REGEX_PATTERN.
Usage:
/home/andreh/dev/libre/dotfiles/scripts/ad-hoc/git-search <REGEX_PATTERN> <REPOSITORY_URL>
Arguments:
REGEX_PATTERN Regular expression that "git grep" can search
REPOSITORY_URL URL address that "git clone" can download the repository from
Examples:
Searching "make get-git" in cgit repository:
git search 'make get-git' https://git.zx2c4.com/cgit/
git search 'make get-git' https://git.zx2c4.com/cgit/ -- $(git rev-list --all)