blob: 7860df35f24c671dc1338408b8ea91da14e6566f (
plain) (
tree)
|
|
---
title: Grep online repositories
date: 2020-08-28
layout: til
lang: en
ref: grep-online-repositories
---
I often find interesting source code repositories online that I want to grep for
some pattern but I can't, because either:
- the repository is on [cgit][cgit] or a similar code repository that doesn't
allow search in files, or;
- the search function is really bad, and doesn't allow me to use regular expressions for searching patterns in the code.
[cgit]: https://git.zx2c4.com/cgit/
Here's a simple script that allows you to overcome that problem easily:
```shell
#!/usr/bin/env bash
set -eu
end="\033[0m"
red="\033[0;31m"
red() { echo -e "${red}${1}${end}"; }
usage() {
red "Missing argument $1.\n"
cat <<EOF
Usage:
$0 <REGEX_PATTERN> <REPOSITORY_URL>
Arguments:
REGEX_PATTERN Regular expression that "git grep" can search
REPOSITORY_URL URL address that "git clone" can download the repository from
Examples:
Searching "make get-git" in cgit repository:
git search 'make get-git' https://git.zx2c4.com/cgit/
git search 'make get-git' https://git.zx2c4.com/cgit/ -- \$(git rev-list --all)
EOF
exit 2
}
REGEX_PATTERN="${1:-}"
REPOSITORY_URL="${2:-}"
[[ -z "${REGEX_PATTERN}" ]] && usage 'REGEX_PATTERN'
[[ -z "${REPOSITORY_URL}" ]] && usage 'REPOSITORY_URL'
mkdir -p /tmp/git-search
DIRNAME="$(echo "${REPOSITORY_URL%/}" | rev | cut -d/ -f1 | rev)"
if [[ ! -d "/tmp/git-search/${DIRNAME}" ]]; then
git clone "${REPOSITORY_URL}" "/tmp/git-search/${DIRNAME}"
fi
pushd "/tmp/git-search/${DIRNAME}"
shift 3 || shift 2 # when "--" is missing
git grep "${REGEX_PATTERN}" "${@}"
```
It is a wrapper around `git grep` that downloads the repository when missing.
Save in a file called `git-search`, make the file executable and add it to your
path.
Overview:
- *lines 1~2*:
Bash shebang and the `set -eu` options to exit on error or undefined
variables.
- *lines 4~30*:
Usage text to be printed when providing less arguments than expected.
- *line 33*:
Extract the repository name from the URL, removing trailing slashes.
- *lines 34~37*:
Download the repository when missing and go to the folder.
- *line 39*:
Make the variable `$@` contain the rest of the unused arguments.
- *line 40*:
Perform `git grep`, forwarding the remaining arguments from `$@`.
Example output:
```shell
$ git search 'make get-git' https://git.zx2c4.com/cgit/
Clonage dans '/tmp/git-search/cgit'...
remote: Enumerating objects: 542, done.
remote: Counting objects: 100% (542/542), done.
remote: Compressing objects: 100% (101/101), done.
warning: object 51dd1eff1edc663674df9ab85d2786a40f7ae3a5: gitmodulesParse: could not parse gitmodules blob
remote: Total 7063 (delta 496), reused 446 (delta 441), pack-reused 6521
Réception d'objets: 100% (7063/7063), 8.69 Mio | 5.39 Mio/s, fait.
Résolution des deltas: 100% (5047/5047), fait.
/tmp/git-search/cgit ~/dev/libre/songbooks/docs
README: $ make get-git
$ git search 'make get-git' https://git.zx2c4.com/cgit/
/tmp/git-search/cgit ~/dev/libre/songbooks/docs
README: $ make get-git
```
Subsequent greps on the same repository are faster because no download is needed.
When no argument is provided, it prints the usage text:
```shell
$ git search
Missing argument REGEX_PATTERN.
Usage:
/home/andreh/dev/libre/dotfiles/scripts/ad-hoc/git-search <REGEX_PATTERN> <REPOSITORY_URL>
Arguments:
REGEX_PATTERN Regular expression that "git grep" can search
REPOSITORY_URL URL address that "git clone" can download the repository from
Examples:
Searching "make get-git" in cgit repository:
git search 'make get-git' https://git.zx2c4.com/cgit/
git search 'make get-git' https://git.zx2c4.com/cgit/ -- $(git rev-list --all)
```
|