--- title: Grep online repositories date: 2020-08-28 layout: post lang: en ref: grep-online-repositories --- I often find interesting source code repositories online that I want to grep for some pattern but I can't, because either: - the repository is on [cgit][cgit] or a similar code repository that doesn't allow search in files, or; - the search function is really bad, and doesn't allow me to use regular expressions for searching patterns in the code. [cgit]: https://git.zx2c4.com/cgit/ Here's a simple script that allows you to overcome that problem easily: ```shell #!/usr/bin/env bash set -eu end="\033[0m" red="\033[0;31m" red() { echo -e "${red}${1}${end}"; } usage() { red "Missing argument $1.\n" cat < Arguments: REGEX_PATTERN Regular expression that "git grep" can search REPOSITORY_URL URL address that "git clone" can download the repository from Examples: Searching "make get-git" in cgit repository: git search 'make get-git' https://git.zx2c4.com/cgit/ git search 'make get-git' https://git.zx2c4.com/cgit/ -- \$(git rev-list --all) EOF exit 2 } REGEX_PATTERN="${1:-}" REPOSITORY_URL="${2:-}" [[ -z "${REGEX_PATTERN}" ]] && usage 'REGEX_PATTERN' [[ -z "${REPOSITORY_URL}" ]] && usage 'REPOSITORY_URL' mkdir -p /tmp/git-search DIRNAME="$(echo "${REPOSITORY_URL%/}" | rev | cut -d/ -f1 | rev)" if [[ ! -d "/tmp/git-search/${DIRNAME}" ]]; then git clone "${REPOSITORY_URL}" "/tmp/git-search/${DIRNAME}" fi pushd "/tmp/git-search/${DIRNAME}" shift 3 || shift 2 # when "--" is missing git grep "${REGEX_PATTERN}" "${@}" ``` It is a wrapper around `git grep` that downloads the repository when missing. Save in a file called `git-search`, make the file executable and add it to your path. Overview: - *lines 1~2*: Bash shebang and the `set -eu` options to exit on error or undefined variables. - *lines 4~30*: Usage text to be printed when providing less arguments than expected. - *line 33*: Extract the repository name from the URL, removing trailing slashes. - *lines 34~37*: Download the repository when missing and go to the folder. - *line 39*: Make the variable `$@` contain the rest of the unused arguments. - *line 40*: Perform `git grep`, forwarding the remaining arguments from `$@`. Example output: ```shell $ git search 'make get-git' https://git.zx2c4.com/cgit/ Clonage dans '/tmp/git-search/cgit'... remote: Enumerating objects: 542, done. remote: Counting objects: 100% (542/542), done. remote: Compressing objects: 100% (101/101), done. warning: object 51dd1eff1edc663674df9ab85d2786a40f7ae3a5: gitmodulesParse: could not parse gitmodules blob remote: Total 7063 (delta 496), reused 446 (delta 441), pack-reused 6521 Réception d'objets: 100% (7063/7063), 8.69 Mio | 5.39 Mio/s, fait. Résolution des deltas: 100% (5047/5047), fait. /tmp/git-search/cgit ~/dev/libre/songbooks/docs README: $ make get-git $ git search 'make get-git' https://git.zx2c4.com/cgit/ /tmp/git-search/cgit ~/dev/libre/songbooks/docs README: $ make get-git ``` Subsequent greps on the same repository are faster because no download is needed. When no argument is provided, it prints the usage text: ```shell $ git search Missing argument REGEX_PATTERN. Usage: /home/andreh/dev/libre/dotfiles/scripts/ad-hoc/git-search Arguments: REGEX_PATTERN Regular expression that "git grep" can search REPOSITORY_URL URL address that "git clone" can download the repository from Examples: Searching "make get-git" in cgit repository: git search 'make get-git' https://git.zx2c4.com/cgit/ git search 'make get-git' https://git.zx2c4.com/cgit/ -- $(git rev-list --all) ```