Remembered the words but forgot the files? Ask Ripgrep.
Ripgrep is smart at seaching files fast and feature-rich.
Many times I have to find several words among tens or hundreds, or maybe thousands of files in one folder. I’m not sure if the words is in subfolders or not.
So I need a helper and it’s “Ripgrep”.
Ripgrep
Ripgrep is a command line tool with high capacity to work on file searching operations. There are claims1 that Ripgrep is faster than many grep tools.
Below is the github repo of the producer.
As far as I used it, I am satisfied with its speed and result display. It provides result with concise and readable colorful highlighted way.
By default, Ripgrep is searching for the regular expression we supplied. But we can add flags like exact matches, files only, and even hidden files (such as files listed in “.gitignore”).
Basic use
Assume that we have installed Ripgrep into our computer or projects, we can start searching something like this.
1
2
rg "<regex>" # current path
rg "<regex>" . # current path as . (dot)
This simple command uses Ripgrep to search that text as a regular expression in all files in the current directory and subdirectories.
Find text in hidden directory
We can use the flag --hidden
or -.
to search in the hidden files or folders, like this.
1
2
rg --hidden "<regex>" .
rg -. "<regex>" .
List only files
Then we can use the flag -l
to list only the file name that contain the text.
1
2
rg -l "<regex>" . # only files
rg -l. "<regex>" . # only files including hidden files
Exact match
If we want to search for the exact match of the text, we can use the flag --word-regexp
or -w
.
1
2
rg --word-regexp "<exact_text>" .
rg -w "<exact_text>" .
As above, "Boston"
is a word that exists in those files but "oston"
can’t be found because it’s not a word, it’s just a substring.
Ripgrep all
rga
or “ripgrep all” is the command that searches broader than rg
in terms of file types such as .pdf
, .docx
, etc. while rg
can search in text-based such as .md
, .js
, .py
, etc.
rga
adopts command line options from rg
and adds some more options to search in more file types.
search in DOCXs
Say we have Microsoft Word files. If we use normal rg
, the result is nothing. But with rga
, we can get results.
1
rga "<regex>" .
search in PDFs
Like rga
on DOCX files, we also use rga
on PDF files. Additionally, it displays page numbers too.
1
rga "<regex>" .
search a set of word (order accordingly)
We know that the regex "regex_1 | regex_2
means that we are looking for regex_1
or regex_2
. And "regex_1.*regex_2
means that we are looking for regex_1
and regex_2
in the same line in the accordingly order.
But how can we search for regex_1
and regex_2
in the same file but not in the same line?
We add the flag --multiline
to tell it to search across multiple lines. Sometimes we need to adjust the regex by having (?s)
or add the flag --multiline-dotall
in order to cover new-line characters.2
1
2
3
4
5
6
7
8
rga --multiline "<regex_1>.*<regex_2>" .
rga -U "<regex_1>.*<regex_2>" . # same as --multiline
rga --multiline "(?s)<regex_1>.*<regex_2>" . # handle new-line characters
rga -U "(?s)<regex_1>.*<regex_2>" . # same as --multiline, handle new-line characters
rga --multiline --multiline-dotall "<regex_1>.*<regex_2>" # handle new-line characters.
rga -U --multiline-dotall "<regex_1>.*<regex_2>" # same as --multiline, handle new-line characters
search a set of word (regardless of order)
Then how should we do search several words without respecting orders?
We need a trick to do so.
1
rga "<regex_2>" $(rga "<regex_1>" -l .)
This implies that we search for files containing regex_1
first and then search for regex_2
in those files.
The output will be just the text regex_2
only so that we need to ensure that regex_1
is correct.