How can I recursively search file contents by ANDed patterns and print the output like ag/the silver searcher would?
1
vote
0
answers
341
views
This will be easier to explain via example. Here are my input files:
### ag -C 1 --pager="less -R" "regexp1|regexp2" ...
Here, I am giving
file1:
x
x
a
b
c
x
x
file2:
x
x
c
b
a
x
x
file3:
x
x
x x a b c x x
x
x
file4:
x
x
x x c b a x x
x
x
file5:
x
x
a b
x
x
file6:
x
x
x x b c x x
x
x
file7:
x
x
x x b b x x
x
x
I want to search for files that have regexp a
AND c
. This will return files 1-4.
I want the output to look as close as possible to ag -C --pager="less -R" regexp
's output. Here's what that would look like (I am surrounding results with angle brackets to represent color highlighting):
file1:
2: x
3:
4: b
5:
6: x
file2:
2: x
3:
4: b
5:
6: x
file3:
2: x
3: x x b x x
4: x
file4:
2: x
3: x x b x x
4: x
Or maybe ag
would print it like this:
file1:
2: x
3:
4: b
--
4: b
5:
6: x
I'm not sure, but this detail doesn't matter to me. Here's what does:
1. The highlighting
1. The relative path of the file above the results
1. The features less
provides for navigation and search
1. It can find multiple regexps ANDed together
1. That -C
option still exists on the command line
The line numbers are a nice to have, but not necessary.
---
I've tried many many things, and this is the closest I've gotten:
## Step 1: Precompile a file list of each individual regexp
for x in $array_of_regexp_file_names;
do r=${x/.txt/} ; # remove .txt from the end
ag -il $r | sort > $x & ; # sort the list of FILES with this single regexp
done
This gives a list of files for each regexp, sorted by filename.
## Step 2: Use ag to search the intersection of 2+ file lists
I will break down the following:
ag -C 1 --pager="less -R" "regexp1|regexp2" $(comm -12 regexp1.txt regexp2.txt)
### $(comm -12 regexp1.txt regexp2.txt)
This command finds the intersection of two file lists. That is, the red in this picture:

ag
a regexp that I *know* exists in every file in that intersection. That may seem redundant, but I'm doing it because want those words highlighted in the output. It makes my life 1000x easier.
Here's the problem: that intersection is so many files, running the command gives me this output:
zsh: argument list too long: ag
Other than that, my workaround works. I've tested this by running a command like this:
ag -C 1 --pager="less -R" "regexp1|regexp2" $(comm -12 patter1.txt regexp2.txt | head -10)
The problem is the intersecting list is so long, it doesn't fit on the command line. If ag
provided an option to pass a list of files to search, I could get past this, but it doesn't have that functionality.
Regardless, I'm hoping I don't need to: I'm assuming there's a much easier solution to this problem, I just don't know what it is.
---
Edit: To solidify some highlighting rules, here are some other examples:
### Example 1
regexs:
regex1 = a.
regex2 = .b
input
file8:
x
x
abc
x
x
Output:
2: x
3: c
4: x
### Example 2
regexs:
regex1 = foo
regex2 = oba
input
file9:
x
x
foobar
x
x
Output:
2: x
3: bar
4: x
I picked these outputs because that's what grep and ag already do, but I'm pretty ambivalent about both of these scenarios, so if these examples are challenging to implement, I don't mind if highlighting works differently in these edge cases; in general, *my* regexps won't overlap.
Asked by Daniel Kaplan
(1070 rep)
Mar 10, 2022, 10:19 AM
Last activity: Mar 11, 2022, 06:06 AM
Last activity: Mar 11, 2022, 06:06 AM