Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

2 votes

2 answers

6540 views

Grep /var/log/maillog for email to a certain user, based only on his linux username

I have a learning environment, based on Linux CentOS, with Postfix and SquirrelMail running, but my assignment is more in general. I need to find in the maillog e-mails **received by** a certain user within a certain time frame, based **only** on his Linux username. I see my maillog, but I am not ex...

                                  I have a learning environment, based on Linux CentOS, with Postfix and SquirrelMail running, but my assignment is more in general.

I need to find in the maillog e-mails **received by** a certain user within a certain time frame, based **only** on his Linux username.

I see my maillog, but I am not experienced in reading maillog and I have two concerns:

1.  Whether or not these patterns that I see in the log are something reliable, i.e. whether a log for incoming e-mail will always have to= in it.

        Jan 2 20:31:17 tmcent01 postfix/local: B58C4330038: to=, orig_to=, relay=local, delay=9.7, delays=9.6/0.03/0/0.02, dsn=2.0.0, status=sent (delivered to mailbox)

2. How does a Linux username correspond to the e-mail name of the user? It is not it always a match (username@domain), is it? We could have alias for it, how can I take this in consideration when composing the Regex for the grep?

My first two attempts were a strike-out.

    sudo grep "to=

pmihova (21 rep)

Jul 29, 2015, 07:19 AM • Last activity: Aug 7, 2025, 01:07 AM

8 votes

3 answers

15160 views

extract lines that match a list of words in another file

awk sed grep bioinformatics

I have file 1 which have those lines: ATM 1434.972183 BMPR2 10762.78192 BMPR2 10762.78192 BMPR2 1469.14535 BMPR2 1469.14535 BMPR2 1738.479639 BMS1 4907.841667 BMS1 4907.841667 BMS1 880.4532628 BMS1 880.4532628 BMS1P17 1249.75 BMS1P17 1249.75 BMS1P17 1606.821429 BMS1P17 1606.821429 BMS1P17 1666.33333...

                                  I have file 1 which have those lines:

    ATM 1434.972183
    BMPR2 10762.78192
    BMPR2 10762.78192
    BMPR2 1469.14535
    BMPR2 1469.14535
    BMPR2 1738.479639
    BMS1 4907.841667
    BMS1 4907.841667
    BMS1 880.4532628
    BMS1 880.4532628
    BMS1P17 1249.75
    BMS1P17 1249.75
    BMS1P17 1606.821429
    BMS1P17 1606.821429
    BMS1P17 1666.333333
    BMS1P17 1666.333333
    BMS1P17 2108.460317
    BMS1P17 2108

And file 2 have a list of words:

    ATM
    BMS1
So, the output will be like this:

    ATM 1434.972183
    BMS1 4907.841667
    BMS1 4907.841667
    BMS1 880.4532628
    BMS1 880.4532628

I know it's really a duplicate question, but I tried all types of grep and sed and awk, maybe it will works with you guys with this tiny example 
but I have a very huge file > 1M lines and all previous way doesn't help

it return part of the lines that containing those words although there are other words in file 2 that matches the lines from file 1


                                

LamaMo (223 rep)

Jul 25, 2018, 06:59 PM • Last activity: Aug 6, 2025, 07:52 PM

28 votes

1 answers

23518 views

tar --list — only the top-level files and folders

grep tar macos

I understand that `tar --list --file=x` will list all files and folders. I am looking to just list the-top level data. Does anyone know how to do that? Alternatively, does anyone know how to list only the top-level files, but all folders including subfolders? Maybe with grep somehow? I'm after somet...

                                  I understand that tar --list --file=x will list all files and folders.
I am looking to just list the-top level data.

Does anyone know how to do that?

Alternatively, does anyone know how to list only the top-level files, but all folders including subfolders? Maybe with grep somehow?

I'm after something that works on most nix flavors including MacOS.

Alexander Mills (10734 rep)

Dec 22, 2018, 03:16 AM • Last activity: Aug 2, 2025, 06:02 PM

-3 votes

5 answers

2547 views

Extract numbers using grep command

grep

I have the following file example: ```none some text is here sometext(1,21); sometext(2,9); sometext(3,231); sometext(10,1112); sometext(11,17) Some text is here ``` I'm trying to extract the second number in the parentheses of the lines containing `sometext`, so in the above example, the numbers `2...

I have the following file example:

some text is here  
   sometext(1,21);
   sometext(2,9);
   sometext(3,231);
   sometext(10,1112);
   sometext(11,17)
Some text is here

I'm trying to extract the second number in the parentheses of the lines containing sometext, so in the above example, the numbers 21,9,231,1112,17. I didn't find a suitable grep command for the above pattern.

Pramod Sharma (1 rep)

Sep 24, 2021, 04:35 PM • Last activity: Aug 2, 2025, 08:54 AM

101 votes

13 answers

188287 views

How to print all lines after a match up to the end of the file?

text-processing sed grep

Input file1 is: dog 123 4335 cat 13123 23424 deer 2131 213132 bear 2313 21313 I give the match the pattern from in `other file` ( like `dog 123 4335` from file2). I match the pattern of the line is `dog 123 4335` and after printing all lines without match line my output is: cat 13123 23424 deer 2131...

                                  Input file1 is:
    
    dog 123 4335
    cat 13123 23424 
    deer 2131 213132
    bear 2313 21313

I give the match the  pattern from in other file ( like dog 123 4335 from   file2).

I match the pattern of the line is dog 123 4335 and after printing
all lines without match line my output is: 
 
    cat 13123 23424
    deer 2131 213132
    bear 2313 21313
    
If only use without address of line only use the pattern, for example 1s
how to  match and  print the lines?
                                

loganaayahee (1209 rep)

Nov 23, 2012, 07:17 AM • Last activity: Aug 2, 2025, 07:26 AM

8 votes

1 answers

522 views

Is there any way to see the string that was matched in grep?

grep regular-expression

**I'm not talking about -o option.** [Posix](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03) says: > The search for a matching sequence starts at the beginning of a string and stops when the **first sequence matching the expression is found**, where "first" is def...

                                  **I'm not talking about -o option.** 
[Posix](https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03)  says:
> The search for a matching sequence starts at the beginning of a string and stops when the **first sequence matching the expression is found**, where "first" is defined to mean "begins earliest in the string". If the pattern permits a variable number of matching characters and thus there is more than one such sequence starting at that point, the **longest such sequence is matched**. For example, the BRE "bb*" matches the second to fourth characters of the string "abbbc", and the ERE "(wee|week)(knights|night)" matches all ten characters of the string "weeknights".

And I want to **verify** what is being said in posix and this tutorial [regTutorialSite](https://www.regular-expressions.info/posix.html) :
> A POSIX-compliant engine will still find the **leftmost match**. If you **apply** Set|SetValue to Set or SetValue **once**, it **will match Set**. 

How to "apply once"?
When i run **grep -o** the result is two strings: Set and SetValue, but not just " one leftmost " . That is, I read about one thing, but in practice I get something else.  So, how to see what string was matched by regex?

(Perhaps the question was formulated incorrectly or could have been better)
                                

Mark (99 rep)

Aug 1, 2025, 12:59 PM • Last activity: Aug 2, 2025, 07:19 AM

214 votes

7 answers

755346 views

Return only the portion of a line after a matching pattern

text-processing sed grep

So pulling open a file with `cat` and then using `grep` to get matching lines only gets me so far when I am working with the particular log set that I am dealing with. It need a way to match lines to a pattern, but only to return the portion of the line after the match. The portion before and after...

                                  So pulling open a file with cat and then using grep to get matching lines only gets me so far when I am working with the particular log set that I am dealing with. It need a way to match lines to a pattern, but only to return the portion of the line after the match. The portion before and after the match will consistently vary. I have played with using sed or awk, but have not been able to figure out how to filter the line to either delete the part before the match, or just return the part after the match, either will work.
This is an example of a line that I need to filter:

    2011-11-07T05:37:43-08:00  isi-udb5-ash4-1(id1) /boot/kernel.amd64/kernel: [gmp_info.c:1758](pid 40370="kt: gmp-drive-updat")(tid=100872) new group: : { 1:0-25,27-34,37-38, 2:0-33,35-36, 3:0-35, 4:0-9,11-14,16-32,34-38, 5:0-35, 6:0-15,17-36, 7:0-16,18-36, 8:0-14,16-32,34-36, 9:0-10,12-36, 10-11:0-35, 12:0-5,7-30,32-35, 13-19:0-35, 20:0,2-35, down: 8:15, soft_failed: 1:27, 8:15, stalled: 12:6,31, 20:1 }

The portion I need is everything after "stalled".

The background behind this is that I can find out how often something stalls:

    cat messages | grep stalled | wc -l

What I need to do is find out how many times a certain node has stalled (indicated by the portion before each colon after "stalled". If I just grep for that (ie 20:) it may return lines that have soft fails, but no stalls, which doesn't help me. I need to filter only the stalled portion so I can then grep for a specific node out of those that have stalled.

For all intents and purposes, this is a freebsd system with standard GNU core utils, but I cannot install anything extra to assist.

MaQleod (2734 rep)

Nov 7, 2011, 11:18 PM • Last activity: Aug 1, 2025, 11:13 AM

3 votes

1 answers

328 views

How to do non-greedy multiline capture with recent versions of pcre2grep?

text-processing grep pcregrep

I noticed a difference in behavior between an older `pcre2grep` version (10.22) and a more recent one (10.42), and I am wondering how I can get the old behavior back. Take the following file: ``` aaa bbb XXX ccc ddd eee XXX fff ggg ``` Back with v10.22 (Debian 9), I could achieve non-greedy multi-li...

I noticed a difference in behavior between an older pcre2grep version (10.22) and a more recent one (10.42), and I am wondering how I can get the old behavior back. Take the following file:

aaa
bbb
XXX
ccc
ddd
eee
XXX
fff
ggg

Back with v10.22 (Debian 9), I could achieve non-greedy multi-line captures:

$ pcre2grep --version
pcre2grep version 10.22 2016-07-29

$ pcre2grep -nM '(.|\n)*?XXX' file
1:aaa
bbb
XXX
4:ccc
ddd
eee
XXX

Notice how it captured two multi-line groups, one starting at line 1 (1:aaa), and a second starting at line 4 (4:ccc). Now, with a more recent version (10.42, Debian 12), its behaviour changed:

$ pcre2grep --version
pcre2grep version 10.42 2022-12-11

$ pcre2grep -nM '(.|\n)*?XXX' file
1:aaa
bbb
XXX
ccc
ddd
eee
XXX

Now I only have one group, starting with 1:aaa. Basically, it seems to ignore the non-greedy operator (?). The result is the same if I omit it:

$ pcre2grep -nM '(.|\n)*XXX' file
1:aaa
bbb
XXX
ccc
ddd
eee
XXX

How can I get the behavior of v10.22 back ? In other words, how can I do non-greedy multiline captures in recent versions of pcre2grep?

ChennyStar (1969 rep)

Jul 27, 2025, 12:42 PM • Last activity: Jul 29, 2025, 02:43 PM

23 votes

6 answers

21514 views

Command-line tool to search docx files

grep search cygwin microsoft-word

Is there a command-line tool to text-search a docx file? I tried `grep`, but it doesn't work with docx even though it works fine with txt and xml files. I could convert the docx to txt first, but I'd prefer a tool that operates directly on docx files. I need the tool to work under Cygwin. OP edit: L...

                                  Is there a command-line tool to text-search a docx file? I tried grep, but it doesn't work with docx even though it works fine with txt and xml files. I could convert the docx to txt first, but I'd prefer a tool that operates directly on docx files. I need the tool to work under Cygwin.

OP edit: Later I found out that the easiest way to achieve the grep is actually to convert those docx to txt then grep over them.

RoundPi (331 rep)

Jan 6, 2012, 04:25 PM • Last activity: Jul 29, 2025, 01:51 PM

2 votes

4 answers

136 views

Grep (BRE) on surrounding delimiters w/o consuming the delimiter? Counting delimiter-separated strings between filename and extension

bash shell grep regular-expression

I have a dataset of images labeled/classified by characteristics, where an image can have more than one label. I want to count how many of each identifier I have. A toy dataset is created below, with different colors being the labels. ```bash bballdave025@MY-MACHINE /home/bballdave025/toy $ touch el...

bballdave025@MY-MACHINE /home/bballdave025/toy
$ touch elephant_grey.jpg && touch zebra_white_black.jpg && touch rubik-s_cube_-_1977-first-prod_by_ErnoRubik-_red_orange_yellow_white_blue_green.jpg && touch Radio_Hotel.Washington_Heights.NYC-USA_green_yellow_blue_red_orange_reddish-purple_teal_grey-brown.jpg && touch Big_Bird__yellow_orange_red.jpg

Let's make it more easily visible. The files in the initially labeled dataset are shown below. (The | awk -F'/' '{print $NF}' is just meant to take off the ./ or path/to/where/the/jpegs/are/ that would otherwise be before the filename.)

$ find . -type f | awk -F'/' '{print $NF}'
Big_Bird__yellow_orange_red.jpg
elephant_grey.jpg
Radio_Hotel.Washington_Heights.NYC-USA_green_yellow_blue_red_orange_reddish-purple_teal_grey-brown.jpg
rubik-s_cube_-_1977-first-prod_by_ErnoRubik-_red_orange_yellow_white_blue_green.jpg
zebra_white_black.jpg

Those are the filenames for labeled versions of the images. The corresponding originals are below:

$ find ../toy_orig_bak/ -type f | awk -F'/' '{print $NF}'
Big_Bird_.jpg
elephant.jpg
Radio_Hotel.Washington_Heights.NYC-USA.jpg
rubik-s_cube_-_1977-first-prod_by_ErnoRubik-.jpg
zebra.jpg

This is to show that the color labels are inserted between the filename and the dot extension. They are separated from each other and from the original filename by a (delimiting) _ character. (There are rules for the label names and for the filenames1.) The only allowed color strings at this initial point are any of {black, white, grey, red, orange, yellow, green, blue, reddish-purple, teal, grey-brown}. I further want to show that other labels may be added, as long as they're part of my controlled vocabulary, something which can be changed only by me. Imagine a file named rainbox.jpg gets put in with the original filenames ( touch ../toy_orig_bak/rainbow.jpg, for those of you following along for reproducibility ). I decide that I want to add indigo and violet to my controlled vocabulary list, so I can create the labeled filename,

$ touch rainbow_red_orange_yellow_green_blue_indigo_violet.jpg

Desired Output Again, I want a count of each of the labels. For the dataset I've set up (including that last labeled picture of a rainbow), the correct output would be

1 black
      3 blue
      3 green
      1 grey
      1 grey-brown
      1 indigo
      4 orange
      4 red
      1 reddish-purple
      1 teal
      1 violet
      2 white
      4 yellow

(The counts were performed somewhat manually, due to my grep confusion.) Attempts and a note on the details of the solution I want Research below My first thought (although I did worry about delimiter consumption) was to look at the surrounding delimiters: '_' before and '_' or '.' after. Here's first my grep attempt

find . -type f -iname "*.jpg" | \
 \
    grep -o "[_]\(black\|white\|grey\|red\|orange\|yellow\|green\|blue\|"\
"reddish-purple\|teal\|grey-brown\|indigo\|violet\)[_.]" | \
 \
        tr -d [_.] | sort | uniq -c

and its output

3 blue
      1 green
      1 grey
      1 orange
      3 red
      1 teal
      1 violet
      1 white
      3 yellow

Which is not the same as before. Here's the comparison.

Before              |   Now
-----------------------|---------------------
      1 black          |
      3 blue           |      3 blue
      3 green          |      1 green
      1 grey           |      1 grey
      1 grey-brown     |
      1 indigo         |
      4 orange         |      1 orange
      4 red            |      3 red
      1 reddish-purple |
      1 teal           |      1 teal
      1 violet         |      1 violet
      2 white          |      1 white
      4 yellow         |      3 yellow
                       |

I know this is happening because the regex engine consumes the second delimiter2. Here is the crux of my main question: (I do want to solve my count problem, and I'll talk about some solutions I've researched and considered myself, but) the detail I want to know is about truly regular expressions and consuming the delimiter. I want to get a count of each identifier string, and I'm wondering if I can do it with approach and (POSIX) Basic Regular Expressions – BRENote 2 and [reddit thread](https://www.reddit.com/r/askscience/comments/5rttyo/do_extended_regular_expressions_still_denote_the/) ([archived as a gist](https://gist.github.com/bballdave025/b2f7a190907146151696eed394079a64)) , specifically with grep . Any of sed, awk, IFS with read, etc. are welcome, too. I'm sure someone has a way solve this problem with Perl (dermis and feline can be divorced by manifold methods), and I'd be glad to get that one, too. Basically, I am absolutely okay with other solutions to the task of getting a count of each identifier string. However, if it's true that there's no way of stepping back the engine with a Basic Regular Expression engine (that's truly regular), I want to know. I've thought of zero-width matches, lookaheads, and look-behinds, but I don't know how these play out in POSIX Basic Regular Expressions or in mathematically/grammatically regular language parsers. One thing I realize I wasn't taking into account The point of the rules (see note \[2\]) was to allow the regex to take advantage of the fact that we should be able to assure ourselves that we're only getting the part of the classified filename with labels, as we only allow one of a finite set of strings preceded by an underscore and followed by either an underscore or a dot, with the dot only happening before the file extension. (I guess we can't be absolutely certain, as the original, pre-labeled filename could have one of the labels immediately preceding the dot - something like a_sunburn_that_is_bright_red.jpg, but that's something for which I check and correct by adding a specific non-label string before the dot and extension.) My regex, imagining that it could get past the delimiter being consumed, would still allow the following example problems

the_new_red_car_-_1989_red_black_silver.jpg

- would return {red, red , silver } as is, - {red, red, black , silver } if working without consuming the 2nd '_', - whereas {red, black , silver } is desired

parrot_at_blue_gold_banquet_-_a_black_tie_affair_yellow_red_green.jpg

- would return {blue, black, yellow, green} as is, - {blue, gold, black, yellow, green} if not consuming the 2nd '_', - whereas {yellow, red, green} is desired Extra points for answers and discussions that take that into account. ; ) Research and ideas There are a few discussions on different StackExchange sites, like [this one](https://web.archive.org/web/20230925145242/https://stackoverflow.com/questions/63821591/how-to-split-a-string-by-underscore-and-extract-an-element-as-a-variable-in-bash) , [that one](https://web.archive.org/web/20250602171231/https://stackoverflow.com/questions/49784912/regex-of-underscore-delimited-string) , [another one](https://web.archive.org/web/20250602171046/https://unix.stackexchange.com/questions/267677/add-quotes-and-new-delimiter-around-space-delimited-words) , but I think the [Unix & Linux discussion here](https://unix.stackexchange.com/a/334551/291375) ([archived](https://web.archive.org/web/20230324152449/https://unix.stackexchange.com/questions/334549/how-do-i-extract-multiple-strings-which-are-comma-delimited-from-a-log-file)) is the best one. I think that one of the approaches in this answer from @terdon ♦ or in the answer with hashes – from @Sobrique – might be useful. I keep thinking that some version of ^.*$[_][]$\+[.]jpg$ might be key to the situation, but I haven't been able to put together that solution today. If you know how it can help, you're welcome to give an answer using it; I'm going to wait for a fresh brain tomorrow morning. Edit: @ilkkachu successfully used this idea. Why am I doing this? I'm training a CNN to recognize different occurrences (not colors) in pictures of old and often handwritten books. I want to make sure the classes are balanced as I want. Also, I'll compare this with another method that doesn't look at the delimiter to make sure I don't have any problems like a '_yllow' (instead of '_yellow'), or a '_whiteorange' _instead of '_white_orange'). Most of the labels are put on through a Java program I've put together, but I've given a little leeway for people to change the filenames themselves in case of multiple labels for one file. Having given that permission, I have the responsibility of verifying legal labeled filenames. Notes \[1\] The rules for the identifying/classifying labels are: The identifiers can be any of a finite set of strings which can contain only characters in [A-Za-z0-9-] but not underscores.
The bare filenames (without dot and extension) can consist of any ASCII characters except: 1) non-printable/control characters; 2) spaces or tabs ; OR 3) any of [!"#$%&/)(\]\[}{*?] See the next paragraph for the real 3). (Note that this means the bare filenames CAN have an underscore, '_', or even several of them.)
Edit: I had my no-no list of characters as is now crossed (struck) out above when @ilkkachu gave the accepted answer. One option of that answer makes excellent use of the '@' which was then not in the excluded character group, but which I actually don't allow in my filenames. There are other omissions in the original character group. As I actually want it, the above paragraph should be amended with the following. 3) any of

'[] ~@#$%^&|/)(}{[*?>\
Edit: Now this compiles as a BRE. (This was the simplest and most-readable BRE I could come up with.)

that beautifully crazy character group means that any of

{   [, ],  , ~, @, #, $, %, ^, &, |, \, /, ), (, }, {, [, *, ?, >, `}

is not allowed – and neither is any tab (\t, ...), nor any non-printing/control characters. Some of these are already standard on the no-no list for filenames on different OSs, but I give _my_ complete set (when I'm in charge of creating the filenames).







\[2\] Here is what I mean by the delimiter being consumed. I'll do my best to illustrate an example with our (Basic) Reg(ular)Ex(pression), 

"[_]\(black\|white\|grey\|red\|orange\|yellow\|green\|blue\|"\
"reddish-purple\|teal\|grey-brown\|indigo\|violet\)[_.]"

Here goes.

This missing of some of the color strings is happening because the regex engine consumes the second delimiter.

For example, using O to denote part of a miss (non-match) and X to denote part of a hit (match), with YYYYY denoting a complete match for the whole regex pattern, we get the following behavior.

Engine goes along looking for '_'

engine is here
       |
       v
rainbow_red_orange_yellow_green_blue_indigo_violet.jpg
OOOOOOO

Matches 
      [_]
          with '_'
engine is at
        |
        v
rainbow_red_orange_yellow_green_blue_indigo_violet.jpg
OOOOOOOX

Matches
 \(...\|red\|...\)
                  with 'red'    
  engine is at
           |
           v
rainbow_red_orange_yellow_green_blue_indigo_violet.jpg
OOOOOOOXXXX

Matches
          [_.]
               with '_'
   engine is at
            |
            v
rainbow_red_orange_yellow_green_blue_indigo_violet.jpg
OOOOOOOXXXXX

We have a whole match! 

rainbow_red_orange_yellow_green_blue_indigo_violet.jpg
OOOOOOOYYYYY

Given the  -o  flag, the engine outputs

'_red_'

The  'tr -d [_.]' takes off the surrounding underscores, 
and our output line becomes 

'red'

The problem now is that the engine cannot go back to
find the  '_' before  'orange', or at least it can't do
so using any process I know about from my admittedly
imperfect knowledge of Basic Regular Expressions. As far
as a REGULAR expression engine, using a REGULAR grammar and
a REGULAR language parser knows, the whole universe in which 
it's searching now consists of 

orange_yellow_green_blue_indigo_violet.jpg

(I don't know if this statement is correct from a mathematical/formal-language point of view, and I'd be interested to know.)

And the process continues as from the first, beginning with Engine goes along looking for '_'

orange_yellow_green_blue_indigo_violet.jpg
OOOOOOXXXXXXXX

Match!

orange_yellow_green_blue_indigo_violet.jpg
OOOOOOYYYYYYYY

Engine spits out  '_yellow_'  which is 'tr -d [_.]'-ed

Engine cannot go back, so its search universe is now

green_blue_indigo_violet.jpg

and we continue with

green_blue_indigo_violet.jpg
OOOOOXXXXXX

Match!

green_blue_indigo_violet.jpg
OOOOOYYYYYYOOOOOOYYYYYYYY

That last match being on the '.' from [_.]






 More formally, I want to know if it can be done with a real regular expression, i.e. one which can define a regular language and whose language is a context-free language, cf. Wikipedia's Regex article (archived). I think this is the same as a POSIX regular expression, but I'm not sure. 




Refs. [A] (archived), [B] (archived), [C] (archived), 





Dang it, I know there's a missing ending parenthesis up there in the text, somewhere, because I noticed it and went up to fix it. When I got up into the text, I couldn't remember the context of the parenthesis, so it's still there, just mocking me. I found it, and I bolded it! I'll probably take the bold formatting and this note down, soon, but I'm sharing my happiness right now.


                          
                          
                        
                        
                        
                          
                            
                            bballdave025
                            (418 rep)
                          
                          
                            
                            Jun 3, 2025, 04:56 AM
                            
                              • Last activity: Jul 25, 2025, 03:50 AM



              
              
            
              
                
                  
                    
                      
                        
                          
                            0
                            votes
                          
                          
                            1
                            answers
                          
                          
                            723
                            views
                          
                        
                      
                      
                        
                          
                            grep behaviour is different when run using bash -c '...'
                          
                          
                        
                        
                        
                          
                            
                              shell
                            
                          
                            
                              grep
                            
                          
                        

                        
                        
                          
                            I met an interesting issue while working with this code from Stack Overflow: [tripleee's answer on "How to check if a file contains only zeros in a Linux shell?"][1] Why does the same bash code produce different result depending on interactive shell or subshell? Make all-zeros file with name `your_f...
                          
                          
                          
                          
                            
                              
                                
                                  I met an interesting issue while working with this code from Stack Overflow: tripleee's answer on "How to check if a file contains only zeros in a Linux shell?" 

Why does the same bash code produce different result depending on interactive shell or subshell?

Make all-zeros file with name your_file.
$ truncate -s 1K your_file
Interactive shell example
$ tr -d '\0' 

The same code but using subshell
$ bash -c 'tr -d '\0' 

And also interesting fact. I changed original example by adding -a option ("equivalent to --binary-files=text") because without this option interactive shell works but subshell:
$ bash -c 'tr -d '\0' 


P.S. I use bash 5.2.37(1)-release from Ubuntu 25.04
                                

                              
                            

                          

                          
                          
                        
                        
                        
                          
                            
                            Андрей Тернити
                            (303 rep)
                          
                          
                            
                            Jul 21, 2025, 01:34 PM
                            
                              • Last activity: Jul 24, 2025, 04:33 AM
                            
                          
                        
                      

                    

                  

                

              


              
              
            
              
                
                  
                    
                      
                        
                          
                            3
                            votes
                          
                          
                            2
                            answers
                          
                          
                            543
                            views
                          
                        
                      
                      
                        
                          
                            Grep command with the side effect of adding a trailing newline character in the last line of file
                          
                          
                        
                        
                        
                          
                            
                              grep
                            
                          
                            
                              regular-expression
                            
                          
                            
                              newlines
                            
                          
                        

                        
                        
                          
                            I've been doing some research on how to correctly read lines from a file whose last line may not have a trailing newline character. Have found the answer in [Read a line-oriented file which may not end with a newline](https://unix.stackexchange.com/questions/418060/read-a-line-oriented-file-which-ma...
                          
                          
                          
                          
                            
                              
                                
                                  I've been doing some research on how to correctly read lines from a file whose last line may not have a trailing newline character. Have found the answer in [Read a line-oriented file which may not end with a newline](https://unix.stackexchange.com/questions/418060/read-a-line-oriented-file-which-may-not-end-with-a-newline) .

However, I have a second goal that is to exclude the comments at the beginning of lines and have found a [grep](http://man7.org/linux/man-pages/man1/grep.1.html)  command that achieves the goal

    $ grep -v '^ *#' file

But I have noticed that this command has a (for me unexpected) side behavior: it adds a trailing newline character in the last line if it does not exist

$ cat file
# This is a commentary
aaaaaa
# This is another commentary
bbbbbb
cccccc

$ od -c file
0000000   #       T   h   i   s       i   s       a       c   o   m   m
0000020   e   n   t   a   r   y  \n   a   a   a   a   a   a  \n   #
0000040   T   h   i   s       i   s       a   n   o   t   h   e   r
0000060   c   o   m   m   e   n   t   a   r   y  \n   b   b   b   b   b
0000100   b  \n   c   c   c   c   c   c  \n
0000111

$ truncate -s -1 file

$ od -c file
0000000   #       T   h   i   s       i   s       a       c   o   m   m
0000020   e   n   t   a   r   y  \n   a   a   a   a   a   a  \n   #
0000040   T   h   i   s       i   s       a   n   o   t   h   e   r
0000060   c   o   m   m   e   n   t   a   r   y  \n   b   b   b   b   b
0000100   b  \n   c   c   c   c   c   c
0000110

$ od -c <(grep -v '^ *#' file)
0000000   a   a   a   a   a   a  \n   b   b   b   b   b   b  \n   c   c
0000020   c   c   c   c  \n
0000025
Notice that besides removing the line beginning comments it also adds a  trailing newline character in the last line.

How could that be?
                                
                              
                            
                          
                          
                          
                        
                        
                        
                          
                            
                            Paulo Tom&#233;
                            (3832 rep)
                          
                          
                            
                            Jan 17, 2020, 06:00 PM
                            
                              • Last activity: Jul 18, 2025, 07:13 AM
                            
                          
                        
                      
                    
                  
                
              

              
              
                
                  
                
              
            
              
                
                  
                    
                      
                        
                          
                            1
                            votes
                          
                          
                            2
                            answers
                          
                          
                            93
                            views
                          
                        
                      
                      
                        
                          
                            How to extract a sub-heading as string which is above a search for word
                          
                          
                        
                        
                        
                          
                            
                              awk
                            
                          
                            
                              sed
                            
                          
                            
                              grep
                            
                          
                            
                              search
                            
                          
                        

                        
                        
                          
                            I'm new to Bash and I've been self-taught. I think I'm learning well, but I do have staggering gaps in my base knowledge. So sorry if this is woefully simple bbuuuttt... Essentially, I need to sift through a large amount of data and pull out specific phrases. I've been making slow and steady progres...
                          
                          
                          
                          
                            
                              
                                
                                  I'm new to Bash and I've been self-taught. I think I'm learning well, but I do have staggering gaps in my base knowledge. So sorry if this is woefully simple bbuuuttt...

Essentially, I need to sift through a large amount of data and pull out specific phrases. I've been making slow and steady progress, but I'm now stuck on getting a heading for a line of data.

Here's what the file looks like:

A lot (AND I MEAN A LOT) of data above  
STATE  1:
133a -> 135a  :     0.010884 (c= -0.10432445)
134a -> 135a  :     0.933650 (c= -0.96625573)

STATE  2:
129a -> 135a  :     0.016601 (c= -0.12884659)
130a -> 135a  :     0.896059 (c= -0.94660402)
130a -> 136a  :     0.011423 (c=  0.10687638)
130a -> 137a  :     0.023884 (c= -0.15454429)
130a -> 138a  :     0.020361 (c= -0.14269354)

STATE  3:
133a -> 135a  :     0.899436 (c= -0.94838591)
134a -> 136a  :     0.012334 (c= -0.11106052)       

STATE  4:
129a -> 135a  :     0.688049 (c= -0.82948703)
129a -> 136a  :     0.212819 (c= -0.46132295)
129a -> 137a  :     0.036987 (c=  0.19231930)
130a -> 135a  :     0.011990 (c=  0.10949722)
134a -> 135a  :     0.922010 (c= -0.98192034)
There are many more states (up to 30) of varying length below, which may also include what I'm looking for.
And then more data below that
    


I have got the numbers I am looking for saved in variables. (134 and 135 for this example) And I can use :          
"${a}a -> ${b}a" File.Name;          
to show me the lines that have 134 -> 135 on, but I need the STATE that they are in.

I've tried using grep to look above the found lines to the nearest line with STATE in, but I couldn't figure out how to set the length of -B as a condition rather than a number (don't know if it can be done). I have also tried with awk and sed to find the line with STATE and look below to see if 134 -> 135 is benethe it before the next STATE, but I couldn't find a way to stop it and not print at the next STATE instead of just continuing until it found the next 134 -> 135. The Ideal output (for the above example) would be:

STATE  1
STATE  4

but

STATE  1:
133a -> 135a  :     0.010884 (c= -0.10432445)
134a -> 135a  :     0.933650 (c= -0.96625573)
STATE  4:
129a -> 135a  :     0.688049 (c= -0.82948703)
129a -> 136a  :     0.212819 (c= -0.46132295)
129a -> 137a  :     0.036987 (c=  0.19231930)
130a -> 135a  :     0.011990 (c=  0.10949722)
134a -> 135a  :     0.922010 (c= -0.98192034)

is also absolutely fine. I just need it to spit out the correct STATES and no others. It doesn't really matter what other data comes with it.

Also, this is going to be applied to about 40 other files with similar layouts, so I need it not to be specific to this one (aka not grep STATE 1 and grep state 4)

I'm hoping someone can help me or tell me if this is impossible to do. 
                                
                              
                            
                          
                          
                          
                        
                        
                        
                          
                            
                            TC575
                            (13 rep)
                          
                          
                            
                            Jul 11, 2025, 08:37 PM
                            
                              • Last activity: Jul 16, 2025, 09:53 AM
                            
                          
                        
                      
                    
                  
                
              

              
              
            
              
                
                  
                    
                      
                        
                          
                            4
                            votes
                          
                          
                            1
                            answers
                          
                          
                            355
                            views
                          
                        
                      
                      
                        
                          
                            Is there a way to set up default values for grep options such as `--exclude-dir`?
                          
                          
                        
                        
                        
                          
                            
                              grep
                            
                          
                            
                              defaults
                            
                          
                        

                        
                        
                          
                            I often have to use the `--exclude-dir` to exclude various folders such as `.git` from the search path. grep -r --exclude-dir=.git --exclude-dir=another_path XXX Then, the command line becomes quite lengthy every time. *Is there a way to make the `--exclude-dir` options the default, so that `grep -r...
                          
                          
                          
                          
                            
                              
                                
                                  I often have to use the  --exclude-dir to exclude various folders such as .git from the search path.

    grep -r --exclude-dir=.git --exclude-dir=another_path XXX

Then, the command line becomes quite lengthy every time.

*Is there a way to make the --exclude-dir options the default, so that grep -r XXX is equivalent to above?* 

(This is with Ubuntu 24.04 LTS).
                                
                              
                            
                          
                          
                          
                        
                        
                        
                          
                            
                            tinlyx
                            (1072 rep)
                          
                          
                            
                            Jul 10, 2025, 08:55 AM
                            
                              • Last activity: Jul 10, 2025, 12:29 PM
                            
                          
                        
                      
                    
                  
                
              

              
              
            
              
                
                  
                    
                      
                        
                          
                            0
                            votes
                          
                          
                            0
                            answers
                          
                          
                            26
                            views
                          
                        
                      
                      
                        
                          
                            How can I grep the output of ffprobe?
                          
                          
                        
                        
                        
                          
                            
                              grep
                            
                          
                            
                              pipe
                            
                          
                            
                              ffmpeg
                            
                          
                        

                        
                        
                          
                            I'd like to only see those lines containing "Stream #" from an ffprobe output. But whatever I do, it continues to show the whole output. Neither "|" nor ">" pipes work. What’s the magic? Thanks a lot! ffprobe -v info video.mp4 | grep "Stream #"
                          
                          
                          
                          
                            
                              
                                
                                  I'd like to only see those lines containing "Stream #" from an ffprobe output. But whatever I do, it continues to show the whole output. Neither "|" nor ">" pipes work. What’s the magic? Thanks a lot!

    ffprobe -v info video.mp4 | grep "Stream #"


                                
                              
                            
                          
                          
                          
                        
                        
                        
                          
                            
                            Gary U.U. Unixuser
                            (339 rep)
                          
                          
                            
                            Jun 28, 2025, 11:16 AM
                            
                              • Last activity: Jun 28, 2025, 12:51 PM
                            
                          
                        
                      
                    
                  
                
              

              
              
                
                  
                
              
            
              
                
                  
                    
                      
                        
                          
                            0
                            votes
                          
                          
                            1
                            answers
                          
                          
                            97
                            views
                          
                        
                      
                      
                        
                          
                            How to match exact string?
                          
                          
                        
                        
                        
                          
                            
                              grep
                            
                          
                            
                              scripting
                            
                          
                        

                        
                        
                          
                            I tried this grep -rn "application_config_project" . I got many application_config_project_name = f"{app_acronym}-application-config" github_application_config_repo = application_github_organization.get_repository(application_config_project_name) f"Initializing Application Config Project {applicatio...
                          
                          
                          
                          
                            
                              
                                
                                  I tried this

    grep -rn "application_config_project" .

I got many



    application_config_project_name = f"{app_acronym}-application-config"
    github_application_config_repo = application_github_organization.get_repository(application_config_project_name)
    f"Initializing Application Config Project {application_config_project_name}"
    
how to restrict my search just to application_config_project?



                                
                              
                            
                          
                          
                          
                        
                        
                        
                          
                            
                            MJoao
                            (47 rep)
                          
                          
                            
                            Jun 27, 2025, 08:14 AM
                            
                              • Last activity: Jun 28, 2025, 01:48 AM
                            
                          
                        
                      
                    
                  
                
              

              
              
            
              
                
                  
                    
                      
                        
                          
                            11
                            votes
                          
                          
                            4
                            answers
                          
                          
                            10714
                            views
                          
                        
                      
                      
                        
                          
                            What is wrong with using "\t" to grep for tab-separated values?
                          
                          
                        
                        
                        
                          
                            
                              grep
                            
                          
                            
                              regular-expression
                            
                          
                            
                              tabulation
                            
                          
                        

                        
                        
                          
                            I have a .tsv file (values separated by tabs) with four values. So each line should have only three tabs and some text around each tab like this: value value2 value3 value4 But it looks that some lines are broken (there is more than three tabs). I need to find out these lines. --- I came up with fol...
                          
                          
                          
                          
                            
                              
                                
                                  I have a .tsv file (values separated by tabs) with four values. So each line should have only three tabs and some text around each tab like this:

    value	value2	value3	value4

But it looks that some lines are broken (there is more than three tabs). I need to find out these lines.

---

I came up with following grep pattern. 

grep -v "^[^\t]+\t[^\t]+\t[^\t]+\t[^\t]+$"

My thinking:
- first ^ matches the beggining
- [^\t]+ matches more than one "no tab character"
- \t matches single tab character
- $ matches end

And than I just put it into right order with correct number of times. That should match correct lines. So I reverted it by -v option to get the wrong lines.

But with the -v option it matches any line in the file and also some random text I tried that don't have any tabs inside. 

**What is my mistake please?**


EDIT: I am using debian and bash.

                                
                              
                            
                          
                          
                          
                        
                        
                        
                          
                            
                            TGar
                            (307 rep)
                          
                          
                            
                            Aug 16, 2022, 11:47 AM
                            
                              • Last activity: Jun 27, 2025, 08:47 AM
                            
                          
                        
                      
                    
                  
                
              

              
              
            
              
                
                  
                    
                      
                        
                          
                            6
                            votes
                          
                          
                            5
                            answers
                          
                          
                            2245
                            views
                          
                        
                      
                      
                        
                          
                            How to make grep for a regex that appear multiple times in a line
                          
                          
                        
                        
                        
                          
                            
                              sed
                            
                          
                            
                              grep
                            
                          
                            
                              regular-expression
                            
                          
                        

                        
                        
                          
                            I want to grep a regex. The pattern I am searching for may appear multiple times in a line. If the pattern appeared many times, I want to separate each occurrence by a comma and print **the match only** not the full line in a new file. If it did not appear in a line I want to print **n.a.** Example....
                          
                          
                          
                          
                            
                              
                                
                                  I want to grep a regex. The pattern I am searching for may appear multiple times in a line. If the pattern appeared many times, I want to separate each occurrence by a comma and print **the match only** not the full line in a new file. If it did not appear in a line I want to print **n.a.**

Example. I want to use this regex to find numbers in the pattern: [12.123.1.3]. 

    grep -oh "\[\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\]" 'filename'

input file (input.txt)

    blabla [11.335.2.33] xyuoeretrete [43.22.11.88] jfdfjkfbs [55.66.77.88]
    blabla [66.223.44.33]
    foo bar
    blabla [1.2.33.3] xyuoeretrete  bla[1.32.2.4]

intended result in a new file (output.csv):

    11.335.2.33,43.22.11.88,55.66.77.88
    66.223.44.33
    n.a.
    1.2.33.3,1.32.2.4

**Note: I use Ubuntu**
                                
                              
                            
                          
                          
                          
                        
                        
                        
                          
                            
                            randomname
                            (161 rep)
                          
                          
                            
                            Jun 24, 2022, 09:08 AM
                            
                              • Last activity: Jun 24, 2025, 09:28 AM
                            
                          
                        
                      
                    
                  
                
              

              
              
                
                  
                
              
            
              
                
                  
                    
                      
                        
                          
                            2
                            votes
                          
                          
                            1
                            answers
                          
                          
                            68
                            views
                          
                        
                      
                      
                        
                          
                            Redirect `rtf` output to file
                          
                          
                        
                        
                        
                          
                            
                              bash
                            
                          
                            
                              text-processing
                            
                          
                            
                              grep
                            
                          
                            
                              pdf
                            
                          
                        

                        
                        
                          
                            ### System Info ``` alinuxchap@libertus-desktop:/usr/share/X11/xkb $ uname -a Linux libertus-desktop 6.12.25+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.12.25-1+rpt1 (2025-04-30) aarch64 GNU/Linux alinuxchap@libertus-desktop:/usr/share/X11/xkb $ ``` ### Cmd ``` cd /home/alinuxchap/Documents/shared/dat/EDS/...
                          
                          
                          
                          
                            
                              
                                
                                  ### System Info

alinuxchap@libertus-desktop:/usr/share/X11/xkb $ uname -a
Linux libertus-desktop 6.12.25+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.12.25-1+rpt1 (2025-04-30) aarch64 GNU/Linux
alinuxchap@libertus-desktop:/usr/share/X11/xkb $

### Cmd

cd /home/alinuxchap/Documents/shared/dat/EDS/it
echo "" > output.txt
while read author; do
	echo $author
	pdfgrep "$author" *.pdf |& tee -a output.txt
done 


### Problem

- grep outputs text matches in bold and red
- I don't want to use grep -p as I also need to see the 'snippet' of context the term is being used in


It's useful for archiving command output as 'logs'; the same problem arises with copy and paste, as that doesn't preserve rtf either.
                                

                              
                            

                          

                          
                          
                        
                        
                        
                          
                            
                            Signor Pizza
                            (25 rep)
                          
                          
                            
                            Jun 17, 2025, 05:35 PM
                            
                              • Last activity: Jun 20, 2025, 08:02 PM
                            
                          
                        
                      

                    

                  

                

              


              
              
            
              
                
                  
                    
                      
                        
                          
                            14
                            votes
                          
                          
                            12
                            answers
                          
                          
                            11414
                            views
                          
                        
                      
                      
                        
                          
                            Count the number of blank lines at the end of file
                          
                          
                        
                        
                        
                          
                            
                              text-processing
                            
                          
                            
                              grep
                            
                          
                            
                              wc
                            
                          
                        

                        
                        
                          
                            I have a file with blank lines at the end of the file. Can I use `grep` to count the number of blank lines at the end of the file with the file name being passed as variable in the script?
                          
                          
                          
                          
                            
                              
                                
                                  I have a file with blank lines at the end of the file. 
Can I use grep to count the number of blank lines at the end of the file with the file name being passed as variable in the script?
                                
                              
                            
                          
                          
                          
                        
                        
                        
                          
                            
                            Raghunath Choudhary
                            (153 rep)
                          
                          
                            
                            Nov 30, 2017, 09:43 AM
                            
                              • Last activity: Jun 14, 2025, 08:57 PM


          
          

          
          
            
              
                
                  
                    Previous
                  
                
                
                
                  Page 1
                
                
                
                  
                    
                      Next
                    
                  
                
              
            
            
            
              
              Showing page 1 of 20 total questions