Sample Header Ad - 728x90

awk and egrep for regular expression

0 votes
3 answers
649 views
I am very new to unix! trying to figure out, from a fastq file how many reads have 3 or MORE As in a row? I used egrep 'A{3}' to tell me how many AAA I have. But now I want to know >= 3 AAA in a row. But >= doesn't work. Can I use awk to help me determine this? Also, how can I use regular expression to determine How many reads have a run of 4 or more As followed by something other than a T? (G C or A) So A has to be >= 4, and followed by GCorA EDIT: When I mean to say 3As in a row, I mean something like this: GGCTAAAAAACGGAT
Asked by Sarah (1 rep)
Mar 25, 2020, 03:41 AM
Last activity: Mar 26, 2020, 12:48 AM