Count the longest stretch of consecutive patterns
5
votes
6
answers
570
views
I have a sequence file:
$ cat file CACCGTTGCCAAACAATG TTAGAAGCCTGTCAGCCT CATTGCTCTCAGACCCAC GATGTACGTCACATTAGA ACACGGAATCTGCTTTTT CAGAATTCCCAAAGATGGI want to calculate the longest stretch of C+T. I could only count total C+T, but I want the longest stretch.
$ cat file | awk '{ print $0, gsub(/[cCtT]/,"",$1)}' CACCGTTGCCAAACAATG 9 TTAGAAGCCTGTCAGCCT 10 CATTGCTCTCAGACCCAC 12 GATGTACGTCACATTAGA 8 ACACGGAATCTGCTTTTT 11 CAGAATTCCCAAAGATGG 7The *Expected result* would be to show the longest C+T stretch.
CACCGTTGCCAAACAATG 9 2 TTAGAAGCCTGTCAGCCT 10 3 CATTGCTCTCAGACCCAC 12 5 GATGTACGTCACATTAGA 8 2 ACACGGAATCTGCTTTTT 11 6 CAGAATTCCCAAAGATGG 7 5
Asked by CN_229133
(115 rep)
Jun 29, 2018, 09:32 AM
Last activity: Feb 7, 2024, 09:58 AM
Last activity: Feb 7, 2024, 09:58 AM