Sample Header Ad - 728x90

How to fetch fasta sequences corresponding to header lines in another file

2 votes
5 answers
1439 views
I have a file of lines of headers (file1) and another file is sequences in fasta format (file2). I want to grep fasta sequences if a header line from file1 is present in file2. Example: * file1:
>sp|B7UM99|TIR_ECO27
    >sp|P06616|ERA_ECOLI
* file2:
>sp|B7UM99|TIR_ECO27
    MPIGNLGNNVNGNHLIPPAPPLPSQTDGAA
    RGGTGHLISSTGALGSRSLFSPLRNSMADS
    VDSRDIPGLPTNPSRLAAATSETCLLGGFE
    VLHDKGPLDILNTQIGPSAFRVEVQADGTH
    ......
    >sp|P06616|ERA_ECOLI
    MSIDKSYCGFIAIVGRPNVGKSTLLNKLL
    GQKISITSRKAQTTRHRIVGIHTEGAYQAIY
    VDTPGLHMEEKRAINRLMNKAASSSIGDVE
    LVIFVVEGTRWTPDDEMVLNKLREGKAPVI
    ............
    >sp|P0AD68|HUMAN
    MKAAAKTQKPKRQEEHANFISWRFALLCGC
    ILLALAFLLGRVAWLQVISPDMLVKEGDMR
    SLRVQQVSTSRGMITDRSGRPLAVSVPVKA
    IWADPKEVHDAGGISVGDRWKALANALNIP
    .............
* Desired output
>sp|B7UM99|TIR_ECO27
    MPIGNLGNNVNGNHLIPPAPPLPSQTDGAA
    RGGTGHLISSTGALGSRSLFSPLRNSMADS
    VDSRDIPGLPTNPSRLAAATSETCLLGGFE
    VLHDKGPLDILNTQIGPSAFRVEVQADGTH
    ......
    >sp|P06616|ERA_ECOLI
    MSIDKSYCGFIAIVGRPNVGKSTLLNKLL
    GQKISITSRKAQTTRHRIVGIHTEGAYQAIY
    VDTPGLHMEEKRAINRLMNKAASSSIGDVE
    LVIFVVEGTRWTPDDEMVLNKLREGKAPVI
    ............
Asked by Manoj Kumar (41 rep)
May 9, 2019, 08:25 PM
Last activity: Apr 14, 2024, 06:48 AM