Sample Header Ad - 728x90

Delete rows from a simplified CSV where some columns matching specific pattern

2 votes
3 answers
1046 views
I have the following simplified CSV file (no separators or newlines embedded in fields): ID,PDBID,FirstResidue,FirstChain,SecondResidue,SecondChain,ThirdResidue,ThirdChain,FourthResidue,FourthChain,Pattern RZ_AUTO_505,1hmh,A22L,C,A22L,A,G21L,A,A23L,A,AA/GA Naked ribose RZ_AUTO_506,1hmh,A22L,C,A22L,A,G114,A,A23L,A,AA/GA Naked ribose RZ_AUTO_507,1hmh,A130,E,A90,A,G80,A,A130,A,AA/GA Naked ribose RZ_AUTO_508,1hmh,A140,E,A90,E,G120,A,A90,A,AA/GA Naked ribose RZ_AUTO_509,1hmh,G102,A,C103,A,G102,E,A90,E,GC/GA Single ribose RZ_AUTO_510,1hmh,G102,A,C103,A,G120,E,A90,E,GC/GA Single ribose RZ_AUTO_511,1hmh,G113,C,C112,C,G21L,A,A23L,A,GC/GA Single ribose RZ_AUTO_512,1hmh,G113,C,C112,C,G114,A,A23L,A,GC/GA Single ribose RZ_AUTO_513,1hnw,C1496,A,G1497,A,A1518,A,A1519,A,CG/AA Canonical ribose RZ_AUTO_514,1hnw,C1496,A,G1497,A,A1519,A,A1518,A,CG/AA Canonical ribose RZ_AUTO_515,1hnw,C221,A,U222,A,A195,A,A196,A,CU/AA Canonical ribose RZ_AUTO_516,1hnw,C221,A,U222,A,A196,A,A195,A,CU/AA Canonical ribose I need to remove the CSV rows if the value of FirstResidue or SecondResidue or ThirdResidue or FourthResidue doesn't end with an integer. The output should look something like below. RZ_AUTO_507,1hmh,A130,E,A90,A,G80,A,A130,A,AA/GA Naked ribose RZ_AUTO_508,1hmh,A140,E,A90,E,G120,A,A90,A,AA/GA Naked ribose RZ_AUTO_509,1hmh,G102,A,C103,A,G102,E,A90,E,GC/GA Single ribose RZ_AUTO_510,1hmh,G102,A,C103,A,G120,E,A90,E,GC/GA Single ribose RZ_AUTO_513,1hnw,C1496,A,G1497,A,A1518,A,A1519,A,CG/AA Canonical ribose RZ_AUTO_514,1hnw,C1496,A,G1497,A,A1519,A,A1518,A,CG/AA Canonical ribose RZ_AUTO_515,1hnw,C221,A,U222,A,A195,A,A196,A,CU/AA Canonical ribose RZ_AUTO_516,1hnw,C221,A,U222,A,A196,A,A195,A,CU/AA Canonical ribose So I'm just wondering how to achieve this using awk. I'm using Mac OSX.
Asked by Sri (165 rep)
Feb 13, 2015, 04:27 AM
Last activity: Dec 14, 2022, 09:44 PM