How to remove lines with nonsense format numbers?
2
votes
2
answers
270
views
I have the following data that I am processing to get the 1st and 5th column, convert the
D
format to E
format and delete rows that have gibberish numbers such as 9.410-316
.
DEG = 1.500
2.600D+01 0.000D+00 0.000D+00 0.000D+00 0.000D+00
2.700D+01 8.720-304 2.369-316 7.556-316 9.410-316
4.300D+01 1.208D-83 4.156D-96 7.360D-96 6.984D-96
1.590D+02 8.002D-07 6.555D-19 7.748D-19 7.376D-19
1.600D+02 1.173D-06 9.669D-19 1.143D-18 1.089D-18
1.610D+02 1.709D-06 1.417D-18 1.676D-18 1.596D+01
1.620D+02 2.468D-06 2.058D-18 2.436D-18 2.320D-10
DEG = 18.500
2.700D+01 2.794-314 0.000D+00 0.000D+00 0.000D+00
2.800D+01 4.352-285 1.224-297 3.685-297 4.412-297
8.800D+01 1.371D-02 6.564D-15 7.852D-15 7.275D-15
My problem is in identifying the number formats that I want to delete. So far, I have tried
-bash
maxa=18.5
maxangle=$(printf "%.3f" $maxa)
if (( $(echo "$maxa 10, else only 5)
else
txt2search="DEG = $maxangle"
fi
line=$(grep -n "$txt2search" file | cut -d : -f 1)
# Once the line number is read for the string, skip a few lines (4) and read next several lines(1000)
beginline=$((line + 4))
endline=$((line + 1002))
awk -v a="$beginline" -v b="$endline" 'NR==a, NR==b {print $1, $5}' fileinput > fileoutput
sed -i 's/D/E/g' fileoutput
Then, to discard the rows with the nonsense numbers, I tried (one at a time) and failed with the following commands.
-bash
sed -ni '/E/p' fileoutput
sed -E '/(E)/!d' fileoutput > spec2.tempdata
sed '/E/!d' fileoutput > spec2.tempdata
awk '!/E/' fileoutput > spec2.tempdata
How can I identify and remove lines with such nonsense numbers? The versions are
* sed (GNU sed) 4.7
* grep (GNU grep) 3.4
* GNU Awk 5.0.1, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.2.0)
The output would be
2.600D+01 0.000D+00 0.000D+00 0.000D+00 0.000D+00
4.300D+01 1.208D-83 4.156D-96 7.360D-96 6.984D-96
1.590D+02 8.002D-07 6.555D-19 7.748D-19 7.376D-19
1.600D+02 1.173D-06 9.669D-19 1.143D-18 1.089D-18
1.610D+02 1.709D-06 1.417D-18 1.676D-18 1.596D+01
1.620D+02 2.468D-06 2.058D-18 2.436D-18 2.320D-10
**EDIT:** The solution that I was looking for is (see first comment)
-bash
grep -v '[0-9]-'
Asked by csnl
(35 rep)
Apr 11, 2023, 04:57 PM
Last activity: Apr 16, 2023, 10:16 PM
Last activity: Apr 16, 2023, 10:16 PM