Split string with 0-2 / (or determine there's none) (Bash)
-1
votes
2
answers
62
views
Update: Up to 2 "/" in the string.
String structure is either:
Character set name/LF
Character set name/CRLF
Character set name/CRLF/(unknown purpose, likely a number)
Character set name
Example: "UTF-8/CRLF"
"UCS-2/CRLF/21"
That is there may be only Character set name (unknown beforehand) without any "/" separator.
Character set name may contain "-" and "_" (no need to separate here).
Need to assign to:
VAR1=Character set name
VAR2=CRLF or LF part between 1st "/" and 2nd "/" (or empty string if there's no "/").
VAR3=Remainer after 2nd "/".
Some kind of true/false (0/1) for VAR2 is OK also (it will be processed with if/else later in script).
Tried
cut -d/ -f
, but cut -d/ -f 2
returns "Character set name" even **if there's no "/"**, so it doesn't work for me.
For **Bash** script a **Faster** solution is preferred as it will be run many times.
I do need to call a function as /bin/bash -c
b/c its called in find -exec
.
Code (mostly based on Choroba's answer):
#!/bin/bash
shopt -s extglob
function convert_single_text_file_to_utf8(){
CUR_FILE_ENCODING_WITH_CRLF=$1
echo "CUR_FILE_ENCODING_WITH_CRLF=${CUR_FILE_ENCODING_WITH_CRLF}"
CUR_FILE_ENCODING_ONLY=${CUR_FILE_ENCODING_WITH_CRLF%%/*} # Remove everything starting from the last slash.
LINE_FEED=${CUR_FILE_ENCODING_WITH_CRLF##$CUR_FILE_ENCODING_ONLY?(/)} # Remove the charset, followed by a slash if any.
echo "CUR_FILE_ENCODING_ONLY=${CUR_FILE_ENCODING_ONLY} LINE_FEED=${LINE_FEED}"
}
export -f convert_single_text_file_to_utf8
for ENCODING in ASCII UTF-8/CRLF ISO-8859-2/LF EBCDIC-CA-FR; do
echo "ENCODING=$ENCODING"
export ENCODING
/bin/bash -c 'shopt -s extglob; convert_single_text_file_to_utf8 "$ENCODING" '
done
Asked by strider
(43 rep)
Feb 27, 2025, 05:11 PM
Last activity: Feb 27, 2025, 11:28 PM
Last activity: Feb 27, 2025, 11:28 PM