Sample Header Ad - 728x90

regex: how come the trademark symbol matches to a-z?

0 votes
1 answer
138 views
Sorry if this is a repeat or basic question but it is hard to search for a ™. I'm writing a script to remove weird characters from file names. How come the trade mark symbol matches [^a-z] ???
$ echo "AMD Ryzen™ 5 2600X Processor rstuv" |sed 's/[^A-Z]//g'
AMDRXP

$ echo "AMD Ryzen™ 5 2600X Processor rstuv" |sed 's/[^a-z]//g'
yzen™rocessorrstuv

$ echo "AMD Ryzen™ 5 2600X Processor rstuv" |sed 's/[^s-t]//g'
ssst

$ echo "AMD Ryzen™ 5 2600X Processor rstuv" |sed 's/[^t-u]//g'
™tu
seems to be between t and u ? Edit: System Specs:
$ locale
LANG=en_CA.UTF-8
LANGUAGE=en_CA:en
LC_CTYPE="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME=en_CA.UTF-8
LC_COLLATE="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_PAPER="en_CA.UTF-8"
LC_NAME="en_CA.UTF-8"
LC_ADDRESS="en_CA.UTF-8"
LC_TELEPHONE="en_CA.UTF-8"
LC_MEASUREMENT="en_CA.UTF-8"
LC_IDENTIFICATION="en_CA.UTF-8"
LC_ALL=

$ lsb_release -sdc; uname -sri
Ubuntu 20.04.6 LTS
focal
Linux 5.4.0-172-generic x86_64

$ sed --version
sed (GNU sed) 4.7
Asked by codywohlers (177 rep)
Apr 7, 2024, 08:43 AM
Last activity: Apr 7, 2024, 01:06 PM