Converting a UTF-8 file to ASCII (best-effort)
45
votes
5
answers
159442
views
I have a file in UTF-8 that contains texts in multiple languages. A lot of it are people's names. I need to convert it to ASCII and I need the result to look as decent as possible.
There are many ways how to approach converting from a wider encoding to a narrower one. The simplest transformation would be to replace all non-ASCII characters with some placeholder, like '_'. If I know the language the file is written in, there are additional possibilities, like romanization.
What Unix tool or programming language library available on Unix can give me a decent (best-effort) conversion from UTF-8 to ASCII?
Most of the text is in European, latin type based languages.
Asked by user7610
(2188 rep)
Dec 6, 2014, 04:53 PM
Last activity: Apr 18, 2023, 09:38 AM
Last activity: Apr 18, 2023, 09:38 AM