Advanced CLI tool/code to determine text encoding (besides enca)
0
votes
0
answers
75
views
Looking for advanced CLI tool/code to **determine text Codepage/Language** (besides enca ).
Goal: Automate as much as possible conversion of hundreds/thousands of 8-bit text files (including non-ASCII characters) to UTF-8 (most of which are generated in Windows).
If the tool can only work if it's given a few languages (even if 1 language), but is very good at determining Codepage - this work also.
Speed does NOT matter (would rather run it overnight than babysit every single text file).
Couldn't find yet any description of algorithm enca uses, therefore cannot estimate how good is it.
Text file can be assumed to have single/same codepage.
My script is in **BASH**.
It's OK to compile smth.
Please don't offer to send files to online services or some online AI (100% offline AI is fine).
Asked by strider
(43 rep)
Feb 27, 2025, 06:34 PM
Last activity: Feb 27, 2025, 09:51 PM
Last activity: Feb 27, 2025, 09:51 PM