Sample Header Ad - 728x90

How do I properly convert the file to UTF-16LE encoding without strange characters appearing in the file?

4 votes
1 answer
4063 views
I'm having some peculiarities with the dictionary file of .dsl format I'm trying to convert. It's essentially a text file with the dictionary pairs. The dictionary software I use is GoldenDict. It requires UTF-16 dictionaries so they render properly. All the dictionaries I have are UTF-16LE format. There is one standing out however. It has iso-8859-1 encoding. An entry looks like this when I open it with vim: abandonarse [m2][c crimson][b]Sinónimos[/b][/c][/m] [m2][i][c green]verbo[/c][/i][/m] [m1][trn][b]desanimarse:[/b] >, >, >, >, >, >[/trn][/m] I have to convert it to UTF-16LE because Goldendict renders some Cyrillic characters instead of Spanish accented characters. Then I try: iconv -f iso-8859-1 -t utf-16le dictionary.dsl -o test.dsl The new test.dsl dictionary is rendered correctly by Goldendict, however I can see some peculiar things I would love to get rid of. First is that the just converted file's encoding is not recognized as it usually is with the other dictionaries: aleksandr@desktop:~/windoc/Dic/Es extra/dictionary.dsl> file dictionary.dsl dictionary: data When I open the file test.dsl with vim every character inside has ^@ added to it. Here is the example of the same entry: ^@^@>^@,^@ ^@^@>^@[^@/^@t^@r^@n^@]^@[^@/^@m^@]^@ ^@ ^@[^@m^@2^@]^@[^@c^@ ^@c^@r^@i^@m^@s^@o^@n^@]^@[^@b^@]^@A^@n^@t^@ó^@n^@i^@m^@o^@s^@[^@/^@b^@]^@[^@/^@c^@]^@[^@/^@m^@]^@ ^@ ^@[^@m^@2^@]^@[^@i^@]^@[^@c^@ ^@g^@r^@e^@e^@n^@]^@v^@e^@r^@b^@o^@[^@/^@c^@]^@[^@/^@i^@]^@[^@/^@m^@]^@ I tried removing this characters in vim %s///g However, then I save the file, it has the encoding iso-8859-1 again. I would like to have this file to be show without ^@ characters, because I may need to edit some headings in the dictionary manually.
Asked by user7748093 (43 rep)
Sep 8, 2020, 03:14 PM
Last activity: Sep 9, 2020, 04:46 PM