How to fix aspell (bn-BD) invalid strings compiling issue
1
vote
0
answers
140
views
When I tried to re-create
aspell-bn
(bn-BD) i.e. of Bengali(BD) language spellchecker with help of this [link](http://blog-archive.copyninja.info/2011/05/tutorial-creating-aspell-dictionaries.html) for aspell6 by compiling like the following
$ cvs -z3 -d:pserver:anonymous@cvs.savannah.gnu.org:/sources/aspell co aspell-lang
$ cd aspell-lang
$ ./pre bn u-beng
$ cd bn
$ ./proc create
This creates configure script and Makefile.pre and finally copying the word list file to working directory and named it as [bn.wl](https://drive.google.com/file/d/1LknCOk0yT59rWIBf55mKD766gfkMxHlP) Word list file which contains list of words separated by new lines. And then
$ ./configure
$ make
(/usr/sbin/prezip-bin -d < bn.cwl | /usr/sbin/aspell --lang=bn create master ./bn.rws)
But when I run "make", it prints many error messages and warnings like the following 2 types (The Unicode code points [**U+09CE**](https://en.wiktionary.org/wiki/%E0%A7%8E) and [**U+FEFF**](https://en.wikipedia.org/wiki/Byte_order_mark) are unsupported), though according to [u-beng.txt](https://pastebin.com/qEwaBYby) **U+09CE** is clearly within the defined range.
Warning: The string "হৃৎযন্ত্র" is invalid. The Unicode code point U+09CE is unsupported. Skipping string.
Warning: The string "ক্ল্যারিজ" is invalid. The Unicode code point U+FEFF is unsupported. Skipping string.
After all of these, I do get a bn.rws file but it does not include all the words listed in bn.wl.
I am using parabola-arch x86-64
And pacman -Qi aspell
returns
Name : aspell
Version : 0.60.6.1-5
Description : A spell checker designed to eventually replace Ispell
Architecture : x86_64
URL : http://aspell.net/
Licenses : LGPL
Groups : None
Provides : None
Depends On : gcc-libs ncurses
Optional Deps : perl: to import old dictionaries [installed]
If you can help me correct my mistakes then please. I am also including the [error.log](https://pastebin.com/AW224Ztr) .
I guess I need to do something with the [u-beng.txt](https://pastebin.com/qEwaBYby) in the aspell-lang/maps folder or in the [u-beng.cset](https://pastebin.com/r8Za9fmt) & [u-beng.cmap](https://pastebin.com/EkPbYMn4) but I am not sure.
P.S. [Bengali_(Unicode_block)](https://en.wikipedia.org/wiki/Bengali_(Unicode_block))
And [Official Unicode Consortium code chart for Bengali.](https://www.unicode.org/charts/PDF/U0980.pdf)
And [***a pull request.***](https://github.com/GNUAspell/aspell-lang/pull/2)
Asked by Pavel Sayekat
(621 rep)
Mar 27, 2018, 02:01 PM
Last activity: Jan 23, 2019, 05:56 AM
Last activity: Jan 23, 2019, 05:56 AM