Is it possible to use OPENROWSET to import fixed width UTF8 encoded files?
9
votes
3
answers
3773
views
I have an example data file with following contents and saved with UTF8 encoding.
oab~opqr
öab~öpqr
öab~öpqr
The format of this file is fixed width with columns 1 to 3 each being allocated 1 character and column 4 reserved 5 characters.
I have created an XML format file as below
Disappointingly running the following SQL...
SELECT *
FROM OPENROWSET
(
BULK 'mydata.txt',
FORMATFILE = 'myformat_file.xml',
CODEPAGE = '65001'
) AS X
Produces the following results
Col1 Col2 Col3 Col4
---- ---- ---- -----
o a b ~opqr
� � a b~öp
� � a b~öp
from which I conclude the
LENGTH
is counting bytes rather than characters.
Is there any way I can get this working correctly for fixed *character* widths with UTF8 encoding?
(Target environment is Azure SQL Database reading from Blob storage)
NB: It was suggested in the comments that adding COLLATION="LATIN1_GENERAL_100_CI_AS_SC_UTF8"
to the FIELD
elements might help but the results remain unchanged with this.
Asked by Martin Smith
(87941 rep)
Dec 1, 2021, 01:56 PM
Last activity: Dec 10, 2021, 08:34 PM
Last activity: Dec 10, 2021, 08:34 PM