You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@openoffice.apache.org by bu...@apache.org on 2023/01/04 14:10:57 UTC

[Issue 22579] incorrect import : HTML page with CKJ characters coded in hexadecimal

https://bz.apache.org/ooo/show_bug.cgi?id=22579

damjan@apache.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |needmoreinfo
                 CC|                            |damjan@apache.org,
                   |                            |lcn@mail.pf

--- Comment #7 from damjan@apache.org ---
(In reply to lcn from comment #5)
> Seems that it affects not only CKJ but all characters (ASCII, 
> accentued, CKJ,... ) coded in hexadecimal in HTML pages.

All 3 sample documents look the same now, and my tests show hexadecimally coded
ASCII (eg. &#x5a; for "Z") look right. Please confirm whether this is still an
issue?

I believe the parsing happens in HTMLParser::ScanText() in
main/svtools/source/svhtml/parhtml.cxx, and it supports both hex and decimal
encoding.

-- 
You are receiving this mail because:
You are the assignee for the issue.
You are on the CC list for the issue.