You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Nick Wellnhofer <we...@aevum.de> on 2016/09/30 18:10:56 UTC
[lucy-dev] (LUCY-311) Non-ASCII error messages from strerror cause exceptions
Lucifers,
Regarding issue LUCY-311:
------------------------------------
Summary: Non-ASCII error messages from strerror cause exceptions
Key: LUCY-311
URL: https://issues.apache.org/jira/browse/LUCY-311
Project: Lucy
Issue Type: Bug
Components: Store
Reporter: Nick Wellnhofer
The code in Lucy/Store creates Err objects with error messages returned from
strerror. Especially under non-English locales, these messages aren't
necessarily valid UTF-8. Now that CB_VCatF checks C strings for invalid UTF-8,
this results in an exception.
Here's an example with a German error message in Latin1 encoding:
http://www.cpantesters.org/cpan/report/20d4902a-8673-11e6-9bc4-e52240a03099
------------------------------------
Does any have a good idea how to solve this? I can see the following approaches.
1. Switch to numeric error codes. Not very informative. Maybe use custom
messages for a couple of error codes.
2. Replace non-ASCII chars in the error message with Unicode replacement
character.
3. Use strerror_l with the "C" locale and hope that error messages are ASCII,
replacing unlikely non-ASCII chars. POSIX only.
4. Call nl_langinfo(CODESET) to detect the character set and try to convert.
POSIX only. Complicated.
Nick