You are viewing a plain text version of this content. The canonical link for it is here.
Posted to log4cxx-user@logging.apache.org by "Jason S. Whitwill" <Ja...@pleora.com> on 2011/02/01 16:21:37 UTC

PropertyConfigurator::configure file encoding

I'm passing a file to PropertyConfigurator::configure to initialize my log4cxx logging configuration. Inside the configuration file I have a rolling file appender with the file property set to ${APPDATA}/example.log.

For example:
log4j.rootLogger=info, R
log4j.appender.R=org.apache.log4j.RollingFileAppender
log4j.appender.R.File=$(APPDATA)/example.log
log4j.appender.R.MaxFileSize=100KB
log4j.appender.R.MaxBackupIndex=1
log4j.appender.R.layout=org.apache.log4j.PatternLayout
log4j.appender.R.layout.ConversionPattern=%r %p %t %c - %m%n

If the log configuration file is encoded in ansii, everything works fine as long as the APPDATA path doesn't contain any characters outside of the normal ascii range. However, if the path contains Chinese characters for example, I see no output file. I attempted to save the log configuration file with Unicode encoding but unfortunately, no matter what the contents of the file, log4cxx would fail to read it. I am passing the path to the configuration file as a wchar_t*. I'm presuming that the rolling file appender supports outputting to a path containing international characters as the constructor accepts a LogString parameter.

Is there a particular encoding that would make it such that international characters could be interpreted correctly when parsing the configuration file using PropertyConfigurator::configure?

Thank you for your time.

Jason
_______________________
Jason Whitwill
Software Designer

Pleora Technologies Inc.
Phone: +1-613-270-0625 ext 153
Fax: +1-613-270-1425
Jason.Whitwill@pleora.com
www.pleora.com


This communication contains confidential information intended only for the addressee(s). If you have received this communication in error, please notify us immediately and delete this communication from your mail box.

RE: PropertyConfigurator::configure file encoding

Posted by "Jason S. Whitwill" <Ja...@pleora.com>.
I managed to work around the issue by parsing the file outside of log4cxx, substituting the environment variables and creating a log4cxx::helpers::Properties object to pass to that overload of PropertyConfigurator::configure.

I created bug https://issues.apache.org/jira/browse/LOGCXX-378

I'm OK for now (I have an acceptable workaround) so prioritize the bug however you like.

Thanks again for your time.

Cheers,
Jason


-----Original Message-----
From: Curt Arnold [mailto:curt.arnld@gmail.com] On Behalf Of Curt Arnold
Sent: Monday, February 07, 2011 12:22 AM
To: Log4CXX User
Subject: Re: PropertyConfigurator::configure file encoding

I'd suggest looking at System::GetProperty in system.cpp, specifically the last section that calls apr_get_env.  It appears that the problem is specific to the introduction of environment variables.  I'd suggest a native Windows alternative that calls GetEnvironmentVariableW which bypasses the conversion to and from an single byte representation.  Alternatively, you could look at the implementation of apr_get_env and try to determine whether log4cxx called it incorrectly or whether it has a bug or limitation.

I would appreciate it if you could file a bug report at http://issues.apache.org/jira.



On Feb 3, 2011, at 11:24 AM, Jason S. Whitwill wrote:

> Thank you very much for your reply.
>
> The operating system is Windows 7 64 bits English version.
> Under region and language in the control panel, everything is set to English (United States)
> The user name has Chinese characters which are interpreted correctly in Windows Explorer and in my Unicode MFC application.
> Log4cxx was built with Visual C++ 2005 and I don't recall changing anything in the default configuration when it was built (it has been a while).
> The log4cxx version is 0.10.0
> When using environment variables for the filename (not the path) I get a file output but the characters of the file name are garbled (from some other language).
>
> If I execute the following code snippet, it interprets the environment variable correctly and the file is output with the correct path and file name.
>
>    wchar_t lPathPtr[2048];
>        size_t lPathSize = 0;
>        _wgetenv_s( &lPathSize, lPathPtr, 2048, L"APPDATA" );
>    if ( lPathPtr != NULL )
>    {
>        wchar_t lPath[2048];
>        StringCbPrintfW(lPath, 2048, L"%s\\unicode_test.txt", lPathPtr);
>
>        log4cxx::LayoutPtr layout(new log4cxx::SimpleLayout());
>        log4cxx::FileAppenderPtr appender(new log4cxx::FileAppender(layout, lPath, true));
>
>        LoggerPtr logger(Logger::getLogger("MyApp"));
>        logger->addAppender( appender );
>
>        wchar_t lString[512];
>        mTextBox.GetWindowTextW( lString, 512 );
>        LOG4CXX_INFO(logger, lString );
>    }
>
> However if I create a configuration file that looks like this...
>
> log4j.rootLogger=info, R
> log4j.appender.R=org.apache.log4j.RollingFileAppender
> log4j.appender.R.File=${TESTENV}
> log4j.appender.R.MaxFileSize=100KB
> log4j.appender.R.MaxBackupIndex=1
> log4j.appender.R.layout=org.apache.log4j.PatternLayout
> log4j.appender.R.layout.ConversionPattern=%r %p %t %c - %m%n
>
> ...and open that configuration file with a code snippet that looks like this...
>     LoggerPtr logger(Logger::getLogger("MyApp"));
>     PropertyConfigurator::configure("test.logcfg");
>     LOG4CXX_INFO(logger, "Entering application.");
>
> ... the file name ends up being garbled with funny characters.
> Obviously if I replace ${TESTENV} with ${APPDATA}/unicode_test.txt, I don't see any output because the garbled folder path doesn't exist. This seems to be the case for European characters outside of the ascii range as well (like the german umlaut or the accent aigu in French).
>
> Since I do have a copy of the log4cxx source code, is there a place you would recommend I start if I was to work around the issue by modifying log4cxx?
>
> My main challenge is the fact that I need to have a default configuration file that outputs to a location on Windows that doesn't require elevated UAC privileges. This wouldn't work for our international customers who have non-western-european characters in their user name. The problem is further complicated by the fact that our main library (not in the case of the test applications I created to rule out the problem) delay loads log4cxx and manually creating the layout and appender was giving us grief.
>
> Thanks again for your valuable time.
> Jason
>
>
>
> -----Original Message-----
> From: Curt Arnold [mailto:curt.arnld@gmail.com] On Behalf Of Curt Arnold
> Sent: Thursday, February 03, 2011 12:33 AM
> To: Log4CXX User
> Subject: Re: PropertyConfigurator::configure file encoding
>
> Property files in Java are by definition in ISO-8859-1 which cannot support Chinese characters without using escape characters (see http://download.oracle.com/javase/6/docs/api/java/util/Properties.html).  log4cxx follows this convention so that it is compatible with log4j configuration files.
>
> However, the issue is the substitution of the contents of the APPDATA environment variable into the evaluation of the configuration which should occur after the properties file in parsed and should happen in LogString (aka Unicode) space.
>
> I'm guessing things are failing since the evaluation of APPDATA does not match an existing directory and therefore the appender fails.  It would be interesting to experiment with an environment variable for the file name (not the path) to see how the name is mangled.
>
> There are a couple of things that would be very useful to know:
>
> What operating system and version is being used?
> What is the default character encoding (control panel or $ locale charmap)?
> What settings are used to build log4cxx?
> What is the observed behavior when using environment variables for the filename (not the path)?  What were the expected behavior?
>
> I'm pretty confident that the property files are correctly always interpreted as ISO-8859-1 regardless of the default encoding.
>
> log4cxx depends on APR to get the environment variables and for file IO, so something unexpected could be happening there or log4cxx could be mangling the substitution.
>
> _______________________
> Jason Whitwill
> Software Designer
>
> Pleora Technologies Inc.
> Phone: +1-613-270-0625 ext 153
> Fax: +1-613-270-1425
> Jason.Whitwill@pleora.com
> www.pleora.com
>
>
> This communication contains confidential information intended only for the addressee(s). If you have received this communication in error, please notify us immediately and delete this communication from your mail box.

_______________________
Jason Whitwill
Software Designer

Pleora Technologies Inc.
Phone: +1-613-270-0625 ext 153
Fax: +1-613-270-1425
Jason.Whitwill@pleora.com
www.pleora.com


This communication contains confidential information intended only for the addressee(s). If you have received this communication in error, please notify us immediately and delete this communication from your mail box.

Re: PropertyConfigurator::configure file encoding

Posted by Curt Arnold <ca...@apache.org>.
I'd suggest looking at System::GetProperty in system.cpp, specifically the last section that calls apr_get_env.  It appears that the problem is specific to the introduction of environment variables.  I'd suggest a native Windows alternative that calls GetEnvironmentVariableW which bypasses the conversion to and from an single byte representation.  Alternatively, you could look at the implementation of apr_get_env and try to determine whether log4cxx called it incorrectly or whether it has a bug or limitation.

I would appreciate it if you could file a bug report at http://issues.apache.org/jira.



On Feb 3, 2011, at 11:24 AM, Jason S. Whitwill wrote:

> Thank you very much for your reply.
> 
> The operating system is Windows 7 64 bits English version.
> Under region and language in the control panel, everything is set to English (United States)
> The user name has Chinese characters which are interpreted correctly in Windows Explorer and in my Unicode MFC application.
> Log4cxx was built with Visual C++ 2005 and I don't recall changing anything in the default configuration when it was built (it has been a while).
> The log4cxx version is 0.10.0
> When using environment variables for the filename (not the path) I get a file output but the characters of the file name are garbled (from some other language).
> 
> If I execute the following code snippet, it interprets the environment variable correctly and the file is output with the correct path and file name.
> 
>    wchar_t lPathPtr[2048];
>        size_t lPathSize = 0;
>        _wgetenv_s( &lPathSize, lPathPtr, 2048, L"APPDATA" );
>    if ( lPathPtr != NULL )
>    {
>        wchar_t lPath[2048];
>        StringCbPrintfW(lPath, 2048, L"%s\\unicode_test.txt", lPathPtr);
> 
>        log4cxx::LayoutPtr layout(new log4cxx::SimpleLayout());
>        log4cxx::FileAppenderPtr appender(new log4cxx::FileAppender(layout, lPath, true));
> 
>        LoggerPtr logger(Logger::getLogger("MyApp"));
>        logger->addAppender( appender );
> 
>        wchar_t lString[512];
>        mTextBox.GetWindowTextW( lString, 512 );
>        LOG4CXX_INFO(logger, lString );
>    }
> 
> However if I create a configuration file that looks like this...
> 
> log4j.rootLogger=info, R
> log4j.appender.R=org.apache.log4j.RollingFileAppender
> log4j.appender.R.File=${TESTENV}
> log4j.appender.R.MaxFileSize=100KB
> log4j.appender.R.MaxBackupIndex=1
> log4j.appender.R.layout=org.apache.log4j.PatternLayout
> log4j.appender.R.layout.ConversionPattern=%r %p %t %c - %m%n
> 
> ...and open that configuration file with a code snippet that looks like this...
>     LoggerPtr logger(Logger::getLogger("MyApp"));
>     PropertyConfigurator::configure("test.logcfg");
>     LOG4CXX_INFO(logger, "Entering application.");
> 
> ... the file name ends up being garbled with funny characters.
> Obviously if I replace ${TESTENV} with ${APPDATA}/unicode_test.txt, I don't see any output because the garbled folder path doesn't exist. This seems to be the case for European characters outside of the ascii range as well (like the german umlaut or the accent aigu in French).
> 
> Since I do have a copy of the log4cxx source code, is there a place you would recommend I start if I was to work around the issue by modifying log4cxx?
> 
> My main challenge is the fact that I need to have a default configuration file that outputs to a location on Windows that doesn't require elevated UAC privileges. This wouldn't work for our international customers who have non-western-european characters in their user name. The problem is further complicated by the fact that our main library (not in the case of the test applications I created to rule out the problem) delay loads log4cxx and manually creating the layout and appender was giving us grief.
> 
> Thanks again for your valuable time.
> Jason
> 
> 
> 
> -----Original Message-----
> From: Curt Arnold [mailto:curt.arnld@gmail.com] On Behalf Of Curt Arnold
> Sent: Thursday, February 03, 2011 12:33 AM
> To: Log4CXX User
> Subject: Re: PropertyConfigurator::configure file encoding
> 
> Property files in Java are by definition in ISO-8859-1 which cannot support Chinese characters without using escape characters (see http://download.oracle.com/javase/6/docs/api/java/util/Properties.html).  log4cxx follows this convention so that it is compatible with log4j configuration files.
> 
> However, the issue is the substitution of the contents of the APPDATA environment variable into the evaluation of the configuration which should occur after the properties file in parsed and should happen in LogString (aka Unicode) space.
> 
> I'm guessing things are failing since the evaluation of APPDATA does not match an existing directory and therefore the appender fails.  It would be interesting to experiment with an environment variable for the file name (not the path) to see how the name is mangled.
> 
> There are a couple of things that would be very useful to know:
> 
> What operating system and version is being used?
> What is the default character encoding (control panel or $ locale charmap)?
> What settings are used to build log4cxx?
> What is the observed behavior when using environment variables for the filename (not the path)?  What were the expected behavior?
> 
> I'm pretty confident that the property files are correctly always interpreted as ISO-8859-1 regardless of the default encoding.
> 
> log4cxx depends on APR to get the environment variables and for file IO, so something unexpected could be happening there or log4cxx could be mangling the substitution.
> 
> _______________________
> Jason Whitwill
> Software Designer
> 
> Pleora Technologies Inc.
> Phone: +1-613-270-0625 ext 153
> Fax: +1-613-270-1425
> Jason.Whitwill@pleora.com
> www.pleora.com
> 
> 
> This communication contains confidential information intended only for the addressee(s). If you have received this communication in error, please notify us immediately and delete this communication from your mail box.


RE: PropertyConfigurator::configure file encoding

Posted by "Jason S. Whitwill" <Ja...@pleora.com>.
Thank you very much for your reply.

The operating system is Windows 7 64 bits English version.
Under region and language in the control panel, everything is set to English (United States)
The user name has Chinese characters which are interpreted correctly in Windows Explorer and in my Unicode MFC application.
Log4cxx was built with Visual C++ 2005 and I don't recall changing anything in the default configuration when it was built (it has been a while).
The log4cxx version is 0.10.0
When using environment variables for the filename (not the path) I get a file output but the characters of the file name are garbled (from some other language).

If I execute the following code snippet, it interprets the environment variable correctly and the file is output with the correct path and file name.

    wchar_t lPathPtr[2048];
        size_t lPathSize = 0;
        _wgetenv_s( &lPathSize, lPathPtr, 2048, L"APPDATA" );
    if ( lPathPtr != NULL )
    {
        wchar_t lPath[2048];
        StringCbPrintfW(lPath, 2048, L"%s\\unicode_test.txt", lPathPtr);

        log4cxx::LayoutPtr layout(new log4cxx::SimpleLayout());
        log4cxx::FileAppenderPtr appender(new log4cxx::FileAppender(layout, lPath, true));

        LoggerPtr logger(Logger::getLogger("MyApp"));
        logger->addAppender( appender );

        wchar_t lString[512];
        mTextBox.GetWindowTextW( lString, 512 );
        LOG4CXX_INFO(logger, lString );
    }

However if I create a configuration file that looks like this...

log4j.rootLogger=info, R
log4j.appender.R=org.apache.log4j.RollingFileAppender
log4j.appender.R.File=${TESTENV}
log4j.appender.R.MaxFileSize=100KB
log4j.appender.R.MaxBackupIndex=1
log4j.appender.R.layout=org.apache.log4j.PatternLayout
log4j.appender.R.layout.ConversionPattern=%r %p %t %c - %m%n

...and open that configuration file with a code snippet that looks like this...
     LoggerPtr logger(Logger::getLogger("MyApp"));
     PropertyConfigurator::configure("test.logcfg");
     LOG4CXX_INFO(logger, "Entering application.");

... the file name ends up being garbled with funny characters.
Obviously if I replace ${TESTENV} with ${APPDATA}/unicode_test.txt, I don't see any output because the garbled folder path doesn't exist. This seems to be the case for European characters outside of the ascii range as well (like the german umlaut or the accent aigu in French).

Since I do have a copy of the log4cxx source code, is there a place you would recommend I start if I was to work around the issue by modifying log4cxx?

My main challenge is the fact that I need to have a default configuration file that outputs to a location on Windows that doesn't require elevated UAC privileges. This wouldn't work for our international customers who have non-western-european characters in their user name. The problem is further complicated by the fact that our main library (not in the case of the test applications I created to rule out the problem) delay loads log4cxx and manually creating the layout and appender was giving us grief.

Thanks again for your valuable time.
Jason



-----Original Message-----
From: Curt Arnold [mailto:curt.arnld@gmail.com] On Behalf Of Curt Arnold
Sent: Thursday, February 03, 2011 12:33 AM
To: Log4CXX User
Subject: Re: PropertyConfigurator::configure file encoding

Property files in Java are by definition in ISO-8859-1 which cannot support Chinese characters without using escape characters (see http://download.oracle.com/javase/6/docs/api/java/util/Properties.html).  log4cxx follows this convention so that it is compatible with log4j configuration files.

However, the issue is the substitution of the contents of the APPDATA environment variable into the evaluation of the configuration which should occur after the properties file in parsed and should happen in LogString (aka Unicode) space.

I'm guessing things are failing since the evaluation of APPDATA does not match an existing directory and therefore the appender fails.  It would be interesting to experiment with an environment variable for the file name (not the path) to see how the name is mangled.

There are a couple of things that would be very useful to know:

What operating system and version is being used?
What is the default character encoding (control panel or $ locale charmap)?
What settings are used to build log4cxx?
What is the observed behavior when using environment variables for the filename (not the path)?  What were the expected behavior?

I'm pretty confident that the property files are correctly always interpreted as ISO-8859-1 regardless of the default encoding.

log4cxx depends on APR to get the environment variables and for file IO, so something unexpected could be happening there or log4cxx could be mangling the substitution.

_______________________
Jason Whitwill
Software Designer

Pleora Technologies Inc.
Phone: +1-613-270-0625 ext 153
Fax: +1-613-270-1425
Jason.Whitwill@pleora.com
www.pleora.com


This communication contains confidential information intended only for the addressee(s). If you have received this communication in error, please notify us immediately and delete this communication from your mail box.

Re: PropertyConfigurator::configure file encoding

Posted by Curt Arnold <ca...@apache.org>.
Property files in Java are by definition in ISO-8859-1 which cannot support Chinese characters without using escape characters (see http://download.oracle.com/javase/6/docs/api/java/util/Properties.html).  log4cxx follows this convention so that it is compatible with log4j configuration files.

However, the issue is the substitution of the contents of the APPDATA environment variable into the evaluation of the configuration which should occur after the properties file in parsed and should happen in LogString (aka Unicode) space.

I'm guessing things are failing since the evaluation of APPDATA does not match an existing directory and therefore the appender fails.  It would be interesting to experiment with an environment variable for the file name (not the path) to see how the name is mangled.

There are a couple of things that would be very useful to know:

What operating system and version is being used?
What is the default character encoding (control panel or $ locale charmap)?
What settings are used to build log4cxx?
What is the observed behavior when using environment variables for the filename (not the path)?  What were the expected behavior?

I'm pretty confident that the property files are correctly always interpreted as ISO-8859-1 regardless of the default encoding.

log4cxx depends on APR to get the environment variables and for file IO, so something unexpected could be happening there or log4cxx could be mangling the substitution.