You are viewing a plain text version of this content. The canonical link for it is here.
Posted to log4cxx-user@logging.apache.org by pe...@mu.org on 2007/06/29 15:09:26 UTC

Race condition with FileAppender.

Hi, I've been using the subversion log4cxx-10.0.0, which is currently at revision 551872.

I'm running on an ubuntu linux box running kernel 2.6.20-16, and g++ 4.1.2

I've been seeing severe race conditions between threads when logging to a FileAppender.

Basically, when multiple threads write to the same log file, some of the data is corrupt.

Here is a little program will show the issue.

I've been pouring over the docs, and the mailing list archives, but I can't find a clear explanation as to why this shouldn't work.

Any ideas what I'm doing wrong?  Thanks, Pete

----------------------------------------------------------------------------
#include <pthread.h>

#include <iostream>
#include <sstream>
#include <vector>

#include <log4cxx/logger.h>

const int NUMTHREADS = 100 ;
const int NUMWRITES = 100 ;

log4cxx::LoggerPtr thread_logger = log4cxx::Logger::getLogger("ThreadManager");
using std::cerr;
using std::endl;

void *Run( void *ptr ){

    int *thread_number = reinterpret_cast<int*>(ptr);

    for (int i=0 ; i< NUMWRITES ; i++){
        std::ostringstream msg;
        msg << "Thread[" << *thread_number << "], Write [" << ( i + 1 ) << "]";
        LOG4CXX_DEBUG(thread_logger, msg.str());
    }

    pthread_exit(NULL);

}

int 
main(int argc, char *argv[]) {

    std::vector<pthread_t> threads(NUMTHREADS);

    log4cxx::LoggerPtr rootlogger = log4cxx::Logger::getRootLogger();

    int         thread_number = 0; 
    int         *thread_cnt = 0;
    while (thread_number++ < NUMTHREADS){

        thread_cnt =  new int(thread_number);
        pthread_create(&threads[thread_number -1], NULL, Run, (void *) thread_cnt);
    }
    
    thread_number = NUMTHREADS;

    while (--thread_number >= 0){
        pthread_join(threads[thread_number], NULL);
    }

    return 0;

}
----------------------------------------------------------------------------

Here is my log4cxx.properties:

log4j.debug=true

# Loggers
log4j.rootLogger=DEBUG, rootlog
log4j.logger.ThreadManager=DEBUG, ThreadManager

# Additivity
log4j.additivity.ThreadManager=false

# rootLogger
log4j.appender.rootlog.File=root.log
log4j.appender.rootlog.Append=false
log4j.appender.rootlog=org.apache.log4j.FileAppender
log4j.appender.rootlog.layout=org.apache.log4j.PatternLayout 
log4j.appender.rootlog.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n

# ThreadManager
log4j.appender.ThreadManager=org.apache.log4j.FileAppender 
log4j.appender.ThreadManager.File=ThreadManager.log
log4j.appender.ThreadManager.Append=false
log4j.appender.ThreadManager.layout=org.apache.log4j.PatternLayout 
log4j.appender.ThreadManager.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n



RE: Race condition with FileAppender.

Posted by Moshe Matitya <Mo...@Kayote.com>.
On Saturday, June 30, 2007 7:27 AM, Curt Arnold wrote:
>
> LOGCXX-129 only reported problems with AsyncAppender which 
> was badly flawed in log4j 1.2.13 and earlier and deadlock 
> reports in log4j started getting more frequent.  The log4cxx 
> AsyncAppender was pretty much a straight port of the early 
> log4j equivalent and would be  
> expected to have the same problems under stress.   The new log4j  
> AsyncAppender should be ported over to log4cxx and that is 
> one of the remaining issues on a push to release.


Please note that with the current HEAD version, it is impossible to
configure log4cxx to use AsyncAppender.  Any appender that is attached
to an AsyncAppender results in the following error message:

    No appender named [<name>] could be found.

For details, see my earlier posting:
http://marc.info/?l=log4cxx-user&m=118244780603300&w=2

Moshe

Re: Race condition with FileAppender.

Posted by pe...@mu.org.
On Sun, Jul 01, 2007 at 03:49:44AM -0700, pete@mu.org wrote:
> Hmm., that's really odd.  I can reproduce it on my two boxes with pent4
> with hyperthreading enabled, but I can't on my weenie single lapetop cpu.

I forgot to mention, that if I disable hyperthreading in the bios, then
the problem goes away.  I don't know if that helps, but there you go.

Pete

> 
> I'll try this and get back to you.
> 
> LOG4CXX_LOCALE_ENCODING_UTF8=1
> 
> Thanks,
> 
> Pete
> 
> On Sun, Jul 01, 2007 at 01:34:22AM -0500, Curt Arnold wrote:
> > I have been unable to reproduce the problem on Ubuntu Linux 6.06-1  
> > and gcc 4.0.3 running under VMWare Fusion on Mac OS/X either in  
> > single processor or double processor mode.  That is the glorious  
> > thing about race conditions, the slightly little change might hide  
> > them.  Thanks for the output when you were seeing corruption, it at  
> > least suggests that the problem isn't in the layout since all the  
> > bogus characters appear between perfectly formatted messages.  That  
> > suggests that the problem is in the FileAppender or probably more  
> > likely in the CharsetEncoder or CharsetDecoder.
> > 
> > I won't be able to work on it until Monday, however it would be  
> > helpful to know if setting LOG4CXX_LOCALE_ENCODING_UTF8=1 (which will  
> > replace the APR encoders/decoders with straight byte copies) or the  
> > equivalent manual hack to src/charsetencoder.cpp and src/ 
> > charsetdecoder.cpp changes the behavior.
> > 
> > I've also run the helgrind tool from valgrind 2.2.0 and generated a  
> > report of possible race conditions in the executed code, but I  
> > haven't had time to review them yet.
> > 
> > 
> 

Re: Race condition with FileAppender.

Posted by pe...@mu.org.
Hmm., that's really odd.  I can reproduce it on my two boxes with pent4
with hyperthreading enabled, but I can't on my weenie single lapetop cpu.

I'll try this and get back to you.

LOG4CXX_LOCALE_ENCODING_UTF8=1

Thanks,

Pete

On Sun, Jul 01, 2007 at 01:34:22AM -0500, Curt Arnold wrote:
> I have been unable to reproduce the problem on Ubuntu Linux 6.06-1  
> and gcc 4.0.3 running under VMWare Fusion on Mac OS/X either in  
> single processor or double processor mode.  That is the glorious  
> thing about race conditions, the slightly little change might hide  
> them.  Thanks for the output when you were seeing corruption, it at  
> least suggests that the problem isn't in the layout since all the  
> bogus characters appear between perfectly formatted messages.  That  
> suggests that the problem is in the FileAppender or probably more  
> likely in the CharsetEncoder or CharsetDecoder.
> 
> I won't be able to work on it until Monday, however it would be  
> helpful to know if setting LOG4CXX_LOCALE_ENCODING_UTF8=1 (which will  
> replace the APR encoders/decoders with straight byte copies) or the  
> equivalent manual hack to src/charsetencoder.cpp and src/ 
> charsetdecoder.cpp changes the behavior.
> 
> I've also run the helgrind tool from valgrind 2.2.0 and generated a  
> report of possible race conditions in the executed code, but I  
> haven't had time to review them yet.
> 
> 

Re: Race condition with FileAppender.

Posted by Curt Arnold <ca...@apache.org>.
I have been unable to reproduce the problem on Ubuntu Linux 6.06-1  
and gcc 4.0.3 running under VMWare Fusion on Mac OS/X either in  
single processor or double processor mode.  That is the glorious  
thing about race conditions, the slightly little change might hide  
them.  Thanks for the output when you were seeing corruption, it at  
least suggests that the problem isn't in the layout since all the  
bogus characters appear between perfectly formatted messages.  That  
suggests that the problem is in the FileAppender or probably more  
likely in the CharsetEncoder or CharsetDecoder.

I won't be able to work on it until Monday, however it would be  
helpful to know if setting LOG4CXX_LOCALE_ENCODING_UTF8=1 (which will  
replace the APR encoders/decoders with straight byte copies) or the  
equivalent manual hack to src/charsetencoder.cpp and src/ 
charsetdecoder.cpp changes the behavior.

I've also run the helgrind tool from valgrind 2.2.0 and generated a  
report of possible race conditions in the executed code, but I  
haven't had time to review them yet.



Re: Race condition with FileAppender.

Posted by pe...@mu.org.
On Sat, Jun 30, 2007 at 12:32:11AM -0500, Curt Arnold wrote:
> What indicates to you that the problem is in FileAppender and not in  
> PatternLayout?  I would think that PatternLayout would be a might  
> more likely source of problems.  Have you tried other appenders  
> (telnet might be reasonable) and not had the problem?  Have you tried  
> a custom layout (all the stock layouts are preconfigured  
> PatternLayout's)?  Does the same issue still appear if you simplify  
> the pattern (maybe down to just %m or %m%n)?

I'm still very new to log4cxx internals and all the options.  I just
learned enough to get it to do what I needed---back with log4cxx9.  

So, I'm afraid I don't have any insight at all into what might be
causing the race condition.  And I haven't used any other appenders or
layouts.

I tried to just put %m%n in the pattern, per your request,  and I've attached the results of the root log so you can see what I'm talking about.  As you'll see
there are lots of valid lines, then just junk, then more good lines.
See attached.

I've also added my log4cxx.properties.  If you want to make any changes
and define some different appenders or whatever, and mail it back, I can run it for you and send you the results.

Please let me know if there is anything else I can do.

Thanks,

Pete






Re: Race condition with FileAppender.

Posted by Curt Arnold <ca...@apache.org>.
On Jun 29, 2007, at 11:37 PM, pete@mu.org wrote:

> If you just build the small program that I sent and link it to the  
> subversion log4cxx, then check out the log files it creates.   
> They're full of junk along with valid lines too.
>
> It does it everytime.  I'm working on a threaded project that uses  
> 100's
> of threads, and it was always working great with log4cxx, then I
> upgraded to subversion source, and it started leaving garbage in the
> logs.
>
> Tonight, I finally decided to go back and reinstall the standard
> log4cxx-9 libs, and link with them.  It completely fixed my problem.
>
> Have you tried to run that program I sent you and link it against the
> newest subversion log4cxx?
>
> Please try and and let me know what you find in the log files.
>
> That's the best example you could have of the problem.
>
> Thanks,
>
> Pete
>

I appreciate the test code, but have not had a chance to attempt to  
build and run it.  I can't rapidly switch projects without endlessly  
churning so when I'm active on log4j development, I usually only  
answer questions or give hints until I finish whatever I'm currently  
working on.  I expect to do so tomorrow morning (US Central Time,  
UTC-5).  If you have any other insights in the interim please post  
them.  I do appreciate the specifics on platform and compiler, that  
helps a huge amount.

What indicates to you that the problem is in FileAppender and not in  
PatternLayout?  I would think that PatternLayout would be a might  
more likely source of problems.  Have you tried other appenders  
(telnet might be reasonable) and not had the problem?  Have you tried  
a custom layout (all the stock layouts are preconfigured  
PatternLayout's)?  Does the same issue still appear if you simplify  
the pattern (maybe down to just %m or %m%n)?




Re: Race condition with FileAppender.

Posted by pe...@mu.org.
If you just build the small program that I sent and link it to the subversion log4cxx, then check out the log files it creates.  They're full of junk along with valid lines too.

It does it everytime.  I'm working on a threaded project that uses 100's
of threads, and it was always working great with log4cxx, then I
upgraded to subversion source, and it started leaving garbage in the
logs.

Tonight, I finally decided to go back and reinstall the standard
log4cxx-9 libs, and link with them.  It completely fixed my problem.

Have you tried to run that program I sent you and link it against the
newest subversion log4cxx?

Please try and and let me know what you find in the log files.

That's the best example you could have of the problem.

Thanks,

Pete

On Fri, Jun 29, 2007 at 11:26:37PM -0500, Curt Arnold wrote:
> 
> On Jun 29, 2007, at 10:53 PM, pete@mu.org wrote:
> 
> >Nevermind.  This is a known bug.
> >
> >http://issues.apache.org/jira/browse/LOGCXX-129
> >
> >I should of just stayed with 97 which was working fine.
> >
> >Thanks,
> >
> >Pete
> >
> 
> 
> LOGCXX-129 only reported problems with AsyncAppender which was badly  
> flawed in log4j 1.2.13 and earlier and deadlock reports in log4j  
> started getting more frequent.  The log4cxx AsyncAppender was pretty  
> much a straight port of the early log4j equivalent and would be  
> expected to have the same problems under stress.   The new log4j  
> AsyncAppender should be ported over to log4cxx and that is one of the  
> remaining issues on a push to release.
> 
> However, your test case does not appear to involve AsyncAppender and  
> the anticipated resolution for LOGCXX-129 would have no impact.  If  
> you have any stack traces or other information to help localize the  
> problem, they would be appreciated.  Otherwise, it will be my top  
> issue to investigate when I shift to log4cxx development tomorrow.
> 

Re: Race condition with FileAppender.

Posted by Curt Arnold <ca...@apache.org>.
On Jun 29, 2007, at 10:53 PM, pete@mu.org wrote:

> Nevermind.  This is a known bug.
>
> http://issues.apache.org/jira/browse/LOGCXX-129
>
> I should of just stayed with 97 which was working fine.
>
> Thanks,
>
> Pete
>


LOGCXX-129 only reported problems with AsyncAppender which was badly  
flawed in log4j 1.2.13 and earlier and deadlock reports in log4j  
started getting more frequent.  The log4cxx AsyncAppender was pretty  
much a straight port of the early log4j equivalent and would be  
expected to have the same problems under stress.   The new log4j  
AsyncAppender should be ported over to log4cxx and that is one of the  
remaining issues on a push to release.

However, your test case does not appear to involve AsyncAppender and  
the anticipated resolution for LOGCXX-129 would have no impact.  If  
you have any stack traces or other information to help localize the  
problem, they would be appreciated.  Otherwise, it will be my top  
issue to investigate when I shift to log4cxx development tomorrow.


Re: Race condition with FileAppender.

Posted by pe...@mu.org.
Nevermind.  This is a known bug.

http://issues.apache.org/jira/browse/LOGCXX-129

I should of just stayed with 97 which was working fine.

Thanks,

Pete




On Fri, Jun 29, 2007 at 06:09:26AM -0700, pete@mu.org wrote:
> 
> Hi, I've been using the subversion log4cxx-10.0.0, which is currently at revision 551872.
> 
> I'm running on an ubuntu linux box running kernel 2.6.20-16, and g++ 4.1.2
> 
> I've been seeing severe race conditions between threads when logging to a FileAppender.
> 
> Basically, when multiple threads write to the same log file, some of the data is corrupt.
> 
> Here is a little program will show the issue.
> 
> I've been pouring over the docs, and the mailing list archives, but I can't find a clear explanation as to why this shouldn't work.
> 
> Any ideas what I'm doing wrong?  Thanks, Pete
> 
> ----------------------------------------------------------------------------
> #include <pthread.h>
> 
> #include <iostream>
> #include <sstream>
> #include <vector>
> 
> #include <log4cxx/logger.h>
> 
> const int NUMTHREADS = 100 ;
> const int NUMWRITES = 100 ;
> 
> log4cxx::LoggerPtr thread_logger = log4cxx::Logger::getLogger("ThreadManager");
> using std::cerr;
> using std::endl;
> 
> void *Run( void *ptr ){
> 
>     int *thread_number = reinterpret_cast<int*>(ptr);
> 
>     for (int i=0 ; i< NUMWRITES ; i++){
>         std::ostringstream msg;
>         msg << "Thread[" << *thread_number << "], Write [" << ( i + 1 ) << "]";
>         LOG4CXX_DEBUG(thread_logger, msg.str());
>     }
> 
>     pthread_exit(NULL);
> 
> }
> 
> int 
> main(int argc, char *argv[]) {
> 
>     std::vector<pthread_t> threads(NUMTHREADS);
> 
>     log4cxx::LoggerPtr rootlogger = log4cxx::Logger::getRootLogger();
> 
>     int         thread_number = 0; 
>     int         *thread_cnt = 0;
>     while (thread_number++ < NUMTHREADS){
> 
>         thread_cnt =  new int(thread_number);
>         pthread_create(&threads[thread_number -1], NULL, Run, (void *) thread_cnt);
>     }
>     
>     thread_number = NUMTHREADS;
> 
>     while (--thread_number >= 0){
>         pthread_join(threads[thread_number], NULL);
>     }
> 
>     return 0;
> 
> }
> ----------------------------------------------------------------------------
> 
> Here is my log4cxx.properties:
> 
> log4j.debug=true
> 
> # Loggers
> log4j.rootLogger=DEBUG, rootlog
> log4j.logger.ThreadManager=DEBUG, ThreadManager
> 
> # Additivity
> log4j.additivity.ThreadManager=false
> 
> # rootLogger
> log4j.appender.rootlog.File=root.log
> log4j.appender.rootlog.Append=false
> log4j.appender.rootlog=org.apache.log4j.FileAppender
> log4j.appender.rootlog.layout=org.apache.log4j.PatternLayout 
> log4j.appender.rootlog.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n
> 
> # ThreadManager
> log4j.appender.ThreadManager=org.apache.log4j.FileAppender 
> log4j.appender.ThreadManager.File=ThreadManager.log
> log4j.appender.ThreadManager.Append=false
> log4j.appender.ThreadManager.layout=org.apache.log4j.PatternLayout 
> log4j.appender.ThreadManager.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n
> 
>