You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@logging.apache.org by Robert Middleton <rm...@apache.org> on 2021/06/29 01:17:14 UTC

Rolling tests / writing X log messages per second

Is there a reliable way to ensure that X number of log messages get
written out per second?  I ask because one of the problems that we
have with log4cxx is that the TimeBasedRollingTest randomly fails(on
OSX).  I've narrowed it down to the following function:

void logMsgAndSleep(Pool & pool,
size_t howOften,
std::string srcFunc,
size_t srcLine,
size_t startWith = 0,
float waitFactor = 0.5)
{

for (size_t i = startWith; i < startWith + howOften; ++i)
{
std::string message("Hello---");
message.append(pool.itoa(i));

LOG4CXX_DEBUG(logger, message);
apr_sleep(APR_USEC_PER_SEC * waitFactor);
}

}

The issue seems to be that due to various scheduling whims, we may not
always get the correct number of messages printed out per second, so
the test will fail.  Is there a better way to do this, perhaps
something in log4j?

-Robert Middleton

Re: Rolling tests / writing X log messages per second

Posted by Robert Middleton <os...@gmail.com>.

That does sound like a good idea.  We can probably replace all of the calls
to APR for the time with their C++ standard equivalents as well to allow
for easy mocking.

-Robert Middleton

On Mon, Jul 5, 2021 at 3:26 AM Volkan Yazıcı <vo...@gmail.com>
wrote:

> I totally agree with Matt on this. Once the code starts changing behaviour
> via the system time, it becomes practically impossible to decently test it.
> If you can mock the clock, testing becomes tractable. To the best of my
> knowledge, that is what we mostly do in Log4j too.
>
> On Sun, Jul 4, 2021 at 8:37 PM Matt Sicker <bo...@gmail.com> wrote:
>
> > When testing the passage of time, you can always try mocking the clock
> > itself to simulate things like that. In some cases, use of count down
> > latches or cyclic barriers can coordinate the various async actions
> rather
> > than relying on clocks. Once you start testing and modeling these things
> > like that, I think the scheduling bugs and any other weird interactions
> > will become far easier to discover and fix.
> >
> > On Thu, Jul 1, 2021 at 19:09 Robert Middleton <os...@gmail.com>
> wrote:
> >
> > > Firing events at a certain rate.  The test configures the rolling file
> > > appender to roll over once a second.
> > >
> > > Specifically, the test ensures that the following works:
> > > * Write log line 1 <-- second 1
> > > * Write log line 2 <-- second 1
> > > * Write log line 3, causes rollover <-- second 2
> > > * Write log line 4 <-- second 2
> > > * Write log line 5, causes rollover  <-- second 3
> > > ....etc...
> > >
> > > Under certain conditions(mostly when running under Github actions on
> > > OSX) this can result in the file having 1 or 3 log messages in it,
> > > instead of exactly 2.  So it almost certainly has to do with the
> > > particular scheduling and sleep interaction.
> > >
> > > I'm thinking that the best thing to do is to change up the algorithm
> > > slightly so that it does
> > > * write log line 1
> > > * write log line 2
> > > * wait until the next second
> > > * write log line 3, causing rollover
> > > ....etc...
> > >
> > > -Robert Middleton
> > >
> > > On Thu, Jul 1, 2021 at 3:22 PM Volkan Yazıcı <vo...@gmail.com>
> > > wrote:
> > > >
> > > > What do you exactly mean by "ensuring"? Checking if the event rate
> was
> > > > correct? Or rather firing events at a certain rate? For the former,
> you
> > > can
> > > > collect all the events, split the collection into buckets of seconds,
> > and
> > > > count the number of events in each bucket. For the latter, did you
> try
> > > busy
> > > > spinning rather than apr_sleep to avoid context switch costs? If I
> > would
> > > > want to implement this in Java, I would use a stock rate limiter, and
> > > shoot
> > > > using that. That is,
> > > >
> > > > RateLimiter rateLimiter = RateLimiter.ofPermitsPerSecond(10);
> > > > for (int i = 0; i < l; i++) {
> > > >     rateLimiter.acquire(); // This will block if the permit is not
> > > > available.
> > > >     // You might want to sleep here some random amount of time, just
> to
> > > > fluctuate the rate over time.
> > > >     logger.debug("event {}", i);
> > > > }
> > > >
> > > > For RateLimiter, I would either use Guava or Resilience4j. I actually
> > > have a
> > > > single-class rate limiter implementation
> > > > <https://gist.github.com/vy/2f1ff259e0814c4bf32ee89f3a26dfff> that I
> > > stole
> > > > from Resilience4j:
> > > >
> > > > On Tue, Jun 29, 2021 at 3:17 AM Robert Middleton <
> > rmiddleton@apache.org>
> > > > wrote:
> > > >
> > > > > Is there a reliable way to ensure that X number of log messages get
> > > > > written out per second?  I ask because one of the problems that we
> > > > > have with log4cxx is that the TimeBasedRollingTest randomly
> fails(on
> > > > > OSX).  I've narrowed it down to the following function:
> > > > >
> > > > > void logMsgAndSleep(Pool & pool,
> > > > > size_t howOften,
> > > > > std::string srcFunc,
> > > > > size_t srcLine,
> > > > > size_t startWith = 0,
> > > > > float waitFactor = 0.5)
> > > > > {
> > > > >
> > > > > for (size_t i = startWith; i < startWith + howOften; ++i)
> > > > > {
> > > > > std::string message("Hello---");
> > > > > message.append(pool.itoa(i));
> > > > >
> > > > > LOG4CXX_DEBUG(logger, message);
> > > > > apr_sleep(APR_USEC_PER_SEC * waitFactor);
> > > > > }
> > > > >
> > > > > }
> > > > >
> > > > > The issue seems to be that due to various scheduling whims, we may
> > not
> > > > > always get the correct number of messages printed out per second,
> so
> > > > > the test will fail.  Is there a better way to do this, perhaps
> > > > > something in log4j?
> > > > >
> > > > > -Robert Middleton
> > > > >
> > >
> >
>

Re: Rolling tests / writing X log messages per second

Posted by Volkan Yazıcı <vo...@gmail.com>.

I totally agree with Matt on this. Once the code starts changing behaviour
via the system time, it becomes practically impossible to decently test it.
If you can mock the clock, testing becomes tractable. To the best of my
knowledge, that is what we mostly do in Log4j too.

On Sun, Jul 4, 2021 at 8:37 PM Matt Sicker <bo...@gmail.com> wrote:

> When testing the passage of time, you can always try mocking the clock
> itself to simulate things like that. In some cases, use of count down
> latches or cyclic barriers can coordinate the various async actions rather
> than relying on clocks. Once you start testing and modeling these things
> like that, I think the scheduling bugs and any other weird interactions
> will become far easier to discover and fix.
>
> On Thu, Jul 1, 2021 at 19:09 Robert Middleton <os...@gmail.com> wrote:
>
> > Firing events at a certain rate.  The test configures the rolling file
> > appender to roll over once a second.
> >
> > Specifically, the test ensures that the following works:
> > * Write log line 1 <-- second 1
> > * Write log line 2 <-- second 1
> > * Write log line 3, causes rollover <-- second 2
> > * Write log line 4 <-- second 2
> > * Write log line 5, causes rollover  <-- second 3
> > ....etc...
> >
> > Under certain conditions(mostly when running under Github actions on
> > OSX) this can result in the file having 1 or 3 log messages in it,
> > instead of exactly 2.  So it almost certainly has to do with the
> > particular scheduling and sleep interaction.
> >
> > I'm thinking that the best thing to do is to change up the algorithm
> > slightly so that it does
> > * write log line 1
> > * write log line 2
> > * wait until the next second
> > * write log line 3, causing rollover
> > ....etc...
> >
> > -Robert Middleton
> >
> > On Thu, Jul 1, 2021 at 3:22 PM Volkan Yazıcı <vo...@gmail.com>
> > wrote:
> > >
> > > What do you exactly mean by "ensuring"? Checking if the event rate was
> > > correct? Or rather firing events at a certain rate? For the former, you
> > can
> > > collect all the events, split the collection into buckets of seconds,
> and
> > > count the number of events in each bucket. For the latter, did you try
> > busy
> > > spinning rather than apr_sleep to avoid context switch costs? If I
> would
> > > want to implement this in Java, I would use a stock rate limiter, and
> > shoot
> > > using that. That is,
> > >
> > > RateLimiter rateLimiter = RateLimiter.ofPermitsPerSecond(10);
> > > for (int i = 0; i < l; i++) {
> > >     rateLimiter.acquire(); // This will block if the permit is not
> > > available.
> > >     // You might want to sleep here some random amount of time, just to
> > > fluctuate the rate over time.
> > >     logger.debug("event {}", i);
> > > }
> > >
> > > For RateLimiter, I would either use Guava or Resilience4j. I actually
> > have a
> > > single-class rate limiter implementation
> > > <https://gist.github.com/vy/2f1ff259e0814c4bf32ee89f3a26dfff> that I
> > stole
> > > from Resilience4j:
> > >
> > > On Tue, Jun 29, 2021 at 3:17 AM Robert Middleton <
> rmiddleton@apache.org>
> > > wrote:
> > >
> > > > Is there a reliable way to ensure that X number of log messages get
> > > > written out per second?  I ask because one of the problems that we
> > > > have with log4cxx is that the TimeBasedRollingTest randomly fails(on
> > > > OSX).  I've narrowed it down to the following function:
> > > >
> > > > void logMsgAndSleep(Pool & pool,
> > > > size_t howOften,
> > > > std::string srcFunc,
> > > > size_t srcLine,
> > > > size_t startWith = 0,
> > > > float waitFactor = 0.5)
> > > > {
> > > >
> > > > for (size_t i = startWith; i < startWith + howOften; ++i)
> > > > {
> > > > std::string message("Hello---");
> > > > message.append(pool.itoa(i));
> > > >
> > > > LOG4CXX_DEBUG(logger, message);
> > > > apr_sleep(APR_USEC_PER_SEC * waitFactor);
> > > > }
> > > >
> > > > }
> > > >
> > > > The issue seems to be that due to various scheduling whims, we may
> not
> > > > always get the correct number of messages printed out per second, so
> > > > the test will fail.  Is there a better way to do this, perhaps
> > > > something in log4j?
> > > >
> > > > -Robert Middleton
> > > >
> >
>

Re: Rolling tests / writing X log messages per second

Posted by Matt Sicker <bo...@gmail.com>.

When testing the passage of time, you can always try mocking the clock
itself to simulate things like that. In some cases, use of count down
latches or cyclic barriers can coordinate the various async actions rather
than relying on clocks. Once you start testing and modeling these things
like that, I think the scheduling bugs and any other weird interactions
will become far easier to discover and fix.

On Thu, Jul 1, 2021 at 19:09 Robert Middleton <os...@gmail.com> wrote:

> Firing events at a certain rate.  The test configures the rolling file
> appender to roll over once a second.
>
> Specifically, the test ensures that the following works:
> * Write log line 1 <-- second 1
> * Write log line 2 <-- second 1
> * Write log line 3, causes rollover <-- second 2
> * Write log line 4 <-- second 2
> * Write log line 5, causes rollover  <-- second 3
> ....etc...
>
> Under certain conditions(mostly when running under Github actions on
> OSX) this can result in the file having 1 or 3 log messages in it,
> instead of exactly 2.  So it almost certainly has to do with the
> particular scheduling and sleep interaction.
>
> I'm thinking that the best thing to do is to change up the algorithm
> slightly so that it does
> * write log line 1
> * write log line 2
> * wait until the next second
> * write log line 3, causing rollover
> ....etc...
>
> -Robert Middleton
>
> On Thu, Jul 1, 2021 at 3:22 PM Volkan Yazıcı <vo...@gmail.com>
> wrote:
> >
> > What do you exactly mean by "ensuring"? Checking if the event rate was
> > correct? Or rather firing events at a certain rate? For the former, you
> can
> > collect all the events, split the collection into buckets of seconds, and
> > count the number of events in each bucket. For the latter, did you try
> busy
> > spinning rather than apr_sleep to avoid context switch costs? If I would
> > want to implement this in Java, I would use a stock rate limiter, and
> shoot
> > using that. That is,
> >
> > RateLimiter rateLimiter = RateLimiter.ofPermitsPerSecond(10);
> > for (int i = 0; i < l; i++) {
> >     rateLimiter.acquire(); // This will block if the permit is not
> > available.
> >     // You might want to sleep here some random amount of time, just to
> > fluctuate the rate over time.
> >     logger.debug("event {}", i);
> > }
> >
> > For RateLimiter, I would either use Guava or Resilience4j. I actually
> have a
> > single-class rate limiter implementation
> > <https://gist.github.com/vy/2f1ff259e0814c4bf32ee89f3a26dfff> that I
> stole
> > from Resilience4j:
> >
> > On Tue, Jun 29, 2021 at 3:17 AM Robert Middleton <rm...@apache.org>
> > wrote:
> >
> > > Is there a reliable way to ensure that X number of log messages get
> > > written out per second?  I ask because one of the problems that we
> > > have with log4cxx is that the TimeBasedRollingTest randomly fails(on
> > > OSX).  I've narrowed it down to the following function:
> > >
> > > void logMsgAndSleep(Pool & pool,
> > > size_t howOften,
> > > std::string srcFunc,
> > > size_t srcLine,
> > > size_t startWith = 0,
> > > float waitFactor = 0.5)
> > > {
> > >
> > > for (size_t i = startWith; i < startWith + howOften; ++i)
> > > {
> > > std::string message("Hello---");
> > > message.append(pool.itoa(i));
> > >
> > > LOG4CXX_DEBUG(logger, message);
> > > apr_sleep(APR_USEC_PER_SEC * waitFactor);
> > > }
> > >
> > > }
> > >
> > > The issue seems to be that due to various scheduling whims, we may not
> > > always get the correct number of messages printed out per second, so
> > > the test will fail.  Is there a better way to do this, perhaps
> > > something in log4j?
> > >
> > > -Robert Middleton
> > >
>

Re: Rolling tests / writing X log messages per second

Posted by Robert Middleton <os...@gmail.com>.

Firing events at a certain rate.  The test configures the rolling file
appender to roll over once a second.

Specifically, the test ensures that the following works:
* Write log line 1 <-- second 1
* Write log line 2 <-- second 1
* Write log line 3, causes rollover <-- second 2
* Write log line 4 <-- second 2
* Write log line 5, causes rollover  <-- second 3
....etc...

Under certain conditions(mostly when running under Github actions on
OSX) this can result in the file having 1 or 3 log messages in it,
instead of exactly 2.  So it almost certainly has to do with the
particular scheduling and sleep interaction.

I'm thinking that the best thing to do is to change up the algorithm
slightly so that it does
* write log line 1
* write log line 2
* wait until the next second
* write log line 3, causing rollover
....etc...

-Robert Middleton

On Thu, Jul 1, 2021 at 3:22 PM Volkan Yazıcı <vo...@gmail.com> wrote:
>
> What do you exactly mean by "ensuring"? Checking if the event rate was
> correct? Or rather firing events at a certain rate? For the former, you can
> collect all the events, split the collection into buckets of seconds, and
> count the number of events in each bucket. For the latter, did you try busy
> spinning rather than apr_sleep to avoid context switch costs? If I would
> want to implement this in Java, I would use a stock rate limiter, and shoot
> using that. That is,
>
> RateLimiter rateLimiter = RateLimiter.ofPermitsPerSecond(10);
> for (int i = 0; i < l; i++) {
>     rateLimiter.acquire(); // This will block if the permit is not
> available.
>     // You might want to sleep here some random amount of time, just to
> fluctuate the rate over time.
>     logger.debug("event {}", i);
> }
>
> For RateLimiter, I would either use Guava or Resilience4j. I actually have a
> single-class rate limiter implementation
> <https://gist.github.com/vy/2f1ff259e0814c4bf32ee89f3a26dfff> that I stole
> from Resilience4j:
>
> On Tue, Jun 29, 2021 at 3:17 AM Robert Middleton <rm...@apache.org>
> wrote:
>
> > Is there a reliable way to ensure that X number of log messages get
> > written out per second?  I ask because one of the problems that we
> > have with log4cxx is that the TimeBasedRollingTest randomly fails(on
> > OSX).  I've narrowed it down to the following function:
> >
> > void logMsgAndSleep(Pool & pool,
> > size_t howOften,
> > std::string srcFunc,
> > size_t srcLine,
> > size_t startWith = 0,
> > float waitFactor = 0.5)
> > {
> >
> > for (size_t i = startWith; i < startWith + howOften; ++i)
> > {
> > std::string message("Hello---");
> > message.append(pool.itoa(i));
> >
> > LOG4CXX_DEBUG(logger, message);
> > apr_sleep(APR_USEC_PER_SEC * waitFactor);
> > }
> >
> > }
> >
> > The issue seems to be that due to various scheduling whims, we may not
> > always get the correct number of messages printed out per second, so
> > the test will fail.  Is there a better way to do this, perhaps
> > something in log4j?
> >
> > -Robert Middleton
> >

Re: Rolling tests / writing X log messages per second

Posted by Volkan Yazıcı <vo...@gmail.com>.

What do you exactly mean by "ensuring"? Checking if the event rate was
correct? Or rather firing events at a certain rate? For the former, you can
collect all the events, split the collection into buckets of seconds, and
count the number of events in each bucket. For the latter, did you try busy
spinning rather than apr_sleep to avoid context switch costs? If I would
want to implement this in Java, I would use a stock rate limiter, and shoot
using that. That is,

RateLimiter rateLimiter = RateLimiter.ofPermitsPerSecond(10);
for (int i = 0; i < l; i++) {
    rateLimiter.acquire(); // This will block if the permit is not
available.
    // You might want to sleep here some random amount of time, just to
fluctuate the rate over time.
    logger.debug("event {}", i);
}

For RateLimiter, I would either use Guava or Resilience4j. I actually have a
single-class rate limiter implementation
<https://gist.github.com/vy/2f1ff259e0814c4bf32ee89f3a26dfff> that I stole
from Resilience4j:

On Tue, Jun 29, 2021 at 3:17 AM Robert Middleton <rm...@apache.org>
wrote:

> Is there a reliable way to ensure that X number of log messages get
> written out per second?  I ask because one of the problems that we
> have with log4cxx is that the TimeBasedRollingTest randomly fails(on
> OSX).  I've narrowed it down to the following function:
>
> void logMsgAndSleep(Pool & pool,
> size_t howOften,
> std::string srcFunc,
> size_t srcLine,
> size_t startWith = 0,
> float waitFactor = 0.5)
> {
>
> for (size_t i = startWith; i < startWith + howOften; ++i)
> {
> std::string message("Hello---");
> message.append(pool.itoa(i));
>
> LOG4CXX_DEBUG(logger, message);
> apr_sleep(APR_USEC_PER_SEC * waitFactor);
> }
>
> }
>
> The issue seems to be that due to various scheduling whims, we may not
> always get the correct number of messages printed out per second, so
> the test will fail.  Is there a better way to do this, perhaps
> something in log4j?
>
> -Robert Middleton
>