You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Alex Chaffee <gu...@edamame.stinky.com> on 2000/07/04 09:37:51 UTC

VOTE: Timestamps in log

I've implemented Timetamps in logs as follows:

You can turn them off entirely, turn them on as raw-msecs, or turn
them on as a clean time-date format (yyyy-MM-dd hh:mm:ss)

Default behavior: on as string, e.g.
2000-07-04 10:41:23 - PoolTcpConnector: Starting HttpConnectionHandler on 8080

With timestampFormat="msec":
962719779440 - ContextManager: Adding context Ctx( /examples )

You can also supply any arbitrary format string you want, to pass to
SimpleDateFormat.  It wasn't hard, so I just went and did it :-)

Any objections to adding this to the release branch?  It's much
cleaner than the current behavior (new Date().toString() or not at
all)

VOTE: add above to release branch?


-- 
Alex Chaffee                       mailto:alex@jguru.com
jGuru - Java News and FAQs         http://www.jguru.com/alex/
Creator of Gamelan                 http://www.gamelan.com/
Founder of Purple Technology       http://www.purpletech.com/
Curator of Stinky Art Collective   http://www.stinky.com/

RE: VOTE: Timestamps in log

Posted by Stan Bailes <st...@quadcap.com>.
I have dealt with this kind of thing in the past by working
hard to avoid calling the SimpleDateFormat methods in the 
first place.   My simple strategy is to limit calls to 'format'
to no more than one per second, doing a little bit of buffer
fiddling to handle the milliseconds specially.

Having real dates in the logs is a big win.  There's no
reason that this should be slow...

Here's a reusable wrapper around SimpleDateFormat that really
speeds things up.

Stan Bailes
Quadcap Software


-------------------------------------------------------------
/**
 * Copyright 2000 Quadcap Software.
 *
 * This software may be freely redistributed in source or binary form
 * for any purpose.
 */
import java.util.Date;

import java.text.DateFormat;
import java.text.FieldPosition;
import java.text.SimpleDateFormat;

/**
 * Fast date formatter that caches recently formatted date information
 * and uses it to avoid too-frequent calls to the underlying formatter.
 *
 * @author Stan Bailes
 */
public class FastDateFormat {
    DateFormat                  df;
    long                        lastSec         = -1;
    StringBuffer                sb              = new StringBuffer();
    FieldPosition               fp              = new FieldPosition(DateFormat.MILLISECOND_FIELD);

    public FastDateFormat(DateFormat df) {
        this.df = df;
    }

    public synchronized String format(Date d) {
        final long dt = d.getTime();
        final long ds = dt / 1000;
        if (ds != lastSec) {
            sb.setLength(0);
            df.format(d, sb, fp);
            lastSec = ds;
        } else {
            int ms = (int)(dt % 1000);
            int pos = fp.getEndIndex() - 1;
            sb.setCharAt(pos--, Character.forDigit(ms % 10, 10));
            ms /= 10;
            sb.setCharAt(pos--, Character.forDigit(ms % 10, 10));
            ms /= 10;
            sb.setCharAt(pos--, Character.forDigit(ms % 10, 10));
        }
        return sb.toString();
    }

    public static void main(String[] args) {
        SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS");
        FastDateFormat fdf = new FastDateFormat(sdf);
        Date d = new Date();
        final int reps = 100000;
        {
            long start = System.currentTimeMillis();
            for (int i = 0; i < reps; i++) {
                d.setTime(System.currentTimeMillis());
                fdf.format(d);
            }
            long elap = System.currentTimeMillis() - start;
            System.out.println("fast: " + elap + " elapsed");
        }
        {
            long start = System.currentTimeMillis();
            for (int i = 0; i < reps; i++) {
                d.setTime(System.currentTimeMillis());
                sdf.format(d);
            }
            long elap = System.currentTimeMillis() - start;
            System.out.println("slow: " + elap + " elapsed");
        }
    }
}



> -----Original Message-----
> From: costin@costin.dnt.ro [mailto:costin@costin.dnt.ro]On Behalf Of
> Costin Manolache
> Sent: Wednesday, July 05, 2000 12:19 PM
> To: alex@jguru.com; tomcat-dev@jakarta.apache.org
> Subject: Re: VOTE: Timestamps in log
> 
> 
> > > Time-stamping anything but the access log is a waste, and formating the
> > > date is also a waste.
> >
> > I disagree; I've often been able to track down errors, by
> > synchronizing different log file entries, thanks to timestamps.  And
> > since this is usually done with grep and less, it's good that it's
> > formatted so it's legible to a human eye.
> 
> I agree with that - my only problem is that we don't have a good way
> to format the timestamp.
> 
> 
> 
> > > Most of the performance improvements between tocmat 3.1 and 3.2 comes from
> > > reducing the GC and reusing the objects. Date formating is a big GC
> > > generator, and it's almost imposible to reuse the date formating objects
> > > with the current API ( or I coulnd't find a way ).
> >
> > The date formatter itself is actually reusable ( I made it an instance
> > var), however...
> >
> > > I would be happy to vote +1 if you find a way to format the date that
> > > allows object reuse !
> > > ( and again - it wouldn't be pre-optimization )
> >
> > Aha, a challenge! :-)
> >
> > SimpleDateFormat.format(Date date, StringBuffer toAppendTo,
> > FieldPosition pos) is what I'm using, and you're right, it seems to
> > sometimes (though not always) allocate a few StringBuffers along the
> > way, which it discards for GC.  (I'd have to track this down more
> > carefully; it makes heavy use of lookup tables for strings, and uses
> > StringBuffer.append a lot.)
> 
> > Now, the current LogEntry object has a toString() method, which
> > allocates, fills, and converts a StringBuffer, which may cause at
> > least one StringBuffer and a String to be left in the heap for each
> > message, but that happens regardless of timestamping.
> 
> > I suppose that this whole process should be tuned to use only
> > stream.write() methods and no internal Strings or StringBuffers,
> > right?  Until that happens, though, we can safely write off the logger
> > as "untuned" and hopefully the extra object or two left around by the
> > date formatter won't make it *that* much less tuned (relative to other
> > processes)... Given your tuning experience, do you (perhaps with
> > reservations) agree?
> 
> Yes.
> 
> It would be great to be the "extra object or two", it's much more.
> The logging creates more GC than the request processing, and that's
> not good, but for 3.2 I don't think we want to tune up the
> formatter.
> 
> A date formater that is efficient and can be reused is very important,
> we do need date conversion for cookies and some headers
> ( and we can't turn this off :-).
> 
> I guess we can start this dicussion again when we do have a
> date formatter.
> 
> 
> ( BTW,  try to run "ab -c 40 ", with logger turned of and on with
> date and with millis, and maybe OptimizeIt. It's very interesting).
> 
> Costin
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org
> 
> 

Re: VOTE: Timestamps in log

Posted by Costin Manolache <co...@eng.sun.com>.
> > Time-stamping anything but the access log is a waste, and formating the
> > date is also a waste.
>
> I disagree; I've often been able to track down errors, by
> synchronizing different log file entries, thanks to timestamps.  And
> since this is usually done with grep and less, it's good that it's
> formatted so it's legible to a human eye.

I agree with that - my only problem is that we don't have a good way
to format the timestamp.



> > Most of the performance improvements between tocmat 3.1 and 3.2 comes from
> > reducing the GC and reusing the objects. Date formating is a big GC
> > generator, and it's almost imposible to reuse the date formating objects
> > with the current API ( or I coulnd't find a way ).
>
> The date formatter itself is actually reusable ( I made it an instance
> var), however...
>
> > I would be happy to vote +1 if you find a way to format the date that
> > allows object reuse !
> > ( and again - it wouldn't be pre-optimization )
>
> Aha, a challenge! :-)
>
> SimpleDateFormat.format(Date date, StringBuffer toAppendTo,
> FieldPosition pos) is what I'm using, and you're right, it seems to
> sometimes (though not always) allocate a few StringBuffers along the
> way, which it discards for GC.  (I'd have to track this down more
> carefully; it makes heavy use of lookup tables for strings, and uses
> StringBuffer.append a lot.)

> Now, the current LogEntry object has a toString() method, which
> allocates, fills, and converts a StringBuffer, which may cause at
> least one StringBuffer and a String to be left in the heap for each
> message, but that happens regardless of timestamping.

> I suppose that this whole process should be tuned to use only
> stream.write() methods and no internal Strings or StringBuffers,
> right?  Until that happens, though, we can safely write off the logger
> as "untuned" and hopefully the extra object or two left around by the
> date formatter won't make it *that* much less tuned (relative to other
> processes)... Given your tuning experience, do you (perhaps with
> reservations) agree?

Yes.

It would be great to be the "extra object or two", it's much more.
The logging creates more GC than the request processing, and that's
not good, but for 3.2 I don't think we want to tune up the
formatter.

A date formater that is efficient and can be reused is very important,
we do need date conversion for cookies and some headers
( and we can't turn this off :-).

I guess we can start this dicussion again when we do have a
date formatter.


( BTW,  try to run "ab -c 40 ", with logger turned of and on with
date and with millis, and maybe OptimizeIt. It's very interesting).

Costin




Re: VOTE: Timestamps in log

Posted by Alex Chaffee <gu...@edamame.stinky.com>.
> > (Anyone replying may say +1 for "2000-07-04 13:45:11" and -1 for
> > "963636252"; if nobody but Costin objects, I'll keep it how it is by
> > default.)
> 
> AFAIK a -1 with valid reasons is a veto. Anyway, if you feel it's so
> important to have formated dates by default - I'm fine, you can go ahead.

You're right, of course; and thanks for your vote towards consensus.


On Wed, Jul 05, 2000 at 10:53:40AM -0700, costin@eng.sun.com wrote:
> > My rule of thumb in programming is "don't pre-optimize".  Most people
> > using Tomcat will not be running high-load web sites (since most
> > people in the world don't run high-load web sites) so it's better to
> > optimize for clarity over performance.  
> 
> I think this is far from a "pre-optimize" case - we are talking about a
> default option in the final product ( maybe post-optimize). 

Given that anyone who cares can turn it off in a config file, this
argument doesn't sway me.  The default should be "most useful to most
people" which I feel here means "most legible".


> Time-stamping anything but the access log is a waste, and formating the
> date is also a waste.  

I disagree; I've often been able to track down errors, by
synchronizing different log file entries, thanks to timestamps.  And
since this is usually done with grep and less, it's good that it's
formatted so it's legible to a human eye.


> > Also, given that the log writer is running in a backround thread
> > anyway (nice design, whoever wrote that, btw), it will never impact
> > the latency of a request in any case.
> 
> I spent last months tuning up tomcat, and believe me - GC has a huge
> impact in latency.

Ah. OK. Excellent point.


> Most of the performance improvements between tocmat 3.1 and 3.2 comes from
> reducing the GC and reusing the objects. Date formating is a big GC
> generator, and it's almost imposible to reuse the date formating objects
> with the current API ( or I coulnd't find a way ). 

The date formatter itself is actually reusable ( I made it an instance
var), however...

> I would be happy to vote +1 if you find a way to format the date that
> allows object reuse ! 
> ( and again - it wouldn't be pre-optimization )

Aha, a challenge! :-)

SimpleDateFormat.format(Date date, StringBuffer toAppendTo,
FieldPosition pos) is what I'm using, and you're right, it seems to
sometimes (though not always) allocate a few StringBuffers along the
way, which it discards for GC.  (I'd have to track this down more
carefully; it makes heavy use of lookup tables for strings, and uses
StringBuffer.append a lot.)

Now, the current LogEntry object has a toString() method, which
allocates, fills, and converts a StringBuffer, which may cause at
least one StringBuffer and a String to be left in the heap for each
message, but that happens regardless of timestamping.

I suppose that this whole process should be tuned to use only
stream.write() methods and no internal Strings or StringBuffers,
right?  Until that happens, though, we can safely write off the logger
as "untuned" and hopefully the extra object or two left around by the
date formatter won't make it *that* much less tuned (relative to other
processes)... Given your tuning experience, do you (perhaps with
reservations) agree?

 - Alex


-- 
Alex Chaffee                       mailto:alex@jguru.com
jGuru - Java News and FAQs         http://www.jguru.com/alex/
Creator of Gamelan                 http://www.gamelan.com/
Founder of Purple Technology       http://www.purpletech.com/
Curator of Stinky Art Collective   http://www.stinky.com/

Re: VOTE: Timestamps in log

Posted by co...@eng.sun.com.
> My rule of thumb in programming is "don't pre-optimize".  Most people
> using Tomcat will not be running high-load web sites (since most
> people in the world don't run high-load web sites) so it's better to
> optimize for clarity over performance.  

I think this is far from a "pre-optimize" case - we are talking about a
default option in the final product ( maybe post-optimize). 

Time-stamping anything but the access log is a waste, and formating the
date is also a waste.  


> Also, given that the log writer is running in a backround thread
> anyway (nice design, whoever wrote that, btw), it will never impact
> the latency of a request in any case.

I spent last months tuning up tomcat, and believe me - GC has a huge
impact in latency.

Most of the performance improvements between tocmat 3.1 and 3.2 comes from
reducing the GC and reusing the objects. Date formating is a big GC
generator, and it's almost imposible to reuse the date formating objects
with the current API ( or I coulnd't find a way ). 

> (Anyone replying may say +1 for "2000-07-04 13:45:11" and -1 for
> "963636252"; if nobody but Costin objects, I'll keep it how it is by
> default.)

AFAIK a -1 with valid reasons is a veto. Anyway, if you feel it's so
important to have formated dates by default - I'm fine, you can go ahead.

I would be happy to vote +1 if you find a way to format the date that
allows object reuse ! 
( and again - it wouldn't be pre-optimization )

Costin




Re: VOTE: Timestamps in log

Posted by Alex Chaffee <gu...@edamame.stinky.com>.
On Tue, Jul 04, 2000 at 09:06:18PM -0700, Costin Manolache wrote:
> +1 on adding it. If Sam agrees, it can go into 3.2.
> 
> I'm not sure about the default - I would like "none" or "msec" as default.
> It is still a very expensive operation, and if somebody really needs the date it
> can easily convery msec to Date in a post-processor. As long as we don't follow
> the common log format it doesn't matter what default we use.
> 

My rule of thumb in programming is "don't pre-optimize".  Most people
using Tomcat will not be running high-load web sites (since most
people in the world don't run high-load web sites) so it's better to
optimize for clarity over performance.  

Also, given that the log writer is running in a backround thread
anyway (nice design, whoever wrote that, btw), it will never impact
the latency of a request in any case.

(Anyone replying may say +1 for "2000-07-04 13:45:11" and -1 for
"963636252"; if nobody but Costin objects, I'll keep it how it is by
default.)

 - Alex

-- 
Alex Chaffee                       mailto:alex@jguru.com
jGuru - Java News and FAQs         http://www.jguru.com/alex/
Creator of Gamelan                 http://www.gamelan.com/
Founder of Purple Technology       http://www.purpletech.com/
Curator of Stinky Art Collective   http://www.stinky.com/

Re: VOTE: Timestamps in log

Posted by Costin Manolache <co...@eng.sun.com>.
+1 on adding it. If Sam agrees, it can go into 3.2.

I'm not sure about the default - I would like "none" or "msec" as default.
It is still a very expensive operation, and if somebody really needs the date it
can easily convery msec to Date in a post-processor. As long as we don't follow
the common log format it doesn't matter what default we use.

Costin


Alex Chaffee wrote:

> I've implemented Timetamps in logs as follows:
>
> You can turn them off entirely, turn them on as raw-msecs, or turn
> them on as a clean time-date format (yyyy-MM-dd hh:mm:ss)
>
> Default behavior: on as string, e.g.
> 2000-07-04 10:41:23 - PoolTcpConnector: Starting HttpConnectionHandler on 8080
>
> With timestampFormat="msec":
> 962719779440 - ContextManager: Adding context Ctx( /examples )
>
> You can also supply any arbitrary format string you want, to pass to
> SimpleDateFormat.  It wasn't hard, so I just went and did it :-)
>
> Any objections to adding this to the release branch?  It's much
> cleaner than the current behavior (new Date().toString() or not at
> all)
>
> VOTE: add above to release branch?


Re: VOTE: Timestamps in log

Posted by Vasile GABURICI <ga...@ss.pub.ro>.
On Tue, 4 Jul 2000, Alex Chaffee wrote:

> 
> I've implemented Timetamps in logs as follows:
> 
> You can turn them off entirely, turn them on as raw-msecs, or turn
> them on as a clean time-date format (yyyy-MM-dd hh:mm:ss)
> 
> Default behavior: on as string, e.g.
> 2000-07-04 10:41:23 - PoolTcpConnector: Starting HttpConnectionHandler on 8080
> 
> With timestampFormat="msec":
> 962719779440 - ContextManager: Adding context Ctx( /examples )
> 
> You can also supply any arbitrary format string you want, to pass to
> SimpleDateFormat.  It wasn't hard, so I just went and did it :-)
> 
> Any objections to adding this to the release branch?  It's much
> cleaner than the current behavior (new Date().toString() or not at
> all)
> 
> VOTE: add above to release branch?
> 

	+1

> -- 
> Alex Chaffee                       mailto:alex@jguru.com
> jGuru - Java News and FAQs         http://www.jguru.com/alex/
> Creator of Gamelan                 http://www.gamelan.com/
> Founder of Purple Technology       http://www.purpletech.com/
> Curator of Stinky Art Collective   http://www.stinky.com/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org
> 
>