You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tomcat.apache.org by Gulcu Ceki <cg...@zurich.ibm.com> on 2000/08/23 12:24:05 UTC

Re: Tomcat Logging

Hi Alex,

> I haven't quite finished my proposal, so hopefully when I do you'll
> see what I mean, but briefly...
> 
> As I understand from perusing the docs, in log4j client code specifies
> the target category, which is then directed by log4j to the right
> place.  But in Tomcat, we have common code running in many different
> contexts/webapps, and each may have its own log targets, according to
> dynamic server.xml configuration.  So we still need a higher level of
> specification of the category.

Just to set the record straight, log4j categories are configurable at
run-time. Log4j components can be configured specifying configuration
files either in key=value format or in XML format. 

Thus, different JVMs using based on the same code but using different
configuration files can have radically different logging behavior.

What happens if the same log request is made in different contexts but
within the same JVM? How do we achieve context sensitive logging then?

Here is how log4j addresses this issue:

Most real-world systems need to deal with multiple clients
simultaneously. In a typical multi-threaded implementation of such a
system, different threads will be handling different clients. Logging
is especially well suited to trace and debug complex distributed
applications. A common approach to differentiate the logging output of
one client from another is to instantiate a new separate category for
each client. This promotes the proliferation of categories and will
considerably increase the management overhead of logging.

A simpler technique is to uniquely stamp each log request initiated
form the same client interaction. Neil Harrison described this method
in "Patterns for Logging Diagnostic Messages", in Pattern Languages of
Program Design 3" edited by Martin et al.

To uniquely stamp each request, the user pushes contextual information
into the NDC, the abbreviation of Nested Diagnostic Context. Part of
the NDC class is shown below.

-----------------------------------------------------
package org.log4j;

public class NDC { 
  // Used when printing the diagnostic context
  public static String get(); 

  // Remove the top of the context from the NDC.
  public static String pop(); 

 // Add diagnostic context for the current thread.
  public static void push(String message); 

 // Remove the diagnostic context for this thread.                      
  public static void remove(); 
}
---------------------------------------------------------


The NDC is managed *per thread* as a stack of contextual information.
Note that all methods of the org.log4j.NDC class are static. Assuming
that NDC printing is enabled, every time a log request is made, the
appropriate log4j component will include the NDC for the calling
thread in the log output. This is done without the intervention of the
user. She is only responsible of placing the right information into
the NDC by using the push and pop methods at a few well-defined points
in the code.  In contrast, the per client category approach commands
extensive changes in the code.

To illustrate the point, let us take the example of a servlet
delivering content to numerous clients. The servlet can build the NDC
at the very beginning of the request before executing other code. The
contextual information can be the requesting client's host name and
other information inherent to the request, typically information
contained in cookies. Hence, even if the servlet is serving multiple
clients simultaneously, the logs initiated by the same code,
i.e. belonging to the same category can still be distinguished because
each client request will have a different NDC. Contrast this with the
complexity of passing a freshly instantiated category to all code
exercised during the client's request.

You may refer to the NumberCruncher example in log4j distribution for
a concrete example.

Admittedly, the NDC approach does not send the log output from
different contexts to different places.  This is good because it
avoids the proliferation of log files. It is also bad because it does
not fulfill your original specification, that is having each component
instance have its own separate log target. Given the NDC approach, do
you really need to have separate log targets? What do you think?
Regards, Ceki

Re: Tomcat Logging

Posted by Costin Manolache <cm...@eng.sun.com>.

Hi Ceki,

I have a different concern with tomcat logging - performance.
There are a lot of things happening in the logger - string
concatentations, date formating, queues and objects pushed/
pulled from queues.

Tomcat is moving to almost 0 memory allocation ( i.e. constant
memory, except the Socket and few small objects we can't recycle).
The problem is that as we eliminate hotspots from tomcat the logging
 starts to show up. I'm comfortable with the current logger and
I think I know how to try to improve it, but I don't know too much
about log4j - except that it has a lot of cool features ( and many
times features means harder to tune and optimize).

Have you run log4j in OptimizeIt or somethig similar ?  What is the
amount of GC and processor time ?

Even if logging happen in a separate thread, on most systems ( Apache,
IIS, etc) it have a significant impact on the server perfromance, and
java is probably not an exception.

Regards,
Costin

( and thanks for all the useful information, I learned a lot from this
thread )

> > As I understand from perusing the docs, in log4j client code specifies
> > the target category, which is then directed by log4j to the right
> > place.  But in Tomcat, we have common code running in many different
> > contexts/webapps, and each may have its own log targets, according to
> > dynamic server.xml configuration.  So we still need a higher level of
> > specification of the category.
>
> Just to set the record straight, log4j categories are configurable at
> run-time. Log4j components can be configured specifying configuration
> files either in key=value format or in XML format.
>
> Thus, different JVMs using based on the same code but using different
> configuration files can have radically different logging behavior.
>
> What happens if the same log request is made in different contexts but
> within the same JVM? How do we achieve context sensitive logging then?
>
> Here is how log4j addresses this issue:
>
> Most real-world systems need to deal with multiple clients
> simultaneously. In a typical multi-threaded implementation of such a
> system, different threads will be handling different clients. Logging
> is especially well suited to trace and debug complex distributed
> applications. A common approach to differentiate the logging output of
> one client from another is to instantiate a new separate category for
> each client. This promotes the proliferation of categories and will
> considerably increase the management overhead of logging.
>
> A simpler technique is to uniquely stamp each log request initiated
> form the same client interaction. Neil Harrison described this method
> in "Patterns for Logging Diagnostic Messages", in Pattern Languages of
> Program Design 3" edited by Martin et al.
>
> To uniquely stamp each request, the user pushes contextual information
> into the NDC, the abbreviation of Nested Diagnostic Context. Part of
> the NDC class is shown below.
>
> -----------------------------------------------------
> package org.log4j;
>
> public class NDC {
>   // Used when printing the diagnostic context
>   public static String get();
>
>   // Remove the top of the context from the NDC.
>   public static String pop();
>
>  // Add diagnostic context for the current thread.
>   public static void push(String message);
>
>  // Remove the diagnostic context for this thread.
>   public static void remove();
> }
> ---------------------------------------------------------
>
> The NDC is managed *per thread* as a stack of contextual information.
> Note that all methods of the org.log4j.NDC class are static. Assuming
> that NDC printing is enabled, every time a log request is made, the
> appropriate log4j component will include the NDC for the calling
> thread in the log output. This is done without the intervention of the
> user. She is only responsible of placing the right information into
> the NDC by using the push and pop methods at a few well-defined points
> in the code.  In contrast, the per client category approach commands
> extensive changes in the code.
>
> To illustrate the point, let us take the example of a servlet
> delivering content to numerous clients. The servlet can build the NDC
> at the very beginning of the request before executing other code. The
> contextual information can be the requesting client's host name and
> other information inherent to the request, typically information
> contained in cookies. Hence, even if the servlet is serving multiple
> clients simultaneously, the logs initiated by the same code,
> i.e. belonging to the same category can still be distinguished because
> each client request will have a different NDC. Contrast this with the
> complexity of passing a freshly instantiated category to all code
> exercised during the client's request.
>
> You may refer to the NumberCruncher example in log4j distribution for
> a concrete example.
>
> Admittedly, the NDC approach does not send the log output from
> different contexts to different places.  This is good because it
> avoids the proliferation of log files. It is also bad because it does
> not fulfill your original specification, that is having each component
> instance have its own separate log target. Given the NDC approach, do
> you really need to have separate log targets? What do you think?
> Regards, Ceki
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org

Re: Tomcat Logging

Posted by Costin Manolache <cm...@yahoo.com>.

> Yes, it's important to keep the ability to maintain multiple log
> files. One target architecture for Tomcat is an ISP hosting hundreds
> of different webapps for different clients; with multiple log files
> (one per webapp) the admin can maintain security (so people can't read
> each other's logs).

There are 2 ways to do that:
- either provide a separate log per webapp ( or group of webapps,
i.e. customer).
- or keep all logs in one place and provide an API ( or an alternate
method ) to access the logs.

The most convenient is the first, and if the log is written in the
workdir for the webapp the user can use a simple servlet to get
it ( it'll be tomcat specific ).

The only problem is scalability - an ISP has 100+ customers and
keeping 100+ open file descriptors is not a solution. Keeping a
pool will also have an impact.

I'm not saying this is a bad solution - but I don't think it scales.
It is perfect for small sites.

The second is to store all logs in a single repository and either
provide an API ( where checks for user id is done)
or do a post-processing.

Post processing means every day ( or hour ) the common log filed
is processed and individual log files are generated.
That also means a binary format can be used ( i.e. no formating
or expensive processing ) and it also means an external process
( maybe offline, separate machine ) can do the job without
affecting the production server.
This config has the minimal impact on the server performance.

Even today it is possible to switch the logger ( extending it for
example ), but I would love to see the the second solution.

( BTW, don't even try to store the logs in a database - it's
too slow.)


> I'm wary of the NDC approach; as you said, since Tomcat uses thread
> pooling, it would require activity before and after each request.
> This would surely have some impact on performance (especially if it
> requires allocating new objects -- a main part of the performance
> tuning Costin's doing is minimizing GC).

As long as the logger is configurable I think we are ok.

Keeping the memory and CPU usage low requires a lot
of discipline and effort - but we are talking about a small
server with a clear functionality, not a GUI or complex
application.


> 1. configuration (though I can imagine with a bit of cleverness we
> could delegate most or all of the log config out of server.xml and
> into log4j's config file)

Having the logger as a "standalone" bean ( i.e. no deps on tomcat) is
very important.
Having external configuration ( i.e. bean setters) proved to be
very good for tomcat, and I think it's a pattern we want to keep.
How those setters are called ( server.xml or EmbededFooApp)
is not so important.

Tomcat will have deps on logger, and will have to adapt to
it's configuration anyway.


> 2. standardizing the way a component knows which category/target to
> write to. This could be as simple as a string property (instance
> variable) with log4j category, or maybe an interface, something like

Or we can just keep it simple and trust the developers to follow
some patterns. If it's really simple to use ( like one line to
get the logger, one line to use it) we'll have more chance to have
the logger used and re-used.

LogAware is fine, but I think having a simple
Logger {
  static Logger getLogger( String category, etc)
  log( .... )
}

will be enough.


> 3. regulating "debug" level (so a component won't even execute the
> code that builds the debug log strings -- which may involve some heavy
> lifting -- this simply can't be handled by library code in Java)

For debug ( and we want to have as much debug() as possible )
I think we should use a special rule,
if( debug > level ) debug()
( maybe with static debug )

Smart compilers will remove this - and I don't think there is anything
equivalent.

Costin

Re: Tomcat Logging

Posted by "Craig R. McClanahan" <Cr...@eng.sun.com>.

Alex Chaffee wrote:

> The per-context or per-component category idea seems more sound.
> Again, it would only solve the output side of things (#3 on my
> original list); we'd still have to write Tomcat code to handle
>

I agree with the thinking that per-context (or more generally, per-container) is the
best way to organize logging output, for the reasons that you described earlier.

Alex, have you looked at how loggers are configured in Catalina (i.e. the
"jakarta-tomcat-4.0" module) yet?  They deal with your #3 issue, because they are
attached to a container -- which can be any of the entire server, a particular virtual
host, or a particular web-app -- and you can override the logger assigned to a
lower-level container if you want to, as illustrated in the default server.xml file.

The current Catalina design addresses your other issues as follows:

>
> 1. configuration (though I can imagine with a bit of cleverness we
> could delegate most or all of the log config out of server.xml and
> into log4j's config file)
>

Configuration of Catalina loggers follows the usual pattern for parsing
"conf/server.xml", where it matches a particular element -- in this case a <Logger> --
nested within a container element (<Engine>, <Host>, or <Context>).  You use the
"className" attribute to choose which Logger implementation class to load (see below
for more details), and then does the usual Java reflection to match up the rest of the
attributes with JavaBeans property setters on the Logger implementation object.  That
way, different logger implementations need not have the same configuration properties.

All Logger implementations must implement org.apache.catalina.Logger, which has log()
calls that cover the usual patterns of things you want (including the optional ability
to declare a severity level and have your logger implementation do filtering).  The
current set of available logger implementations (in package
org.apache.catalina.loggers) includes:
* Log to system output
* Log to system error
* Log to a configurable filename pattern in
  a configurable directory, with automatic
  switchover to new log files at midnight
  (the date is in the filename).

>
> 2. standardizing the way a component knows which category/target to
> write to. This could be as simple as a string property (instance
> variable) with log4j category, or maybe an interface, something like
>

Essentially every internal Catalina component is attached to a Container of some sort,
so it's real easy:

    getContainer().getLogger().log("your message goes here");

If your component is connected to a Context (say, it's the session manager), and you
have not assigned a specific Logger to this web app, the getLogger() method
automatically walks up the container hierarchy until it finds an assigned Logger.
Thus, you can change the configuration file to log all the webapps for a virtual host
together, if you want, without modifying any code.

For components that are not internal to Catalina, (such as the Jasper servlet), I would
suggest that we use the ServletContext.log() mechanism -- otherwise, it would be
introducing dependencies between Catalina and Jasper that hinder portability.  Of
course, those issues can be addressed with adapter classes that are used when Catalina
and Jasper really are configured together if we want to.

>
>  interface LogAware {
>   void setLogTarget(String category);
>  }
>
> or, better:
>
>  interface LogAware {
>   void setLogger(Logger l);
>  }
>
>  interface Logger {
>   void setCategory(String category);    // for writing directly to a log4j category
>   void setMaster(Logger logger);        // for delegating to a parent
>   void log(String msg);
>   // etc.
>  }
>
> and then appropriate helper and impl classes, and implementing it in
> all components.)
>

This is addressed differently in the current Catalina implemenation, as described
above.

>
> 3. regulating "debug" level (so a component won't even execute the
> code that builds the debug log strings -- which may involve some heavy
> lifting -- this simply can't be handled by library code in Java)
>

The org.apache.catalina.Logger interface includes the concept of a verbosity level
(with standard constants for the usual suspects:  FATAL, ERROR, WARNING, INFORMATION,
and DEBUG).  If you call a log() method that does not have a verbosity argument, the
message (or message and exception) are logged unconditionally.  Otherwise, they are
filtered based on the verbosity level you have configured for the logger.

To get around performance issues of creating log messages that I'm going to throw away,
most of the existing Catalina components have a "debug" property set to an integer
value (default = 0).  Then, you'll see code like this:

    if (debug >= 1)
        logger.log("some debugging message goes here")

so that you don't have to compute the message string, which might include string
concatenations etc., unless you are really going to use it.  But you can still do
logger-level filtering as well if you want to.

>
> and probably some more things I've forgotten :-)
>

On a long-running server, I've found the "automatic rollover to new files each day"
feature to be incredibly handy, and have built it into all of my server applications
for years.  I would suggest that we keep this concept no matter what ultimate logging
approach is chosen.

>
>  - Alex
>

Craig

Re: Tomcat Logging

Posted by Alex Chaffee <gu...@edamame.stinky.com>.

Ceki -

Thanks for the thorough reply.

Yes, it's important to keep the ability to maintain multiple log
files. One target architecture for Tomcat is an ISP hosting hundreds
of different webapps for different clients; with multiple log files
(one per webapp) the admin can maintain security (so people can't read
each other's logs).

I'm wary of the NDC approach; as you said, since Tomcat uses thread
pooling, it would require activity before and after each request.
This would surely have some impact on performance (especially if it
requires allocating new objects -- a main part of the performance
tuning Costin's doing is minimizing GC).

The per-context or per-component category idea seems more sound.
Again, it would only solve the output side of things (#3 on my
original list); we'd still have to write Tomcat code to handle 

1. configuration (though I can imagine with a bit of cleverness we
could delegate most or all of the log config out of server.xml and
into log4j's config file)

2. standardizing the way a component knows which category/target to
write to. This could be as simple as a string property (instance
variable) with log4j category, or maybe an interface, something like

 interface LogAware {
  void setLogTarget(String category);
 }

or, better:

 interface LogAware {
  void setLogger(Logger l);
 }

 interface Logger {
  void setCategory(String category);	// for writing directly to a log4j category
  void setMaster(Logger logger);	// for delegating to a parent
  void log(String msg);
  // etc.
 }

and then appropriate helper and impl classes, and implementing it in
all components.)

3. regulating "debug" level (so a component won't even execute the
code that builds the debug log strings -- which may involve some heavy
lifting -- this simply can't be handled by library code in Java)

and probably some more things I've forgotten :-)

 - Alex

P.S. This is fun!  It's really helping me flesh out my understanding
of the issues.  Hopefully, even if we don't use log4j with Tomcat, we
can each help the other's product get stronger...

-- 
Alex Chaffee                       mailto:alex@jguru.com
jGuru - Java News and FAQs         http://www.jguru.com/alex/
Creator of Gamelan                 http://www.gamelan.com/
Founder of Purple Technology       http://www.purpletech.com/
Curator of Stinky Art Collective   http://www.stinky.com/