You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Leonid Antonenkov <an...@olis.ru> on 2002/05/16 12:14:59 UTC

New "full" log format definition

Hello all!

There are a lot of programs, which analyze access log files and
generates statistics reports.

Current most detailed log format ("combined") doesn't contain some
interesting fields
(which may be analyzed by such systems).

Yes, developers of these systems
may ask users to define custom log format,
but in my opinion global standardization way is better.

I think it will be better to define _standard_ "full" log format
and add it into standard configuration file
(which is included in installation package)
near definition of "combined", "access", "referer" and "agent" formats.

I think next fields are useful:
(In my opinion, all of them have to be included.
If somebody doesn't need some of them, he(or she) can ignore such ones)

%t:  Time, in common log format time format (standard english format)

%a:  Remote IP-address
%h:  Remote host
%l:  Remote logname (from identd, if supplied)

%A:  Local IP-address 
%p:  The canonical Port of the server serving the request 

%v:  The canonical ServerName of the server serving the request.
%V:  The server name according to the UseCanonicalName setting.
%U:  The URL path requested, not including any query string. 
%q:  The query string (prepended with a ? if a query string exists,
otherwise an empty string) 
%f:  Filename 

%r:  First line of request 
%H:  The request protocol 
%m:  The request method 
%{User-Agent}i: Signature of user agent.

%{Referer}i: Referer document URL

%P:  The process ID of the child that serviced the request. 

%{cookie}n: User tracking cookie
%u:  Remote user (from auth; may be bogus if return status (%s) is 401) 

%s:  Status. For requests that got internally redirected, this is the
status of the *original* request
%>s: Last status for requests that got internally redirected. 
%B:  Bytes sent, excluding HTTP headers. 
%D:  The time taken to serve the request, in microseconds. 
%{Content-type}o: mime-type of response content

%X: Connection status when response is completed. 

Sincerely yours,
  Leonid Antonenkov

  Email: antonenk@olis.ru
ICQ UIN: 23637980

Re: New "full" log format definition

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 08:53 AM 5/20/2002, you wrote:
>Is this topic interesting for somebody except me?

Not here... here's why.,,

>In my opinion, it's very useful thing, and it's easy to realize...

When it comes to a "Complete" reference, stop and look at the number
of HTTP headers that might be interesting.  Look at the breadth of SSL
tokens available from mod_ssl.  It is reasonably incomprehensible to try
and inventory "EVERY" relevant field that might spew forth from servers
today.  Not to mention the incidental log bloat.

What has worked rather well, is just what has happened for some time.
We define several formats in the default httpd.conf file.  Two of these are
readily recognized by most statistic gathering tools.  Better tools allow
for the user to customize their list of fields.  Certainly, if you install one
of these, it might provide a "Comprehensive" format to format the logs for
the stats it can summarize, recommended for Apache {or IIS, or iPlanet}.

Since there is really little limit to how many fields might be logged, it
becomes an exercise in futility to create the one "recommended complete
log format string".  A noble goal?  Yes... but probably not realistic.



Re: New "full" log format definition

Posted by Leonid Antonenkov <an...@olis.ru>.
Is this topic interesting for somebody except me?
In my opinion, it's very useful thing, and it's easy to realize...

Anyway, there are another two fields for "full" format:

%{X-Forwarded-For}i: list of proxy clients ip-addresses (separated by
',', value 'unknown' may be used in some cases)
(thanks to Tony Finch)

%{Content-Range}o: partial content position

Sincerely yours,
  Leonid Antonenkov

  Email: antonenk@olis.ru
ICQ UIN: 23637980

Re: Fields delimiter in new "full" log format

Posted by Tony Finch <do...@dotat.at>.
On Thu, May 16, 2002 at 03:50:39PM +0400, Leonid Antonenkov wrote:
> 
> Not the best, but, I think, better than current solution
> is to define long delimiter with rare characters (for example ' |:| ')
> 
> Format string will be like this: 
> LogFormat "%h |:| %l |:| %u |:| %t |:| %r |:| %>s |:| %b |:| %{Referer}i
> |:| %{User-Agent}i" new_combined

That doesn't avoid the need for escaping. Before the current effort
to properly escape log lines and to be strict about the request line
format (and you also need to be strict about all other request headers),
the only way to reliably log in Apache is to use a newline delimiter
between fields and a double newline between log records, so instead of
one line a log record would be multiple lines. This is more convenient
for software than for humans, though...

I also suggest adding X-Forwarded-For to your full log, to get some
details of proxy forwarding.

Tony.
-- 
f.a.n.finch <do...@dotat.at> http://dotat.at/
SHANNON: SOUTHERLY 5 TO 7, DECREASING 4 IN EAST. RAIN THEN SHOWERS. MODERATE
BECOMING GOOD.

Fields delimiter in new "full" log format

Posted by Leonid Antonenkov <an...@olis.ru>.
Hello all!

Leonid Antonenkov wrote:
> I think it will be better to define _standard_ "full" log format

About fields delimiter in new "full" format:

In "combined" log format all fields separated with space character.
Some of fields (for example, %r - first line of request)
may contain space symbol, so such fields enclosed with '"' character.

The problem is in parsing log files:
if there is '"' character in first line of request,
results of parsing will be unpredictable.

In my opinion, the best way to store such data structures
(without changing main concept, without using XML, etc.)
is to store lines separated by 'separator' (tab, for example),
enclosed by 'encloser' ('"', for example)
and escaped with 'escaper' ('\', for example).
But this way needs changing code of Apache and so on.

Not the best, but, I think, better than current solution
is to define long delimiter with rare characters (for example ' |:| ')

Format string will be like this: 
LogFormat "%h |:| %l |:| %u |:| %t |:| %r |:| %>s |:| %b |:| %{Referer}i
|:| %{User-Agent}i" new_combined

Sincerely yours,
  Leonid Antonenkov

  Email: antonenk@olis.ru
ICQ UIN: 23637980