You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Glen Stormbind <gl...@nuws.net> on 2002/09/21 16:28:28 UTC

[users@httpd] Tweaking Access.log

Hi,

I'm new to Apache so I'm not familiar with the logs but I don't like
what I see.

I wrote a custom content management system that rellies on the path_info
var in mod_include. The following log entries come from viewing the URL
(/path_info/hw.article/000,000,000.html?PAGE3) but Apache has added an
_extra_ entry both times.

Unless mistaken, this additional log entry will ruin the value of site
statistics and double the size of the log.

Is there any way to fix this problem? I would prefer to use an
AccessFile instead of httpd.conf

Extract from Access.log: 

127.0.0.1 - - [20/Sep/2002:20:33:24 +0100] "GET /path_info/hw.article/
HTTP/1.0" 200 2323
127.0.0.1 - - [20/Sep/2002:20:33:48 +0100] "GET
/path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.0" 200 6842
127.0.0.1 - - [20/Sep/2002:20:33:48 +0100] "GET /path_info/hw.article/
HTTP/1.0" 200 2323
127.0.0.1 - - [20/Sep/2002:20:34:47 +0100] "GET
/path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.1" 200 6842

Thanks for your time :)
--Glen



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Tweaking Access.log

Posted by Glen Stormbind <gl...@nuws.net>.
Ak! I discovered something much more problematic when migrating my work
from one server to another so this small issue has dropped in my list of
priorities like a lead balloon.

Hi Joshua,

I don't think it's the browser as I have used NS4 and IE5 with the same
result.

I'm sorry for being a novice but I don't know how to telnet. I only know
the service is not currently enabled on my WinXP system.

It looks like it, but the script should not be making any reference to
the directory. If there are no other possible causes, I will assume the
script is at fault and will go bug squashing. Thanks for pointing me in
a direction :)

This is 'one' request for /hw.article/

127.0.0.1 - - [21/Sep/2002:18:03:07 +0100] "GET /path_info/hw.article/
HTTP/1.1" 200 2329
127.0.0.1 - - [21/Sep/2002:18:03:07 +0100] "GET /path_info/hw.article/
HTTP/1.1" 200 2329

And here is 'one' request for /hw.article (no trailing slash)

127.0.0.1 - - [21/Sep/2002:18:05:07 +0100] "GET /path_info/hw.article
HTTP/1.1" 200 2333
127.0.0.1 - - [21/Sep/2002:18:05:07 +0100] "GET /path_info/ HTTP/1.1"
403 280

As you can see, it is not requesting the script twice. It is clearly
requesting the file once and the directory once.

Joshua Slive wrote:
> 
> Glen Stormbind wrote:
> > I do appologise. The extract from access.log was not checked, here's
> > another one. I just hit refresh a couple of times, as you can see the
> > times for each entry do match.
> >
> > 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
> > /path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.1" 200 6907
> > 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
> > /path_info/testcms/image.jpg HTTP/1.1" 200 6299
> > 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET /path_info/hw.article/
> > HTTP/1.1" 200 2329
> > 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
> > /path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.1" 200 6907
> 
> The times match, but the transfer size differs (2329 versus 6907), so
> this is clearly not exactly the same page.  Are you sure this isn't
> something funky that your browser is doing or perhaps the script itself?
>   Have you tried constructing a manual request using telnet?
> 
> Joshua.
> 
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>    "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
> 
> X-ReceivedOn: Sat, Sep 21 2002 12:56:46 PM



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Tweaking Access.log

Posted by Glen Stormbind <gl...@nuws.net>.
Thanks for the tip Gilles :)

--Glen

Gilles Gros wrote:
> 
> you should also set the log to the combined log format :
> "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\""
> 
> it will give you more clue on who is doing what.
> 
> Gilles



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


RE: [users@httpd] Tweaking Access.log

Posted by Gilles Gros <gi...@whitepj.com>.
you should also set the log to the combined log format :
"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\""

it will give you more clue on who is doing what.

Gilles

> -----Original Message-----
> From: Joshua Slive [mailto:joshua@slive.ca]
> Sent: Saturday, September 21, 2002 9:56 AM
> To: users@httpd.apache.org
> Subject: Re: [users@httpd] Tweaking Access.log
>
>
> Glen Stormbind wrote:
> > I do appologise. The extract from access.log was not checked, here's
> > another one. I just hit refresh a couple of times, as you can see the
> > times for each entry do match.
> >
> > 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
> > /path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.1" 200 6907
> > 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
> > /path_info/testcms/image.jpg HTTP/1.1" 200 6299
> > 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET /path_info/hw.article/
> > HTTP/1.1" 200 2329
> > 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
> > /path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.1" 200 6907
>
> The times match, but the transfer size differs (2329 versus 6907), so
> this is clearly not exactly the same page.  Are you sure this isn't
> something funky that your browser is doing or perhaps the script itself?
>   Have you tried constructing a manual request using telnet?
>
> Joshua.
>
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>    "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Tweaking Access.log

Posted by Joshua Slive <jo...@slive.ca>.
Glen Stormbind wrote:
> I do appologise. The extract from access.log was not checked, here's
> another one. I just hit refresh a couple of times, as you can see the
> times for each entry do match.
> 
> 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
> /path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.1" 200 6907
> 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
> /path_info/testcms/image.jpg HTTP/1.1" 200 6299
> 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET /path_info/hw.article/
> HTTP/1.1" 200 2329
> 127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
> /path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.1" 200 6907

The times match, but the transfer size differs (2329 versus 6907), so 
this is clearly not exactly the same page.  Are you sure this isn't 
something funky that your browser is doing or perhaps the script itself? 
  Have you tried constructing a manual request using telnet?

Joshua.


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Tweaking Access.log

Posted by Glen Stormbind <gl...@nuws.net>.
I do appologise. The extract from access.log was not checked, here's
another one. I just hit refresh a couple of times, as you can see the
times for each entry do match.

127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
/path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.1" 200 6907
127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
/path_info/testcms/image.jpg HTTP/1.1" 200 6299
127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET /path_info/hw.article/
HTTP/1.1" 200 2329
127.0.0.1 - - [21/Sep/2002:17:43:58 +0100] "GET
/path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.1" 200 6907
127.0.0.1 - - [21/Sep/2002:17:43:59 +0100] "GET
/path_info/testcms/image.jpg HTTP/1.1" 200 6299
127.0.0.1 - - [21/Sep/2002:17:43:59 +0100] "GET /path_info/hw.article/
HTTP/1.1" 200 2329


Glen Stormbind wrote:
> 
> Well it's this... when requesting only the page
> /hw.article/000,000,000.html - the server is recording two requests. One
> to /hw.article/000,000,000.html and one to /hw.article/
> 
> So after a days worth of requesting, the URL /hw.article/ will be seen
> as the most popular page by quite a margin even though it's possible
> that nobody has requested to see it's contents.
> 
> I'm jumping the gun, but I assume this will distort the stats and double
> the number of recorded hits?
> 
> If you tell me I am mistaken then I will accept that, but at the moment
> I don't feel confident about the accuracy of the log entries :(
> 
> 127.0.0.1 - - [21/Sep/2002:17:02:07 +0100] "GET /path_info/hw.article/
> HTTP/1.1" 200 2329
> 127.0.0.1 - - [21/Sep/2002:17:02:47 +0100] "GET
> /path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.1" 200 6907
> 
> Joshua Slive wrote:
> >
> > Glen Stormbind wrote:
> > > Hi,
> > >
> > > I'm new to Apache so I'm not familiar with the logs but I don't like
> > > what I see.
> > >
> > > I wrote a custom content management system that rellies on the path_info
> > > var in mod_include. The following log entries come from viewing the URL
> > > (/path_info/hw.article/000,000,000.html?PAGE3) but Apache has added an
> > > _extra_ entry both times.
> >
> > You'll need to give us more information.  I don't see any obvious
> > duplicate entires.  Either the times or the request-line is different
> > for each entry.
> >
> > Joshua.
> >
> > ---------------------------------------------------------------------
> > The official User-To-User support forum of the Apache HTTP Server Project.
> > See <URL:http://httpd.apache.org/userslist.html> for more info.
> > To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
> >    "   from the digest: users-digest-unsubscribe@httpd.apache.org
> > For additional commands, e-mail: users-help@httpd.apache.org
> >
> > X-ReceivedOn: Sat, Sep 21 2002 12:00:38 PM
> 
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>    "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
> 
> X-ReceivedOn: Sat, Sep 21 2002 12:42:13 PM



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Tweaking Access.log

Posted by Glen Stormbind <gl...@nuws.net>.
Well it's this... when requesting only the page
/hw.article/000,000,000.html - the server is recording two requests. One
to /hw.article/000,000,000.html and one to /hw.article/

So after a days worth of requesting, the URL /hw.article/ will be seen
as the most popular page by quite a margin even though it's possible
that nobody has requested to see it's contents.

I'm jumping the gun, but I assume this will distort the stats and double
the number of recorded hits?

If you tell me I am mistaken then I will accept that, but at the moment
I don't feel confident about the accuracy of the log entries :(

127.0.0.1 - - [21/Sep/2002:17:02:07 +0100] "GET /path_info/hw.article/
HTTP/1.1" 200 2329
127.0.0.1 - - [21/Sep/2002:17:02:47 +0100] "GET
/path_info/hw.article/000,000,000.html?PAGE3 HTTP/1.1" 200 6907

Joshua Slive wrote:
> 
> Glen Stormbind wrote:
> > Hi,
> >
> > I'm new to Apache so I'm not familiar with the logs but I don't like
> > what I see.
> >
> > I wrote a custom content management system that rellies on the path_info
> > var in mod_include. The following log entries come from viewing the URL
> > (/path_info/hw.article/000,000,000.html?PAGE3) but Apache has added an
> > _extra_ entry both times.
> 
> You'll need to give us more information.  I don't see any obvious
> duplicate entires.  Either the times or the request-line is different
> for each entry.
> 
> Joshua.
> 
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>    "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
> 
> X-ReceivedOn: Sat, Sep 21 2002 12:00:38 PM



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] Tweaking Access.log

Posted by Joshua Slive <jo...@slive.ca>.
Glen Stormbind wrote:
> Hi,
> 
> I'm new to Apache so I'm not familiar with the logs but I don't like
> what I see.
> 
> I wrote a custom content management system that rellies on the path_info
> var in mod_include. The following log entries come from viewing the URL
> (/path_info/hw.article/000,000,000.html?PAGE3) but Apache has added an
> _extra_ entry both times.

You'll need to give us more information.  I don't see any obvious 
duplicate entires.  Either the times or the request-line is different 
for each entry.

Joshua.



---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org