You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Brian Akins <br...@turner.com> on 2005/10/03 17:40:27 UTC

[PATCH] Re: Pluggable mod_log_config

Akins, Brian wrote:

> CustomLog mysql://something common env=images
> CustomLog file:///logs/my.log combined
> CustomLog spread://somegroup refere
> CustomLog buffer:///logs/other.log common

This patch implements the above.  Within mod_log_config two providers
are provided: file and buffer.  If no "scheme" is given, file is assumed.

I have tested and preliminarily it works for both file and buffer.  I
had to rearrange some of the buffer code to get it to work "non-globally."

It should be just as easy to write custom log handles. With this patch,
different custom loggers could handle different log "files."

Comments?

-- 
Brian Akins
Lead Systems Engineer
CNN Internet Technologies

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Ondrej Sury <on...@sury.org>.

On Mon, 2005-10-03 at 15:40 -0400, Brian Akins wrote:
> Ondrej Sury wrote:
> 
> > Just quick thought, maybe we should add:
> > 
> > ap_log_writer_close *close;
> > 
> > to struct log_provider_t.  It's not absolutely necessary, because you
> > can use apr_pool_cleanup_register(...), but it will make writing addon
> > modules much cleaner.
> 
> That's debatable.  I already use apr_pool_cleanup_register in a lot of 
> stuff, so it seems "natural" to me.  I guess we could register a cleanup 
>   on each cls that called close or we could let the module developer do 
> it...  I'm leaning toward just letting others register the cleanup on 
> there own like buffered logs does now.

Disadvantage of having it outside cls->provider is that you have to keep
list of opened connections/files/sockets/etc. somewhere and you will be
duplicating this list if you keep it outside of multi_log_state.

Anyway it's not a big deal since connections to server will be propably
opened in init_child.

Ondrej.
-- 
Ondrej Sury <on...@sury.org>

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Brian Akins <br...@turner.com>.

Ondrej Sury wrote:

> Just quick thought, maybe we should add:
> 
> ap_log_writer_close *close;
> 
> to struct log_provider_t.  It's not absolutely necessary, because you
> can use apr_pool_cleanup_register(...), but it will make writing addon
> modules much cleaner.

That's debatable.  I already use apr_pool_cleanup_register in a lot of 
stuff, so it seems "natural" to me.  I guess we could register a cleanup 
  on each cls that called close or we could let the module developer do 
it...  I'm leaning toward just letting others register the cleanup on 
there own like buffered logs does now.

-- 
Brian Akins
Lead Systems Engineer
CNN Internet Technologies

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Ondrej Sury <on...@sury.org>.

On Mon, 2005-10-03 at 11:40 -0400, Brian Akins wrote:
> Akins, Brian wrote:
> 
> > CustomLog mysql://something common env=images
> > CustomLog file:///logs/my.log combined
> > CustomLog spread://somegroup refere
> > CustomLog buffer:///logs/other.log common
> 
> 
> This patch implements the above.  Within mod_log_config two providers
> are provided: file and buffer.  If no "scheme" is given, file is assumed.
> 
> I have tested and preliminarily it works for both file and buffer.  I
> had to rearrange some of the buffer code to get it to work "non-globally."
> 
> It should be just as easy to write custom log handles. With this patch,
> different custom loggers could handle different log "files."
> 
> Comments?

Wow, you're fast :-).

Just quick thought, maybe we should add:

ap_log_writer_close *close;

to struct log_provider_t.  It's not absolutely necessary, because you
can use apr_pool_cleanup_register(...), but it will make writing addon
modules much cleaner.

Ondrej.
-- 
Ondrej Sury <on...@sury.org>

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Ondrej Sury <on...@sury.org>.

On Tue, 2005-10-04 at 13:55 +0100, Colm MacCarthaigh wrote:
> On Tue, Oct 04, 2005 at 08:37:08AM -0400, Brian Akins wrote:
> > In this case, from my patches:
> > 
> > LogFormat "INSERT INTO foo VALUES ('%h', '%l');" foo-sql
> > 
> > CustomLog mysql://user:password@host/database foo-sql
> > 
> > and the mysql module would get the arrays of strings and lengths.  at 
> > init time, it would have prepared the format sql.  At log time, it would 
> > bind and execute.
> 
> No, I don't think it's so simple at all. Although it would have to parse
> the SQL at init time (how does LogFormat know to ask the MySQL provider
> to do this?) that can't be simply untied from the actions at log-time;
> in the case of SQL for example you have to be very pedantic about the
> escaping, so that a request for "/foo\'; nasty sql;" doesn't kill us. 

Well, I think you can do something like this:

LogFormat "%h %l" foo
MySQLFormat "INSERT INTO bar SET hostname = '$1', logname = '$2'" bar

MySQLHost mysql0 localhost

CustomLog mysql://mysql0/bar foo

Then ap_mysql_log_writer would do it's own processing of strs/strl
combo.

> And we still need additional per-provider directives to provide the
> database, hostname, username and so on information.

That's true.

> If such directives are required anyway, what effort are we saving?
> Why not just replace mod_log_config rather than plug into it. 

Because there is a lot more stuff which needs to be reimplemented in
mod_log_config and which can be reused in other modules.

Ondrej.
-- 
Ondrej Sury <on...@sury.org>

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Colm MacCarthaigh <co...@stdlib.net>.

On Tue, Oct 04, 2005 at 09:15:32AM -0400, Brian Akins wrote:
> Why not just replace mod_log_config rather than plug into it. 
> 
> Because mod_log_config handles a bunch of stuff for us.  May be a
> better solution would be to use the standard log_config way of
> replacing init and writer and replace them with a pluggable one.

O.k., well if people really want to do this, using schemes and providers
is an elegant way, we would just have to signpost the road to handling
the formats. 

> >Though if all of this text parsing is getting expensive, I wonder would
> >anyone be interested in a protocol for binary logging from httpd?
> 
> Hmmm. interesting. Especially for things like spread, this would make 
> alot of sense.

Whether it represents even more encoding or not is hard to tell though
:)

> >>CustomLog /logs/site.sock common
> >
> >You re-implemented syslog :-) I've done the same myself for our hosting
> >service, we use syslog-ng to do all sorts of weird things with the info. 
> 
> but syslog can't, on it's own, determine between virtuals (same problems 
> as piped loggers).  I may have to look at syslog-ng.  Can you maybe 
> off-list share some of your techniques?

Once you see syslog-ng's way of doing things, it becomes fairly
self-explanatory; We use its regular expression engine for logging
vhosts seperately and so on, but we also have a centralised log file
which logs everything we consider important (which is basically errors
we have not explicitly filtered out) accross all hosts. 

We also use its mysql output features to put this later class of logging
information in a transient database, so that we can integrate and
monitor the important, potentially critical, events within our NMS more
easily.

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Brian Akins <br...@turner.com>.

Colm MacCarthaigh wrote:
> On Tue, Oct 04, 2005 at 08:37:08AM -0400, Brian Akins wrote:
>
> 
> No, I don't think it's so simple at all. Although it would have to parse
> the SQL at init time (how does LogFormat know to ask the MySQL provider
> to do this?) 

LogFormat doesn't, but the init of the provider could.

that can't be simply untied from the actions at log-time;
> in the case of SQL for example you have to be very pedantic about the
> escaping, so that a request for "/foo\'; nasty sql;" doesn't kill us. 

That what the binding ensures.

> And we still need additional per-provider directives to provide the
> database, hostname, username and so on information. 

No, the init of the provider worries about the parameters.  In fact, the 
mysql modules may just have seperate directives to set this:

SqlLoggerUsername User
SqlLoggerPassword password
....

If such directives
> are required anyway, what effort are we saving? Why not just replace
> mod_log_config rather than plug into it. 

Because mod_log_config handles a bunch of stuff for us.  May be a better 
solution would be to use the standard log_config way of replacing init 
and writer and replace them with a pluggable one.

> Though if all of this text parsing is getting expensive, I wonder would
> anyone be interested in a protocol for binary logging from httpd?

Hmmm. interesting. Especially for things like spread, this would make 
alot of sense.

> 
>>CustomLog /logs/site.sock common
> 
> 
> You re-implemented syslog :-) I've done the same myself for our hosting
> service, we use syslog-ng to do all sorts of weird things with the info. 

but syslog can't, on it's own, determine between virtuals (same problems 
as piped loggers).  I may have to look at syslog-ng.  Can you maybe 
off-list share some of your techniques?

> 
>>define "damn busy." 
> 
> 
> We frequently see over 4000 requests per second being logged. Almost any
> time there's a major security update for fedora really. 

Okay, you qualify :)

> 
> Pipes themselves have little overhead, it's basically shared memory with
> a standard IO interface and automatic mutexing. The processes on the
> other side certainly need not be cumbersome - and I really like that you
> can run a pipe-logger as a different uid, in a chroot and so on, it's a
> nice place to sandbox all of that icky SQL parsing and that kind of
> thing.

True.  The test parsing just seems so "icky" to determine the virtual 
host since Apache already knows that.

-- 
Brian Akins
Lead Systems Engineer
CNN Internet Technologies

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Colm MacCarthaigh <co...@stdlib.net>.

On Tue, Oct 04, 2005 at 08:37:08AM -0400, Brian Akins wrote:
> In this case, from my patches:
> 
> LogFormat "INSERT INTO foo VALUES ('%h', '%l');" foo-sql
> 
> CustomLog mysql://user:password@host/database foo-sql
> 
> and the mysql module would get the arrays of strings and lengths.  at 
> init time, it would have prepared the format sql.  At log time, it would 
> bind and execute.

No, I don't think it's so simple at all. Although it would have to parse
the SQL at init time (how does LogFormat know to ask the MySQL provider
to do this?) that can't be simply untied from the actions at log-time;
in the case of SQL for example you have to be very pedantic about the
escaping, so that a request for "/foo\'; nasty sql;" doesn't kill us. 

And we still need additional per-provider directives to provide the
database, hostname, username and so on information. If such directives
are required anyway, what effort are we saving? Why not just replace
mod_log_config rather than plug into it. 

> >And after all of this, what if any, are the compelling reasons to
> >implement this in httpd at all? Why can't all of this be moved into
> >piped loggers? 
> 
> Try pipe loggers with 60 or so virtual hosts.  It doesn't scale well, as 
> we open a pipe for each virtual that defines custom log.

Ahh, that's a different problem; it's trivial to have a piped logger do
the splitting (it's what we do). 

Though if all of this text parsing is getting expensive, I wonder would
anyone be interested in a protocol for binary logging from httpd?

> CustomLog /logs/site.sock common

You re-implemented syslog :-) I've done the same myself for our hosting
service, we use syslog-ng to do all sorts of weird things with the info. 

> define "damn busy." 

We frequently see over 4000 requests per second being logged. Almost any
time there's a major security update for fedora really. 

> We may have a different scale of "busy."  I'v found pipe logs with
> lots of virtuals to be less than spectacular.  In Apache logging will
> be the best performance, but it lacks some flexibility.  That's why I
> wrote my domain socket stuff.  However, I think the patches I've
> submitted allow for in-Apache logging to be very flexible with no
> additional overhead of pipes.

Pipes themselves have little overhead, it's basically shared memory with
a standard IO interface and automatic mutexing. The processes on the
other side certainly need not be cumbersome - and I really like that you
can run a pipe-logger as a different uid, in a chroot and so on, it's a
nice place to sandbox all of that icky SQL parsing and that kind of
thing.

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Brian Akins <br...@turner.com>.

Jeffrey Burgoyne wrote:
> I run a system with about 250 virtual hosts averaging around 100 hits per
> second with little problem.

100 hits per second is not that much relative to what we do.  We may be 
more of an extreme case, though.

  Rather than having a piped log for each
> virtual host though, we only use one piped log for every virtual host, and
> add in the host name into the log. Your logging program then has ot have
> the smarts to figure out which virtual host the log is for, but that is
> relatively easy and inexpensive.

Yes, but Apache already knows what virtual host a log line, so it seems 
redundant to do it again in the external program.

I would agree, for 90+% of users, piped logs are probably fine.  But, 
for the rest of us, it seems like we ought to be able to come up with 
some "standard" way of doing high-performance flexible logging.  If 
nothing else but to have more eyes on the code and techniques.

-- 
Brian Akins
Lead Systems Engineer
CNN Internet Technologies

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Jeffrey Burgoyne <bu...@keenuh.com>.

I run a system with about 250 virtual hosts averaging around 100 hits per
second with little problem. Rather than having a piped log for each
virtual host though, we only use one piped log for every virtual host, and
add in the host name into the log. Your logging program then has ot have
the smarts to figure out which virtual host the log is for, but that is
relatively easy and inexpensive.

Jeffrey Burgoyne

Chief Technology Architect
KCSI Keenuh Consulting Services Inc
burgoyne@keenuh.com

On Tue, 4 Oct 2005, Brian Akins wrote:

> Colm MacCarthaigh wrote:
>
>
> Try pipe loggers with 60 or so virtual hosts.  It doesn't scale well, as
> we open a pipe for each virtual that defines custom log.
>
>
>

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Brian Akins <br...@turner.com>.

Colm MacCarthaigh wrote:

> But in order to save overhead, this would require some intelligence, it
> would not make much sense for the pluggable logger to re-parse this
> string everytime, to figure out what it should be doing. And where does
> it get its database, username and host information from?  Do we require
> per-provider directives? Do we hack the format more to get them in
> somehow?

In this case, from my patches:

LogFormat "INSERT INTO foo VALUES ('%h', '%l');" foo-sql

CustomLog mysql://user:password@host/database foo-sql

and the mysql module would get the arrays of strings and lengths.  at 
init time, it would have prepared the format sql.  At log time, it would 
bind and execute.

> And after all of this, what if any, are the compelling reasons to
> implement this in httpd at all? Why can't all of this be moved into
> piped loggers? 

Try pipe loggers with 60 or so virtual hosts.  It doesn't scale well, as 
we open a pipe for each virtual that defines custom log.

Why can't they just parse the logs as they get them and
> do whatever whoever wants with it after that? After all, many people use
> "logger" to log to syslog. The whole mod_log_spread architecture with
> two netcat commands, and that can even be done with privilege
> seperation, which an in-httpd module never could.

What I did was something similar, although without pipes and not 
portably (at least not to windows).  I wrote a very simple log module 
that logs to Unix domain sockets:

CustomLog /logs/site.sock common

And the log "server" does whatever with it -- it's "pluggable."  I have 
a spread and an asynchronous disk one.

> On ftp.heanet.ie, we've been using piped loggers forever, our logs are
> damn busy, and we've seen our share of crashes, and these issues just
> never arise for us.

define "damn busy."  We may have a different scale of "busy."  I'v found 
pipe logs with lots of virtuals to be less than spectacular.  In Apache 
logging will be the best performance, but it lacks some flexibility. 
That's why I wrote my domain socket stuff.  However, I think the patches 
I've submitted allow for in-Apache logging to be very flexible with no 
additional overhead of pipes.

-- 
Brian Akins
Lead Systems Engineer
CNN Internet Technologies

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Ondrej Sury <on...@sury.org>.

On Tue, 2005-10-04 at 14:06 +0100, Colm MacCarthaigh wrote:
> On Tue, Oct 04, 2005 at 02:59:40PM +0200, Ondrej Sury wrote:
> > No, it cannot be implemented with two netcat commands (just tried it).
> 
> Sure, it can, use my Multicast Netcat;

Didn't know that anything like that exists.  My comment about restarting
frontends was based on fact, that I didn't realized that apache2
restarts piped logger automaticaly.

Ondrej.
-- 
Ondrej Sury <on...@sury.org>

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Brian Candler <B....@pobox.com>.

On Thu, Oct 06, 2005 at 03:49:10PM -0400, Brian Akins wrote:
> Colm MacCarthaigh wrote:
> 
> >If httpd writes a complete line, to any kind of a file descriptor,
> >anything beyond that is out of our control and becomes a question of the
> >quality of the piped logger, filesystem or whatever else is on the other
> >side of that file-descriptor.
> 
> Maybe I'm just being difficult, but I'm still not sold on it :)  While I 
>  understand all that you wrote, the possibilities of getting "partial 
> line" with normal piped logs irks me.  Maybe it's just a personal thing...
> 
> I prefer to have something like:
> 
> length of message, message, sync bit
> 
> (message could be many log lines)

Or you could just do read() on a pipe. If the last character you receive is
'\n' then you have a high degree of confidence that you have received an
entire line or a number of whole lines.

I don't think it's possible for logs generated by Apache to contain newlines
mid-way (if there are any such cases, they probably should be fixed)

If you're talking about log output from a CGI, then there's nothing to
guarantee that a CGI will write anything sensible which ends with newline,
or that it will write whole lines atomically rather than writing a few
characters at a time (which therefore could get mixed up with stderr output
from other CGI processes); but equally you cannot enforce that a CGI uses
your tagged protocol either.

CGI error logs are a bit of a pain. Since Apache is the parent process, and
the CGI is the child, Apache would be a good place to:
(a) receive bytes from stderr and reformat them into whole log lines,
prefixed by timestamp, server_name etc to allow post-processing; and
(b) log the CPU time used by a CGI after it has completed

Since Apache does neither of these, I have been doing this in a hacked
version of suexec to avoid having to hack mod_cgi itself. This costs an
extra fork() in suexec though.

Regards,

Brian.

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Brian Akins <br...@turner.com>.

Colm MacCarthaigh wrote:

> If httpd writes a complete line, to any kind of a file descriptor,
> anything beyond that is out of our control and becomes a question of the
> quality of the piped logger, filesystem or whatever else is on the other
> side of that file-descriptor.

Maybe I'm just being difficult, but I'm still not sold on it :)  While I 
  understand all that you wrote, the possibilities of getting "partial 
line" with normal piped logs irks me.  Maybe it's just a personal thing...

I prefer to have something like:

length of message, message, sync bit

(message could be many log lines)

That way it's easy to ensure you get complete log entries and detect 
when you don't.  I could write a piped logger that did this, but I'd 
have to detect when the lines begin and end and mod_log_config already 
knows that.

Yes, for 90+% of the time, piped logs are great.

-- 
Brian Akins
Lead Systems Engineer
CNN Internet Technologies

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Colm MacCarthaigh <co...@stdlib.net>.

On Thu, Oct 06, 2005 at 02:06:24PM -0400, Brian Akins wrote:
> My problem with piped loggers is there is no fast way to make sure you 
> have a "complete" line.  This is especially hard when you buffer the logs.

If httpd writes a complete line, to any kind of a file descriptor,
anything beyond that is out of our control and becomes a question of the
quality of the piped logger, filesystem or whatever else is on the other
side of that file-descriptor.

> Think of a situation where the piped logger is supposed to write to a 
> socket, for example.  If the piped logger has the ability to "fail over" 
> to another socket, there is a great chance that you may get partial lines.

That would be a question of the code and design quality of the piped
logger though. Having the code in-httpd just moves the question, it
wouldn't remove it. 

Within the POSIX model though, we should consider piped loggers more
reliable for similar scenarios. Piped logging and standard logging are
basically just writing to a file descriptor, one which children inherit
and in turn just write to. 

The extreme example is the stderr of an CGI - which can successfully log
to a piped logger, because POSIX makes it trivial.

Now, imagine we had a few different types of logging in-httpd, MySQL,
XML, and all that; in this scenario we can no longer simply write to an
fd, instead we have to intercept the writes and mogrify them by some
means. The first problem is that this makes error logging really really
hard, and the second and even worse problem is that when httpd is
crashing we really arn't going to want the complexity of all of that
going on.  

A piped logger on the other hand, lives in a seperate process entirely,
relatively isolated from any problems httpd itself may be having, and
can be trusted to log the information when a crash is happening.

Of course when implementation is bad, bad things happen, but than be
applied to anything.

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Brian Akins <ba...@web.turner.com>.

My problem with piped loggers is there is no fast way to make sure you 
have a "complete" line.  This is especially hard when you buffer the logs.

Think of a situation where the piped logger is supposed to write to a 
socket, for example.  If the piped logger has the ability to "fail over" 
  to another socket, there is a great chance that you may get partial lines.

Our general policy is that we cannot loose any log lines. period.  So, 
we have to be a little creative and many of the common solutions can be 
rather "lossy."



-- 
Brian Akins
Lead Systems Engineer
CNN Internet Technologies

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

On 10/4/05, Colm MacCarthaigh <co...@stdlib.net> wrote:
> On Tue, Oct 04, 2005 at 02:59:40PM +0200, Ondrej Sury wrote:
> > No, it cannot be implemented with two netcat commands (just tried it).
>
> Sure, it can, use my Multicast Netcat;
>
>         http://people.heanet.ie/~colmmacc/mnc/
>
> (version 1.3 is experimental and not working right now, avoid that, and
> I'm about the public the APR version I've been working on for a while).
>
> > Purpose of mod_log_spread is to unify several balanced frontends logging
> > to one common log server.  With netcat you can only do one to one.
>
> On the clients;
>
>         CustomLog "|mnc group-id"
>
> (you can use ordinary netcat for that part if you want).
>
> On the collector;
>
>         mnc -l group-id > logfile
>

That'll work, but spread provides some level of reliability
(retransmitting messages that are not acknowledged, etc), while mnc
appears to rely on the fact that UDP packets will not be dropped. 
Neat program though, I hadn't heard of it either...

-garrett

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Colm MacCarthaigh <co...@stdlib.net>.

On Tue, Oct 04, 2005 at 02:59:40PM +0200, Ondrej Sury wrote:
> No, it cannot be implemented with two netcat commands (just tried it).

Sure, it can, use my Multicast Netcat;

	http://people.heanet.ie/~colmmacc/mnc/

(version 1.3 is experimental and not working right now, avoid that, and
I'm about the public the APR version I've been working on for a while). 

> Purpose of mod_log_spread is to unify several balanced frontends logging
> to one common log server.  With netcat you can only do one to one.

On the clients;

	CustomLog "|mnc group-id"

(you can use ordinary netcat for that part if you want). 

On the collector;

	mnc -l group-id > logfile

> Second problem that restart of backend server listener cannot cause
> piped loggers to fail (and thus force you to restart all frontends).
> Which is f.e. true for netcat.

I know about the restart problem, but I don't know what you mean about
this forcing you to restart frontends, or how it affects the
architecture.  It wouldn't matter using the above commands.

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Ondrej Sury <on...@sury.org>.

On Mon, 2005-10-03 at 23:28 +0100, Colm MacCarthaigh wrote:
> And after all of this, what if any, are the compelling reasons to
> implement this in httpd at all? Why can't all of this be moved into
> piped loggers? Why can't they just parse the logs as they get them and
> do whatever whoever wants with it after that? After all, many people use
> "logger" to log to syslog. The whole mod_log_spread architecture with
> two netcat commands, and that can even be done with privilege
> seperation, which an in-httpd module never could.

No, it cannot be implemented with two netcat commands (just tried it).

Purpose of mod_log_spread is to unify several balanced frontends logging
to one common log server.  With netcat you can only do one to one.

Second problem that restart of backend server listener cannot cause
piped loggers to fail (and thus force you to restart all frontends).
Which is f.e. true for netcat.

Ondrej.
-- 
Ondrej Sury <on...@sury.org>

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Colm MacCarthaigh <co...@stdlib.net>.

On Mon, Oct 03, 2005 at 11:40:27AM -0400, Brian Akins wrote:
> Akins, Brian wrote:
> 
> >CustomLog mysql://something common env=images
> >CustomLog file:///logs/my.log combined
> >CustomLog spread://somegroup refere
> >CustomLog buffer:///logs/other.log common

I've been looking more at this, and I'm kind of confused as to what the
aim is here. The line;

	"CustomLog mysql://something common env=images"

doesn't make a lot of sense to me. The "common" LogFormat is
line-oriented, but a database-driven logformat should be anything but
line-oriented. If it's to be of any real use it should have its own
schema and so on, each nugget of log information would probably be its
own field within a record, defined by a table, and so on. 

I think there's a lot more to think about, and I think these kind of
changes to CustomLog may hint at a need for slightly wider-ranging
changes, including to LogFormat. Or maybe strings will do, would we be
happy to see;

	LogFormat "INSERT INTO foo VALUES ('%h', '%l');" foo-sql

But in order to save overhead, this would require some intelligence, it
would not make much sense for the pluggable logger to re-parse this
string everytime, to figure out what it should be doing. And where does
it get its database, username and host information from?  Do we require
per-provider directives? Do we hack the format more to get them in
somehow? 

And after all of this, what if any, are the compelling reasons to
implement this in httpd at all? Why can't all of this be moved into
piped loggers? Why can't they just parse the logs as they get them and
do whatever whoever wants with it after that? After all, many people use
"logger" to log to syslog. The whole mod_log_spread architecture with
two netcat commands, and that can even be done with privilege
seperation, which an in-httpd module never could.

I can understand that it is desirable to have non-blocking logging, but
there are means to deal with this, I've been testing some buffered (on
the piped logger side) piped logging that is I think more than good
enough in that regard. I can also understand that it's desirable to have
reliable logs from a httpd crash, or to guard against a piped logger
crash - but these are a question of code quality (of the piped logger)
more than anything else.

On ftp.heanet.ie, we've been using piped loggers forever, our logs are
damn busy, and we've seen our share of crashes, and these issues just
never arise for us.

What am I not getting?

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Brian Akins <br...@turner.com>.

Colm MacCarthaigh wrote:

> Looks useful, but file://|/bin/foo would be very non-intuitive for piped
> loggers, balancing the backwards compatibility might need a bit more, I
> guess "pipe://" or "cmd://" schemes might make sense also. 

True.  I just used uri's as they seemed to just "fit."  I'm not 
particularly tied to the configuration interface.  Any interest in me 
changing it to use "cmd://"?

Also, since the default is file://, |/some/command would behave just as 
it does now.

I would like to be able to use multiple custom loggers, rather than only 
get one.

-- 
Brian Akins
Lead Systems Engineer
CNN Internet Technologies

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Joshua Kogut <jm...@gmail.com>.

On 10/3/05, Brian Akins <br...@turner.com> wrote:
>
> Rüdiger Plüm wrote:
>
> >
> > Or it likes to read many lines in one block as it can commit such things
> > as a batch rather than as single commits per line. Of course it is not
> required
> > to use DB transactions on mysql. But for Oracle this might improve
> performance.

Hey! I resent that. I am sure that this topic has been the starter of many a
flame war, so lets not start another one here, but Mysql is surely faster
that Oracle!

--
|| jmkogut ||
email: jmkogut@gmail.com

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Rüdiger Plüm <r....@gmx.de>.

On 10/03/2005 09:53 PM, Brian Akins wrote:
> Rüdiger Plüm wrote:
> 

[..cut..]

> 
> but that would not be buffering in the sense that mod_log_config does
> it.  mod_log_config just keeps appending data to a buffer.  in an sql
> logger, to "buffer" you would keep a list/array/ring of log lines and
> wrap them in a transaction.  This cannot be "buffered" in mod_log_config

Agreed.

[..cut..]

> See latest patch :)

To be honest I haven't tried it yet, but does apr_uri_parse handle the spaces
between the piped log parameters correctly?
Furthermore I guess you should use

pl = ap_open_piped_log(p, name);

instead of

pl = ap_open_piped_log(p, name + 1);

The "|" does not prefix name any longer :-).

[..cut..]

> interesting.  You could have a "linebuffer" that buffered in a way I
> describe above and a "buffer" that does it the current way.  This may be
>  a larger change that we first started.  Of course, you could define a
> log filter chain:
> 
> LogFilter example "monitor param=somevalue|buffer size=1024|compress|file"
> 

Sounds like a good idea

> 
> and then apply the filter in Custom Log:
> 
> CustomeLog filter://example/logs/some.log common
> 
> filter could supply "/logs/some.log" as user_data to each filter.

I guess /logs/some.log is only interesting for the backend. We also
need to consider that backend providers might need totally different (compared to filenames)
parameter types. Maybe URL's with args.
I currently found no solution that makes me really happy in conjunction with logfilters but
I will keep on thinking on this.

> 
> Of course, this is a much larger change than I think we are ready to
> tackle now.
>

Yes of course. This would require much changes and will not be done in a quick
patch, but it just came up my mind during this discussion and it might be a feature to think of.

Regards

Rüdiger

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Brian Akins <br...@turner.com>.

Rüdiger Plüm wrote:

> 
> Or it likes to read many lines in one block as it can commit such things
> as a batch rather than as single commits per line. Of course it is not required
> to use DB transactions on mysql. But for Oracle this might improve performance.

but that would not be buffering in the sense that mod_log_config does 
it.  mod_log_config just keeps appending data to a buffer.  in an sql 
logger, to "buffer" you would keep a list/array/ring of log lines and 
wrap them in a transaction.  This cannot be "buffered" in mod_log_config

> 
> Hm, the question is how to get all the parameters set with the URL approach (thinking
> of piped loggers, which you said currently do not seem to work because of the spaces).

See latest patch :)

> Maybe as named parameters in the args?
> If one likes to follow a filter approach I think URL's would not be useful, but to be
> honest I currently would have no other idea in this case as the shell approach e.g.
> 
> 
> Customlog "|monitor param=somevalue|buffer size=1024|compress |pipe /bin/foo" combined

> where the last member needs to be a backend provider like pipe, file, mysql.

interesting.  You could have a "linebuffer" that buffered in a way I 
describe above and a "buffer" that does it the current way.  This may be 
  a larger change that we first started.  Of course, you could define a 
log filter chain:

LogFilter example "monitor param=somevalue|buffer size=1024|compress|file"

and then apply the filter in Custom Log:

CustomeLog filter://example/logs/some.log common

filter could supply "/logs/some.log" as user_data to each filter.

Of course, this is a much larger change than I think we are ready to 
tackle now.

-- 
Brian Akins
Lead Systems Engineer
CNN Internet Technologies

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Rüdiger Plüm <r....@gmx.de>.

On 10/03/2005 09:11 PM, Brian Akins wrote:
> Rüdiger Plüm wrote:
> 
>> Makes also sense to me since it seems to me that piped logging does
>> not really play
>> well with other things like spread or mysql. On the other side, what
>> about the
>> buffered logging? Would it make sense to make it possible to turn this on
>> and off with each provider? If yes, we might should have schemes that
>> say e.g.:
> 
> 
> It seems to me this would be provider specific.

I think buffering is on a higher level and thus it might not be needed
nor useful to reimplement this in every provider. But if we want to
follow this way more strictly we might end up having something like a filter
chain before the provider actually writes the data to its target.
A buffer filter would be the first implementation, but there might be others
like compression or embedded monitoring filters.

> 
> 
>> mysql:// for unbuffered mysql backend
>> mysqlb:// for buffered mysql backend
> 
> 
> Probably, a mysql module would want each line individual, rather than a
> large buffer.

Or it likes to read many lines in one block as it can commit such things
as a batch rather than as single commits per line. Of course it is not required
to use DB transactions on mysql. But for Oracle this might improve performance.

> 
>> file:// for unbuffered file backend
>> fileb:// for buffered file backend
> 
> 
> 
> Or, sticking with the uri methods:
> 
> file:///some/log/path?buffered
> 
> or maybe:
> 
> file://buffered@/some/log/path
> 

Hm, the question is how to get all the parameters set with the URL approach (thinking
of piped loggers, which you said currently do not seem to work because of the spaces).
Maybe as named parameters in the args?
If one likes to follow a filter approach I think URL's would not be useful, but to be
honest I currently would have no other idea in this case as the shell approach e.g.

Customlog "|monitor param=somevalue|buffer size=1024|compress |pipe /bin/foo" combined

where the last member needs to be a backend provider like pipe, file, mysql.

Regards

Rüdiger

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Brian Akins <br...@turner.com>.

Rüdiger Plüm wrote:

> Makes also sense to me since it seems to me that piped logging does not really play
> well with other things like spread or mysql. On the other side, what about the
> buffered logging? Would it make sense to make it possible to turn this on
> and off with each provider? If yes, we might should have schemes that say e.g.:

It seems to me this would be provider specific.


> mysql:// for unbuffered mysql backend
> mysqlb:// for buffered mysql backend

Probably, a mysql module would want each line individual, rather than a 
large buffer.

> file:// for unbuffered file backend
> fileb:// for buffered file backend


Or, sticking with the uri methods:

file:///some/log/path?buffered

or maybe:

file://buffered@/some/log/path


Interesting comments.

-- 
Brian Akins
Lead Systems Engineer
CNN Internet Technologies

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Rüdiger Plüm <r....@gmx.de>.

On 10/03/2005 08:49 PM, Colm MacCarthaigh wrote:
> On Mon, Oct 03, 2005 at 11:40:27AM -0400, Brian Akins wrote:
> 

[..cut..]
> 
> Looks useful, but file://|/bin/foo would be very non-intuitive for piped

I guess you still can use |/bin/foo because file is the default provider.

> loggers, balancing the backwards compatibility might need a bit more, I
> guess "pipe://" or "cmd://" schemes might make sense also. 
> 

Makes also sense to me since it seems to me that piped logging does not really play
well with other things like spread or mysql. On the other side, what about the
buffered logging? Would it make sense to make it possible to turn this on
and off with each provider? If yes, we might should have schemes that say e.g.:

mysql:// for unbuffered mysql backend
mysqlb:// for buffered mysql backend

file:// for unbuffered file backend
fileb:// for buffered file backend

Regards

Rüdiger

Re: [PATCH] Re: Pluggable mod_log_config

Posted by Colm MacCarthaigh <co...@stdlib.net>.

On Mon, Oct 03, 2005 at 11:40:27AM -0400, Brian Akins wrote:
> >CustomLog mysql://something common env=images
> >CustomLog file:///logs/my.log combined
> >CustomLog spread://somegroup refere
> >CustomLog buffer:///logs/other.log common
> 
> This patch implements the above.  Within mod_log_config two providers
> are provided: file and buffer.  If no "scheme" is given, file is assumed.

Looks useful, but file://|/bin/foo would be very non-intuitive for piped
loggers, balancing the backwards compatibility might need a bit more, I
guess "pipe://" or "cmd://" schemes might make sense also. 

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net