You are viewing a plain text version of this content. The canonical link for it is here.
Posted to modules-dev@httpd.apache.org by Neil McKee <ne...@inmon.com> on 2011/01/08 00:06:19 UTC

mod_sflow

Hello all,

A new module that exports log data using the sFlow protocol is available for review:

http://mod-sflow.googlecode.com

For details,  and usage examples,  see:

http://blog.sflow.com/2011/01/http.html

This module is designed for efficient,  centralized, real-time monitoring of large web clusters.
It also maintains a standard block of performance counters which are pushed via sFlow, but
can also be retrieved via a URL,  e.g. for trending in Cacti.

This module is designed to work in both "prefork" and "worker" models.  I would really
appreciate it if someone could review the design to make sure I made appropriate choices
about where to use pipes, shared-memory, mutex locking,  and so on(!)   These choices
are documented in the comment at the top of the mod_sflow.c file,  here:
http://code.google.com/p/mod-sflow/source/browse/trunk/mod_sflow.c?r=14

The details of which fields should be exported and which counters should be maintained
is the subject of a discussion on the sFlow discussion group here:
http://groups.google.com/group/sflow/browse_thread/thread/88accb2bad594d1d

Neil


Re: mod_sflow

Posted by Neil McKee <ne...@inmon.com>.
On Jan 10, 2011, at 2:03 PM, Ben Noordhuis wrote:

> On Mon, Jan 10, 2011 at 22:22, Neil McKee <ne...@inmon.com> wrote:
>> I pushed changes to:
>> 
>> 1.  use PIPE_BUF from limits.h (if available).
>> 2.  use apr_file_pipe_timeout_set(<pipe>, 0) on both ends -- just to make absolutely certain that the writes from the critical section are always non-blocking.
>> 3.  use apr_poll() at the read end of the pipe to get my read timeout back.
>> 4.  make sure any critical-section logging is with APLOG_DEBUG.
>> 
>> http://code.google.com/p/mod-sflow/source/browse/trunk/mod_sflow.c?r=17
> 
> Much better, Neil.
> 
>> setting the "timeout" to 0 seems to be equivalent to setting O_NONBLOCK?
> 
> Even better, apr_file_pipe_timeout_set() with timeout > -1 sets O_NONBLOCK.
> 

I think I understand now.  I can have:

apr_file_pipe_timeout_set(<read_end>,  900000);
apr_file_pipe_timeout_set(<write_end>, 0);

and remove all that apr_poll() code again:
http://code.google.com/p/mod-sflow/source/browse/trunk/mod_sflow.c?r=20

Thanks!

Neil


>> I don't see any functions for assembling an apr_pollfd_t object
> 
> There indeed aren't, creating them by hand is the (only) way to go.



Re: mod_sflow

Posted by Ben Noordhuis <in...@bnoordhuis.nl>.
On Mon, Jan 10, 2011 at 22:22, Neil McKee <ne...@inmon.com> wrote:
> I pushed changes to:
>
> 1.  use PIPE_BUF from limits.h (if available).
> 2.  use apr_file_pipe_timeout_set(<pipe>, 0) on both ends -- just to make absolutely certain that the writes from the critical section are always non-blocking.
> 3.  use apr_poll() at the read end of the pipe to get my read timeout back.
> 4.  make sure any critical-section logging is with APLOG_DEBUG.
>
> http://code.google.com/p/mod-sflow/source/browse/trunk/mod_sflow.c?r=17

Much better, Neil.

> setting the "timeout" to 0 seems to be equivalent to setting O_NONBLOCK?

Even better, apr_file_pipe_timeout_set() with timeout > -1 sets O_NONBLOCK.

> I don't see any functions for assembling an apr_pollfd_t object

There indeed aren't, creating them by hand is the (only) way to go.

Re: mod_sflow

Posted by Neil McKee <ne...@inmon.com>.
I pushed changes to:

1.  use PIPE_BUF from limits.h (if available).
2.  use apr_file_pipe_timeout_set(<pipe>, 0) on both ends -- just to make absolutely certain that the writes from the critical section are always non-blocking.
3.  use apr_poll() at the read end of the pipe to get my read timeout back.
4.  make sure any critical-section logging is with APLOG_DEBUG.

http://code.google.com/p/mod-sflow/source/browse/trunk/mod_sflow.c?r=17

Neil

On Jan 9, 2011, at 11:03 PM, Neil McKee wrote:

> Great feedback,  thanks.
> 
> I have several questions,   if you don't mind...
> 
> 1. What is the most portable way to make the pipe non-blocking?  That way we can just drop any samples that get EWOULDBLOCK or EAGAIN on writing to the pipe (and increment the drops counter).    apr_file_pipe_create_ex(.... APR_FULL_NONBLOCK...) was my first choice,  but it only seems to have appeared fairly recently,  in APR version 1.3.
> 
> 2.  It's very unlikely that we would ever be sending a message on the pipe greater than about 200 bytes,  so if we ever see (msgBytes > PIPE_BUF) then I'm OK with just dropping that message.   PIPE_BUF seems to be defined without explicitly adding "limits.h" as another include.   Can I assume it will always be defined,  or is there something like "APR_PIPE_BUF" I should use instead?
> 
> 3.  Making sure we can't block in the critical section is obviously the most important thing,  but let me know if you still think we need to refine it further after that.  It is my assumption that the sampling-rate setting will be at least 1-in-100 (and quite possibly 1-in-50000) so the critical path is really just a few counter increments and a sampling decrement-and-test.    If anyone ever wanted to run it flat out at 1-in-10 or 1-in-1 then obviously we'd have to look closer at this,  but there are other tools for that kind of logging.
> 
> The sFlow standard allows for multiple independent "sub-agents" to maintain their own sequence numbers and send their own datagrams (an important property for a large network switch),  so theoretically we could do without the pipe and have each child process send his own sFlow. We could even have a separate sub-agent for each worker thread, and run with no locks at all.  However we decided against that because it could mean generating an uncontrolled number of these sub-agents,  which would be antisocial for the collector.
> 
> Atomic operations might help to avoid thread-locking here,  but I was hoping to avoid the "dark arts".
> 
> 4.  The pipe is anonymous.  The docs say that some platforms don't allow that and a filename is required.  Do I need to bother with that,  or are those platforms obsolete anyway?
> 
> I apologize if any of these are FAQs / RTFMs.  I'm new to APR.
> 
> Neil
> 
> 
> On Jan 9, 2011, at 12:17 PM, Ben Noordhuis wrote:
> 
>> On Sat, Jan 8, 2011 at 00:06, Neil McKee <ne...@inmon.com> wrote:
>>> This module is designed to work in both "prefork" and "worker" models.  I would really
>>> appreciate it if someone could review the design to make sure I made appropriate choices
>>> about where to use pipes, shared-memory, mutex locking,  and so on(!)   These choices
>>> are documented in the comment at the top of the mod_sflow.c file,  here:
>>> http://code.google.com/p/mod-sflow/source/browse/trunk/mod_sflow.c?r=14
>> 
>> Neil, two points of critique:
>> 
>> 1. You are doing way too much in your critical section, including
>> potentially blocking actions like logging.
>> 
>> 2. Assuming it's safe to write up to 4K to the pipe is dangerous for
>> several reasons: PIPE_BUF may be < 4096, the pipe may not be empty,
>> etc. This ties in with #1 since you are doing it from within the
>> critical section.
> 


Re: mod_sflow

Posted by Neil McKee <ne...@inmon.com>.
Great feedback,  thanks.

I have several questions,   if you don't mind...

1. What is the most portable way to make the pipe non-blocking?  That way we can just drop any samples that get EWOULDBLOCK or EAGAIN on writing to the pipe (and increment the drops counter).    apr_file_pipe_create_ex(.... APR_FULL_NONBLOCK...) was my first choice,  but it only seems to have appeared fairly recently,  in APR version 1.3.

2.  It's very unlikely that we would ever be sending a message on the pipe greater than about 200 bytes,  so if we ever see (msgBytes > PIPE_BUF) then I'm OK with just dropping that message.   PIPE_BUF seems to be defined without explicitly adding "limits.h" as another include.   Can I assume it will always be defined,  or is there something like "APR_PIPE_BUF" I should use instead?

3.  Making sure we can't block in the critical section is obviously the most important thing,  but let me know if you still think we need to refine it further after that.  It is my assumption that the sampling-rate setting will be at least 1-in-100 (and quite possibly 1-in-50000) so the critical path is really just a few counter increments and a sampling decrement-and-test.    If anyone ever wanted to run it flat out at 1-in-10 or 1-in-1 then obviously we'd have to look closer at this,  but there are other tools for that kind of logging.

The sFlow standard allows for multiple independent "sub-agents" to maintain their own sequence numbers and send their own datagrams (an important property for a large network switch),  so theoretically we could do without the pipe and have each child process send his own sFlow. We could even have a separate sub-agent for each worker thread, and run with no locks at all.  However we decided against that because it could mean generating an uncontrolled number of these sub-agents,  which would be antisocial for the collector.

Atomic operations might help to avoid thread-locking here,  but I was hoping to avoid the "dark arts".

4.  The pipe is anonymous.  The docs say that some platforms don't allow that and a filename is required.  Do I need to bother with that,  or are those platforms obsolete anyway?

I apologize if any of these are FAQs / RTFMs.  I'm new to APR.

Neil


On Jan 9, 2011, at 12:17 PM, Ben Noordhuis wrote:

> On Sat, Jan 8, 2011 at 00:06, Neil McKee <ne...@inmon.com> wrote:
>> This module is designed to work in both "prefork" and "worker" models.  I would really
>> appreciate it if someone could review the design to make sure I made appropriate choices
>> about where to use pipes, shared-memory, mutex locking,  and so on(!)   These choices
>> are documented in the comment at the top of the mod_sflow.c file,  here:
>> http://code.google.com/p/mod-sflow/source/browse/trunk/mod_sflow.c?r=14
> 
> Neil, two points of critique:
> 
> 1. You are doing way too much in your critical section, including
> potentially blocking actions like logging.
> 
> 2. Assuming it's safe to write up to 4K to the pipe is dangerous for
> several reasons: PIPE_BUF may be < 4096, the pipe may not be empty,
> etc. This ties in with #1 since you are doing it from within the
> critical section.


Re: mod_sflow

Posted by Ben Noordhuis <in...@bnoordhuis.nl>.
On Sat, Jan 8, 2011 at 00:06, Neil McKee <ne...@inmon.com> wrote:
> This module is designed to work in both "prefork" and "worker" models.  I would really
> appreciate it if someone could review the design to make sure I made appropriate choices
> about where to use pipes, shared-memory, mutex locking,  and so on(!)   These choices
> are documented in the comment at the top of the mod_sflow.c file,  here:
> http://code.google.com/p/mod-sflow/source/browse/trunk/mod_sflow.c?r=14

Neil, two points of critique:

1. You are doing way too much in your critical section, including
potentially blocking actions like logging.

2. Assuming it's safe to write up to 4K to the pipe is dangerous for
several reasons: PIPE_BUF may be < 4096, the pipe may not be empty,
etc. This ties in with #1 since you are doing it from within the
critical section.

Re: mod_sflow

Posted by Neil McKee <ne...@inmon.com>.
I removed all the ap_assert() calls.

The "master" process that is forked at the beginning will simply bow-out gracefully if it loses self-confidence,  which will result in the pipe being closed.  When a child processes detects this (or any other IPC or mutex problem) it will log an error and go into a "logging disabled" state where it will ignore all subsequent transactions and generally do as little as possible.

http://code.google.com/p/mod-sflow/source/browse/trunk/mod_sflow.c?r=22

Does anyone have any comments about the choice of performance counters for HTTP? 

typedef struct _SFLHTTP_counters {
  uint32_t method_option_count;
  uint32_t method_get_count;
  uint32_t method_head_count;
  uint32_t method_post_count;
  uint32_t method_put_count;
  uint32_t method_delete_count;
  uint32_t method_trace_count;
  uint32_t method_connect_count;
  uint32_t method_other_count;
  uint32_t status_1XX_count;
  uint32_t status_2XX_count;
  uint32_t status_3XX_count;
  uint32_t status_4XX_count;
  uint32_t status_5XX_count;
  uint32_t status_other_count;
} SFLHTTP_counters;

For example,  Ganglia already supports sFlow for monitoring cluster performance and could easily be extended to monitor web-farms using these counters.

http://blog.sflow.com/2010/10/ganglia.html

Neil



On Jan 10, 2011, at 3:52 PM, Nick Kew wrote:

> On Fri, 7 Jan 2011 15:06:19 -0800
> Neil McKee <ne...@inmon.com> wrote:
> 
>> Hello all,
>> 
>> A new module that exports log data using the sFlow protocol is available for review:
>> 
>> http://mod-sflow.googlecode.com
> 
> I started to look at it, then got distracted.
> 
> Five-minute review:
> (1) It looks well-formed as a module.  If the question was
>     meant to be "have I got the basics", then the answer is yes.
> (2) I don't like modules that die on error.  Use of ap_assert
>     makes me uneasy.
> 
> -- 
> Nick Kew
> 
> Available for work, contract or permanent.
> http://www.webthing.com/~nick/cv.html


Re: mod_sflow

Posted by Nick Kew <ni...@apache.org>.
On Fri, 7 Jan 2011 15:06:19 -0800
Neil McKee <ne...@inmon.com> wrote:

> Hello all,
> 
> A new module that exports log data using the sFlow protocol is available for review:
> 
> http://mod-sflow.googlecode.com

I started to look at it, then got distracted.

Five-minute review:
 (1) It looks well-formed as a module.  If the question was
     meant to be "have I got the basics", then the answer is yes.
 (2) I don't like modules that die on error.  Use of ap_assert
     makes me uneasy.

-- 
Nick Kew

Available for work, contract or permanent.
http://www.webthing.com/~nick/cv.html