You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Randall Leeds <ra...@gmail.com> on 2010/04/19 22:03:01 UTC

SASL overhead (was Re: CouchDB and Hadoop_)

I've found that replication crashes pin the cpu and can even make couch
unresponsive because the message queues and internal state of the
replication processes is huge SASL logging serializes it all to the log
file.

I don't really have a good solution to this. Maybe there should be an option
for turning off SASL in couch which would restrict the log messages to
explicitly logged messages from couches ?LOG_<level> functions. It seems a
reasonable default to me that SASL is only enabled when log level is DEBUG.
If people feel good about this I'd happily make the patch.

On Apr 19, 2010 10:39 AM, "Adam Kocoloski" <ko...@apache.org> wrote:

Thanks Fredrik.  I think I have a pretty good handle on what's happening and
have replied in detail in JIRA.  Best,

Adam

On Apr 19, 2010, at 10:22 AM, Fredrik Widlund wrote:

>
>
> Hi,
>
> https://issues.apache.org/ji...

Re: SASL overhead (was Re: CouchDB and Hadoop_)

Posted by Adam Kocoloski <ko...@apache.org>.

On Apr 19, 2010, at 4:24 PM, Randall Leeds wrote:

> Nice! Definitely the full state should be kept if debug logging is on, but I
> like this approach. I had thought of truncating the state in terminate but
> hadn't put it together to check the log level and was worried about losing
> debugging info.
> 
> At least this portion of the patch might be a great candidate for someone
> who wants to get their hands dirty with patches.
> 
> couch_rep_* processes are a great place to start, clearing document queues
> in the #state record and clearing the process message queue (is there a way
> to do this in erlang that's better than a tight receive loop)?

Nope, a tight receive loop is the way to go afaik.

Adam

> On Apr 19, 2010 4:18 PM, "Adam Kocoloski" <ko...@apache.org> wrote:
> 
> CouchDB definitely needs a bit of spring cleaning in the logging department.
> If you haven't noticed, most error messages are written *twice* in the
> logs, once in a very nice format by SASL and once in a crude format by
> couch_log.  I believe this is because the couch_log event handler is
> installed in the same event manager as the SASL error_logger, and because
> the second handle_event clause in couch_log matches SASL-style error
> reports. So even if you turned off the SASL error_logger I think you'd still
> get these messages in the log.  I've been meaning to fix that ...
> 
> I'd prefer to keep the SASL-style logs, although I understand that we need
> to trim them down.  That can be accomplished on a process-by-process basis
> by identifying the processes with large internal states and truncating those
> states in a terminate() function just before exiting (perhaps keeping the
> full state if ?LOG_DEBUG is on).  Cheers,
> 
> Adam
> 
> 
> On Apr 19, 2010, at 4:03 PM, Randall Leeds wrote:
> 
>> I've found that replication crashes pin the cp...

Re: SASL overhead (was Re: CouchDB and Hadoop_)

Posted by Randall Leeds <ra...@gmail.com>.

Nice! Definitely the full state should be kept if debug logging is on, but I
like this approach. I had thought of truncating the state in terminate but
hadn't put it together to check the log level and was worried about losing
debugging info.

At least this portion of the patch might be a great candidate for someone
who wants to get their hands dirty with patches.

couch_rep_* processes are a great place to start, clearing document queues
in the #state record and clearing the process message queue (is there a way
to do this in erlang that's better than a tight receive loop)?

On Apr 19, 2010 4:18 PM, "Adam Kocoloski" <ko...@apache.org> wrote:

CouchDB definitely needs a bit of spring cleaning in the logging department.
 If you haven't noticed, most error messages are written *twice* in the
logs, once in a very nice format by SASL and once in a crude format by
couch_log.  I believe this is because the couch_log event handler is
installed in the same event manager as the SASL error_logger, and because
the second handle_event clause in couch_log matches SASL-style error
reports. So even if you turned off the SASL error_logger I think you'd still
get these messages in the log.  I've been meaning to fix that ...

I'd prefer to keep the SASL-style logs, although I understand that we need
to trim them down.  That can be accomplished on a process-by-process basis
by identifying the processes with large internal states and truncating those
states in a terminate() function just before exiting (perhaps keeping the
full state if ?LOG_DEBUG is on).  Cheers,

Adam


On Apr 19, 2010, at 4:03 PM, Randall Leeds wrote:

> I've found that replication crashes pin the cp...

Re: SASL overhead (was Re: CouchDB and Hadoop_)

Posted by Randall Leeds <ra...@gmail.com>.

Nice! Definitely the full state should be kept if debug logging is on, but I
like this approach. I had thought of truncating the state in terminate but
hadn't put it together to check the log level and was worried about losing
debugging info.

At least this portion of the patch might be a great candidate for someone
who wants to get their hands dirty with patches.

couch_rep_* processes are a great place to start, clearing document queues
in the #state record and clearing the process message queue (is there a way
to do this in erlang that's better than a tight receive loop)?

On Apr 19, 2010 4:18 PM, "Adam Kocoloski" <ko...@apache.org> wrote:

CouchDB definitely needs a bit of spring cleaning in the logging department.
 If you haven't noticed, most error messages are written *twice* in the
logs, once in a very nice format by SASL and once in a crude format by
couch_log.  I believe this is because the couch_log event handler is
installed in the same event manager as the SASL error_logger, and because
the second handle_event clause in couch_log matches SASL-style error
reports. So even if you turned off the SASL error_logger I think you'd still
get these messages in the log.  I've been meaning to fix that ...

I'd prefer to keep the SASL-style logs, although I understand that we need
to trim them down.  That can be accomplished on a process-by-process basis
by identifying the processes with large internal states and truncating those
states in a terminate() function just before exiting (perhaps keeping the
full state if ?LOG_DEBUG is on).  Cheers,

Adam


On Apr 19, 2010, at 4:03 PM, Randall Leeds wrote:

> I've found that replication crashes pin the cp...

Re: SASL overhead (was Re: CouchDB and Hadoop_)

Posted by Adam Kocoloski <ko...@apache.org>.

CouchDB definitely needs a bit of spring cleaning in the logging department.  If you haven't noticed, most error messages are written *twice* in the logs, once in a very nice format by SASL and once in a crude format by couch_log.  I believe this is because the couch_log event handler is installed in the same event manager as the SASL error_logger, and because the second handle_event clause in couch_log matches SASL-style error reports. So even if you turned off the SASL error_logger I think you'd still get these messages in the log.  I've been meaning to fix that ...

I'd prefer to keep the SASL-style logs, although I understand that we need to trim them down.  That can be accomplished on a process-by-process basis by identifying the processes with large internal states and truncating those states in a terminate() function just before exiting (perhaps keeping the full state if ?LOG_DEBUG is on).  Cheers,

Adam

On Apr 19, 2010, at 4:03 PM, Randall Leeds wrote:

> I've found that replication crashes pin the cpu and can even make couch
> unresponsive because the message queues and internal state of the
> replication processes is huge SASL logging serializes it all to the log
> file.
> 
> I don't really have a good solution to this. Maybe there should be an option
> for turning off SASL in couch which would restrict the log messages to
> explicitly logged messages from couches ?LOG_<level> functions. It seems a
> reasonable default to me that SASL is only enabled when log level is DEBUG.
> If people feel good about this I'd happily make the patch.
> 
> On Apr 19, 2010 10:39 AM, "Adam Kocoloski" <ko...@apache.org> wrote:
> 
> Thanks Fredrik.  I think I have a pretty good handle on what's happening and
> have replied in detail in JIRA.  Best,
> 
> Adam
> 
> 
> On Apr 19, 2010, at 10:22 AM, Fredrik Widlund wrote:
> 
>> 
>> 
>> Hi,
>> 
>> https://issues.apache.org/ji...

Re: SASL overhead (was Re: CouchDB and Hadoop_)

Posted by Adam Kocoloski <ko...@apache.org>.

CouchDB definitely needs a bit of spring cleaning in the logging department.  If you haven't noticed, most error messages are written *twice* in the logs, once in a very nice format by SASL and once in a crude format by couch_log.  I believe this is because the couch_log event handler is installed in the same event manager as the SASL error_logger, and because the second handle_event clause in couch_log matches SASL-style error reports. So even if you turned off the SASL error_logger I think you'd still get these messages in the log.  I've been meaning to fix that ...

I'd prefer to keep the SASL-style logs, although I understand that we need to trim them down.  That can be accomplished on a process-by-process basis by identifying the processes with large internal states and truncating those states in a terminate() function just before exiting (perhaps keeping the full state if ?LOG_DEBUG is on).  Cheers,

Adam

On Apr 19, 2010, at 4:03 PM, Randall Leeds wrote:

> I've found that replication crashes pin the cpu and can even make couch
> unresponsive because the message queues and internal state of the
> replication processes is huge SASL logging serializes it all to the log
> file.
> 
> I don't really have a good solution to this. Maybe there should be an option
> for turning off SASL in couch which would restrict the log messages to
> explicitly logged messages from couches ?LOG_<level> functions. It seems a
> reasonable default to me that SASL is only enabled when log level is DEBUG.
> If people feel good about this I'd happily make the patch.
> 
> On Apr 19, 2010 10:39 AM, "Adam Kocoloski" <ko...@apache.org> wrote:
> 
> Thanks Fredrik.  I think I have a pretty good handle on what's happening and
> have replied in detail in JIRA.  Best,
> 
> Adam
> 
> 
> On Apr 19, 2010, at 10:22 AM, Fredrik Widlund wrote:
> 
>> 
>> 
>> Hi,
>> 
>> https://issues.apache.org/ji...