You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2012/10/25 12:15:12 UTC

[jira] [Created] (CASSANDRA-4861) Consider separating tracing from log4j

Sylvain Lebresne created CASSANDRA-4861:
-------------------------------------------

             Summary: Consider separating tracing from log4j
                 Key: CASSANDRA-4861
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Sylvain Lebresne
             Fix For: 1.2.0 beta 2


Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
# the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
 ** Consider some of the messages we log at debug in CFS:
   {noformat}
   logger.debug("memtable is already frozen; another thread must be flushing it");
   logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
   logger.debug("Checking for sstables overlapping {}", sstables);
   {noformat}
   Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
   {noformat}
   logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
   logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
   logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
   logger.debug("Aborting compaction for dropped CF");
   logger.debug("No tasks available");
   logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
   logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
   {noformat}
   It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
   interrupted:
   {noformat}
   if (t instanceof CompactionInterruptedException)
   {
       logger.info(t.getMessage());
       logger.debug("Full interruption stack trace:", t);
   }
   {noformat}
   I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
   Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
   {noformat}
   logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
   {noformat}
   I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
   As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
  ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
# I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
# It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.

And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.

Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4861) Consider separating tracing from log4j

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-4861:
--------------------------------------

    Fix Version/s:     (was: 1.2.0 beta 2)
                   1.2.0

Tagging 1.2.0; don't want to block beta2 for this.
                
> Consider separating tracing from log4j
> --------------------------------------
>
>                 Key: CASSANDRA-4861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>             Fix For: 1.2.0
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
>  ** Consider some of the messages we log at debug in CFS:
>    {noformat}
>    logger.debug("memtable is already frozen; another thread must be flushing it");
>    logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
>    logger.debug("Checking for sstables overlapping {}", sstables);
>    {noformat}
>    Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
>    {noformat}
>    logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
>    logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
>    logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
>    logger.debug("Aborting compaction for dropped CF");
>    logger.debug("No tasks available");
>    logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
>    logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
>    {noformat}
>    It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
>    interrupted:
>    {noformat}
>    if (t instanceof CompactionInterruptedException)
>    {
>        logger.info(t.getMessage());
>        logger.debug("Full interruption stack trace:", t);
>    }
>    {noformat}
>    I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
>    Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
>    {noformat}
>    logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
>    {noformat}
>    I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
>    As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
>   ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4861) Consider separating tracing from log4j

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-4861:
----------------------------------------

    Attachment: 4861.txt

Attaching patch for this.

The gist of the patch is that instead of calling logger.debug(...) to trace stuff, Tracing.trace(...) is now called. I note that I've made so that this latter method logs the message at TRACE. We can change but I figured that anything that belong the query tracing probably belong to trace logging, and that avoids duplicating two calls (one to trace, one to log). There is also a Tracing.traceAndDebugLog for when you want to log at debug (but as it happens, I think most traced stuff don't belong at DEBUG log, because if we start logging per-query things at DEBUG it very quickly makes the debug log unusable).

As far as I can tell, for inserts and select the things logged are pretty much the same. Except for the fact that I took the liberty of adding the tracing of CQL3 statemtn parsing/preparation/validation but that's a detail. I've also done a few very minor update to one or two messages while at it when I though it improved them, but I'm happy reverting those bits if someone disagree.

I do want to note that this does change what is trace if you say trace a CREATE TABLE statement. Currently, in trunk, the trace is unreadable, because we trace messages like:
{noformat}
Renaming /home/mcmanus/.ccm/test/node1/data/system/schema_columnfamilies/system-schema_columnfamilies-tmp-ia-2-CompressionInfo.db to /home/mcmanus/.ccm/test/node1/data/system/schema_columnfamilies/system-schema_columnfamilies-ia-2-CompressionInfo.d
Completed flushing /home/mcmanus/.ccm/test/node1/data/system/schema_columnfamilies/system-schema_columnfamilies-ia-2-Data.db (646 bytes) for commitlog position ReplayPosition(segmentId=1352213221524, position=53258)
Creating IntervalNode from [[DecoratedKey(2008276574632865675, 73797374656d), DecoratedKey(5501786289152180687, 73797374656d5f747261636573)](SSTableReader(path='/home/mcmanus/.ccm/test/node1/data/system/schema_columnfamilies/system-schema_columnfamilies-ia-1-Data.db')), [DecoratedKey(-6017608668500074083, 74657374), DecoratedKey(-6017608668500074083, 74657374)](SSTableReader(path='/home/mcmanus/.ccm/test/node1/data/system/schema_columnfamilies/system-schema_columnfamilies-ia-2-Data.db'))]
completed reading (0 ms; 0 keys) saved cache /home/mcmanus/.ccm/test/node1/saved_caches/test-foo-KeyCache-b.db
...
{noformat}
which are not traced with this patch obviously. This also restore the debug log to its former glory (without all those annoying 'Acquiring switchLock' and other 'Appending to commitlog' all over the place).

                
> Consider separating tracing from log4j
> --------------------------------------
>
>                 Key: CASSANDRA-4861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Sylvain Lebresne
>             Fix For: 1.2.0
>
>         Attachments: 4861.txt
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
>  ** Consider some of the messages we log at debug in CFS:
>    {noformat}
>    logger.debug("memtable is already frozen; another thread must be flushing it");
>    logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
>    logger.debug("Checking for sstables overlapping {}", sstables);
>    {noformat}
>    Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
>    {noformat}
>    logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
>    logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
>    logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
>    logger.debug("Aborting compaction for dropped CF");
>    logger.debug("No tasks available");
>    logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
>    logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
>    {noformat}
>    It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
>    interrupted:
>    {noformat}
>    if (t instanceof CompactionInterruptedException)
>    {
>        logger.info(t.getMessage());
>        logger.debug("Full interruption stack trace:", t);
>    }
>    {noformat}
>    I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
>    Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
>    {noformat}
>    logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
>    {noformat}
>    I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
>    As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
>   ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4861) Consider separating tracing from log4j

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492520#comment-13492520 ] 

Jonathan Ellis commented on CASSANDRA-4861:
-------------------------------------------

LGTM in general.  Minor points:

- Using varargs forces an array allocation.  I like log4j's approach here: have overloads for 1/2/3 parameters, and one that takes a (non-varargs) array.  This means that if you do need to pass so many parameters that you allocate an array, you need to be explicit about it.
- Ambivalent about whether getting trace-level log messages for free is worth having to pass logger to each trace call.  Leaning towards "probably not worth it;" many of these messages are "high level" ones that I added recently specifically for tracing, which IMO shows that they are not very useful for debugging (or we would have had them there already).
                
> Consider separating tracing from log4j
> --------------------------------------
>
>                 Key: CASSANDRA-4861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.2.0
>
>         Attachments: 4861.txt
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
>  ** Consider some of the messages we log at debug in CFS:
>    {noformat}
>    logger.debug("memtable is already frozen; another thread must be flushing it");
>    logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
>    logger.debug("Checking for sstables overlapping {}", sstables);
>    {noformat}
>    Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
>    {noformat}
>    logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
>    logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
>    logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
>    logger.debug("Aborting compaction for dropped CF");
>    logger.debug("No tasks available");
>    logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
>    logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
>    {noformat}
>    It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
>    interrupted:
>    {noformat}
>    if (t instanceof CompactionInterruptedException)
>    {
>        logger.info(t.getMessage());
>        logger.debug("Full interruption stack trace:", t);
>    }
>    {noformat}
>    I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
>    Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
>    {noformat}
>    logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
>    {noformat}
>    I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
>    As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
>   ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4861) Consider separating tracing from log4j

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484114#comment-13484114 ] 

Jonathan Ellis commented on CASSANDRA-4861:
-------------------------------------------

ISTM that over half of your objections are addressed by considering that only debug messages *on a request-handling thread* are traced.  So all of compaction manager etc doesn't matter.

As for flexibility in the future, point taken but we're not tied to this implementation at all.  As far as clients are concerned, as long as traces get put in system_traces.sessions and events, we can change out the plumbing as much as we need.
                
> Consider separating tracing from log4j
> --------------------------------------
>
>                 Key: CASSANDRA-4861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>             Fix For: 1.2.0 beta 2
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
>  ** Consider some of the messages we log at debug in CFS:
>    {noformat}
>    logger.debug("memtable is already frozen; another thread must be flushing it");
>    logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
>    logger.debug("Checking for sstables overlapping {}", sstables);
>    {noformat}
>    Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
>    {noformat}
>    logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
>    logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
>    logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
>    logger.debug("Aborting compaction for dropped CF");
>    logger.debug("No tasks available");
>    logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
>    logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
>    {noformat}
>    It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
>    interrupted:
>    {noformat}
>    if (t instanceof CompactionInterruptedException)
>    {
>        logger.info(t.getMessage());
>        logger.debug("Full interruption stack trace:", t);
>    }
>    {noformat}
>    I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
>    Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
>    {noformat}
>    logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
>    {noformat}
>    I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
>    As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
>   ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-4861) Consider separating tracing from log4j

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492525#comment-13492525 ] 

Jonathan Ellis edited comment on CASSANDRA-4861 at 11/7/12 5:25 PM:
--------------------------------------------------------------------

bq. Using varargs forces an array allocation

Dug a little deeper.  String.format uses varargs itself, and so does Formatter.format.  (Side note: String.format instantiates a new Formatter each time.)  So adjusting Tracing alone won't be enough.

SLF4J MessageFormatter [says|http://www.slf4j.org/api/org/slf4j/helpers/MessageFormatter.html],

bq. The formatting conventions are different than those of MessageFormat which ships with the Java platform. This is justified by the fact that SLF4J's implementation is 10 times faster than that of MessageFormat. This local performance difference is both measurable and significant in the larger context of the complete logging processing chain.

So I would recommend using MessageFormatter (which is public) instead.
                
      was (Author: jbellis):
    bq. Using varargs forces an array allocation

Dug a little deeper.  String.format uses varargs itself, and so does Formatter.format.  (Side note: String.format instantiates a new Formatter each time.)  So adjusting Tracing alone won't be enough.

SLF4J MessageFormatter [says|http://www.slf4j.org/api/org/slf4j/helpers/MessageFormatter.html],

bq. The formatting conventions are different than those of MessageFormat which ships with the Java platform. This is justified by the fact that SLF4J's implementation is 10 times faster than that of MessageFormat. This local performance difference is both measurable and significant in the larger context of the complete logging processing chain."

So I would recommend using MessageFormatter (which is public) instead.
                  
> Consider separating tracing from log4j
> --------------------------------------
>
>                 Key: CASSANDRA-4861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.2.0
>
>         Attachments: 4861.txt
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
>  ** Consider some of the messages we log at debug in CFS:
>    {noformat}
>    logger.debug("memtable is already frozen; another thread must be flushing it");
>    logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
>    logger.debug("Checking for sstables overlapping {}", sstables);
>    {noformat}
>    Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
>    {noformat}
>    logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
>    logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
>    logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
>    logger.debug("Aborting compaction for dropped CF");
>    logger.debug("No tasks available");
>    logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
>    logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
>    {noformat}
>    It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
>    interrupted:
>    {noformat}
>    if (t instanceof CompactionInterruptedException)
>    {
>        logger.info(t.getMessage());
>        logger.debug("Full interruption stack trace:", t);
>    }
>    {noformat}
>    I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
>    Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
>    {noformat}
>    logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
>    {noformat}
>    I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
>    As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
>   ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4861) Consider separating tracing from log4j

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492682#comment-13492682 ] 

Jonathan Ellis commented on CASSANDRA-4861:
-------------------------------------------

+1
                
> Consider separating tracing from log4j
> --------------------------------------
>
>                 Key: CASSANDRA-4861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.2.0
>
>         Attachments: 4861.txt, 4861-v2.txt
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
>  ** Consider some of the messages we log at debug in CFS:
>    {noformat}
>    logger.debug("memtable is already frozen; another thread must be flushing it");
>    logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
>    logger.debug("Checking for sstables overlapping {}", sstables);
>    {noformat}
>    Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
>    {noformat}
>    logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
>    logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
>    logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
>    logger.debug("Aborting compaction for dropped CF");
>    logger.debug("No tasks available");
>    logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
>    logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
>    {noformat}
>    It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
>    interrupted:
>    {noformat}
>    if (t instanceof CompactionInterruptedException)
>    {
>        logger.info(t.getMessage());
>        logger.debug("Full interruption stack trace:", t);
>    }
>    {noformat}
>    I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
>    Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
>    {noformat}
>    logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
>    {noformat}
>    I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
>    As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
>   ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4861) Consider separating tracing from log4j

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484158#comment-13484158 ] 

Sylvain Lebresne commented on CASSANDRA-4861:
---------------------------------------------

bq. ISTM that over half of your objections are addressed by considering that only debug messages on a request-handling thread are traced.

It's a good point, and I did forgot that it was only logged on request-handling threads, but that doesn't really "addressed" the problem, just make my particular examples not relevant. As a developer, I still have to worry when I want to add a debug logging if I'm on a request-handling thread or not, which I shouldn't have. Do I want to log that and do I want to add this to the query tracing is two different questions and having one implying the other is limiting. But granted, the fact that it's only request-handling thread probably makes fixing this less urgent than I made it sound.

The other thing is that this thread local business we are forced to use currently seems also a bit limiting/ugly. By design, a lot of what happen in Cassandra doesn't happen on the request-handling thread, but rather on some other stage. We might be good for an initial version, but I'd be surprised if we don't end up wanting to trace things happening on other threads very soon.  Which we could do with the current implementation by passing the trace uuid around and making sure every thread that need to trace things first initialize it's thread local TraceState based on said trace uuid, but it would be much simpler to just be able to pass the TraceState object around.

bq. we're not tied to this implementation at all

That's definitively true and I didn't pretended otherwise. What I'm wondering however is if using a log4j appender bring any benefits? Because it seems to me it only bring limitations. As long as we agree on that last part, I'm willing to spend the < 2 hours (there really isn't much to change) needed to remove the dependency of tracing to log4j before 1.2 just for the sake of not having people wonder what that TracingAppender in log4j-server.properties is and why it seems that the rootLogger is at debug even though they don't get DEBUG logs.

                
> Consider separating tracing from log4j
> --------------------------------------
>
>                 Key: CASSANDRA-4861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>             Fix For: 1.2.0 beta 2
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
>  ** Consider some of the messages we log at debug in CFS:
>    {noformat}
>    logger.debug("memtable is already frozen; another thread must be flushing it");
>    logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
>    logger.debug("Checking for sstables overlapping {}", sstables);
>    {noformat}
>    Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
>    {noformat}
>    logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
>    logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
>    logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
>    logger.debug("Aborting compaction for dropped CF");
>    logger.debug("No tasks available");
>    logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
>    logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
>    {noformat}
>    It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
>    interrupted:
>    {noformat}
>    if (t instanceof CompactionInterruptedException)
>    {
>        logger.info(t.getMessage());
>        logger.debug("Full interruption stack trace:", t);
>    }
>    {noformat}
>    I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
>    Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
>    {noformat}
>    logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
>    {noformat}
>    I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
>    As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
>   ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4861) Consider separating tracing from log4j

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492525#comment-13492525 ] 

Jonathan Ellis commented on CASSANDRA-4861:
-------------------------------------------

bq. Using varargs forces an array allocation

Dug a little deeper.  String.format uses varargs itself, and so does Formatter.format.  (Side note: String.format instantiates a new Formatter each time.)  So adjusting Tracing alone won't be enough.

SLF4J MessageFormatter [says|http://www.slf4j.org/api/org/slf4j/helpers/MessageFormatter.html],

bq. The formatting conventions are different than those of MessageFormat which ships with the Java platform. This is justified by the fact that SLF4J's implementation is 10 times faster than that of MessageFormat. This local performance difference is both measurable and significant in the larger context of the complete logging processing chain."

So I would recommend using MessageFormatter (which is public) instead.
                
> Consider separating tracing from log4j
> --------------------------------------
>
>                 Key: CASSANDRA-4861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.2.0
>
>         Attachments: 4861.txt
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
>  ** Consider some of the messages we log at debug in CFS:
>    {noformat}
>    logger.debug("memtable is already frozen; another thread must be flushing it");
>    logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
>    logger.debug("Checking for sstables overlapping {}", sstables);
>    {noformat}
>    Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
>    {noformat}
>    logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
>    logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
>    logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
>    logger.debug("Aborting compaction for dropped CF");
>    logger.debug("No tasks available");
>    logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
>    logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
>    {noformat}
>    It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
>    interrupted:
>    {noformat}
>    if (t instanceof CompactionInterruptedException)
>    {
>        logger.info(t.getMessage());
>        logger.debug("Full interruption stack trace:", t);
>    }
>    {noformat}
>    I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
>    Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
>    {noformat}
>    logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
>    {noformat}
>    I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
>    As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
>   ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4861) Consider separating tracing from log4j

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492574#comment-13492574 ] 

Sylvain Lebresne commented on CASSANDRA-4861:
---------------------------------------------

bq. Ambivalent about whether getting trace-level log messages for free is worth having to pass logger to each trace call

I agree actually. That started pissing me off always through the patch, not sure why I kept it.

bq. So I would recommend using MessageFormatter (which is public) instead

That's a good idea. Attached v2 that does that (and stop logging thing to trace).
                
> Consider separating tracing from log4j
> --------------------------------------
>
>                 Key: CASSANDRA-4861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.2.0
>
>         Attachments: 4861.txt, 4861-v2.txt
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
>  ** Consider some of the messages we log at debug in CFS:
>    {noformat}
>    logger.debug("memtable is already frozen; another thread must be flushing it");
>    logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
>    logger.debug("Checking for sstables overlapping {}", sstables);
>    {noformat}
>    Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
>    {noformat}
>    logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
>    logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
>    logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
>    logger.debug("Aborting compaction for dropped CF");
>    logger.debug("No tasks available");
>    logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
>    logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
>    {noformat}
>    It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
>    interrupted:
>    {noformat}
>    if (t instanceof CompactionInterruptedException)
>    {
>        logger.info(t.getMessage());
>        logger.debug("Full interruption stack trace:", t);
>    }
>    {noformat}
>    I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
>    Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
>    {noformat}
>    logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
>    {noformat}
>    I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
>    As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
>   ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4861) Consider separating tracing from log4j

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484180#comment-13484180 ] 

Jonathan Ellis commented on CASSANDRA-4861:
-------------------------------------------

All right.
                
> Consider separating tracing from log4j
> --------------------------------------
>
>                 Key: CASSANDRA-4861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>             Fix For: 1.2.0 beta 2
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
>  ** Consider some of the messages we log at debug in CFS:
>    {noformat}
>    logger.debug("memtable is already frozen; another thread must be flushing it");
>    logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
>    logger.debug("Checking for sstables overlapping {}", sstables);
>    {noformat}
>    Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
>    {noformat}
>    logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
>    logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
>    logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
>    logger.debug("Aborting compaction for dropped CF");
>    logger.debug("No tasks available");
>    logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
>    logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
>    {noformat}
>    It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
>    interrupted:
>    {noformat}
>    if (t instanceof CompactionInterruptedException)
>    {
>        logger.info(t.getMessage());
>        logger.debug("Full interruption stack trace:", t);
>    }
>    {noformat}
>    I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
>    Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
>    {noformat}
>    logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
>    {noformat}
>    I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
>    As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
>   ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4861) Consider separating tracing from log4j

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-4861:
----------------------------------------

    Attachment: 4861-v2.txt
    
> Consider separating tracing from log4j
> --------------------------------------
>
>                 Key: CASSANDRA-4861
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4861
>             Project: Cassandra
>          Issue Type: Improvement
>    Affects Versions: 1.2.0 beta 1
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.2.0
>
>         Attachments: 4861.txt, 4861-v2.txt
>
>
> Currently, (as far as I understand) tracing is implemented as a log4j appender that "intercepts" all log messages and write them to a system table. I'm sorry to not have bring that up during the initial review (it's hard to follow every ticket) but before we release this I'd like to have a serious discussion on that choice because I'm not convinced (at all) that it's a good idea. Namely, I can see the following drawbacks:
> # the main one is that this *forces* every debug messages to be traced and conversely, every traced message to be logged at debug. But I strongly think that debug logging and query tracing are not the same thing. Don't get me wrong, there is clearly a large intersection between those two things (which is fine), but I do think that *identifying* them is a mistake. More concretely:
>  ** Consider some of the messages we log at debug in CFS:
>    {noformat}
>    logger.debug("memtable is already frozen; another thread must be flushing it");
>    logger.debug("forceFlush requested but everything is clean in {}", columnFamily);
>    logger.debug("Checking for sstables overlapping {}", sstables);
>    {noformat}
>    Those messages are useful for debugging and have a place in the log at debug, but they are noise as far as query tracing is concerned (None have any concrete impact on query performance, they just describe what the code has done). Or take the following ones from CompactionManager:
>    {noformat}
>    logger.debug("Background compaction is still running for {}.{} ({} remaining). Skipping", new Object[] {cfs.table.name, cfs.columnFamily, count});
>    logger.debug("Scheduling a background task check for {}.{} with {}", new Object[] {cfs.table.name, cfs.columnFamily, cfs.getCompactionStrategy().getClass().getSimpleName()});
>    logger.debug("Checking {}.{}", cfs.table.name, cfs.columnFamily);
>    logger.debug("Aborting compaction for dropped CF");
>    logger.debug("No tasks available");
>    logger.debug("Expected bloom filter size : " + expectedBloomFilterSize);
>    logger.debug("Cache flushing was already in progress: skipping {}", writer.getCompactionInfo());
>    {noformat}
>    It is useful to have that in the debug log, but how is any of that useful to users in query tracing? (it may be useful to trace if a new compaction start or stop, because that does influence query performance, but those message do not). Also take the following message logged when a compaction is user
>    interrupted:
>    {noformat}
>    if (t instanceof CompactionInterruptedException)
>    {
>        logger.info(t.getMessage());
>        logger.debug("Full interruption stack trace:", t);
>    }
>    {noformat}
>    I can buy that you may want the first log message in the query tracing, but the second one is definitively something that only make sense for debug logging but not for query tracing (and as a side note, the current implementation don't do something sensible as it traces "Full interruption stack trace:" but completely ignore the throwable).
>    Lastly, and though that's arguably more a detail (but why would we settle for something good enough if we can do better) I believe that in some cases you want an event to be both logged at debug and traced but having different messages could make sense. For instance, in CFS we have
>    {noformat}
>    logger.debug("Snapshot for " + table + " keyspace data file " + ssTable.getFilename() + " created in " + snapshotDirectory);
>    {noformat}
>    I'm not convinced that snapshot should be part of query tracing given it doesn't really have an impact on queries, but even if we do trace it, we probably don't care about having one event for each snapshoted file (2 events, one for the start of the snapshot, one for the end would be enough).
>    As it stands, I think query tracing will have a lot of random noises, which will not only be annoying but I'm also sure will make users spend time worrying about events that have no impact whatsoever. And I've only looked at the debug message of 2 classes ...
>   ** I also think there could be case where we would want to trace something, but not have it in the debug log. For instance, it makes sense in the query trace to know how long parsing the query took. But logging too much info per query like that will make the debug log unmanageable in many case. And if you say, let's log that at TRACE, you have to put the TracingAppender at trace and now you get all the junk (junk as far as query tracing is concerned) that trace logging have.
> # I find it rather unintuitive. Query tracing is enable per query and it writes its trace in a system table. How come changing some settings in the log4j config file can disable that feature for me? I agree it's not a big deal, but it does is some form of leaking an implementation detail.
> # It doesn't seem very future-proof. For instance (and that's only an example), I think it could make sense to later add a tracing level. I might want a very detailed tracing mode where I get very fine grained details like what sstable was hit, and how many seeks on that sstable we did and whatnot. But as said above, using the log4j TRACE level for that is not convenient because it logs lots of stuff that are completely unrelated to my query.
> And the only advantages of using log4j that I can see is that it that we don't have to go through all our debug statements to check which ones make sense to also add to query traces. But as lengthly explained above, that's not a real advantage as it end up generating trace that are less useful/user friendly as they could be.
> Now maybe there is killer advantages that I don't see, and the goal of this ticket is to discuss those. But if there isn't, I'm very much in favor of moving from log4j for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira