You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Clint Kelly <cl...@gmail.com> on 2014/11/16 22:42:15 UTC

Best practices for route tracing

Hi all,

I am trying to debug some high-latency outliers (99th percentile) in an
application I'm working on.  I thought that I could turn on route tracing,
print the route traces to logs, and then examine my logs after a load test
to find the highest-latency paths and figure out what is going on.

I did something similar to what is described here:
http://www.datastax.com/documentation/developer/java-driver/2.1/java-driver/tracing_t.html

Under low amounts of traffic, this looks great and I can see nice trace
logs:

Total duration: 599 us (id: ae159110-6dd8-11e4-a6a1-cd4ac3ad1896)
                                  Executing single-partition query on
lg_users_C |   13:36:52.898 |     /10.29.2.9 |          204
                                                    Acquiring sstable
references |   13:36:52.898 |     /10.29.2.9 |          207
                                                     Merging memtable
tombstones |   13:36:52.898 |     /10.29.2.9 |          213
                                         Bloom filter allows skipping
sstable 40 |   13:36:52.898 |     /10.29.2.9 |          231
                                         Bloom filter allows skipping
sstable 38 |   13:36:52.898 |     /10.29.2.9 |          237
                                         Bloom filter allows skipping
sstable 37 |   13:36:52.898 |     /10.29.2.9 |          243
                                                    Key cache hit for
sstable 39 |   13:36:52.898 |     /10.29.2.9 |          251
                               Seeking to partition indexed section in data
file |   13:36:52.898 |     /10.29.2.9 |          251
       Skipped 0/4 non-slice-intersecting sstables, included 0 due to
tombstones |   13:36:52.898 |     /10.29.2.9 |          455
                                      Merging data from memtables and 1
sstables |   13:36:52.898 |     /10.29.2.9 |          457
                                              Read 0 live and 0 tombstoned
cells |   13:36:52.898 |     /10.29.2.9 |          472


Under heavier load, however, things seem to fall over, and I get a lot of
exceptions like:

14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.9:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.8:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.9:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.12:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.9:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.9:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.10:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.11:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying xwing07.ul.wibidata.net/10.29.2.7:9042, trying next host (error
is: com.datastax.driver.core.exceptions.DriverException: Timeout during
read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.10:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.9:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.8:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying xwing07.ul.wibidata.net/10.29.2.7:9042, trying next host (error
is: com.datastax.driver.core.exceptions.DriverException: Timeout during
read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.8:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 DEBUG com.datastax.driver.core.RequestHandler: Error
querying /10.29.2.11:9042, trying next host (error is:
com.datastax.driver.core.exceptions.DriverException: Timeout during read)
14/11/16 13:39:41 WARN com.datastax.driver.core.Connection: Timeout while
setting keyspace on connection to /10.29.2.11:9042. This should not happen
but is not critical (it will retried)

Is this expected?  I am running only ~100 reads per second on this load
test, which I didn't think should be too much.

One thing I noticed is that the example I referenced above explicitly calls:

      insert.disableTracing();

do I need to do that in my application for a statement that is not used
again?  What if I have a prepared statement that I issue several time
consecutively?

Thanks!

-Clint