You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "T Jake Luciani (JIRA)" <ji...@apache.org> on 2016/06/23 16:30:16 UTC

[jira] [Updated] (CASSANDRA-12080) More detailed compaction log

     [ https://issues.apache.org/jira/browse/CASSANDRA-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

T Jake Luciani updated CASSANDRA-12080:
---------------------------------------
    Description: 
As mentioned by [~zznate] at NGCC the compaction task info at the end of the compaction log is pretty confusing.

Mainly we only show the throughput of the sstable writer.  But if there is a lot of merging being done it might look like compaction is really really slow since the output might be small but the inputs were huge.

Also bytes/sec isn't a great metric of *work* really we should be reporting the CQL row throughput. since for the same bytes on disk we might be compacting 100k rows or 1 large one.

I've added a trivial patch that improves the logging info to now show Read Throughput, Write Throughput, Rows sec and total source partitions.

{quote}
DEBUG [CompactionExecutor:1] 2016-06-23 12:22:06,114 CompactionTask.java:229 - Compacted (9edcfa50-395e-11e6-9944-3109153b1592) 2 sstables to [/home/jake/workspace/cassandra/data/data/stresscql/userpics-b9d2811038b711e69c04018b580faf7b/mb-11-big,] to level=0.  13.159MiB to 6.590MiB (~50% of original) in 2,474ms.  Read Throughput = 5.317MiB/s, Write Throughput = 2.663MiB/s, Row Throughput = ~166,666/s.  500,000 total partitions merged to 250,000.  Partition merge counts were \{2:250000, \}
{quote}

  was:
As mentioned by [~zznate] at NGCC the compaction task info at the end of the compaction log is pretty confusing.

Mainly we only show the throughput of the sstable writer.  But if there is a lot of merging being done it might look like compaction is really really slow since the output might be small but the inputs were huge.

Also bytes/sec isn't a great metric of *work* really we should be reporting the CQL row throughput. since for the same bytes on disk we might be compacting 100k rows or 1 large one.

I've added a trivial patch that improves the logging info to now show Read Throughput, Write Throughput, Rows sec and total source partitions.

{quote}
DEBUG [CompactionExecutor:1] 2016-06-23 12:22:06,114 CompactionTask.java:229 - Compacted (9edcfa50-395e-11e6-9944-3109153b1592) 2 sstables to [/home/jake/workspace/cassandra/data/data/stresscql/userpics-b9d2811038b711e69c04018b580faf7b/mb-11-big,] to level=0.  13.159MiB to 6.590MiB (~50% of original) in 2,474ms.  Read Throughput = 5.317MiB/s, Write Throughput = 2.663MiB/s, Row Throughput = ~166,666/s.  500,000 total partitions merged to 250,000.  Partition merge counts were {2:250000, }
{quote}


> More detailed compaction log
> ----------------------------
>
>                 Key: CASSANDRA-12080
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12080
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>            Priority: Trivial
>             Fix For: 3.8
>
>
> As mentioned by [~zznate] at NGCC the compaction task info at the end of the compaction log is pretty confusing.
> Mainly we only show the throughput of the sstable writer.  But if there is a lot of merging being done it might look like compaction is really really slow since the output might be small but the inputs were huge.
> Also bytes/sec isn't a great metric of *work* really we should be reporting the CQL row throughput. since for the same bytes on disk we might be compacting 100k rows or 1 large one.
> I've added a trivial patch that improves the logging info to now show Read Throughput, Write Throughput, Rows sec and total source partitions.
> {quote}
> DEBUG [CompactionExecutor:1] 2016-06-23 12:22:06,114 CompactionTask.java:229 - Compacted (9edcfa50-395e-11e6-9944-3109153b1592) 2 sstables to [/home/jake/workspace/cassandra/data/data/stresscql/userpics-b9d2811038b711e69c04018b580faf7b/mb-11-big,] to level=0.  13.159MiB to 6.590MiB (~50% of original) in 2,474ms.  Read Throughput = 5.317MiB/s, Write Throughput = 2.663MiB/s, Row Throughput = ~166,666/s.  500,000 total partitions merged to 250,000.  Partition merge counts were \{2:250000, \}
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)