You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sean Bridges (JIRA)" <ji...@apache.org> on 2010/06/14 00:43:14 UTC
[jira] Created: (CASSANDRA-1187) make the number of compaction
threads configurable
make the number of compaction threads configurable
--------------------------------------------------
Key: CASSANDRA-1187
URL: https://issues.apache.org/jira/browse/CASSANDRA-1187
Project: Cassandra
Issue Type: Improvement
Components: Core
Affects Versions: 0.6.1
Reporter: Sean Bridges
On our test machines, compaction is the limiting factor when we are writing to Cassandra. It's easy to write to Cassandra faster than the single compaction thread can keep up, leading to a large number of sstables.
In one extreme example, we inserted a TB of data into a single cassandra node overnight, and ended up with 100,000 sstables, which took another two days to finish compacting.
If the number of compaction threads was configurable, we could tune cassandra to support a higher write workload.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1187) make the number of compaction
threads configurable
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-1187:
--------------------------------------
Fix Version/s: 0.7
Affects Version/s: (was: 0.6.1)
> make the number of compaction threads configurable
> --------------------------------------------------
>
> Key: CASSANDRA-1187
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1187
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Sean Bridges
> Fix For: 0.7
>
> Attachments: CASSANDRA-1187.patch
>
>
> On our test machines, compaction is the limiting factor when we are writing to Cassandra. It's easy to write to Cassandra faster than the single compaction thread can keep up, leading to a large number of sstables.
> In one extreme example, we inserted a TB of data into a single cassandra node overnight, and ended up with 100,000 sstables, which took another two days to finish compacting.
> If the number of compaction threads was configurable, we could tune cassandra to support a higher write workload.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1187) make the number of compaction
threads configurable
Posted by "Sean Bridges (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Bridges updated CASSANDRA-1187:
------------------------------------
Attachment: CASSANDRA-1187.patch
This patch allows setting the number of threads used in compaction.
A queue is created for each column family, and only one compaction thread is allowed to compact a column family at a time.
> make the number of compaction threads configurable
> --------------------------------------------------
>
> Key: CASSANDRA-1187
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1187
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.6.1
> Reporter: Sean Bridges
> Attachments: CASSANDRA-1187.patch
>
>
> On our test machines, compaction is the limiting factor when we are writing to Cassandra. It's easy to write to Cassandra faster than the single compaction thread can keep up, leading to a large number of sstables.
> In one extreme example, we inserted a TB of data into a single cassandra node overnight, and ended up with 100,000 sstables, which took another two days to finish compacting.
> If the number of compaction threads was configurable, we could tune cassandra to support a higher write workload.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1187) make the number of compaction
threads configurable
Posted by "Sean Bridges (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Bridges updated CASSANDRA-1187:
------------------------------------
Attachment: CASSANDRA-1187-2.patch
Is this what you were thinking of?
The patch adds a new ConcurrentCompactedRow which can read columns from multiple SSTables in parallell. I'm not sure how much parallelism this patch gives. For the case where two SSTables have no rows in common, there is no benefit.
Trying to read from multiple rows in parallell seems like it would get messy.
> make the number of compaction threads configurable
> --------------------------------------------------
>
> Key: CASSANDRA-1187
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1187
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Sean Bridges
> Attachments: CASSANDRA-1187-2.patch, CASSANDRA-1187.patch
>
>
> On our test machines, compaction is the limiting factor when we are writing to Cassandra. It's easy to write to Cassandra faster than the single compaction thread can keep up, leading to a large number of sstables.
> In one extreme example, we inserted a TB of data into a single cassandra node overnight, and ended up with 100,000 sstables, which took another two days to finish compacting.
> If the number of compaction threads was configurable, we could tune cassandra to support a higher write workload.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.