You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Radim Kolar (Created) (JIRA)" <ji...@apache.org> on 2012/02/26 16:04:54 UTC
[jira] [Created] (CASSANDRA-3961) Make index_interval configurable
per column family
Make index_interval configurable per column family
--------------------------------------------------
Key: CASSANDRA-3961
URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
Project: Cassandra
Issue Type: Improvement
Affects Versions: 1.0.7
Environment: Cassandra 1.0.7/unix
Reporter: Radim Kolar
Priority: Minor
After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3961) Make index_interval
configurable per column family
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501464#comment-13501464 ]
Jonathan Ellis commented on CASSANDRA-3961:
-------------------------------------------
Eyeballing this looks reasonable. Can you rebase to trunk?
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Fix For: 1.3
>
> Attachments: cass-interval1.txt, cass-interval2.txt
>
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3961) Make index_interval
configurable per column family
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507808#comment-13507808 ]
Jonathan Ellis commented on CASSANDRA-3961:
-------------------------------------------
My mistake. v6 lgtm; rebased and committed.
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Fix For: 1.3
>
> Attachments: cass-interval1.txt, cass-interval2.txt, cass-interval3.txt, cass-interval4.txt, cass-interval5.txt, cass-interval6.txt
>
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3961) Make index_interval
configurable per column family
Posted by "Jiri Eichler (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236627#comment-13236627 ]
Jiri Eichler commented on CASSANDRA-3961:
-----------------------------------------
If there are column families of very different sizes, such option will save a lot of memory and since it's not any technical limitation of the Cassandra's design, this option should be available.
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Affects Versions: 1.0.7
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Priority: Minor
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3961) Make index_interval configurable
per column family
Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Radim Kolar updated CASSANDRA-3961:
-----------------------------------
Attachment: cass-interval6.txt
some extra whitespace cleaned
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Fix For: 1.3
>
> Attachments: cass-interval1.txt, cass-interval2.txt, cass-interval3.txt, cass-interval4.txt, cass-interval5.txt, cass-interval6.txt
>
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3961) Make index_interval
configurable per column family
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502390#comment-13502390 ]
Jonathan Ellis commented on CASSANDRA-3961:
-------------------------------------------
Comments on v4:
- Shouldn't need to touch avro, that's just for compatibility with 1.0
- Please observe http://wiki.apache.org/cassandra/CodeStyle
- If index interval is changed, we should regenerate it on restart instead of using the saved summary with the old interval
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Fix For: 1.3
>
> Attachments: cass-interval1.txt, cass-interval2.txt, cass-interval3.txt, cass-interval4.txt
>
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3961) Make index_interval
configurable per column family
Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502998#comment-13502998 ]
Radim Kolar commented on CASSANDRA-3961:
----------------------------------------
your review is wrong. #3 is done by load(boolean)
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Fix For: 1.3
>
> Attachments: cass-interval1.txt, cass-interval2.txt, cass-interval3.txt, cass-interval4.txt, cass-interval5.txt, cass-interval6.txt
>
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-3961) Make index_interval configurable
per column family
Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Radim Kolar reassigned CASSANDRA-3961:
--------------------------------------
Assignee: Radim Kolar
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Affects Versions: 1.0.7
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Priority: Minor
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3961) Make index_interval configurable
per column family
Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Radim Kolar updated CASSANDRA-3961:
-----------------------------------
Attachment: cass-interval2.txt
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Affects Versions: 1.0.7
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Priority: Minor
> Attachments: cass-interval1.txt, cass-interval2.txt
>
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3961) Make index_interval configurable
per column family
Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Radim Kolar updated CASSANDRA-3961:
-----------------------------------
Attachment: cass-interval1.txt
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Affects Versions: 1.0.7
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Priority: Minor
> Attachments: cass-interval1.txt
>
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3961) Make index_interval configurable
per column family
Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Radim Kolar updated CASSANDRA-3961:
-----------------------------------
Attachment: cass-interval3.txt
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Fix For: 1.3
>
> Attachments: cass-interval1.txt, cass-interval2.txt, cass-interval3.txt
>
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3961) Make index_interval configurable
per column family
Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Radim Kolar updated CASSANDRA-3961:
-----------------------------------
Attachment: cass-interval5.txt
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Fix For: 1.3
>
> Attachments: cass-interval1.txt, cass-interval2.txt, cass-interval3.txt, cass-interval4.txt, cass-interval5.txt
>
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3961) Make index_interval configurable
per column family
Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Radim Kolar updated CASSANDRA-3961:
-----------------------------------
Component/s: Core
Priority: Major (was: Minor)
Affects Version/s: (was: 1.0.7)
1.2.0 beta 1
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 1.2.0 beta 1
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Attachments: cass-interval1.txt, cass-interval2.txt
>
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3961) Make index_interval configurable
per column family
Posted by "Radim Kolar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Radim Kolar updated CASSANDRA-3961:
-----------------------------------
Attachment: cass-interval4.txt
> Make index_interval configurable per column family
> --------------------------------------------------
>
> Key: CASSANDRA-3961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3961
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: Cassandra 1.0.7/unix
> Reporter: Radim Kolar
> Assignee: Radim Kolar
> Fix For: 1.3
>
> Attachments: cass-interval1.txt, cass-interval2.txt, cass-interval3.txt, cass-interval4.txt
>
>
> After various experiments with mixing OLTP a OLAP workload running on single cassandra cluster i discovered that lot of memory is wasted on holding index samples for CF which are rarely accessed or index is not much used for CF access because slices over keys are used.
> There is per column family setting for configuring bloom filters - bloom_filter_fp_chance. Please add setting index_interval configurable per CF as well. If this setting is not set or it is zero, default from cassandra.yaml will be used.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira