You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Michael Shuler (JIRA)" <ji...@apache.org> on 2014/07/29 21:24:41 UTC

[jira] [Commented] (CASSANDRA-5503) Large Dataset with Secondary Index

    [ https://issues.apache.org/jira/browse/CASSANDRA-5503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078219#comment-14078219 ] 

Michael Shuler commented on CASSANDRA-5503:
-------------------------------------------

[~bajbnet] would you be so kind as to include some environment details (HW/OS/JVM basics would be great) and C* version info on this ticket?  Any C* non-default configurations might be helpful, as well.

[~yukim] I'm not familiar with recent changes that may have affected this behavior - comments?

> Large Dataset with Secondary Index
> ----------------------------------
>
>                 Key: CASSANDRA-5503
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5503
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Brooke Bryan
>
> We have a cluster with 1 CF, and 1 secondary index.  Currently, there are around 12 billion keys across 10 nodes, and we need to grow the cluster to support new data.  (This is only a small % of our total data atm) 
> The problem we are faced with, is when joining a new node, the system will often sit there joining, and then fail a stream stage, failing the process.  This has been the result of another node running a compaction and building up its heap too high, or other issues.  However, I think this problem could be massively reduced, and make the join process more stable, if the joining node pulled in all the data from the other nodes, and built up its secondary indexes after the other nodes have done everything they need to for the node to complete its join.



--
This message was sent by Atlassian JIRA
(v6.2#6252)