You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Cyril Scetbon (JIRA)" <ji...@apache.org> on 2013/05/31 10:21:19 UTC

[jira] [Commented] (CASSANDRA-5604) Vnodes decrease Hadoop performances cause it creates too many small splits

    [ https://issues.apache.org/jira/browse/CASSANDRA-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671251#comment-13671251 ] 

Cyril Scetbon commented on CASSANDRA-5604:
------------------------------------------

For what I understand with the issue, if I have a table of 15*6 millions rows spread over 6 nodes with 256 tokens (RF=1 in my case) as I will have less than 64k per node, my hadoop job will use only one mapper for a table of 15*6 millions of rows. With that in mind I really think it's not a low priority issue. And of course, the more vnodes we have the more chances we have to get less data than 64k rows per vnode
                
> Vnodes decrease Hadoop performances cause it creates too many small splits
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5604
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5604
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.2.4
>         Environment: Linux Ubuntu 12.04 LTS x86_64
>            Reporter: Cyril Scetbon
>            Priority: Trivial
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira