You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Cyril Scetbon (JIRA)" <ji...@apache.org> on 2013/05/31 10:21:19 UTC
[jira] [Commented] (CASSANDRA-5604) Vnodes decrease Hadoop
performances cause it creates too many small splits
[ https://issues.apache.org/jira/browse/CASSANDRA-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671251#comment-13671251 ]
Cyril Scetbon commented on CASSANDRA-5604:
------------------------------------------
For what I understand with the issue, if I have a table of 15*6 millions rows spread over 6 nodes with 256 tokens (RF=1 in my case) as I will have less than 64k per node, my hadoop job will use only one mapper for a table of 15*6 millions of rows. With that in mind I really think it's not a low priority issue. And of course, the more vnodes we have the more chances we have to get less data than 64k rows per vnode
> Vnodes decrease Hadoop performances cause it creates too many small splits
> --------------------------------------------------------------------------
>
> Key: CASSANDRA-5604
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5604
> Project: Cassandra
> Issue Type: Bug
> Components: Hadoop
> Affects Versions: 1.2.4
> Environment: Linux Ubuntu 12.04 LTS x86_64
> Reporter: Cyril Scetbon
> Priority: Trivial
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira