You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Shamim Ahmed (JIRA)" <ji...@apache.org> on 2013/05/28 08:54:21 UTC

[jira] [Issue Comment Deleted] (CASSANDRA-5544) Hadoop jobs assigns only one mapper in task

     [ https://issues.apache.org/jira/browse/CASSANDRA-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shamim Ahmed updated CASSANDRA-5544:
------------------------------------

    Comment: was deleted

(was: [~alexliu68]
1) I am using pig and actually don't know how many split i had (i am very curious to know how to calculate the split count). However i have had more than 30 million rows.
2) I didn't use VNODES.
3) SET mapred.min.split.size 12500000; )
    
> Hadoop jobs assigns only one mapper in task 
> --------------------------------------------
>
>                 Key: CASSANDRA-5544
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5544
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.2.1
>         Environment: Red hat linux 5.4, Hadoop 1.0.3, pig 0.11.1
>            Reporter: Shamim Ahmed
>            Assignee: Alex Liu
>         Attachments: Screen Shot 2013-05-26 at 4.49.48 PM.png
>
>
> We have got very strange beheviour of hadoop cluster after upgrading 
> Cassandra from 1.1.5 to Cassandra 1.2.1. We have 5 nodes cluster of Cassandra, where three of them are hodoop slaves. Now when we are submitting job through Pig script, only one map assigns in task running on one of the hadoop slaves regardless of 
> volume of data (already tried with more than million rows).
> Configure of pig as follows:
> export PIG_HOME=/oracle/pig-0.10.0
> export PIG_CONF_DIR=${HADOOP_HOME}/conf
> export PIG_INITIAL_ADDRESS=192.168.157.103
> export PIG_RPC_PORT=9160
> export PIG_PARTITIONER=org.apache.cassandra.dht.Murmur3Partitioner
> Also we have these following properties in hadoop:
>  <property>
>  <name>mapred.tasktracker.map.tasks.maximum</name>
>  <value>10</value>
>  </property>
>  <property>
>  <name>mapred.map.tasks</name>
>  <value>4</value>
>  </property>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira