You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Mck SembWever (JIRA)" <ji...@apache.org> on 2013/05/18 23:51:17 UTC

[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

    [ https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661436#comment-13661436 ] 

Mck SembWever commented on CASSANDRA-2388:
------------------------------------------

Jonathan,
 I can't say i'm in favour of enforcing data locality.
Because  data locality in hadoop doesn't work this way… when a tasktracker through the next heartbeat announces that it has a task slot free the jobtracker will do its best to assign a task with data locality to it but failing this will assign it a random task. the number of these random tasks can be quite high, just like i mentioned above
{quote} Tasks are still being evenly distributed around the ring regardless of what the ColumnFamilySplit.locations is. {quote}

This can be almost solved by upgrading to hadoop-0.21+, using the fair scheduler and setting the property {code}<property>
        <name>mapred.fairscheduler.locality.delay</name>
        <value>360000000</value>
<property>{code}.

At the end of the day while hadoop encourages data locality it does not enforce it.
The ideal approach would be to sort all locations by proximity.
The feasible approach hopefully is still [~tjake]'s above. In addition i'd be in favour of a setting in the job's configuration as to whether a location from another datacenter can be used.

references:
 - http://www.infoq.com/articles/HadoopInputFormat
 - http://www.mentby.com/matei-zaharia/running-only-node-local-jobs.html
 - https://groups.google.com/a/cloudera.org/forum/?fromgroups#!topic/cdh-user/3ggnE5hV0PY
 - http://www.cs.berkeley.edu/~matei/papers/2010/eurosys_delay_scheduling.pdf
                
> ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2388
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.6
>            Reporter: Eldon Stegall
>            Assignee: Mck SembWever
>            Priority: Minor
>              Labels: hadoop, inputformat
>             Fix For: 1.2.5
>
>         Attachments: 0002_On_TException_try_next_split.patch, CASSANDRA-2388-addition1.patch, CASSANDRA-2388-extended.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch
>
>
> ColumnFamilyRecordReader only tries the first location for a given split. We should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira