You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Bhupesh Bansal <bb...@linkedin.com> on 2009/01/22 20:29:54 UTC
Distributed cache testing in local mode
Hey folks,
I am trying to use Distributed cache in hadoop jobs to pass around
configuration files , external-jars (job sepecific) and some archive data.
I want to test Job end-to-end in local mode, but I think the distributed
caches are localized in TaskTracker code which is not called in local mode
Through LocalJobRunner.
I can do some fairly simple workarounds for this but was just wondering if
folks have more ideas about it.
Thanks
Bhupesh
Re: Distributed cache testing in local mode
Posted by Tom White <to...@cloudera.com>.
It would be nice to make this more uniform. There's an outstanding
Jira on this if anyone is interested in looking at it:
https://issues.apache.org/jira/browse/HADOOP-2914
Tom
On Fri, Jan 23, 2009 at 12:14 AM, Aaron Kimball <aa...@cloudera.com> wrote:
> Hi Bhupesh,
>
> I've noticed the same problem -- LocalJobRunner makes the DistributedCache
> effectively not work; so my code often winds up with two codepaths to
> retrieve the local data :\
>
> You could try running in pseudo-distributed mode to test, though then you
> lose the ability to run a single-stepping debugger on the whole end-to-end
> process.
>
> - Aaron
>
> On Thu, Jan 22, 2009 at 11:29 AM, Bhupesh Bansal <bb...@linkedin.com>wrote:
>
>> Hey folks,
>>
>> I am trying to use Distributed cache in hadoop jobs to pass around
>> configuration files , external-jars (job sepecific) and some archive data.
>>
>> I want to test Job end-to-end in local mode, but I think the distributed
>> caches are localized in TaskTracker code which is not called in local mode
>> Through LocalJobRunner.
>>
>> I can do some fairly simple workarounds for this but was just wondering if
>> folks have more ideas about it.
>>
>> Thanks
>> Bhupesh
>>
>>
>
Re: Distributed cache testing in local mode
Posted by Aaron Kimball <aa...@cloudera.com>.
Hi Bhupesh,
I've noticed the same problem -- LocalJobRunner makes the DistributedCache
effectively not work; so my code often winds up with two codepaths to
retrieve the local data :\
You could try running in pseudo-distributed mode to test, though then you
lose the ability to run a single-stepping debugger on the whole end-to-end
process.
- Aaron
On Thu, Jan 22, 2009 at 11:29 AM, Bhupesh Bansal <bb...@linkedin.com>wrote:
> Hey folks,
>
> I am trying to use Distributed cache in hadoop jobs to pass around
> configuration files , external-jars (job sepecific) and some archive data.
>
> I want to test Job end-to-end in local mode, but I think the distributed
> caches are localized in TaskTracker code which is not called in local mode
> Through LocalJobRunner.
>
> I can do some fairly simple workarounds for this but was just wondering if
> folks have more ideas about it.
>
> Thanks
> Bhupesh
>
>