You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by "Botelho, Andrew" <An...@emc.com> on 2013/07/10 19:15:39 UTC

New Distributed Cache

Hi,

I am trying to store a file in the Distributed Cache during my Hadoop job.
In the driver class, I tell the job to store the file in the cache with this code:

Job job = Job.getInstance();
job.addCacheFile(new URI("file name"));

That all compiles fine.  In the Mapper code, I try accessing the cached file with this method:

Path[] localPaths = context.getLocalCacheFiles();

However, I am getting warnings that this method is deprecated.
Does anyone know the newest way to access cached files in the Mapper code? (I am using Hadoop 2.0.5)

Thanks in advance,

Andrew

Re: New Distributed Cache

Posted by Bill Q <bi...@gmail.com>.
Hi Omkar,
Did you find out why the getLocalCacheFiles is deprecated? If it is indeed
deprecated, what would be the alternative to use?

On Thursday, July 11, 2013, Omkar Joshi wrote:

> Yeah Andrew.. there seems to be some problem with context.getCacheFiles()
> api which is returning null..
>
>  Path[] cachedFilePaths =
>
>           context.getLocalCacheFiles(); // I am checking why it is
> deprecated...
>
>       for (Path cachedFilePath : cachedFilePaths) {
>
>         File cachedFile = new File(cachedFilePath.toUri().getRawPath());
>
>         System.out.println("cached fie path >> "
>
>             + cachedFile.getAbsolutePath());
>
>       }
>
> I hope this helps for the time being.. JobContext was suppose to replace
> DistributedCache api (it will be deprecated) however there is some problem
> with that or I am missing something... Will reply if I find the solution to
> it.
>
> context.getCacheFiles will give you the uri used for localizing files...
> (original uri used for adding it to cache)... However you can use
> DistributedCache.getCacheFiles() api till context api is fixed.
>
> context.getLocalCacheFiles .. will give you the actual file path on node
> manager... (after file is localized).
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com>wrote:
>
> So in my driver code, I try to store the file in the cache with this line
> of code:
>
>
>
> job.addCacheFile(new URI("file location"));
>
>
>
> Then in my Mapper code, I do this to try and access the cached file:
>
>
>
> URI[] localPaths = context.getCacheFiles();
>
> File f = new File(localPaths[0]);
>
>
>
> However, I get a NullPointerException when I do that in the Mapper code.
>
>
>
> Any suggesstions?
>
>
>
> Andrew
>
>
>
> *From:* Shahab Yunus [mailto:shahab.yunus@gmail.com]
> *Sent:* Wednesday, July 10, 2013 9:43 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: New Distributed Cache
>
>
>
> Also, once you have the array of URIs after calling getCacheFiles  you
> can iterate over them using File class or Path (
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
> )
>
>
>
> Regards,
>
> Shahab
>
>
>
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>
> wrote:
>
> did you try JobContext.getCacheFiles() ?
>
>
>
>
> Thanks,
>
> Omkar Joshi
>
> <http://www.hortonworks.com>
>
>

-- 
Many thanks.


Bill

Re: New Distributed Cache

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
Andrew, you are running into https://issues.apache.org/jira/browse/MAPREDUCE-5385. This got fixed in 2.1.1-beta (by Omkar :) )

+Vinod

On Jul 11, 2013, at 11:08 AM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Yeah Andrew.. there seems to be some problem with context.getCacheFiles() api which is returning null..
>  Path[] cachedFilePaths =
> 
>           context.getLocalCacheFiles(); // I am checking why it is deprecated...
> 
>       for (Path cachedFilePath : cachedFilePaths) {
> 
>         File cachedFile = new File(cachedFilePath.toUri().getRawPath());
> 
>         System.out.println("cached fie path >> "
> 
>             + cachedFile.getAbsolutePath());
> 
>       }
> 
> I hope this helps for the time being.. JobContext was suppose to replace DistributedCache api (it will be deprecated) however there is some problem with that or I am missing something... Will reply if I find the solution to it.
> 
> context.getCacheFiles will give you the uri used for localizing files... (original uri used for adding it to cache)... However you can use DistributedCache.getCacheFiles() api till context api is fixed.
> 
> context.getLocalCacheFiles .. will give you the actual file path on node manager... (after file is localized).
> 
> 
> Thanks,
> Omkar Joshi
> Hortonworks Inc.
> 
> 
> On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com> wrote:
> So in my driver code, I try to store the file in the cache with this line of code:
> 
>  
> 
> job.addCacheFile(new URI("file location"));
> 
>  
> 
> Then in my Mapper code, I do this to try and access the cached file:
> 
>  
> 
> URI[] localPaths = context.getCacheFiles();
> 
> File f = new File(localPaths[0]);
> 
>  
> 
> However, I get a NullPointerException when I do that in the Mapper code.
> 
>  
> 
> Any suggesstions?
> 
>  
> 
> Andrew
> 
>  
> 
> From: Shahab Yunus [mailto:shahab.yunus@gmail.com] 
> Sent: Wednesday, July 10, 2013 9:43 PM
> To: user@hadoop.apache.org
> Subject: Re: New Distributed Cache
> 
>  
> 
> Also, once you have the array of URIs after calling getCacheFiles  you can iterate over them using File class or Path (http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI))
> 
>  
> 
> Regards,
> 
> Shahab
> 
>  
> 
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com> wrote:
> 
> did you try JobContext.getCacheFiles() ?
> 
>  
> 
> 
> 
> Thanks,
> 
> Omkar Joshi
> 
> Hortonworks Inc.
> 
>  
> 
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com> wrote:
> 
> Hi,
> 
>  
> 
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> 
> In the driver class, I tell the job to store the file in the cache with this code:
> 
>  
> 
> Job job = Job.getInstance();
> 
> job.addCacheFile(new URI("file name"));
> 
>  
> 
> That all compiles fine.  In the Mapper code, I try accessing the cached file with this method:
> 
>  
> 
> Path[] localPaths = context.getLocalCacheFiles();
> 
>  
> 
> However, I am getting warnings that this method is deprecated.
> 
> Does anyone know the newest way to access cached files in the Mapper code? (I am using Hadoop 2.0.5)
> 
>  
> 
> Thanks in advance,
> 
>  
> 
> Andrew
> 
>  
> 
>  
> 
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: New Distributed Cache

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
Andrew, you are running into https://issues.apache.org/jira/browse/MAPREDUCE-5385. This got fixed in 2.1.1-beta (by Omkar :) )

+Vinod

On Jul 11, 2013, at 11:08 AM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Yeah Andrew.. there seems to be some problem with context.getCacheFiles() api which is returning null..
>  Path[] cachedFilePaths =
> 
>           context.getLocalCacheFiles(); // I am checking why it is deprecated...
> 
>       for (Path cachedFilePath : cachedFilePaths) {
> 
>         File cachedFile = new File(cachedFilePath.toUri().getRawPath());
> 
>         System.out.println("cached fie path >> "
> 
>             + cachedFile.getAbsolutePath());
> 
>       }
> 
> I hope this helps for the time being.. JobContext was suppose to replace DistributedCache api (it will be deprecated) however there is some problem with that or I am missing something... Will reply if I find the solution to it.
> 
> context.getCacheFiles will give you the uri used for localizing files... (original uri used for adding it to cache)... However you can use DistributedCache.getCacheFiles() api till context api is fixed.
> 
> context.getLocalCacheFiles .. will give you the actual file path on node manager... (after file is localized).
> 
> 
> Thanks,
> Omkar Joshi
> Hortonworks Inc.
> 
> 
> On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com> wrote:
> So in my driver code, I try to store the file in the cache with this line of code:
> 
>  
> 
> job.addCacheFile(new URI("file location"));
> 
>  
> 
> Then in my Mapper code, I do this to try and access the cached file:
> 
>  
> 
> URI[] localPaths = context.getCacheFiles();
> 
> File f = new File(localPaths[0]);
> 
>  
> 
> However, I get a NullPointerException when I do that in the Mapper code.
> 
>  
> 
> Any suggesstions?
> 
>  
> 
> Andrew
> 
>  
> 
> From: Shahab Yunus [mailto:shahab.yunus@gmail.com] 
> Sent: Wednesday, July 10, 2013 9:43 PM
> To: user@hadoop.apache.org
> Subject: Re: New Distributed Cache
> 
>  
> 
> Also, once you have the array of URIs after calling getCacheFiles  you can iterate over them using File class or Path (http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI))
> 
>  
> 
> Regards,
> 
> Shahab
> 
>  
> 
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com> wrote:
> 
> did you try JobContext.getCacheFiles() ?
> 
>  
> 
> 
> 
> Thanks,
> 
> Omkar Joshi
> 
> Hortonworks Inc.
> 
>  
> 
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com> wrote:
> 
> Hi,
> 
>  
> 
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> 
> In the driver class, I tell the job to store the file in the cache with this code:
> 
>  
> 
> Job job = Job.getInstance();
> 
> job.addCacheFile(new URI("file name"));
> 
>  
> 
> That all compiles fine.  In the Mapper code, I try accessing the cached file with this method:
> 
>  
> 
> Path[] localPaths = context.getLocalCacheFiles();
> 
>  
> 
> However, I am getting warnings that this method is deprecated.
> 
> Does anyone know the newest way to access cached files in the Mapper code? (I am using Hadoop 2.0.5)
> 
>  
> 
> Thanks in advance,
> 
>  
> 
> Andrew
> 
>  
> 
>  
> 
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: New Distributed Cache

Posted by Bill Q <bi...@gmail.com>.
Hi Omkar,
Did you find out why the getLocalCacheFiles is deprecated? If it is indeed
deprecated, what would be the alternative to use?

On Thursday, July 11, 2013, Omkar Joshi wrote:

> Yeah Andrew.. there seems to be some problem with context.getCacheFiles()
> api which is returning null..
>
>  Path[] cachedFilePaths =
>
>           context.getLocalCacheFiles(); // I am checking why it is
> deprecated...
>
>       for (Path cachedFilePath : cachedFilePaths) {
>
>         File cachedFile = new File(cachedFilePath.toUri().getRawPath());
>
>         System.out.println("cached fie path >> "
>
>             + cachedFile.getAbsolutePath());
>
>       }
>
> I hope this helps for the time being.. JobContext was suppose to replace
> DistributedCache api (it will be deprecated) however there is some problem
> with that or I am missing something... Will reply if I find the solution to
> it.
>
> context.getCacheFiles will give you the uri used for localizing files...
> (original uri used for adding it to cache)... However you can use
> DistributedCache.getCacheFiles() api till context api is fixed.
>
> context.getLocalCacheFiles .. will give you the actual file path on node
> manager... (after file is localized).
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com>wrote:
>
> So in my driver code, I try to store the file in the cache with this line
> of code:
>
>
>
> job.addCacheFile(new URI("file location"));
>
>
>
> Then in my Mapper code, I do this to try and access the cached file:
>
>
>
> URI[] localPaths = context.getCacheFiles();
>
> File f = new File(localPaths[0]);
>
>
>
> However, I get a NullPointerException when I do that in the Mapper code.
>
>
>
> Any suggesstions?
>
>
>
> Andrew
>
>
>
> *From:* Shahab Yunus [mailto:shahab.yunus@gmail.com]
> *Sent:* Wednesday, July 10, 2013 9:43 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: New Distributed Cache
>
>
>
> Also, once you have the array of URIs after calling getCacheFiles  you
> can iterate over them using File class or Path (
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
> )
>
>
>
> Regards,
>
> Shahab
>
>
>
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>
> wrote:
>
> did you try JobContext.getCacheFiles() ?
>
>
>
>
> Thanks,
>
> Omkar Joshi
>
> <http://www.hortonworks.com>
>
>

-- 
Many thanks.


Bill

Re: New Distributed Cache

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
Andrew, you are running into https://issues.apache.org/jira/browse/MAPREDUCE-5385. This got fixed in 2.1.1-beta (by Omkar :) )

+Vinod

On Jul 11, 2013, at 11:08 AM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Yeah Andrew.. there seems to be some problem with context.getCacheFiles() api which is returning null..
>  Path[] cachedFilePaths =
> 
>           context.getLocalCacheFiles(); // I am checking why it is deprecated...
> 
>       for (Path cachedFilePath : cachedFilePaths) {
> 
>         File cachedFile = new File(cachedFilePath.toUri().getRawPath());
> 
>         System.out.println("cached fie path >> "
> 
>             + cachedFile.getAbsolutePath());
> 
>       }
> 
> I hope this helps for the time being.. JobContext was suppose to replace DistributedCache api (it will be deprecated) however there is some problem with that or I am missing something... Will reply if I find the solution to it.
> 
> context.getCacheFiles will give you the uri used for localizing files... (original uri used for adding it to cache)... However you can use DistributedCache.getCacheFiles() api till context api is fixed.
> 
> context.getLocalCacheFiles .. will give you the actual file path on node manager... (after file is localized).
> 
> 
> Thanks,
> Omkar Joshi
> Hortonworks Inc.
> 
> 
> On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com> wrote:
> So in my driver code, I try to store the file in the cache with this line of code:
> 
>  
> 
> job.addCacheFile(new URI("file location"));
> 
>  
> 
> Then in my Mapper code, I do this to try and access the cached file:
> 
>  
> 
> URI[] localPaths = context.getCacheFiles();
> 
> File f = new File(localPaths[0]);
> 
>  
> 
> However, I get a NullPointerException when I do that in the Mapper code.
> 
>  
> 
> Any suggesstions?
> 
>  
> 
> Andrew
> 
>  
> 
> From: Shahab Yunus [mailto:shahab.yunus@gmail.com] 
> Sent: Wednesday, July 10, 2013 9:43 PM
> To: user@hadoop.apache.org
> Subject: Re: New Distributed Cache
> 
>  
> 
> Also, once you have the array of URIs after calling getCacheFiles  you can iterate over them using File class or Path (http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI))
> 
>  
> 
> Regards,
> 
> Shahab
> 
>  
> 
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com> wrote:
> 
> did you try JobContext.getCacheFiles() ?
> 
>  
> 
> 
> 
> Thanks,
> 
> Omkar Joshi
> 
> Hortonworks Inc.
> 
>  
> 
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com> wrote:
> 
> Hi,
> 
>  
> 
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> 
> In the driver class, I tell the job to store the file in the cache with this code:
> 
>  
> 
> Job job = Job.getInstance();
> 
> job.addCacheFile(new URI("file name"));
> 
>  
> 
> That all compiles fine.  In the Mapper code, I try accessing the cached file with this method:
> 
>  
> 
> Path[] localPaths = context.getLocalCacheFiles();
> 
>  
> 
> However, I am getting warnings that this method is deprecated.
> 
> Does anyone know the newest way to access cached files in the Mapper code? (I am using Hadoop 2.0.5)
> 
>  
> 
> Thanks in advance,
> 
>  
> 
> Andrew
> 
>  
> 
>  
> 
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: New Distributed Cache

Posted by Bill Q <bi...@gmail.com>.
Hi Omkar,
Did you find out why the getLocalCacheFiles is deprecated? If it is indeed
deprecated, what would be the alternative to use?

On Thursday, July 11, 2013, Omkar Joshi wrote:

> Yeah Andrew.. there seems to be some problem with context.getCacheFiles()
> api which is returning null..
>
>  Path[] cachedFilePaths =
>
>           context.getLocalCacheFiles(); // I am checking why it is
> deprecated...
>
>       for (Path cachedFilePath : cachedFilePaths) {
>
>         File cachedFile = new File(cachedFilePath.toUri().getRawPath());
>
>         System.out.println("cached fie path >> "
>
>             + cachedFile.getAbsolutePath());
>
>       }
>
> I hope this helps for the time being.. JobContext was suppose to replace
> DistributedCache api (it will be deprecated) however there is some problem
> with that or I am missing something... Will reply if I find the solution to
> it.
>
> context.getCacheFiles will give you the uri used for localizing files...
> (original uri used for adding it to cache)... However you can use
> DistributedCache.getCacheFiles() api till context api is fixed.
>
> context.getLocalCacheFiles .. will give you the actual file path on node
> manager... (after file is localized).
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com>wrote:
>
> So in my driver code, I try to store the file in the cache with this line
> of code:
>
>
>
> job.addCacheFile(new URI("file location"));
>
>
>
> Then in my Mapper code, I do this to try and access the cached file:
>
>
>
> URI[] localPaths = context.getCacheFiles();
>
> File f = new File(localPaths[0]);
>
>
>
> However, I get a NullPointerException when I do that in the Mapper code.
>
>
>
> Any suggesstions?
>
>
>
> Andrew
>
>
>
> *From:* Shahab Yunus [mailto:shahab.yunus@gmail.com]
> *Sent:* Wednesday, July 10, 2013 9:43 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: New Distributed Cache
>
>
>
> Also, once you have the array of URIs after calling getCacheFiles  you
> can iterate over them using File class or Path (
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
> )
>
>
>
> Regards,
>
> Shahab
>
>
>
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>
> wrote:
>
> did you try JobContext.getCacheFiles() ?
>
>
>
>
> Thanks,
>
> Omkar Joshi
>
> <http://www.hortonworks.com>
>
>

-- 
Many thanks.


Bill

Re: New Distributed Cache

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
Andrew, you are running into https://issues.apache.org/jira/browse/MAPREDUCE-5385. This got fixed in 2.1.1-beta (by Omkar :) )

+Vinod

On Jul 11, 2013, at 11:08 AM, Omkar Joshi <oj...@hortonworks.com> wrote:

> Yeah Andrew.. there seems to be some problem with context.getCacheFiles() api which is returning null..
>  Path[] cachedFilePaths =
> 
>           context.getLocalCacheFiles(); // I am checking why it is deprecated...
> 
>       for (Path cachedFilePath : cachedFilePaths) {
> 
>         File cachedFile = new File(cachedFilePath.toUri().getRawPath());
> 
>         System.out.println("cached fie path >> "
> 
>             + cachedFile.getAbsolutePath());
> 
>       }
> 
> I hope this helps for the time being.. JobContext was suppose to replace DistributedCache api (it will be deprecated) however there is some problem with that or I am missing something... Will reply if I find the solution to it.
> 
> context.getCacheFiles will give you the uri used for localizing files... (original uri used for adding it to cache)... However you can use DistributedCache.getCacheFiles() api till context api is fixed.
> 
> context.getLocalCacheFiles .. will give you the actual file path on node manager... (after file is localized).
> 
> 
> Thanks,
> Omkar Joshi
> Hortonworks Inc.
> 
> 
> On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com> wrote:
> So in my driver code, I try to store the file in the cache with this line of code:
> 
>  
> 
> job.addCacheFile(new URI("file location"));
> 
>  
> 
> Then in my Mapper code, I do this to try and access the cached file:
> 
>  
> 
> URI[] localPaths = context.getCacheFiles();
> 
> File f = new File(localPaths[0]);
> 
>  
> 
> However, I get a NullPointerException when I do that in the Mapper code.
> 
>  
> 
> Any suggesstions?
> 
>  
> 
> Andrew
> 
>  
> 
> From: Shahab Yunus [mailto:shahab.yunus@gmail.com] 
> Sent: Wednesday, July 10, 2013 9:43 PM
> To: user@hadoop.apache.org
> Subject: Re: New Distributed Cache
> 
>  
> 
> Also, once you have the array of URIs after calling getCacheFiles  you can iterate over them using File class or Path (http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI))
> 
>  
> 
> Regards,
> 
> Shahab
> 
>  
> 
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com> wrote:
> 
> did you try JobContext.getCacheFiles() ?
> 
>  
> 
> 
> 
> Thanks,
> 
> Omkar Joshi
> 
> Hortonworks Inc.
> 
>  
> 
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com> wrote:
> 
> Hi,
> 
>  
> 
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> 
> In the driver class, I tell the job to store the file in the cache with this code:
> 
>  
> 
> Job job = Job.getInstance();
> 
> job.addCacheFile(new URI("file name"));
> 
>  
> 
> That all compiles fine.  In the Mapper code, I try accessing the cached file with this method:
> 
>  
> 
> Path[] localPaths = context.getLocalCacheFiles();
> 
>  
> 
> However, I am getting warnings that this method is deprecated.
> 
> Does anyone know the newest way to access cached files in the Mapper code? (I am using Hadoop 2.0.5)
> 
>  
> 
> Thanks in advance,
> 
>  
> 
> Andrew
> 
>  
> 
>  
> 
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: New Distributed Cache

Posted by Bill Q <bi...@gmail.com>.
Hi Omkar,
Did you find out why the getLocalCacheFiles is deprecated? If it is indeed
deprecated, what would be the alternative to use?

On Thursday, July 11, 2013, Omkar Joshi wrote:

> Yeah Andrew.. there seems to be some problem with context.getCacheFiles()
> api which is returning null..
>
>  Path[] cachedFilePaths =
>
>           context.getLocalCacheFiles(); // I am checking why it is
> deprecated...
>
>       for (Path cachedFilePath : cachedFilePaths) {
>
>         File cachedFile = new File(cachedFilePath.toUri().getRawPath());
>
>         System.out.println("cached fie path >> "
>
>             + cachedFile.getAbsolutePath());
>
>       }
>
> I hope this helps for the time being.. JobContext was suppose to replace
> DistributedCache api (it will be deprecated) however there is some problem
> with that or I am missing something... Will reply if I find the solution to
> it.
>
> context.getCacheFiles will give you the uri used for localizing files...
> (original uri used for adding it to cache)... However you can use
> DistributedCache.getCacheFiles() api till context api is fixed.
>
> context.getLocalCacheFiles .. will give you the actual file path on node
> manager... (after file is localized).
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com>wrote:
>
> So in my driver code, I try to store the file in the cache with this line
> of code:
>
>
>
> job.addCacheFile(new URI("file location"));
>
>
>
> Then in my Mapper code, I do this to try and access the cached file:
>
>
>
> URI[] localPaths = context.getCacheFiles();
>
> File f = new File(localPaths[0]);
>
>
>
> However, I get a NullPointerException when I do that in the Mapper code.
>
>
>
> Any suggesstions?
>
>
>
> Andrew
>
>
>
> *From:* Shahab Yunus [mailto:shahab.yunus@gmail.com]
> *Sent:* Wednesday, July 10, 2013 9:43 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: New Distributed Cache
>
>
>
> Also, once you have the array of URIs after calling getCacheFiles  you
> can iterate over them using File class or Path (
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
> )
>
>
>
> Regards,
>
> Shahab
>
>
>
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>
> wrote:
>
> did you try JobContext.getCacheFiles() ?
>
>
>
>
> Thanks,
>
> Omkar Joshi
>
> <http://www.hortonworks.com>
>
>

-- 
Many thanks.


Bill

Re: New Distributed Cache

Posted by Omkar Joshi <oj...@hortonworks.com>.
Yeah Andrew.. there seems to be some problem with context.getCacheFiles()
api which is returning null..

 Path[] cachedFilePaths =

          context.getLocalCacheFiles(); // I am checking why it is
deprecated...

      for (Path cachedFilePath : cachedFilePaths) {

        File cachedFile = new File(cachedFilePath.toUri().getRawPath());

        System.out.println("cached fie path >> "

            + cachedFile.getAbsolutePath());

      }

I hope this helps for the time being.. JobContext was suppose to replace
DistributedCache api (it will be deprecated) however there is some problem
with that or I am missing something... Will reply if I find the solution to
it.

context.getCacheFiles will give you the uri used for localizing files...
(original uri used for adding it to cache)... However you can use
DistributedCache.getCacheFiles() api till context api is fixed.

context.getLocalCacheFiles .. will give you the actual file path on node
manager... (after file is localized).

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com>wrote:

> So in my driver code, I try to store the file in the cache with this line
> of code:****
>
> ** **
>
> job.addCacheFile(new URI("file location"));****
>
> ** **
>
> Then in my Mapper code, I do this to try and access the cached file:****
>
> ** **
>
> URI[] localPaths = context.getCacheFiles();****
>
> File f = new File(localPaths[0]);****
>
> ** **
>
> However, I get a NullPointerException when I do that in the Mapper code.**
> **
>
> ** **
>
> Any suggesstions?****
>
> ** **
>
> Andrew****
>
> ** **
>
> *From:* Shahab Yunus [mailto:shahab.yunus@gmail.com]
> *Sent:* Wednesday, July 10, 2013 9:43 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: New Distributed Cache****
>
> ** **
>
> Also, once you have the array of URIs after calling getCacheFiles  you
> can iterate over them using File class or Path (
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
> )****
>
> ** **
>
> Regards,****
>
> Shahab****
>
> ** **
>
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>
> wrote:****
>
> did you try JobContext.getCacheFiles() ?****
>
> ** **
>
>
> ****
>
> Thanks,****
>
> Omkar Joshi****
>
> *Hortonworks Inc.* <http://www.hortonworks.com>****
>
> ** **
>
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>
> wrote:****
>
> Hi,****
>
>  ****
>
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> ****
>
> In the driver class, I tell the job to store the file in the cache with
> this code:****
>
>  ****
>
> Job job = Job.getInstance();****
>
> job.addCacheFile(new URI("file name"));****
>
>  ****
>
> That all compiles fine.  In the Mapper code, I try accessing the cached
> file with this method:****
>
>  ****
>
> Path[] localPaths = context.getLocalCacheFiles();****
>
>  ****
>
> However, I am getting warnings that this method is deprecated.****
>
> Does anyone know the newest way to access cached files in the Mapper code?
> (I am using Hadoop 2.0.5)****
>
>  ****
>
> Thanks in advance,****
>
>  ****
>
> Andrew****
>
> ** **
>
> ** **
>

Re: New Distributed Cache

Posted by Omkar Joshi <oj...@hortonworks.com>.
Yeah Andrew.. there seems to be some problem with context.getCacheFiles()
api which is returning null..

 Path[] cachedFilePaths =

          context.getLocalCacheFiles(); // I am checking why it is
deprecated...

      for (Path cachedFilePath : cachedFilePaths) {

        File cachedFile = new File(cachedFilePath.toUri().getRawPath());

        System.out.println("cached fie path >> "

            + cachedFile.getAbsolutePath());

      }

I hope this helps for the time being.. JobContext was suppose to replace
DistributedCache api (it will be deprecated) however there is some problem
with that or I am missing something... Will reply if I find the solution to
it.

context.getCacheFiles will give you the uri used for localizing files...
(original uri used for adding it to cache)... However you can use
DistributedCache.getCacheFiles() api till context api is fixed.

context.getLocalCacheFiles .. will give you the actual file path on node
manager... (after file is localized).

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com>wrote:

> So in my driver code, I try to store the file in the cache with this line
> of code:****
>
> ** **
>
> job.addCacheFile(new URI("file location"));****
>
> ** **
>
> Then in my Mapper code, I do this to try and access the cached file:****
>
> ** **
>
> URI[] localPaths = context.getCacheFiles();****
>
> File f = new File(localPaths[0]);****
>
> ** **
>
> However, I get a NullPointerException when I do that in the Mapper code.**
> **
>
> ** **
>
> Any suggesstions?****
>
> ** **
>
> Andrew****
>
> ** **
>
> *From:* Shahab Yunus [mailto:shahab.yunus@gmail.com]
> *Sent:* Wednesday, July 10, 2013 9:43 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: New Distributed Cache****
>
> ** **
>
> Also, once you have the array of URIs after calling getCacheFiles  you
> can iterate over them using File class or Path (
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
> )****
>
> ** **
>
> Regards,****
>
> Shahab****
>
> ** **
>
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>
> wrote:****
>
> did you try JobContext.getCacheFiles() ?****
>
> ** **
>
>
> ****
>
> Thanks,****
>
> Omkar Joshi****
>
> *Hortonworks Inc.* <http://www.hortonworks.com>****
>
> ** **
>
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>
> wrote:****
>
> Hi,****
>
>  ****
>
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> ****
>
> In the driver class, I tell the job to store the file in the cache with
> this code:****
>
>  ****
>
> Job job = Job.getInstance();****
>
> job.addCacheFile(new URI("file name"));****
>
>  ****
>
> That all compiles fine.  In the Mapper code, I try accessing the cached
> file with this method:****
>
>  ****
>
> Path[] localPaths = context.getLocalCacheFiles();****
>
>  ****
>
> However, I am getting warnings that this method is deprecated.****
>
> Does anyone know the newest way to access cached files in the Mapper code?
> (I am using Hadoop 2.0.5)****
>
>  ****
>
> Thanks in advance,****
>
>  ****
>
> Andrew****
>
> ** **
>
> ** **
>

Re: New Distributed Cache

Posted by Omkar Joshi <oj...@hortonworks.com>.
Yeah Andrew.. there seems to be some problem with context.getCacheFiles()
api which is returning null..

 Path[] cachedFilePaths =

          context.getLocalCacheFiles(); // I am checking why it is
deprecated...

      for (Path cachedFilePath : cachedFilePaths) {

        File cachedFile = new File(cachedFilePath.toUri().getRawPath());

        System.out.println("cached fie path >> "

            + cachedFile.getAbsolutePath());

      }

I hope this helps for the time being.. JobContext was suppose to replace
DistributedCache api (it will be deprecated) however there is some problem
with that or I am missing something... Will reply if I find the solution to
it.

context.getCacheFiles will give you the uri used for localizing files...
(original uri used for adding it to cache)... However you can use
DistributedCache.getCacheFiles() api till context api is fixed.

context.getLocalCacheFiles .. will give you the actual file path on node
manager... (after file is localized).

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com>wrote:

> So in my driver code, I try to store the file in the cache with this line
> of code:****
>
> ** **
>
> job.addCacheFile(new URI("file location"));****
>
> ** **
>
> Then in my Mapper code, I do this to try and access the cached file:****
>
> ** **
>
> URI[] localPaths = context.getCacheFiles();****
>
> File f = new File(localPaths[0]);****
>
> ** **
>
> However, I get a NullPointerException when I do that in the Mapper code.**
> **
>
> ** **
>
> Any suggesstions?****
>
> ** **
>
> Andrew****
>
> ** **
>
> *From:* Shahab Yunus [mailto:shahab.yunus@gmail.com]
> *Sent:* Wednesday, July 10, 2013 9:43 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: New Distributed Cache****
>
> ** **
>
> Also, once you have the array of URIs after calling getCacheFiles  you
> can iterate over them using File class or Path (
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
> )****
>
> ** **
>
> Regards,****
>
> Shahab****
>
> ** **
>
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>
> wrote:****
>
> did you try JobContext.getCacheFiles() ?****
>
> ** **
>
>
> ****
>
> Thanks,****
>
> Omkar Joshi****
>
> *Hortonworks Inc.* <http://www.hortonworks.com>****
>
> ** **
>
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>
> wrote:****
>
> Hi,****
>
>  ****
>
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> ****
>
> In the driver class, I tell the job to store the file in the cache with
> this code:****
>
>  ****
>
> Job job = Job.getInstance();****
>
> job.addCacheFile(new URI("file name"));****
>
>  ****
>
> That all compiles fine.  In the Mapper code, I try accessing the cached
> file with this method:****
>
>  ****
>
> Path[] localPaths = context.getLocalCacheFiles();****
>
>  ****
>
> However, I am getting warnings that this method is deprecated.****
>
> Does anyone know the newest way to access cached files in the Mapper code?
> (I am using Hadoop 2.0.5)****
>
>  ****
>
> Thanks in advance,****
>
>  ****
>
> Andrew****
>
> ** **
>
> ** **
>

Re: New Distributed Cache

Posted by Omkar Joshi <oj...@hortonworks.com>.
Yeah Andrew.. there seems to be some problem with context.getCacheFiles()
api which is returning null..

 Path[] cachedFilePaths =

          context.getLocalCacheFiles(); // I am checking why it is
deprecated...

      for (Path cachedFilePath : cachedFilePaths) {

        File cachedFile = new File(cachedFilePath.toUri().getRawPath());

        System.out.println("cached fie path >> "

            + cachedFile.getAbsolutePath());

      }

I hope this helps for the time being.. JobContext was suppose to replace
DistributedCache api (it will be deprecated) however there is some problem
with that or I am missing something... Will reply if I find the solution to
it.

context.getCacheFiles will give you the uri used for localizing files...
(original uri used for adding it to cache)... However you can use
DistributedCache.getCacheFiles() api till context api is fixed.

context.getLocalCacheFiles .. will give you the actual file path on node
manager... (after file is localized).

Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Thu, Jul 11, 2013 at 8:19 AM, Botelho, Andrew <An...@emc.com>wrote:

> So in my driver code, I try to store the file in the cache with this line
> of code:****
>
> ** **
>
> job.addCacheFile(new URI("file location"));****
>
> ** **
>
> Then in my Mapper code, I do this to try and access the cached file:****
>
> ** **
>
> URI[] localPaths = context.getCacheFiles();****
>
> File f = new File(localPaths[0]);****
>
> ** **
>
> However, I get a NullPointerException when I do that in the Mapper code.**
> **
>
> ** **
>
> Any suggesstions?****
>
> ** **
>
> Andrew****
>
> ** **
>
> *From:* Shahab Yunus [mailto:shahab.yunus@gmail.com]
> *Sent:* Wednesday, July 10, 2013 9:43 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: New Distributed Cache****
>
> ** **
>
> Also, once you have the array of URIs after calling getCacheFiles  you
> can iterate over them using File class or Path (
> http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
> )****
>
> ** **
>
> Regards,****
>
> Shahab****
>
> ** **
>
> On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>
> wrote:****
>
> did you try JobContext.getCacheFiles() ?****
>
> ** **
>
>
> ****
>
> Thanks,****
>
> Omkar Joshi****
>
> *Hortonworks Inc.* <http://www.hortonworks.com>****
>
> ** **
>
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>
> wrote:****
>
> Hi,****
>
>  ****
>
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> ****
>
> In the driver class, I tell the job to store the file in the cache with
> this code:****
>
>  ****
>
> Job job = Job.getInstance();****
>
> job.addCacheFile(new URI("file name"));****
>
>  ****
>
> That all compiles fine.  In the Mapper code, I try accessing the cached
> file with this method:****
>
>  ****
>
> Path[] localPaths = context.getLocalCacheFiles();****
>
>  ****
>
> However, I am getting warnings that this method is deprecated.****
>
> Does anyone know the newest way to access cached files in the Mapper code?
> (I am using Hadoop 2.0.5)****
>
>  ****
>
> Thanks in advance,****
>
>  ****
>
> Andrew****
>
> ** **
>
> ** **
>

RE: New Distributed Cache

Posted by "Botelho, Andrew" <An...@emc.com>.
So in my driver code, I try to store the file in the cache with this line of code:

job.addCacheFile(new URI("file location"));

Then in my Mapper code, I do this to try and access the cached file:

URI[] localPaths = context.getCacheFiles();
File f = new File(localPaths[0]);

However, I get a NullPointerException when I do that in the Mapper code.

Any suggesstions?

Andrew

From: Shahab Yunus [mailto:shahab.yunus@gmail.com]
Sent: Wednesday, July 10, 2013 9:43 PM
To: user@hadoop.apache.org
Subject: Re: New Distributed Cache

Also, once you have the array of URIs after calling getCacheFiles  you can iterate over them using File class or Path (http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI))

Regards,
Shahab

On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>> wrote:
did you try JobContext.getCacheFiles() ?


Thanks,
Omkar Joshi
Hortonworks Inc.<http://www.hortonworks.com>

On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>> wrote:
Hi,

I am trying to store a file in the Distributed Cache during my Hadoop job.
In the driver class, I tell the job to store the file in the cache with this code:

Job job = Job.getInstance();
job.addCacheFile(new URI("file name"));

That all compiles fine.  In the Mapper code, I try accessing the cached file with this method:

Path[] localPaths = context.getLocalCacheFiles();

However, I am getting warnings that this method is deprecated.
Does anyone know the newest way to access cached files in the Mapper code? (I am using Hadoop 2.0.5)

Thanks in advance,

Andrew



RE: New Distributed Cache

Posted by "Botelho, Andrew" <An...@emc.com>.
So in my driver code, I try to store the file in the cache with this line of code:

job.addCacheFile(new URI("file location"));

Then in my Mapper code, I do this to try and access the cached file:

URI[] localPaths = context.getCacheFiles();
File f = new File(localPaths[0]);

However, I get a NullPointerException when I do that in the Mapper code.

Any suggesstions?

Andrew

From: Shahab Yunus [mailto:shahab.yunus@gmail.com]
Sent: Wednesday, July 10, 2013 9:43 PM
To: user@hadoop.apache.org
Subject: Re: New Distributed Cache

Also, once you have the array of URIs after calling getCacheFiles  you can iterate over them using File class or Path (http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI))

Regards,
Shahab

On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>> wrote:
did you try JobContext.getCacheFiles() ?


Thanks,
Omkar Joshi
Hortonworks Inc.<http://www.hortonworks.com>

On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>> wrote:
Hi,

I am trying to store a file in the Distributed Cache during my Hadoop job.
In the driver class, I tell the job to store the file in the cache with this code:

Job job = Job.getInstance();
job.addCacheFile(new URI("file name"));

That all compiles fine.  In the Mapper code, I try accessing the cached file with this method:

Path[] localPaths = context.getLocalCacheFiles();

However, I am getting warnings that this method is deprecated.
Does anyone know the newest way to access cached files in the Mapper code? (I am using Hadoop 2.0.5)

Thanks in advance,

Andrew



RE: New Distributed Cache

Posted by "Botelho, Andrew" <An...@emc.com>.
So in my driver code, I try to store the file in the cache with this line of code:

job.addCacheFile(new URI("file location"));

Then in my Mapper code, I do this to try and access the cached file:

URI[] localPaths = context.getCacheFiles();
File f = new File(localPaths[0]);

However, I get a NullPointerException when I do that in the Mapper code.

Any suggesstions?

Andrew

From: Shahab Yunus [mailto:shahab.yunus@gmail.com]
Sent: Wednesday, July 10, 2013 9:43 PM
To: user@hadoop.apache.org
Subject: Re: New Distributed Cache

Also, once you have the array of URIs after calling getCacheFiles  you can iterate over them using File class or Path (http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI))

Regards,
Shahab

On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>> wrote:
did you try JobContext.getCacheFiles() ?


Thanks,
Omkar Joshi
Hortonworks Inc.<http://www.hortonworks.com>

On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>> wrote:
Hi,

I am trying to store a file in the Distributed Cache during my Hadoop job.
In the driver class, I tell the job to store the file in the cache with this code:

Job job = Job.getInstance();
job.addCacheFile(new URI("file name"));

That all compiles fine.  In the Mapper code, I try accessing the cached file with this method:

Path[] localPaths = context.getLocalCacheFiles();

However, I am getting warnings that this method is deprecated.
Does anyone know the newest way to access cached files in the Mapper code? (I am using Hadoop 2.0.5)

Thanks in advance,

Andrew



RE: New Distributed Cache

Posted by "Botelho, Andrew" <An...@emc.com>.
So in my driver code, I try to store the file in the cache with this line of code:

job.addCacheFile(new URI("file location"));

Then in my Mapper code, I do this to try and access the cached file:

URI[] localPaths = context.getCacheFiles();
File f = new File(localPaths[0]);

However, I get a NullPointerException when I do that in the Mapper code.

Any suggesstions?

Andrew

From: Shahab Yunus [mailto:shahab.yunus@gmail.com]
Sent: Wednesday, July 10, 2013 9:43 PM
To: user@hadoop.apache.org
Subject: Re: New Distributed Cache

Also, once you have the array of URIs after calling getCacheFiles  you can iterate over them using File class or Path (http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI))

Regards,
Shahab

On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com>> wrote:
did you try JobContext.getCacheFiles() ?


Thanks,
Omkar Joshi
Hortonworks Inc.<http://www.hortonworks.com>

On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>> wrote:
Hi,

I am trying to store a file in the Distributed Cache during my Hadoop job.
In the driver class, I tell the job to store the file in the cache with this code:

Job job = Job.getInstance();
job.addCacheFile(new URI("file name"));

That all compiles fine.  In the Mapper code, I try accessing the cached file with this method:

Path[] localPaths = context.getLocalCacheFiles();

However, I am getting warnings that this method is deprecated.
Does anyone know the newest way to access cached files in the Mapper code? (I am using Hadoop 2.0.5)

Thanks in advance,

Andrew



Re: New Distributed Cache

Posted by Shahab Yunus <sh...@gmail.com>.
Also, once you have the array of URIs after calling getCacheFiles  you can
iterate over them using File class or Path (
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
)

Regards,
Shahab


On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com> wrote:

> did you try JobContext.getCacheFiles() ?
>
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>wrote:
>
>> Hi,****
>>
>> ** **
>>
>> I am trying to store a file in the Distributed Cache during my Hadoop job.
>> ****
>>
>> In the driver class, I tell the job to store the file in the cache with
>> this code:****
>>
>> ** **
>>
>> Job job = Job.getInstance();****
>>
>> job.addCacheFile(new URI("file name"));****
>>
>> ** **
>>
>> That all compiles fine.  In the Mapper code, I try accessing the cached
>> file with this method:****
>>
>> ** **
>>
>> Path[] localPaths = context.getLocalCacheFiles();****
>>
>> ** **
>>
>> However, I am getting warnings that this method is deprecated.****
>>
>> Does anyone know the newest way to access cached files in the Mapper
>> code? (I am using Hadoop 2.0.5)****
>>
>> ** **
>>
>> Thanks in advance,****
>>
>> ** **
>>
>> Andrew****
>>
>
>

Re: New Distributed Cache

Posted by Shahab Yunus <sh...@gmail.com>.
Also, once you have the array of URIs after calling getCacheFiles  you can
iterate over them using File class or Path (
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
)

Regards,
Shahab


On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com> wrote:

> did you try JobContext.getCacheFiles() ?
>
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>wrote:
>
>> Hi,****
>>
>> ** **
>>
>> I am trying to store a file in the Distributed Cache during my Hadoop job.
>> ****
>>
>> In the driver class, I tell the job to store the file in the cache with
>> this code:****
>>
>> ** **
>>
>> Job job = Job.getInstance();****
>>
>> job.addCacheFile(new URI("file name"));****
>>
>> ** **
>>
>> That all compiles fine.  In the Mapper code, I try accessing the cached
>> file with this method:****
>>
>> ** **
>>
>> Path[] localPaths = context.getLocalCacheFiles();****
>>
>> ** **
>>
>> However, I am getting warnings that this method is deprecated.****
>>
>> Does anyone know the newest way to access cached files in the Mapper
>> code? (I am using Hadoop 2.0.5)****
>>
>> ** **
>>
>> Thanks in advance,****
>>
>> ** **
>>
>> Andrew****
>>
>
>

Re: New Distributed Cache

Posted by Shahab Yunus <sh...@gmail.com>.
Also, once you have the array of URIs after calling getCacheFiles  you can
iterate over them using File class or Path (
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
)

Regards,
Shahab


On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com> wrote:

> did you try JobContext.getCacheFiles() ?
>
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>wrote:
>
>> Hi,****
>>
>> ** **
>>
>> I am trying to store a file in the Distributed Cache during my Hadoop job.
>> ****
>>
>> In the driver class, I tell the job to store the file in the cache with
>> this code:****
>>
>> ** **
>>
>> Job job = Job.getInstance();****
>>
>> job.addCacheFile(new URI("file name"));****
>>
>> ** **
>>
>> That all compiles fine.  In the Mapper code, I try accessing the cached
>> file with this method:****
>>
>> ** **
>>
>> Path[] localPaths = context.getLocalCacheFiles();****
>>
>> ** **
>>
>> However, I am getting warnings that this method is deprecated.****
>>
>> Does anyone know the newest way to access cached files in the Mapper
>> code? (I am using Hadoop 2.0.5)****
>>
>> ** **
>>
>> Thanks in advance,****
>>
>> ** **
>>
>> Andrew****
>>
>
>

Re: New Distributed Cache

Posted by Shahab Yunus <sh...@gmail.com>.
Also, once you have the array of URIs after calling getCacheFiles  you can
iterate over them using File class or Path (
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/Path.html#Path(java.net.URI)
)

Regards,
Shahab


On Wed, Jul 10, 2013 at 5:08 PM, Omkar Joshi <oj...@hortonworks.com> wrote:

> did you try JobContext.getCacheFiles() ?
>
>
> Thanks,
> Omkar Joshi
> *Hortonworks Inc.* <http://www.hortonworks.com>
>
>
> On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>wrote:
>
>> Hi,****
>>
>> ** **
>>
>> I am trying to store a file in the Distributed Cache during my Hadoop job.
>> ****
>>
>> In the driver class, I tell the job to store the file in the cache with
>> this code:****
>>
>> ** **
>>
>> Job job = Job.getInstance();****
>>
>> job.addCacheFile(new URI("file name"));****
>>
>> ** **
>>
>> That all compiles fine.  In the Mapper code, I try accessing the cached
>> file with this method:****
>>
>> ** **
>>
>> Path[] localPaths = context.getLocalCacheFiles();****
>>
>> ** **
>>
>> However, I am getting warnings that this method is deprecated.****
>>
>> Does anyone know the newest way to access cached files in the Mapper
>> code? (I am using Hadoop 2.0.5)****
>>
>> ** **
>>
>> Thanks in advance,****
>>
>> ** **
>>
>> Andrew****
>>
>
>

Re: New Distributed Cache

Posted by Omkar Joshi <oj...@hortonworks.com>.
did you try JobContext.getCacheFiles() ?


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>wrote:

> Hi,****
>
> ** **
>
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> ****
>
> In the driver class, I tell the job to store the file in the cache with
> this code:****
>
> ** **
>
> Job job = Job.getInstance();****
>
> job.addCacheFile(new URI("file name"));****
>
> ** **
>
> That all compiles fine.  In the Mapper code, I try accessing the cached
> file with this method:****
>
> ** **
>
> Path[] localPaths = context.getLocalCacheFiles();****
>
> ** **
>
> However, I am getting warnings that this method is deprecated.****
>
> Does anyone know the newest way to access cached files in the Mapper code?
> (I am using Hadoop 2.0.5)****
>
> ** **
>
> Thanks in advance,****
>
> ** **
>
> Andrew****
>

Re: New Distributed Cache

Posted by Omkar Joshi <oj...@hortonworks.com>.
did you try JobContext.getCacheFiles() ?


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>wrote:

> Hi,****
>
> ** **
>
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> ****
>
> In the driver class, I tell the job to store the file in the cache with
> this code:****
>
> ** **
>
> Job job = Job.getInstance();****
>
> job.addCacheFile(new URI("file name"));****
>
> ** **
>
> That all compiles fine.  In the Mapper code, I try accessing the cached
> file with this method:****
>
> ** **
>
> Path[] localPaths = context.getLocalCacheFiles();****
>
> ** **
>
> However, I am getting warnings that this method is deprecated.****
>
> Does anyone know the newest way to access cached files in the Mapper code?
> (I am using Hadoop 2.0.5)****
>
> ** **
>
> Thanks in advance,****
>
> ** **
>
> Andrew****
>

Re: New Distributed Cache

Posted by Omkar Joshi <oj...@hortonworks.com>.
did you try JobContext.getCacheFiles() ?


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>wrote:

> Hi,****
>
> ** **
>
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> ****
>
> In the driver class, I tell the job to store the file in the cache with
> this code:****
>
> ** **
>
> Job job = Job.getInstance();****
>
> job.addCacheFile(new URI("file name"));****
>
> ** **
>
> That all compiles fine.  In the Mapper code, I try accessing the cached
> file with this method:****
>
> ** **
>
> Path[] localPaths = context.getLocalCacheFiles();****
>
> ** **
>
> However, I am getting warnings that this method is deprecated.****
>
> Does anyone know the newest way to access cached files in the Mapper code?
> (I am using Hadoop 2.0.5)****
>
> ** **
>
> Thanks in advance,****
>
> ** **
>
> Andrew****
>

Re: New Distributed Cache

Posted by Omkar Joshi <oj...@hortonworks.com>.
did you try JobContext.getCacheFiles() ?


Thanks,
Omkar Joshi
*Hortonworks Inc.* <http://www.hortonworks.com>


On Wed, Jul 10, 2013 at 10:15 AM, Botelho, Andrew <An...@emc.com>wrote:

> Hi,****
>
> ** **
>
> I am trying to store a file in the Distributed Cache during my Hadoop job.
> ****
>
> In the driver class, I tell the job to store the file in the cache with
> this code:****
>
> ** **
>
> Job job = Job.getInstance();****
>
> job.addCacheFile(new URI("file name"));****
>
> ** **
>
> That all compiles fine.  In the Mapper code, I try accessing the cached
> file with this method:****
>
> ** **
>
> Path[] localPaths = context.getLocalCacheFiles();****
>
> ** **
>
> However, I am getting warnings that this method is deprecated.****
>
> Does anyone know the newest way to access cached files in the Mapper code?
> (I am using Hadoop 2.0.5)****
>
> ** **
>
> Thanks in advance,****
>
> ** **
>
> Andrew****
>