You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2019/03/22 22:22:31 UTC

[GitHub] [accumulo] ctubbsii opened a new issue #1052: Review DistributedCache usage (pretty sure it's very broken)

ctubbsii opened a new issue #1052: Review DistributedCache usage (pretty sure it's very broken)
URL: https://github.com/apache/accumulo/issues/1052
 
 
   The best explanation (if it even works) of how to use Hadoop's distributed cache is at https://stackoverflow.com/a/26421057/196405
   
   But, I have not tested this.
   
   Our code does not do anything like this.... we tell the distributed cache to add a cache file, but then we just loop through the list of cached URIs, pointlessly, and then reach directly out to the DFS to read it.
   
   All of the Hadoop APIs to read the local cache files are (rightfully) deprecated, because there's no way to map them to the files that were added to the cache. But the APIs to add them to the cache are not deprecated... but they also aren't documented.
   
   We should test the above solution, and if it works, use it, otherwise, completely remove any use of the distributed cache in our MapReduce code.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services