You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Alberto Cordioli <co...@gmail.com> on 2012/10/19 09:19:24 UTC

DistributedCache: getLocalCacheFiles() always null

Hi all,

I am trying to use the DistributedCache with the new Hadoop API.
According to the documentation it seems that nothing change, and the
use is the same as with the old api.
However I am facing some problems. This is the snippet in which I use it:


// setting input/output format classes
....
		
//DISTRIBUTED CACHE
DistributedCache.addCacheFile(new
Path("/cdr/input/cgi.csv#cgi.csv").toUri(), getConf());
job.waitForCompletion(true);


and in my reducer:

@Override
protected void setup(Context context) throws IOException{
      Path[] localFiles =
DistributedCache.getLocalCacheFiles(context.getConfiguration());
      ....
}

localFiels is always null. I read that the getLocalCacheFiles() should
be used in configure() method, but the mapper/reducer of the new api
do not have that method.
What's wrong?
I read that the DistributedCache has some troubles if you try to run
your program from a client (e.g., inside an IDE), but I tried also to
run it directly on the cluster.


Thanks.

-- 
Alberto Cordioli

Re: DistributedCache: getLocalCacheFiles() always null

Posted by Alberto Cordioli <co...@gmail.com>.
Ok, it was my fault.
Instead of using getConf() when I added a new cache file I should use
job.getConfiguration()
Not it works.



Cheers,
Alberto


On 19 October 2012 09:19, Alberto Cordioli <co...@gmail.com> wrote:
> Hi all,
>
> I am trying to use the DistributedCache with the new Hadoop API.
> According to the documentation it seems that nothing change, and the
> use is the same as with the old api.
> However I am facing some problems. This is the snippet in which I use it:
>
>
> // setting input/output format classes
> ....
>
> //DISTRIBUTED CACHE
> DistributedCache.addCacheFile(new
> Path("/cdr/input/cgi.csv#cgi.csv").toUri(), getConf());
> job.waitForCompletion(true);
>
>
> and in my reducer:
>
> @Override
> protected void setup(Context context) throws IOException{
>       Path[] localFiles =
> DistributedCache.getLocalCacheFiles(context.getConfiguration());
>       ....
> }
>
> localFiels is always null. I read that the getLocalCacheFiles() should
> be used in configure() method, but the mapper/reducer of the new api
> do not have that method.
> What's wrong?
> I read that the DistributedCache has some troubles if you try to run
> your program from a client (e.g., inside an IDE), but I tried also to
> run it directly on the cluster.
>
>
> Thanks.
>
> --
> Alberto Cordioli



-- 
Alberto Cordioli

Re: DistributedCache: getLocalCacheFiles() always null

Posted by Alberto Cordioli <co...@gmail.com>.
Ok, it was my fault.
Instead of using getConf() when I added a new cache file I should use
job.getConfiguration()
Not it works.



Cheers,
Alberto


On 19 October 2012 09:19, Alberto Cordioli <co...@gmail.com> wrote:
> Hi all,
>
> I am trying to use the DistributedCache with the new Hadoop API.
> According to the documentation it seems that nothing change, and the
> use is the same as with the old api.
> However I am facing some problems. This is the snippet in which I use it:
>
>
> // setting input/output format classes
> ....
>
> //DISTRIBUTED CACHE
> DistributedCache.addCacheFile(new
> Path("/cdr/input/cgi.csv#cgi.csv").toUri(), getConf());
> job.waitForCompletion(true);
>
>
> and in my reducer:
>
> @Override
> protected void setup(Context context) throws IOException{
>       Path[] localFiles =
> DistributedCache.getLocalCacheFiles(context.getConfiguration());
>       ....
> }
>
> localFiels is always null. I read that the getLocalCacheFiles() should
> be used in configure() method, but the mapper/reducer of the new api
> do not have that method.
> What's wrong?
> I read that the DistributedCache has some troubles if you try to run
> your program from a client (e.g., inside an IDE), but I tried also to
> run it directly on the cluster.
>
>
> Thanks.
>
> --
> Alberto Cordioli



-- 
Alberto Cordioli

Re: DistributedCache: getLocalCacheFiles() always null

Posted by Alberto Cordioli <co...@gmail.com>.
Ok, it was my fault.
Instead of using getConf() when I added a new cache file I should use
job.getConfiguration()
Not it works.



Cheers,
Alberto


On 19 October 2012 09:19, Alberto Cordioli <co...@gmail.com> wrote:
> Hi all,
>
> I am trying to use the DistributedCache with the new Hadoop API.
> According to the documentation it seems that nothing change, and the
> use is the same as with the old api.
> However I am facing some problems. This is the snippet in which I use it:
>
>
> // setting input/output format classes
> ....
>
> //DISTRIBUTED CACHE
> DistributedCache.addCacheFile(new
> Path("/cdr/input/cgi.csv#cgi.csv").toUri(), getConf());
> job.waitForCompletion(true);
>
>
> and in my reducer:
>
> @Override
> protected void setup(Context context) throws IOException{
>       Path[] localFiles =
> DistributedCache.getLocalCacheFiles(context.getConfiguration());
>       ....
> }
>
> localFiels is always null. I read that the getLocalCacheFiles() should
> be used in configure() method, but the mapper/reducer of the new api
> do not have that method.
> What's wrong?
> I read that the DistributedCache has some troubles if you try to run
> your program from a client (e.g., inside an IDE), but I tried also to
> run it directly on the cluster.
>
>
> Thanks.
>
> --
> Alberto Cordioli



-- 
Alberto Cordioli

Re: DistributedCache: getLocalCacheFiles() always null

Posted by Alberto Cordioli <co...@gmail.com>.
Ok, it was my fault.
Instead of using getConf() when I added a new cache file I should use
job.getConfiguration()
Not it works.



Cheers,
Alberto


On 19 October 2012 09:19, Alberto Cordioli <co...@gmail.com> wrote:
> Hi all,
>
> I am trying to use the DistributedCache with the new Hadoop API.
> According to the documentation it seems that nothing change, and the
> use is the same as with the old api.
> However I am facing some problems. This is the snippet in which I use it:
>
>
> // setting input/output format classes
> ....
>
> //DISTRIBUTED CACHE
> DistributedCache.addCacheFile(new
> Path("/cdr/input/cgi.csv#cgi.csv").toUri(), getConf());
> job.waitForCompletion(true);
>
>
> and in my reducer:
>
> @Override
> protected void setup(Context context) throws IOException{
>       Path[] localFiles =
> DistributedCache.getLocalCacheFiles(context.getConfiguration());
>       ....
> }
>
> localFiels is always null. I read that the getLocalCacheFiles() should
> be used in configure() method, but the mapper/reducer of the new api
> do not have that method.
> What's wrong?
> I read that the DistributedCache has some troubles if you try to run
> your program from a client (e.g., inside an IDE), but I tried also to
> run it directly on the cluster.
>
>
> Thanks.
>
> --
> Alberto Cordioli



-- 
Alberto Cordioli