You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by John Armstrong <jo...@ccri.com> on 2011/07/26 21:28:46 UTC
Adding files to map/reduce classpath
I'm back to trying to add libraries to the classpath instead of handing
around a fat JAR. This time I've served up my directory full of JARs on
NFS, which each node in my cluster has mounted at /mnt/hadoop-libs. Now my
question is how to add that (local) directory to the classpath of the
mapper and reducer tasks. I've tried adding "-classpath
/mnt/hadoop-libs/*" to mapred.map.child.java.opts, but it doesn't seem to
work; the actual classpath I can see being called is just the local
/usr/lib/hadoop/lib stuff as usual.
Re: Adding files to map/reduce classpath
Posted by John Armstrong <jo...@ccri.com>.
On Tue, 26 Jul 2011 12:35:48 -0700, Shrijeet Paliwal
<sh...@rocketfuel.com> wrote:
> **
> See if this (very old) reply from Mikhail helps.
> http://search-hadoop.com/m/QFVD1kEmQT
> Here is the patch he is referring to.
>
http://m1.archiveorange.com/m/att/RNVYm/ArchiveOrange_8dEcdJI4bXFkKHBnsll8YzTc8u8a.patch
>
> **replying in hurry
Thanks; it looks like that would work, but it's a gamble whether the
client will be willing to install that patch. Do you know if it's been
added in CDH3-beta-3?
Re: Merge Reducers Outputs
Posted by David Rosenstrauch <da...@darose.net>.
On 07/26/2011 06:52 PM, Mohamed Riadh Trad wrote:
> Dear All,
>
> Is it possible to set up a task with multiple reducers and merge reducers outputs into one single file?
>
> Bests,
>
> Trad Mohamed Riadh, M.Sc, Ing.
Not within the map-reduce job, but you can merge it after the job is
done. At my previous company we used FileUtil.copyMerge() to do this,
and it worked quite well.
See:
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileUtil.html#copyMerge%28org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path,%20org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path,%20boolean,%20org.apache.hadoop.conf.Configuration,%20java.lang.String%29
DR
Re: Merge Reducers Outputs
Posted by Arun C Murthy <ac...@hortonworks.com>.
No, you either have small enough data that you can have all go to a single reducer or you can setup a (sampling) partitioner so that the partitions are sorted and you can get globally sorted output from multiple reduces - take a look at the TeraSort example for this.
Arun
On Jul 26, 2011, at 3:52 PM, Mohamed Riadh Trad wrote:
> Dear All,
>
> Is it possible to set up a task with multiple reducers and merge reducers outputs into one single file?
>
> Bests,
>
> Trad Mohamed Riadh, M.Sc, Ing.
> PhD. student
> INRIA-TELECOM PARISTECH - ENPC School of International Management
>
> Office: 11-15
> Phone: (33)-1 39 63 59 33
> Fax: (33)-1 39 63 56 74
> Email: riadh.trad@inria.fr
> Home page: http://www-rocq.inria.fr/who/Mohamed.Trad/
Merge Reducers Outputs
Posted by Mohamed Riadh Trad <Mo...@inria.fr>.
Dear All,
Is it possible to set up a task with multiple reducers and merge reducers outputs into one single file?
Bests,
Trad Mohamed Riadh, M.Sc, Ing.
PhD. student
INRIA-TELECOM PARISTECH - ENPC School of International Management
Office: 11-15
Phone: (33)-1 39 63 59 33
Fax: (33)-1 39 63 56 74
Email: riadh.trad@inria.fr
Home page: http://www-rocq.inria.fr/who/Mohamed.Trad/
Re: Adding files to map/reduce classpath
Posted by Shrijeet Paliwal <sh...@rocketfuel.com>.
**
See if this (very old) reply from Mikhail helps.
http://search-hadoop.com/m/QFVD1kEmQT
Here is the patch he is referring to.
http://m1.archiveorange.com/m/att/RNVYm/ArchiveOrange_8dEcdJI4bXFkKHBnsll8YzTc8u8a.patch
**replying in hurry
On Tue, Jul 26, 2011 at 12:28 PM, John Armstrong
<jo...@ccri.com> wrote:
> I'm back to trying to add libraries to the classpath instead of handing
> around a fat JAR. This time I've served up my directory full of JARs on
> NFS, which each node in my cluster has mounted at /mnt/hadoop-libs. Now my
> question is how to add that (local) directory to the classpath of the
> mapper and reducer tasks. I've tried adding "-classpath
> /mnt/hadoop-libs/*" to mapred.map.child.java.opts, but it doesn't seem to
> work; the actual classpath I can see being called is just the local
> /usr/lib/hadoop/lib stuff as usual.
>