You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Alejandro Abdelnur <tu...@gmail.com> on 2009/02/04 08:07:57 UTC

Re: Hadoop 0.19, Cascading 1.0 and MultipleOutputs problem

Mikhail,

You are right, please open a Jira on this.

Alejandro


On Wed, Jan 28, 2009 at 9:23 PM, Mikhail Yakshin
<gr...@gmail.com>wrote:

> Hi,
>
> We have a system based on Hadoop 0.18 / Cascading 0.8.1 and now I'm
> trying to port it to Hadoop 0.19 / Cascading 1.0. The first serious
> problem I've got into that we're extensively using MultipleOutputs in
> our jobs dealing with sequence files that store Cascading's Tuples.
>
> Since Cascading 0.9, Tuples stopped being WritableComparable and
> implemented generic Hadoop serialization interface and framework.
> However, in Hadoop 0.19, MultipleOutputs require use of older
> WritableComparable interface. Thus, trying to do something like:
>
> MultipleOutputs.addNamedOutput(conf, "output-name",
> MySpecialMultiSplitOutputFormat.class, Tuple.class, Tuple.class);
> mos = new MultipleOutputs(conf);
> ...
> mos.getCollector("output-name", reporter).collect(tuple1, tuple2);
>
> yields an error:
>
> java.lang.RuntimeException: java.lang.RuntimeException: class
> cascading.tuple.Tuple not org.apache.hadoop.io.WritableComparable
>        at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:752)
>        at
> org.apache.hadoop.mapred.lib.MultipleOutputs.getNamedOutputKeyClass(MultipleOutputs.java:252)
>        at
> org.apache.hadoop.mapred.lib.MultipleOutputs$InternalFileOutputFormat.getRecordWriter(MultipleOutputs.java:556)
>        at
> org.apache.hadoop.mapred.lib.MultipleOutputs.getRecordWriter(MultipleOutputs.java:425)
>        at
> org.apache.hadoop.mapred.lib.MultipleOutputs.getCollector(MultipleOutputs.java:511)
>        at
> org.apache.hadoop.mapred.lib.MultipleOutputs.getCollector(MultipleOutputs.java:476)
>        at my.namespace.MyReducer.reduce(MyReducer.java:xxx)
>
> Is there any known workaround for that? Any progress going on to make
> MultipleOutputs use generic Hadoop serialization?
>
> --
> WBR, Mikhail Yakshin
>

Re: Hadoop 0.19, Cascading 1.0 and MultipleOutputs problem

Posted by Mikhail Yakshin <gr...@gmail.com>.
On Wed, Feb 4, 2009 at 10:07 AM, Alejandro Abdelnur <tu...@gmail.com> wrote:
> Mikhail,
>
> You are right, please open a Jira on this.
>
> Alejandro

Done:
https://issues.apache.org/jira/browse/HADOOP-5167

-- 
WBR, Mikhail Yakshin