You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Raphael Cendrillon <ce...@gmail.com> on 2011/12/20 04:52:26 UTC

Multiple outputs

A question, is it possible use multiple outputs with the new Hadoop API?

It seems that multiple outputs were only full ported in Hadoop 0.21, but I think Mahout uses 0.20. 

Does that mean I need to stick with the old API (JobConf etc.)?

Thanks!

Raphael. 

Re: Multiple outputs

Posted by Jake Mannix <ja...@gmail.com>.
Funny that you ask about this, as I was just writing code which required
hacking back to
the old API because it needed to use MultipleOutputs.

Short answer to say: as far as I can tell, the only way to get
MultipleOutputs (or map-side
join via CompositeInputFormat) is to go back to the old and ugly JobConf /
MapReduceBase
API.

Sad, but necessary, unless I'm mistaken.

  -jake

On Mon, Dec 19, 2011 at 7:52 PM, Raphael Cendrillon <
cendrillon1978@gmail.com> wrote:

> A question, is it possible use multiple outputs with the new Hadoop API?
>
> It seems that multiple outputs were only full ported in Hadoop 0.21, but I
> think Mahout uses 0.20.
>
> Does that mean I need to stick with the old API (JobConf etc.)?
>
> Thanks!
>
> Raphael.

Re: Multiple outputs

Posted by Raphael Cendrillon <ce...@gmail.com>.
Fantastic. Thanks!

On Dec 19, 2011, at 8:47 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> in particular, i think B' job uses new api for Job etc. but yet
> produces old api mutliple outputs (and i think it may  even do it in
> both mapper and reducer).
> 
> On Mon, Dec 19, 2011 at 8:45 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> i hacked it. i use multiple outputs from old api which i pull on the
>> new api context (see code). Surprisingly, it works (most likely, new
>> api just delegates to it in 0.21)
>> 
>> On Mon, Dec 19, 2011 at 7:52 PM, Raphael Cendrillon
>> <ce...@gmail.com> wrote:
>>> A question, is it possible use multiple outputs with the new Hadoop API?
>>> 
>>> It seems that multiple outputs were only full ported in Hadoop 0.21, but I think Mahout uses 0.20.
>>> 
>>> Does that mean I need to stick with the old API (JobConf etc.)?
>>> 
>>> Thanks!
>>> 
>>> Raphael.

Re: Multiple outputs

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
in particular, i think B' job uses new api for Job etc. but yet
produces old api mutliple outputs (and i think it may  even do it in
both mapper and reducer).

On Mon, Dec 19, 2011 at 8:45 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> i hacked it. i use multiple outputs from old api which i pull on the
> new api context (see code). Surprisingly, it works (most likely, new
> api just delegates to it in 0.21)
>
> On Mon, Dec 19, 2011 at 7:52 PM, Raphael Cendrillon
> <ce...@gmail.com> wrote:
>> A question, is it possible use multiple outputs with the new Hadoop API?
>>
>> It seems that multiple outputs were only full ported in Hadoop 0.21, but I think Mahout uses 0.20.
>>
>> Does that mean I need to stick with the old API (JobConf etc.)?
>>
>> Thanks!
>>
>> Raphael.

Re: Multiple outputs

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
Haven't seen where it wouldn't work so far. After all it is all just
property interpretation helpers, so I suppose for as long as legacy classes
are still around, there s no compelling reason for it not to work.

apologies for brevity.

Sent from my android.
-Dmitriy
On Dec 19, 2011 9:37 PM, "Jake Mannix" <ja...@gmail.com> wrote:

> Ah, this is nice!  I had not realized this works.  Do you know in which
> hadoop
> versions it works for?
>
>  -jake
>
> On Mon, Dec 19, 2011 at 8:45 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>
> > i hacked it. i use multiple outputs from old api which i pull on the
> > new api context (see code). Surprisingly, it works (most likely, new
> > api just delegates to it in 0.21)
> >
> > On Mon, Dec 19, 2011 at 7:52 PM, Raphael Cendrillon
> > <ce...@gmail.com> wrote:
> > > A question, is it possible use multiple outputs with the new Hadoop
> API?
> > >
> > > It seems that multiple outputs were only full ported in Hadoop 0.21,
> but
> > I think Mahout uses 0.20.
> > >
> > > Does that mean I need to stick with the old API (JobConf etc.)?
> > >
> > > Thanks!
> > >
> > > Raphael.
> >
>

Re: Multiple outputs

Posted by Jake Mannix <ja...@gmail.com>.
Ah, this is nice!  I had not realized this works.  Do you know in which
hadoop
versions it works for?

  -jake

On Mon, Dec 19, 2011 at 8:45 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> i hacked it. i use multiple outputs from old api which i pull on the
> new api context (see code). Surprisingly, it works (most likely, new
> api just delegates to it in 0.21)
>
> On Mon, Dec 19, 2011 at 7:52 PM, Raphael Cendrillon
> <ce...@gmail.com> wrote:
> > A question, is it possible use multiple outputs with the new Hadoop API?
> >
> > It seems that multiple outputs were only full ported in Hadoop 0.21, but
> I think Mahout uses 0.20.
> >
> > Does that mean I need to stick with the old API (JobConf etc.)?
> >
> > Thanks!
> >
> > Raphael.
>

Re: Multiple outputs

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
i hacked it. i use multiple outputs from old api which i pull on the
new api context (see code). Surprisingly, it works (most likely, new
api just delegates to it in 0.21)

On Mon, Dec 19, 2011 at 7:52 PM, Raphael Cendrillon
<ce...@gmail.com> wrote:
> A question, is it possible use multiple outputs with the new Hadoop API?
>
> It seems that multiple outputs were only full ported in Hadoop 0.21, but I think Mahout uses 0.20.
>
> Does that mean I need to stick with the old API (JobConf etc.)?
>
> Thanks!
>
> Raphael.