You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by xeonmailinglist <xe...@gmail.com> on 2015/02/27 16:22:46 UTC

1 job with Input data from 2 HDFS?

Hi,

I would like to have a mapreduce job that reads input data from 2 HDFS. 
Is this possible?

Thanks,

Re: 1 job with Input data from 2 HDFS?

Posted by xeon Mailinglist <xe...@gmail.com>.
Hi,

I don't understand this part of your answer: "read the other as a
side-input directly by creating a client.".

If I consider both inputs through the InputFormat, this means that a job
will contain both input path in its configuration, and this is enough to
work. So, what is the "other"? Is is the second input? Can you please
explain what you have meant?

On Friday, February 27, 2015, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

> It is entirely possible. You should treat one of them as the primary
> inputs through the InputFormat/Mapper and read the other as a side-input
> directly by creating a client.
>
> +Vinod
>
> On Feb 27, 2015, at 7:22 AM, xeonmailinglist <xeonmailinglist@gmail.com
> <javascript:;>> wrote:
>
> > Hi,
> >
> > I would like to have a mapreduce job that reads input data from 2 HDFS.
> Is this possible?
> >
> > Thanks,
>
>

Re: 1 job with Input data from 2 HDFS?

Posted by xeon Mailinglist <xe...@gmail.com>.
Hi,

I don't understand this part of your answer: "read the other as a
side-input directly by creating a client.".

If I consider both inputs through the InputFormat, this means that a job
will contain both input path in its configuration, and this is enough to
work. So, what is the "other"? Is is the second input? Can you please
explain what you have meant?

On Friday, February 27, 2015, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

> It is entirely possible. You should treat one of them as the primary
> inputs through the InputFormat/Mapper and read the other as a side-input
> directly by creating a client.
>
> +Vinod
>
> On Feb 27, 2015, at 7:22 AM, xeonmailinglist <xeonmailinglist@gmail.com
> <javascript:;>> wrote:
>
> > Hi,
> >
> > I would like to have a mapreduce job that reads input data from 2 HDFS.
> Is this possible?
> >
> > Thanks,
>
>

Re: 1 job with Input data from 2 HDFS?

Posted by xeon Mailinglist <xe...@gmail.com>.
Hi,

I don't understand this part of your answer: "read the other as a
side-input directly by creating a client.".

If I consider both inputs through the InputFormat, this means that a job
will contain both input path in its configuration, and this is enough to
work. So, what is the "other"? Is is the second input? Can you please
explain what you have meant?

On Friday, February 27, 2015, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

> It is entirely possible. You should treat one of them as the primary
> inputs through the InputFormat/Mapper and read the other as a side-input
> directly by creating a client.
>
> +Vinod
>
> On Feb 27, 2015, at 7:22 AM, xeonmailinglist <xeonmailinglist@gmail.com
> <javascript:;>> wrote:
>
> > Hi,
> >
> > I would like to have a mapreduce job that reads input data from 2 HDFS.
> Is this possible?
> >
> > Thanks,
>
>

Re: 1 job with Input data from 2 HDFS?

Posted by xeon Mailinglist <xe...@gmail.com>.
Hi,

I don't understand this part of your answer: "read the other as a
side-input directly by creating a client.".

If I consider both inputs through the InputFormat, this means that a job
will contain both input path in its configuration, and this is enough to
work. So, what is the "other"? Is is the second input? Can you please
explain what you have meant?

On Friday, February 27, 2015, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

> It is entirely possible. You should treat one of them as the primary
> inputs through the InputFormat/Mapper and read the other as a side-input
> directly by creating a client.
>
> +Vinod
>
> On Feb 27, 2015, at 7:22 AM, xeonmailinglist <xeonmailinglist@gmail.com
> <javascript:;>> wrote:
>
> > Hi,
> >
> > I would like to have a mapreduce job that reads input data from 2 HDFS.
> Is this possible?
> >
> > Thanks,
>
>

Re: 1 job with Input data from 2 HDFS?

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
It is entirely possible. You should treat one of them as the primary inputs through the InputFormat/Mapper and read the other as a side-input directly by creating a client.

+Vinod

On Feb 27, 2015, at 7:22 AM, xeonmailinglist <xe...@gmail.com> wrote:

> Hi,
> 
> I would like to have a mapreduce job that reads input data from 2 HDFS. Is this possible?
> 
> Thanks,


Re: 1 job with Input data from 2 HDFS?

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
It is entirely possible. You should treat one of them as the primary inputs through the InputFormat/Mapper and read the other as a side-input directly by creating a client.

+Vinod

On Feb 27, 2015, at 7:22 AM, xeonmailinglist <xe...@gmail.com> wrote:

> Hi,
> 
> I would like to have a mapreduce job that reads input data from 2 HDFS. Is this possible?
> 
> Thanks,


Re: 1 job with Input data from 2 HDFS?

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
It is entirely possible. You should treat one of them as the primary inputs through the InputFormat/Mapper and read the other as a side-input directly by creating a client.

+Vinod

On Feb 27, 2015, at 7:22 AM, xeonmailinglist <xe...@gmail.com> wrote:

> Hi,
> 
> I would like to have a mapreduce job that reads input data from 2 HDFS. Is this possible?
> 
> Thanks,


Re: 1 job with Input data from 2 HDFS?

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
It is entirely possible. You should treat one of them as the primary inputs through the InputFormat/Mapper and read the other as a side-input directly by creating a client.

+Vinod

On Feb 27, 2015, at 7:22 AM, xeonmailinglist <xe...@gmail.com> wrote:

> Hi,
> 
> I would like to have a mapreduce job that reads input data from 2 HDFS. Is this possible?
> 
> Thanks,