You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@trafodion.apache.org by Gunnar Tapper <ta...@gmail.com> on 2016/02/02 21:36:57 UTC

MRv1 vs. MRv2

Hi,

Does Trafodion require YARN with MRv2 or is MRv1 supported, too?

-- 
Thanks,

Gunnar
*If you think you can you can, if you think you can't you're right.*

Re: MRv1 vs. MRv2

Posted by Gunnar Tapper <ta...@gmail.com>.

Thanks Suresh.

I agree, let's go YARN/MRv2 for the initial version of the Provisioning
Guide. Later revisions can deal with the more complex stuff.

Thanks,

Gunnar

On Tue, Feb 2, 2016 at 7:49 PM, Suresh Subbiah <su...@gmail.com>
wrote:

> Hi Gunnar,
>
> Agree with everything you state above, except "However, YARN is required
> for some of the backup/restore..."
> I am sorry I said the wrong thing way up in this thread. backup/restore
> uses map reduce to copy files. I think it will work with both MRv1 and
> MRv2. So we should be good with this line alone "you install Hive and
> whatever version of MapReduce you want to use for Hive". backup/restore
> should be able to use the same version of MapReduce as the one Hive is
> using.
>
> However there is a caveat that Trafodion backup/restore use HBase's
> ExportSnapshot java class. If this class is changed such that it can use
> only MRv2 then we will have the same dependency. If we are looking for
> simple install instructions then maybe we should just say that MRv2 (YARN)
> is required and can be used for both Hive and backup/restore?
>
> Thanks
> Suresh
>
>
> On Tue, Feb 2, 2016 at 6:20 PM, Gunnar Tapper <ta...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am a bit lost. Per previous messages, Hive requires MapReduce. So,
>> MapReduce must be required for full function. I can see that MapReduce is
>> not required if you don't use the Hive functionality.
>>
>> The Jira Hans pointed to seems to suggest to use MapReduce in lieu of
>> YARN, which must mean MRv1 since MRv2 is part of YARN.
>>
>> From what I now understand, you install Hive and whatever version of
>> MapReduce you want to use for Hive. However, YARN is required for some of
>> the backup/restore capabilities so you always need to install MRv2 (since
>> its part of YARN). So, MRv1 is relevant ONLY IF your installation is using
>> MRv1 for Hive processing.
>>
>> Did I get that right?
>>
>> I don't think it's wise to discuss exceptions such as "you don't need
>> MapReduce if you don't plan to use Hive via Trafodion" in the first
>> revision of the Trafodion Provisioning Guide. Too many angles dancing on a
>> needle's head. Instead, let's keep the requirements as simple as we can.
>>
>> Thanks,
>>
>> Gunnar
>>
>> On Tue, Feb 2, 2016 at 3:44 PM, Amanda Moran <am...@esgyn.com>
>> wrote:
>>
>>> I have done many installs without Yarn or MapReduce installed at all.
>>> Trafodion runs fine :)
>>>
>>> On Tue, Feb 2, 2016 at 2:39 PM, Hans Zeller <ha...@esgyn.com>
>>> wrote:
>>>
>>>> No, it is not required in the build environment.
>>>>
>>>> Hans
>>>>
>>>> On Tue, Feb 2, 2016 at 1:54 PM, Gunnar Tapper <ta...@gmail.com>
>>>> wrote:
>>>>
>>>>> Does this mean that MRv1 is now required in the build environment?
>>>>>
>>>>> On Tue, Feb 2, 2016 at 2:12 PM, Hans Zeller <ha...@esgyn.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> That decision would be made by Hive, not Trafodion. For people who
>>>>>> use install_local_hadoop, we recently changed that setup to use local
>>>>>> MapReduce, not YARN, see
>>>>>> https://issues.apache.org/jira/browse/TRAFODION-1781.
>>>>>>
>>>>>> Hans
>>>>>>
>>>>>> On Tue, Feb 2, 2016 at 12:58 PM, Gunnar Tapper <
>>>>>> tapper.gunnar@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Suresh:
>>>>>>>
>>>>>>> Thanks for the information.
>>>>>>>
>>>>>>> Given from what you write, it seems that YARN with MRv2 is required
>>>>>>> for full functionality.
>>>>>>>
>>>>>>> MRv1 is a separate install in current distributions, which is why I
>>>>>>> am asking about it. How does Trafodion decide to run the MapReduce job as
>>>>>>> MRv1 vs. MRv2 if both are installed?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Gunnar
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <
>>>>>>> suresh.subbiah60@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I don't think Trafodion requires YARN for most activity.
>>>>>>>>
>>>>>>>> For Hive table access, Trafodion uses Hive metadata access Java API
>>>>>>>> and libhdfs to actually scan the data file. Therefore YARN is not needed
>>>>>>>> for Hive access.
>>>>>>>> YARN is not needed for native Hbase or Trafodion table access too.
>>>>>>>> YARN is needed for backup/restore, since the HBase exportSnapshot
>>>>>>>> class Trafodion calls, used MapReduce to copy large snapshot files to/from
>>>>>>>> the backup location.
>>>>>>>> YARN is also needed for developer regressions as some vanilla Hive
>>>>>>>> commands are executed during the regression run.
>>>>>>>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Suresh
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <
>>>>>>>> tapper.gunnar@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Gunnar
>>>>>>>>> *If you think you can you can, if you think you can't you're
>>>>>>>>> right.*
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Gunnar
>>>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>>
>>>>> Gunnar
>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>>
>>> Amanda Moran
>>>
>>
>>
>>
>> --
>> Thanks,
>>
>> Gunnar
>> *If you think you can you can, if you think you can't you're right.*
>>
>
>


-- 
Thanks,

Gunnar
*If you think you can you can, if you think you can't you're right.*

Re: MRv1 vs. MRv2

Posted by Suresh Subbiah <su...@gmail.com>.

Hi Gunnar,

Agree with everything you state above, except "However, YARN is required
for some of the backup/restore..."
I am sorry I said the wrong thing way up in this thread. backup/restore
uses map reduce to copy files. I think it will work with both MRv1 and
MRv2. So we should be good with this line alone "you install Hive and
whatever version of MapReduce you want to use for Hive". backup/restore
should be able to use the same version of MapReduce as the one Hive is
using.

However there is a caveat that Trafodion backup/restore use HBase's
ExportSnapshot java class. If this class is changed such that it can use
only MRv2 then we will have the same dependency. If we are looking for
simple install instructions then maybe we should just say that MRv2 (YARN)
is required and can be used for both Hive and backup/restore?

Thanks
Suresh


On Tue, Feb 2, 2016 at 6:20 PM, Gunnar Tapper <ta...@gmail.com>
wrote:

> Hi,
>
> I am a bit lost. Per previous messages, Hive requires MapReduce. So,
> MapReduce must be required for full function. I can see that MapReduce is
> not required if you don't use the Hive functionality.
>
> The Jira Hans pointed to seems to suggest to use MapReduce in lieu of
> YARN, which must mean MRv1 since MRv2 is part of YARN.
>
> From what I now understand, you install Hive and whatever version of
> MapReduce you want to use for Hive. However, YARN is required for some of
> the backup/restore capabilities so you always need to install MRv2 (since
> its part of YARN). So, MRv1 is relevant ONLY IF your installation is using
> MRv1 for Hive processing.
>
> Did I get that right?
>
> I don't think it's wise to discuss exceptions such as "you don't need
> MapReduce if you don't plan to use Hive via Trafodion" in the first
> revision of the Trafodion Provisioning Guide. Too many angles dancing on a
> needle's head. Instead, let's keep the requirements as simple as we can.
>
> Thanks,
>
> Gunnar
>
> On Tue, Feb 2, 2016 at 3:44 PM, Amanda Moran <am...@esgyn.com>
> wrote:
>
>> I have done many installs without Yarn or MapReduce installed at all.
>> Trafodion runs fine :)
>>
>> On Tue, Feb 2, 2016 at 2:39 PM, Hans Zeller <ha...@esgyn.com>
>> wrote:
>>
>>> No, it is not required in the build environment.
>>>
>>> Hans
>>>
>>> On Tue, Feb 2, 2016 at 1:54 PM, Gunnar Tapper <ta...@gmail.com>
>>> wrote:
>>>
>>>> Does this mean that MRv1 is now required in the build environment?
>>>>
>>>> On Tue, Feb 2, 2016 at 2:12 PM, Hans Zeller <ha...@esgyn.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> That decision would be made by Hive, not Trafodion. For people who use
>>>>> install_local_hadoop, we recently changed that setup to use local
>>>>> MapReduce, not YARN, see
>>>>> https://issues.apache.org/jira/browse/TRAFODION-1781.
>>>>>
>>>>> Hans
>>>>>
>>>>> On Tue, Feb 2, 2016 at 12:58 PM, Gunnar Tapper <
>>>>> tapper.gunnar@gmail.com> wrote:
>>>>>
>>>>>> Hi Suresh:
>>>>>>
>>>>>> Thanks for the information.
>>>>>>
>>>>>> Given from what you write, it seems that YARN with MRv2 is required
>>>>>> for full functionality.
>>>>>>
>>>>>> MRv1 is a separate install in current distributions, which is why I
>>>>>> am asking about it. How does Trafodion decide to run the MapReduce job as
>>>>>> MRv1 vs. MRv2 if both are installed?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Gunnar
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <
>>>>>> suresh.subbiah60@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I don't think Trafodion requires YARN for most activity.
>>>>>>>
>>>>>>> For Hive table access, Trafodion uses Hive metadata access Java API
>>>>>>> and libhdfs to actually scan the data file. Therefore YARN is not needed
>>>>>>> for Hive access.
>>>>>>> YARN is not needed for native Hbase or Trafodion table access too.
>>>>>>> YARN is needed for backup/restore, since the HBase exportSnapshot
>>>>>>> class Trafodion calls, used MapReduce to copy large snapshot files to/from
>>>>>>> the backup location.
>>>>>>> YARN is also needed for developer regressions as some vanilla Hive
>>>>>>> commands are executed during the regression run.
>>>>>>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Suresh
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <
>>>>>>> tapper.gunnar@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Gunnar
>>>>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>>
>>>>>> Gunnar
>>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>>
>>>> Gunnar
>>>> *If you think you can you can, if you think you can't you're right.*
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks,
>>
>> Amanda Moran
>>
>
>
>
> --
> Thanks,
>
> Gunnar
> *If you think you can you can, if you think you can't you're right.*
>

Re: MRv1 vs. MRv2

Posted by Gunnar Tapper <ta...@gmail.com>.

Hi,

I am a bit lost. Per previous messages, Hive requires MapReduce. So,
MapReduce must be required for full function. I can see that MapReduce is
not required if you don't use the Hive functionality.

The Jira Hans pointed to seems to suggest to use MapReduce in lieu of YARN,
which must mean MRv1 since MRv2 is part of YARN.

>From what I now understand, you install Hive and whatever version of
MapReduce you want to use for Hive. However, YARN is required for some of
the backup/restore capabilities so you always need to install MRv2 (since
its part of YARN). So, MRv1 is relevant ONLY IF your installation is using
MRv1 for Hive processing.

Did I get that right?

I don't think it's wise to discuss exceptions such as "you don't need
MapReduce if you don't plan to use Hive via Trafodion" in the first
revision of the Trafodion Provisioning Guide. Too many angles dancing on a
needle's head. Instead, let's keep the requirements as simple as we can.

Thanks,

Gunnar

On Tue, Feb 2, 2016 at 3:44 PM, Amanda Moran <am...@esgyn.com> wrote:

> I have done many installs without Yarn or MapReduce installed at all.
> Trafodion runs fine :)
>
> On Tue, Feb 2, 2016 at 2:39 PM, Hans Zeller <ha...@esgyn.com> wrote:
>
>> No, it is not required in the build environment.
>>
>> Hans
>>
>> On Tue, Feb 2, 2016 at 1:54 PM, Gunnar Tapper <ta...@gmail.com>
>> wrote:
>>
>>> Does this mean that MRv1 is now required in the build environment?
>>>
>>> On Tue, Feb 2, 2016 at 2:12 PM, Hans Zeller <ha...@esgyn.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> That decision would be made by Hive, not Trafodion. For people who use
>>>> install_local_hadoop, we recently changed that setup to use local
>>>> MapReduce, not YARN, see
>>>> https://issues.apache.org/jira/browse/TRAFODION-1781.
>>>>
>>>> Hans
>>>>
>>>> On Tue, Feb 2, 2016 at 12:58 PM, Gunnar Tapper <tapper.gunnar@gmail.com
>>>> > wrote:
>>>>
>>>>> Hi Suresh:
>>>>>
>>>>> Thanks for the information.
>>>>>
>>>>> Given from what you write, it seems that YARN with MRv2 is required
>>>>> for full functionality.
>>>>>
>>>>> MRv1 is a separate install in current distributions, which is why I am
>>>>> asking about it. How does Trafodion decide to run the MapReduce job as MRv1
>>>>> vs. MRv2 if both are installed?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Gunnar
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <
>>>>> suresh.subbiah60@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I don't think Trafodion requires YARN for most activity.
>>>>>>
>>>>>> For Hive table access, Trafodion uses Hive metadata access Java API
>>>>>> and libhdfs to actually scan the data file. Therefore YARN is not needed
>>>>>> for Hive access.
>>>>>> YARN is not needed for native Hbase or Trafodion table access too.
>>>>>> YARN is needed for backup/restore, since the HBase exportSnapshot
>>>>>> class Trafodion calls, used MapReduce to copy large snapshot files to/from
>>>>>> the backup location.
>>>>>> YARN is also needed for developer regressions as some vanilla Hive
>>>>>> commands are executed during the regression run.
>>>>>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>>>>>
>>>>>> Thanks
>>>>>> Suresh
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <
>>>>>> tapper.gunnar@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Gunnar
>>>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>>
>>>>> Gunnar
>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>>
>>> Gunnar
>>> *If you think you can you can, if you think you can't you're right.*
>>>
>>
>>
>
>
> --
> Thanks,
>
> Amanda Moran
>



-- 
Thanks,

Gunnar
*If you think you can you can, if you think you can't you're right.*

Re: MRv1 vs. MRv2

Posted by Amanda Moran <am...@esgyn.com>.

I have done many installs without Yarn or MapReduce installed at all.
Trafodion runs fine :)

On Tue, Feb 2, 2016 at 2:39 PM, Hans Zeller <ha...@esgyn.com> wrote:

> No, it is not required in the build environment.
>
> Hans
>
> On Tue, Feb 2, 2016 at 1:54 PM, Gunnar Tapper <ta...@gmail.com>
> wrote:
>
>> Does this mean that MRv1 is now required in the build environment?
>>
>> On Tue, Feb 2, 2016 at 2:12 PM, Hans Zeller <ha...@esgyn.com>
>> wrote:
>>
>>> Hi,
>>>
>>> That decision would be made by Hive, not Trafodion. For people who use
>>> install_local_hadoop, we recently changed that setup to use local
>>> MapReduce, not YARN, see
>>> https://issues.apache.org/jira/browse/TRAFODION-1781.
>>>
>>> Hans
>>>
>>> On Tue, Feb 2, 2016 at 12:58 PM, Gunnar Tapper <ta...@gmail.com>
>>> wrote:
>>>
>>>> Hi Suresh:
>>>>
>>>> Thanks for the information.
>>>>
>>>> Given from what you write, it seems that YARN with MRv2 is required for
>>>> full functionality.
>>>>
>>>> MRv1 is a separate install in current distributions, which is why I am
>>>> asking about it. How does Trafodion decide to run the MapReduce job as MRv1
>>>> vs. MRv2 if both are installed?
>>>>
>>>> Thanks,
>>>>
>>>> Gunnar
>>>>
>>>>
>>>>
>>>> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <
>>>> suresh.subbiah60@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I don't think Trafodion requires YARN for most activity.
>>>>>
>>>>> For Hive table access, Trafodion uses Hive metadata access Java API
>>>>> and libhdfs to actually scan the data file. Therefore YARN is not needed
>>>>> for Hive access.
>>>>> YARN is not needed for native Hbase or Trafodion table access too.
>>>>> YARN is needed for backup/restore, since the HBase exportSnapshot
>>>>> class Trafodion calls, used MapReduce to copy large snapshot files to/from
>>>>> the backup location.
>>>>> YARN is also needed for developer regressions as some vanilla Hive
>>>>> commands are executed during the regression run.
>>>>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>>>>
>>>>> Thanks
>>>>> Suresh
>>>>>
>>>>>
>>>>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <tapper.gunnar@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>>
>>>>>> Gunnar
>>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>>
>>>> Gunnar
>>>> *If you think you can you can, if you think you can't you're right.*
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks,
>>
>> Gunnar
>> *If you think you can you can, if you think you can't you're right.*
>>
>
>


-- 
Thanks,

Amanda Moran

Re: MRv1 vs. MRv2

Posted by Hans Zeller <ha...@esgyn.com>.

No, it is not required in the build environment.

Hans

On Tue, Feb 2, 2016 at 1:54 PM, Gunnar Tapper <ta...@gmail.com>
wrote:

> Does this mean that MRv1 is now required in the build environment?
>
> On Tue, Feb 2, 2016 at 2:12 PM, Hans Zeller <ha...@esgyn.com> wrote:
>
>> Hi,
>>
>> That decision would be made by Hive, not Trafodion. For people who use
>> install_local_hadoop, we recently changed that setup to use local
>> MapReduce, not YARN, see
>> https://issues.apache.org/jira/browse/TRAFODION-1781.
>>
>> Hans
>>
>> On Tue, Feb 2, 2016 at 12:58 PM, Gunnar Tapper <ta...@gmail.com>
>> wrote:
>>
>>> Hi Suresh:
>>>
>>> Thanks for the information.
>>>
>>> Given from what you write, it seems that YARN with MRv2 is required for
>>> full functionality.
>>>
>>> MRv1 is a separate install in current distributions, which is why I am
>>> asking about it. How does Trafodion decide to run the MapReduce job as MRv1
>>> vs. MRv2 if both are installed?
>>>
>>> Thanks,
>>>
>>> Gunnar
>>>
>>>
>>>
>>> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <
>>> suresh.subbiah60@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I don't think Trafodion requires YARN for most activity.
>>>>
>>>> For Hive table access, Trafodion uses Hive metadata access Java API and
>>>> libhdfs to actually scan the data file. Therefore YARN is not needed for
>>>> Hive access.
>>>> YARN is not needed for native Hbase or Trafodion table access too.
>>>> YARN is needed for backup/restore, since the HBase exportSnapshot class
>>>> Trafodion calls, used MapReduce to copy large snapshot files to/from the
>>>> backup location.
>>>> YARN is also needed for developer regressions as some vanilla Hive
>>>> commands are executed during the regression run.
>>>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>>>
>>>> Thanks
>>>> Suresh
>>>>
>>>>
>>>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <ta...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>>>
>>>>> --
>>>>> Thanks,
>>>>>
>>>>> Gunnar
>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>>
>>> Gunnar
>>> *If you think you can you can, if you think you can't you're right.*
>>>
>>
>>
>
>
> --
> Thanks,
>
> Gunnar
> *If you think you can you can, if you think you can't you're right.*
>

Re: MRv1 vs. MRv2

Posted by Gunnar Tapper <ta...@gmail.com>.

Does this mean that MRv1 is now required in the build environment?

On Tue, Feb 2, 2016 at 2:12 PM, Hans Zeller <ha...@esgyn.com> wrote:

> Hi,
>
> That decision would be made by Hive, not Trafodion. For people who use
> install_local_hadoop, we recently changed that setup to use local
> MapReduce, not YARN, see
> https://issues.apache.org/jira/browse/TRAFODION-1781.
>
> Hans
>
> On Tue, Feb 2, 2016 at 12:58 PM, Gunnar Tapper <ta...@gmail.com>
> wrote:
>
>> Hi Suresh:
>>
>> Thanks for the information.
>>
>> Given from what you write, it seems that YARN with MRv2 is required for
>> full functionality.
>>
>> MRv1 is a separate install in current distributions, which is why I am
>> asking about it. How does Trafodion decide to run the MapReduce job as MRv1
>> vs. MRv2 if both are installed?
>>
>> Thanks,
>>
>> Gunnar
>>
>>
>>
>> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <
>> suresh.subbiah60@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I don't think Trafodion requires YARN for most activity.
>>>
>>> For Hive table access, Trafodion uses Hive metadata access Java API and
>>> libhdfs to actually scan the data file. Therefore YARN is not needed for
>>> Hive access.
>>> YARN is not needed for native Hbase or Trafodion table access too.
>>> YARN is needed for backup/restore, since the HBase exportSnapshot class
>>> Trafodion calls, used MapReduce to copy large snapshot files to/from the
>>> backup location.
>>> YARN is also needed for developer regressions as some vanilla Hive
>>> commands are executed during the regression run.
>>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>>
>>> Thanks
>>> Suresh
>>>
>>>
>>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <ta...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>>
>>>> --
>>>> Thanks,
>>>>
>>>> Gunnar
>>>> *If you think you can you can, if you think you can't you're right.*
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks,
>>
>> Gunnar
>> *If you think you can you can, if you think you can't you're right.*
>>
>
>


-- 
Thanks,

Gunnar
*If you think you can you can, if you think you can't you're right.*

Re: MRv1 vs. MRv2

Posted by Gunnar Tapper <ta...@gmail.com>.

Thanks.

On Tue, Feb 2, 2016 at 2:36 PM, Hans Zeller <ha...@esgyn.com> wrote:

> No, unless a user explicitly puts it into a UDF, none of the built-in UDFs
> and SPJs use MR.
>
> Hans
>
> On Tue, Feb 2, 2016 at 1:21 PM, Gunnar Tapper <ta...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I think it's a configuration choice. For example:
>> http://www.cloudera.com/documentation/manager/5-0-x/Cloudera-Manager-Managing-Clusters/cm5mc_mapreduce_service.html
>>
>> Are there UDFs, SPJs, and similar functions that use MapReduce, too?
>>
>> Thanks,
>>
>> Gunnar
>>
>> On Tue, Feb 2, 2016 at 2:19 PM, Qifan Chen <qi...@esgyn.com> wrote:
>>
>>>
>>> Hi Hans,
>>>
>>> I think Hive uses MapReduce to sort the data during table population,
>>> after disabling YARN. This is observed on a workstation.
>>>
>>> Thanks -Qifan
>>>
>>> On Tue, Feb 2, 2016 at 3:12 PM, Hans Zeller <ha...@esgyn.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> That decision would be made by Hive, not Trafodion. For people who use
>>>> install_local_hadoop, we recently changed that setup to use local
>>>> MapReduce, not YARN, see
>>>> https://issues.apache.org/jira/browse/TRAFODION-1781.
>>>>
>>>> Hans
>>>>
>>>> On Tue, Feb 2, 2016 at 12:58 PM, Gunnar Tapper <tapper.gunnar@gmail.com
>>>> > wrote:
>>>>
>>>>> Hi Suresh:
>>>>>
>>>>> Thanks for the information.
>>>>>
>>>>> Given from what you write, it seems that YARN with MRv2 is required
>>>>> for full functionality.
>>>>>
>>>>> MRv1 is a separate install in current distributions, which is why I am
>>>>> asking about it. How does Trafodion decide to run the MapReduce job as MRv1
>>>>> vs. MRv2 if both are installed?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Gunnar
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <
>>>>> suresh.subbiah60@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I don't think Trafodion requires YARN for most activity.
>>>>>>
>>>>>> For Hive table access, Trafodion uses Hive metadata access Java API
>>>>>> and libhdfs to actually scan the data file. Therefore YARN is not needed
>>>>>> for Hive access.
>>>>>> YARN is not needed for native Hbase or Trafodion table access too.
>>>>>> YARN is needed for backup/restore, since the HBase exportSnapshot
>>>>>> class Trafodion calls, used MapReduce to copy large snapshot files to/from
>>>>>> the backup location.
>>>>>> YARN is also needed for developer regressions as some vanilla Hive
>>>>>> commands are executed during the regression run.
>>>>>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>>>>>
>>>>>> Thanks
>>>>>> Suresh
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <
>>>>>> tapper.gunnar@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Gunnar
>>>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Thanks,
>>>>>
>>>>> Gunnar
>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Regards, --Qifan
>>>
>>>
>>
>>
>> --
>> Thanks,
>>
>> Gunnar
>> *If you think you can you can, if you think you can't you're right.*
>>
>
>


-- 
Thanks,

Gunnar
*If you think you can you can, if you think you can't you're right.*

Re: MRv1 vs. MRv2

Posted by Hans Zeller <ha...@esgyn.com>.

No, unless a user explicitly puts it into a UDF, none of the built-in UDFs
and SPJs use MR.

Hans

On Tue, Feb 2, 2016 at 1:21 PM, Gunnar Tapper <ta...@gmail.com>
wrote:

> Hi,
>
> I think it's a configuration choice. For example:
> http://www.cloudera.com/documentation/manager/5-0-x/Cloudera-Manager-Managing-Clusters/cm5mc_mapreduce_service.html
>
> Are there UDFs, SPJs, and similar functions that use MapReduce, too?
>
> Thanks,
>
> Gunnar
>
> On Tue, Feb 2, 2016 at 2:19 PM, Qifan Chen <qi...@esgyn.com> wrote:
>
>>
>> Hi Hans,
>>
>> I think Hive uses MapReduce to sort the data during table population,
>> after disabling YARN. This is observed on a workstation.
>>
>> Thanks -Qifan
>>
>> On Tue, Feb 2, 2016 at 3:12 PM, Hans Zeller <ha...@esgyn.com>
>> wrote:
>>
>>> Hi,
>>>
>>> That decision would be made by Hive, not Trafodion. For people who use
>>> install_local_hadoop, we recently changed that setup to use local
>>> MapReduce, not YARN, see
>>> https://issues.apache.org/jira/browse/TRAFODION-1781.
>>>
>>> Hans
>>>
>>> On Tue, Feb 2, 2016 at 12:58 PM, Gunnar Tapper <ta...@gmail.com>
>>> wrote:
>>>
>>>> Hi Suresh:
>>>>
>>>> Thanks for the information.
>>>>
>>>> Given from what you write, it seems that YARN with MRv2 is required for
>>>> full functionality.
>>>>
>>>> MRv1 is a separate install in current distributions, which is why I am
>>>> asking about it. How does Trafodion decide to run the MapReduce job as MRv1
>>>> vs. MRv2 if both are installed?
>>>>
>>>> Thanks,
>>>>
>>>> Gunnar
>>>>
>>>>
>>>>
>>>> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <
>>>> suresh.subbiah60@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I don't think Trafodion requires YARN for most activity.
>>>>>
>>>>> For Hive table access, Trafodion uses Hive metadata access Java API
>>>>> and libhdfs to actually scan the data file. Therefore YARN is not needed
>>>>> for Hive access.
>>>>> YARN is not needed for native Hbase or Trafodion table access too.
>>>>> YARN is needed for backup/restore, since the HBase exportSnapshot
>>>>> class Trafodion calls, used MapReduce to copy large snapshot files to/from
>>>>> the backup location.
>>>>> YARN is also needed for developer regressions as some vanilla Hive
>>>>> commands are executed during the regression run.
>>>>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>>>>
>>>>> Thanks
>>>>> Suresh
>>>>>
>>>>>
>>>>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <tapper.gunnar@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>>
>>>>>> Gunnar
>>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>>
>>>> Gunnar
>>>> *If you think you can you can, if you think you can't you're right.*
>>>>
>>>
>>>
>>
>>
>> --
>> Regards, --Qifan
>>
>>
>
>
> --
> Thanks,
>
> Gunnar
> *If you think you can you can, if you think you can't you're right.*
>

Re: MRv1 vs. MRv2

Posted by Gunnar Tapper <ta...@gmail.com>.

Hi,

I think it's a configuration choice. For example:
http://www.cloudera.com/documentation/manager/5-0-x/Cloudera-Manager-Managing-Clusters/cm5mc_mapreduce_service.html

Are there UDFs, SPJs, and similar functions that use MapReduce, too?

Thanks,

Gunnar

On Tue, Feb 2, 2016 at 2:19 PM, Qifan Chen <qi...@esgyn.com> wrote:

>
> Hi Hans,
>
> I think Hive uses MapReduce to sort the data during table population,
> after disabling YARN. This is observed on a workstation.
>
> Thanks -Qifan
>
> On Tue, Feb 2, 2016 at 3:12 PM, Hans Zeller <ha...@esgyn.com> wrote:
>
>> Hi,
>>
>> That decision would be made by Hive, not Trafodion. For people who use
>> install_local_hadoop, we recently changed that setup to use local
>> MapReduce, not YARN, see
>> https://issues.apache.org/jira/browse/TRAFODION-1781.
>>
>> Hans
>>
>> On Tue, Feb 2, 2016 at 12:58 PM, Gunnar Tapper <ta...@gmail.com>
>> wrote:
>>
>>> Hi Suresh:
>>>
>>> Thanks for the information.
>>>
>>> Given from what you write, it seems that YARN with MRv2 is required for
>>> full functionality.
>>>
>>> MRv1 is a separate install in current distributions, which is why I am
>>> asking about it. How does Trafodion decide to run the MapReduce job as MRv1
>>> vs. MRv2 if both are installed?
>>>
>>> Thanks,
>>>
>>> Gunnar
>>>
>>>
>>>
>>> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <
>>> suresh.subbiah60@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I don't think Trafodion requires YARN for most activity.
>>>>
>>>> For Hive table access, Trafodion uses Hive metadata access Java API and
>>>> libhdfs to actually scan the data file. Therefore YARN is not needed for
>>>> Hive access.
>>>> YARN is not needed for native Hbase or Trafodion table access too.
>>>> YARN is needed for backup/restore, since the HBase exportSnapshot class
>>>> Trafodion calls, used MapReduce to copy large snapshot files to/from the
>>>> backup location.
>>>> YARN is also needed for developer regressions as some vanilla Hive
>>>> commands are executed during the regression run.
>>>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>>>
>>>> Thanks
>>>> Suresh
>>>>
>>>>
>>>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <ta...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>>>
>>>>> --
>>>>> Thanks,
>>>>>
>>>>> Gunnar
>>>>> *If you think you can you can, if you think you can't you're right.*
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>>
>>> Gunnar
>>> *If you think you can you can, if you think you can't you're right.*
>>>
>>
>>
>
>
> --
> Regards, --Qifan
>
>


-- 
Thanks,

Gunnar
*If you think you can you can, if you think you can't you're right.*

Re: MRv1 vs. MRv2

Posted by Qifan Chen <qi...@esgyn.com>.

Hi Hans,

I think Hive uses MapReduce to sort the data during table population, after
disabling YARN. This is observed on a workstation.

Thanks -Qifan

On Tue, Feb 2, 2016 at 3:12 PM, Hans Zeller <ha...@esgyn.com> wrote:

> Hi,
>
> That decision would be made by Hive, not Trafodion. For people who use
> install_local_hadoop, we recently changed that setup to use local
> MapReduce, not YARN, see
> https://issues.apache.org/jira/browse/TRAFODION-1781.
>
> Hans
>
> On Tue, Feb 2, 2016 at 12:58 PM, Gunnar Tapper <ta...@gmail.com>
> wrote:
>
>> Hi Suresh:
>>
>> Thanks for the information.
>>
>> Given from what you write, it seems that YARN with MRv2 is required for
>> full functionality.
>>
>> MRv1 is a separate install in current distributions, which is why I am
>> asking about it. How does Trafodion decide to run the MapReduce job as MRv1
>> vs. MRv2 if both are installed?
>>
>> Thanks,
>>
>> Gunnar
>>
>>
>>
>> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <
>> suresh.subbiah60@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I don't think Trafodion requires YARN for most activity.
>>>
>>> For Hive table access, Trafodion uses Hive metadata access Java API and
>>> libhdfs to actually scan the data file. Therefore YARN is not needed for
>>> Hive access.
>>> YARN is not needed for native Hbase or Trafodion table access too.
>>> YARN is needed for backup/restore, since the HBase exportSnapshot class
>>> Trafodion calls, used MapReduce to copy large snapshot files to/from the
>>> backup location.
>>> YARN is also needed for developer regressions as some vanilla Hive
>>> commands are executed during the regression run.
>>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>>
>>> Thanks
>>> Suresh
>>>
>>>
>>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <ta...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>>
>>>> --
>>>> Thanks,
>>>>
>>>> Gunnar
>>>> *If you think you can you can, if you think you can't you're right.*
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks,
>>
>> Gunnar
>> *If you think you can you can, if you think you can't you're right.*
>>
>
>


-- 
Regards, --Qifan

Re: MRv1 vs. MRv2

Posted by Hans Zeller <ha...@esgyn.com>.

Hi,

That decision would be made by Hive, not Trafodion. For people who use
install_local_hadoop, we recently changed that setup to use local
MapReduce, not YARN, see
https://issues.apache.org/jira/browse/TRAFODION-1781.

Hans

On Tue, Feb 2, 2016 at 12:58 PM, Gunnar Tapper <ta...@gmail.com>
wrote:

> Hi Suresh:
>
> Thanks for the information.
>
> Given from what you write, it seems that YARN with MRv2 is required for
> full functionality.
>
> MRv1 is a separate install in current distributions, which is why I am
> asking about it. How does Trafodion decide to run the MapReduce job as MRv1
> vs. MRv2 if both are installed?
>
> Thanks,
>
> Gunnar
>
>
>
> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <suresh.subbiah60@gmail.com
> > wrote:
>
>> Hi,
>>
>> I don't think Trafodion requires YARN for most activity.
>>
>> For Hive table access, Trafodion uses Hive metadata access Java API and
>> libhdfs to actually scan the data file. Therefore YARN is not needed for
>> Hive access.
>> YARN is not needed for native Hbase or Trafodion table access too.
>> YARN is needed for backup/restore, since the HBase exportSnapshot class
>> Trafodion calls, used MapReduce to copy large snapshot files to/from the
>> backup location.
>> YARN is also needed for developer regressions as some vanilla Hive
>> commands are executed during the regression run.
>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>
>> Thanks
>> Suresh
>>
>>
>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <ta...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>
>>> --
>>> Thanks,
>>>
>>> Gunnar
>>> *If you think you can you can, if you think you can't you're right.*
>>>
>>
>>
>
>
> --
> Thanks,
>
> Gunnar
> *If you think you can you can, if you think you can't you're right.*
>

Re: MRv1 vs. MRv2

Posted by Qifan Chen <qi...@esgyn.com>.

Hi Gunnar,

Hive uses map/reduce to load hive tables in non-external format, such as
ORC.  So even though not required directly, YARN is something very relevant
for Hive to get the data into hive tables that Trafodion can access.

Not sure if YARN (or other possibility) is used in anyway for system
resource management for Trafodion.  Something we need to consider in the
near future.

Regards, --Qifan

On Tue, Feb 2, 2016 at 2:58 PM, Gunnar Tapper <ta...@gmail.com>
wrote:

> Hi Suresh:
>
> Thanks for the information.
>
> Given from what you write, it seems that YARN with MRv2 is required for
> full functionality.
>
> MRv1 is a separate install in current distributions, which is why I am
> asking about it. How does Trafodion decide to run the MapReduce job as MRv1
> vs. MRv2 if both are installed?
>
> Thanks,
>
> Gunnar
>
>
>
> On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <suresh.subbiah60@gmail.com
> > wrote:
>
>> Hi,
>>
>> I don't think Trafodion requires YARN for most activity.
>>
>> For Hive table access, Trafodion uses Hive metadata access Java API and
>> libhdfs to actually scan the data file. Therefore YARN is not needed for
>> Hive access.
>> YARN is not needed for native Hbase or Trafodion table access too.
>> YARN is needed for backup/restore, since the HBase exportSnapshot class
>> Trafodion calls, used MapReduce to copy large snapshot files to/from the
>> backup location.
>> YARN is also needed for developer regressions as some vanilla Hive
>> commands are executed during the regression run.
>> For the last 2 lines I think both MRv1 and MRv2 is supported.
>>
>> Thanks
>> Suresh
>>
>>
>> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <ta...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>>
>>> --
>>> Thanks,
>>>
>>> Gunnar
>>> *If you think you can you can, if you think you can't you're right.*
>>>
>>
>>
>
>
> --
> Thanks,
>
> Gunnar
> *If you think you can you can, if you think you can't you're right.*
>



-- 
Regards, --Qifan

Re: MRv1 vs. MRv2

Posted by Gunnar Tapper <ta...@gmail.com>.

Hi Suresh:

Thanks for the information.

Given from what you write, it seems that YARN with MRv2 is required for
full functionality.

MRv1 is a separate install in current distributions, which is why I am
asking about it. How does Trafodion decide to run the MapReduce job as MRv1
vs. MRv2 if both are installed?

Thanks,

Gunnar



On Tue, Feb 2, 2016 at 1:50 PM, Suresh Subbiah <su...@gmail.com>
wrote:

> Hi,
>
> I don't think Trafodion requires YARN for most activity.
>
> For Hive table access, Trafodion uses Hive metadata access Java API and
> libhdfs to actually scan the data file. Therefore YARN is not needed for
> Hive access.
> YARN is not needed for native Hbase or Trafodion table access too.
> YARN is needed for backup/restore, since the HBase exportSnapshot class
> Trafodion calls, used MapReduce to copy large snapshot files to/from the
> backup location.
> YARN is also needed for developer regressions as some vanilla Hive
> commands are executed during the regression run.
> For the last 2 lines I think both MRv1 and MRv2 is supported.
>
> Thanks
> Suresh
>
>
> On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <ta...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>>
>> --
>> Thanks,
>>
>> Gunnar
>> *If you think you can you can, if you think you can't you're right.*
>>
>
>


-- 
Thanks,

Gunnar
*If you think you can you can, if you think you can't you're right.*

Re: MRv1 vs. MRv2

Posted by Suresh Subbiah <su...@gmail.com>.

Hi,

I don't think Trafodion requires YARN for most activity.

For Hive table access, Trafodion uses Hive metadata access Java API and
libhdfs to actually scan the data file. Therefore YARN is not needed for
Hive access.
YARN is not needed for native Hbase or Trafodion table access too.
YARN is needed for backup/restore, since the HBase exportSnapshot class
Trafodion calls, used MapReduce to copy large snapshot files to/from the
backup location.
YARN is also needed for developer regressions as some vanilla Hive commands
are executed during the regression run.
For the last 2 lines I think both MRv1 and MRv2 is supported.

Thanks
Suresh

On Tue, Feb 2, 2016 at 2:36 PM, Gunnar Tapper <ta...@gmail.com>
wrote:

> Hi,
>
> Does Trafodion require YARN with MRv2 or is MRv1 supported, too?
>
> --
> Thanks,
>
> Gunnar
> *If you think you can you can, if you think you can't you're right.*
>