You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Pala M Muthaia <mc...@rocketfuelinc.com> on 2014/12/02 22:28:59 UTC

Enabling Tez sessions on HiveServer2

Hi,

I am trying to get Tez sessions enabled with HS2. I start the HiveServer2
instance with the flag "-hiveconf hive.execution.engine=tez" and then try
to submit multiple queries one after another, as the same user, to the HS2
instance.

When i check the YARN UI, i find that each query of mine is launched as a
new YARN application. While the new Tez application is running, the old Tez
applications are still alive. This is different from Tez session in Hive
CLI, where multiple queries are submitted to the same Tez application (if
launched within the Tez session timeout).


I followed the config instructions at
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez-configure_hive_for_tez.html
so far.

Is there a separate config flag that i need to turn on for Tez sessions on
HS2? How should i enable Tez sessions with HiveServer2.


-pala

Re: Enabling Tez sessions on HiveServer2

Posted by Vikram Dixit <vi...@hortonworks.com>.
Hi Pala,

The doAs turning off requirement is the last entry in the table on
that page. It needs to be turned off while using the pool of sessions
because once the session has started, there is currently no way of
changing the user of the session. With the doAs turned off, since
everything runs as user hive, that problem is avoided.

However, as you mentioned, if you have the case of needing to audit
etc., you cannot use the pool. doAs disable is the requirement for SQL
standard authorization (HIVE-5837) as well. The initial goal was to
address the case where there are users running an interactive set of
queries and others running batch queries. The users running
interactive queries would use the pool and its cached resources not
minding the lack of auditing whereas the users running batch can run
as themselves and get their own sessions. We are looking to address
the issue of different users creating tables and auditing and this is
a good data point.

You are right in that if you have doAs turned on, the behavior is same
as the CLI.

Thanks
Vikram.

On Wed, Dec 3, 2014 at 3:34 PM, Pala M Muthaia
<mc...@rocketfuelinc.com> wrote:
> I should have mentioned this earlier: I am using Hive 13, Tez 0.4.1 on
> Hadoop 2.5
>
> Hi Vikram,
>
> Yes my intention was to enable shared sessions on HiveServer2 so that
> queries from multiple users can benefit from YARN app and container reuse.
>
> I didn't know doAs needs to be turned off. But I don't think that is
> something to give up - users create tables, manage data, query etc, and we
> need the queries/jobs to run as the user who submitted them for various
> purposes including authorization, auditing, table ownership etc. (Btw, i
> don't see any mention of turning off doAs to use session pools in the above
> link. Maybe it's a different page?)
>
> I ended up not setting any of the settings mentioned. Hive - HS2 - Tez stack
> works without any of the settings, with doAs turned on.
>
> As of now, what i see is that Tez application reuse happens if a single user
> submits queries one after the other, within a short interval, on the same
> JDBC connection. I suppose this is similar to Hive CLI.
>
> Is there a reason why session pool cannot be made to work with doAs enabled?
> Unless data is cached and reused within the pool containers, this should be
> ok right?
>
> On Tue, Dec 2, 2014 at 9:35 PM, Vikram Dixit <vi...@hortonworks.com> wrote:
>>
>> Hi Pala,
>>
>> Can you share the settings you have used for those mentioned in the
>> document? Are you trying to use the tez session pools? As the document
>> mentions, you need to disable doAs to ensure usage of the session
>> pool. Also, the hive server 2 settings need to be in place at the time
>> of starting the server.
>>
>> Regards
>> Vikram.
>>
>> On Tue, Dec 2, 2014 at 2:42 PM, Hitesh Shah <hi...@apache.org> wrote:
>> > BCC’ed user@tez.
>> >
>> > This question belongs to either the hive user list or the Hortonworks
>> > user forums.
>> >
>> > thanks
>> > — Hitesh
>> >
>> > On Dec 2, 2014, at 1:28 PM, Pala M Muthaia <mc...@rocketfuelinc.com>
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> I am trying to get Tez sessions enabled with HS2. I start the
>> >> HiveServer2 instance with the flag "-hiveconf hive.execution.engine=tez" and
>> >> then try to submit multiple queries one after another, as the same user, to
>> >> the HS2 instance.
>> >>
>> >> When i check the YARN UI, i find that each query of mine is launched as
>> >> a new YARN application. While the new Tez application is running, the old
>> >> Tez applications are still alive. This is different from Tez session in Hive
>> >> CLI, where multiple queries are submitted to the same Tez application (if
>> >> launched within the Tez session timeout).
>> >>
>> >>
>> >> I followed the config instructions at
>> >> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez-configure_hive_for_tez.html
>> >> so far.
>> >>
>> >> Is there a separate config flag that i need to turn on for Tez sessions
>> >> on HS2? How should i enable Tez sessions with HiveServer2.
>> >>
>> >>
>> >> -pala
>> >
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Enabling Tez sessions on HiveServer2

Posted by Pala M Muthaia <mc...@rocketfuelinc.com>.
I should have mentioned this earlier: I am using Hive 13, Tez 0.4.1 on
Hadoop 2.5

Hi Vikram,

Yes my intention was to enable shared sessions on HiveServer2 so that
queries from multiple users can benefit from YARN app and container reuse.

I didn't know doAs needs to be turned off. But I don't think that is
something to give up - users create tables, manage data, query etc, and we
need the queries/jobs to run as the user who submitted them for various
purposes including authorization, auditing, table ownership etc. (Btw, i
don't see any mention of turning off doAs to use session pools in the above
link. Maybe it's a different page?)

I ended up not setting any of the settings mentioned. Hive - HS2 - Tez
stack works without any of the settings, with doAs turned on.

As of now, what i see is that Tez application reuse happens if a single
user submits queries one after the other, within a short interval, on the
same JDBC connection. I suppose this is similar to Hive CLI.

Is there a reason why session pool cannot be made to work with doAs
enabled? Unless data is cached and reused within the pool containers, this
should be ok right?

On Tue, Dec 2, 2014 at 9:35 PM, Vikram Dixit <vi...@hortonworks.com> wrote:

> Hi Pala,
>
> Can you share the settings you have used for those mentioned in the
> document? Are you trying to use the tez session pools? As the document
> mentions, you need to disable doAs to ensure usage of the session
> pool. Also, the hive server 2 settings need to be in place at the time
> of starting the server.
>
> Regards
> Vikram.
>
> On Tue, Dec 2, 2014 at 2:42 PM, Hitesh Shah <hi...@apache.org> wrote:
> > BCC’ed user@tez.
> >
> > This question belongs to either the hive user list or the Hortonworks
> user forums.
> >
> > thanks
> > — Hitesh
> >
> > On Dec 2, 2014, at 1:28 PM, Pala M Muthaia <mc...@rocketfuelinc.com>
> wrote:
> >
> >> Hi,
> >>
> >> I am trying to get Tez sessions enabled with HS2. I start the
> HiveServer2 instance with the flag "-hiveconf hive.execution.engine=tez"
> and then try to submit multiple queries one after another, as the same
> user, to the HS2 instance.
> >>
> >> When i check the YARN UI, i find that each query of mine is launched as
> a new YARN application. While the new Tez application is running, the old
> Tez applications are still alive. This is different from Tez session in
> Hive CLI, where multiple queries are submitted to the same Tez application
> (if launched within the Tez session timeout).
> >>
> >>
> >> I followed the config instructions at
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez-configure_hive_for_tez.html
> so far.
> >>
> >> Is there a separate config flag that i need to turn on for Tez sessions
> on HS2? How should i enable Tez sessions with HiveServer2.
> >>
> >>
> >> -pala
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Enabling Tez sessions on HiveServer2

Posted by Vikram Dixit <vi...@hortonworks.com>.
Hi Pala,

Can you share the settings you have used for those mentioned in the
document? Are you trying to use the tez session pools? As the document
mentions, you need to disable doAs to ensure usage of the session
pool. Also, the hive server 2 settings need to be in place at the time
of starting the server.

Regards
Vikram.

On Tue, Dec 2, 2014 at 2:42 PM, Hitesh Shah <hi...@apache.org> wrote:
> BCC’ed user@tez.
>
> This question belongs to either the hive user list or the Hortonworks user forums.
>
> thanks
> — Hitesh
>
> On Dec 2, 2014, at 1:28 PM, Pala M Muthaia <mc...@rocketfuelinc.com> wrote:
>
>> Hi,
>>
>> I am trying to get Tez sessions enabled with HS2. I start the HiveServer2 instance with the flag "-hiveconf hive.execution.engine=tez" and then try to submit multiple queries one after another, as the same user, to the HS2 instance.
>>
>> When i check the YARN UI, i find that each query of mine is launched as a new YARN application. While the new Tez application is running, the old Tez applications are still alive. This is different from Tez session in Hive CLI, where multiple queries are submitted to the same Tez application (if launched within the Tez session timeout).
>>
>>
>> I followed the config instructions at http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez-configure_hive_for_tez.html so far.
>>
>> Is there a separate config flag that i need to turn on for Tez sessions on HS2? How should i enable Tez sessions with HiveServer2.
>>
>>
>> -pala
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Enabling Tez sessions on HiveServer2

Posted by Pala M Muthaia <mc...@rocketfuelinc.com>.
Ok.

Turns out that if i use the same JDBC connection, each successive query
from the same user gets submitted to the same Tez application. If i create
a different JDBC connection for each query, then each query runs in its own
Tez application.

On Tue, Dec 2, 2014 at 2:42 PM, Hitesh Shah <hi...@apache.org> wrote:

> BCC’ed user@tez.
>
> This question belongs to either the hive user list or the Hortonworks user
> forums.
>
> thanks
> — Hitesh
>
> On Dec 2, 2014, at 1:28 PM, Pala M Muthaia <mc...@rocketfuelinc.com>
> wrote:
>
> > Hi,
> >
> > I am trying to get Tez sessions enabled with HS2. I start the
> HiveServer2 instance with the flag "-hiveconf hive.execution.engine=tez"
> and then try to submit multiple queries one after another, as the same
> user, to the HS2 instance.
> >
> > When i check the YARN UI, i find that each query of mine is launched as
> a new YARN application. While the new Tez application is running, the old
> Tez applications are still alive. This is different from Tez session in
> Hive CLI, where multiple queries are submitted to the same Tez application
> (if launched within the Tez session timeout).
> >
> >
> > I followed the config instructions at
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez-configure_hive_for_tez.html
> so far.
> >
> > Is there a separate config flag that i need to turn on for Tez sessions
> on HS2? How should i enable Tez sessions with HiveServer2.
> >
> >
> > -pala
>
>

Re: Enabling Tez sessions on HiveServer2

Posted by Hitesh Shah <hi...@apache.org>.
BCC’ed user@tez.

This question belongs to either the hive user list or the Hortonworks user forums. 

thanks
— Hitesh

On Dec 2, 2014, at 1:28 PM, Pala M Muthaia <mc...@rocketfuelinc.com> wrote:

> Hi,
> 
> I am trying to get Tez sessions enabled with HS2. I start the HiveServer2 instance with the flag "-hiveconf hive.execution.engine=tez" and then try to submit multiple queries one after another, as the same user, to the HS2 instance. 
> 
> When i check the YARN UI, i find that each query of mine is launched as a new YARN application. While the new Tez application is running, the old Tez applications are still alive. This is different from Tez session in Hive CLI, where multiple queries are submitted to the same Tez application (if launched within the Tez session timeout).
> 
> 
> I followed the config instructions at http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez-configure_hive_for_tez.html so far.
> 
> Is there a separate config flag that i need to turn on for Tez sessions on HS2? How should i enable Tez sessions with HiveServer2.
> 
> 
> -pala


Re: Enabling Tez sessions on HiveServer2

Posted by Hitesh Shah <hi...@apache.org>.
BCC’ed user@tez.

This question belongs to either the hive user list or the Hortonworks user forums. 

thanks
— Hitesh

On Dec 2, 2014, at 1:28 PM, Pala M Muthaia <mc...@rocketfuelinc.com> wrote:

> Hi,
> 
> I am trying to get Tez sessions enabled with HS2. I start the HiveServer2 instance with the flag "-hiveconf hive.execution.engine=tez" and then try to submit multiple queries one after another, as the same user, to the HS2 instance. 
> 
> When i check the YARN UI, i find that each query of mine is launched as a new YARN application. While the new Tez application is running, the old Tez applications are still alive. This is different from Tez session in Hive CLI, where multiple queries are submitted to the same Tez application (if launched within the Tez session timeout).
> 
> 
> I followed the config instructions at http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.2/bk_installing_manually_book/content/rpm-chap-tez-configure_hive_for_tez.html so far.
> 
> Is there a separate config flag that i need to turn on for Tez sessions on HS2? How should i enable Tez sessions with HiveServer2.
> 
> 
> -pala