You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Edward Capriolo <ed...@gmail.com> on 2015/05/01 18:18:26 UTC

Re: Branch for HIVE-10511: Replacing the implementation of Hive CLI using Beeline

 The health test result for HIVESERVER2_SCM_HEALTH has become bad: This
role's process is starting. This role is supposed to be started.
 Time: May 1, 2015 11:08:59 AM

Speak of the devil.

Does anyone care to comment? Is this "just me" can anyone claim
moderate/heavy trouble free usage of hiveserver2. Because its nice to say
like the 'CLI is deprecated', but... In our setup with use hue -> thrift
and jdbc->thrift supporting maybe 10-30 users and a very recent release and
the stability is not there. It is hard for me to believe that anything we
are doing is causing this instability.

Like i said, I get around it by running a load balancer etc, but I do not
want to be 100% reliant on it.

On Thu, Apr 30, 2015 at 10:39 AM, Edward Capriolo <ed...@gmail.com>
wrote:

> For what it is worth, even with all the enhancements made to hive-server 2
> it is just not as reliable as the CLI. I am actually using cloudera 5.3 and
> we have multiple hive-server2 instances behind a load balancer. The load we
> are doing is not particularly heavy and the mean time to failure is < 3
> days. ClouderaManager even provides a feature "automatically restart
> hiveserver2" it is very prone to issues. Many times I have been at the
> point where I have considered writing queries to a file and then launching
> them from the CLI -f because the reliability of hiveserver2 is
> questionable. I am a serious -1 on removing the CLI because it just plan
> works better.
>
> On Thu, Apr 30, 2015 at 9:29 AM, Xuefu Zhang <xz...@cloudera.com> wrote:
>
>> FYI. branch "beeline-cli" is created for the ongoing work in HIVE-10511.
>>
>> Thanks,
>> Xuefu
>>
>> On Mon, Apr 27, 2015 at 8:24 PM, Xuefu Zhang <xz...@cloudera.com> wrote:
>>
>> > Hi all,
>> >
>> > I have created this JIRA per discussion in the "deprecating Hive CLI"
>> > thread. The task is non-trivial, for which I think it makes sense to
>> create
>> > a branch for the development to avoid inference with other ongoing work
>> on
>> > master. If there is no objection, I will create a branch in 24 hours.
>> >
>> > Thanks,
>> > Xuefu
>> >
>>
>
>

Re: Branch for HIVE-10511: Replacing the implementation of Hive CLI using Beeline

Posted by Thejas Nair <th...@gmail.com>.
What sort of issue do you see ? Is it some memory leak ?
Most of the issues I saw getting filed recently happens when HS2 uses
remote metastore.
Can you try switching to use embedded metastore (hiveserver2
--hiveconf hive.metastore.uris=' '). For multiple reasons (including
historic reason of their being a bug with use of remote metastore in
old version of HS2), the hortonworks default configuration has been
using embedded metastore.


On Fri, May 1, 2015 at 9:18 AM, Edward Capriolo <ed...@gmail.com> wrote:
>  The health test result for HIVESERVER2_SCM_HEALTH has become bad: This
> role's process is starting. This role is supposed to be started.
>  Time: May 1, 2015 11:08:59 AM
>
> Speak of the devil.
>
> Does anyone care to comment? Is this "just me" can anyone claim
> moderate/heavy trouble free usage of hiveserver2. Because its nice to say
> like the 'CLI is deprecated', but... In our setup with use hue -> thrift
> and jdbc->thrift supporting maybe 10-30 users and a very recent release and
> the stability is not there. It is hard for me to believe that anything we
> are doing is causing this instability.
>
> Like i said, I get around it by running a load balancer etc, but I do not
> want to be 100% reliant on it.
>
> On Thu, Apr 30, 2015 at 10:39 AM, Edward Capriolo <ed...@gmail.com>
> wrote:
>
>> For what it is worth, even with all the enhancements made to hive-server 2
>> it is just not as reliable as the CLI. I am actually using cloudera 5.3 and
>> we have multiple hive-server2 instances behind a load balancer. The load we
>> are doing is not particularly heavy and the mean time to failure is < 3
>> days. ClouderaManager even provides a feature "automatically restart
>> hiveserver2" it is very prone to issues. Many times I have been at the
>> point where I have considered writing queries to a file and then launching
>> them from the CLI -f because the reliability of hiveserver2 is
>> questionable. I am a serious -1 on removing the CLI because it just plan
>> works better.
>>
>> On Thu, Apr 30, 2015 at 9:29 AM, Xuefu Zhang <xz...@cloudera.com> wrote:
>>
>>> FYI. branch "beeline-cli" is created for the ongoing work in HIVE-10511.
>>>
>>> Thanks,
>>> Xuefu
>>>
>>> On Mon, Apr 27, 2015 at 8:24 PM, Xuefu Zhang <xz...@cloudera.com> wrote:
>>>
>>> > Hi all,
>>> >
>>> > I have created this JIRA per discussion in the "deprecating Hive CLI"
>>> > thread. The task is non-trivial, for which I think it makes sense to
>>> create
>>> > a branch for the development to avoid inference with other ongoing work
>>> on
>>> > master. If there is no objection, I will create a branch in 24 hours.
>>> >
>>> > Thanks,
>>> > Xuefu
>>> >
>>>
>>
>>

Re: Branch for HIVE-10511: Replacing the implementation of Hive CLI using Beeline

Posted by Alexander Pivovarov <ap...@gmail.com>.
How did we get to the situation when actions of one PMC leads another PMC
to say that he is serious on using -1 ?

Lets look at the document describing how Apache Software Foundation works:

"...process scaled very well without creating friction."

"...they were only filtering the people that ... matched the human
attitudes required to work well with others, especially in disagreement."

"This process is called "consensus gathering" and we consider it a very
important indication of a healthy community."

http://www.apache.org/foundation/how-it-works.html#meritocracy
http://www.apache.org/foundation/how-it-works.html#management


I believe all agree that HS2 and jdbc driver are key part of Hive project.

Should we continue to work on their improvement and postpone hive-cli vs
beeline battle till most of the people agree that HS2 is stable and jdbc
driver meets any standard

Yesterday I tried Apache DBUtils QueryRunner with hive-jdbc-1.1.0. Was able
to make it work only after set pmdKnownBroken=true.
HiveCallableStatement.getParameterMetaData is not supported in hive-jdbc
https://github.com/apache/hive/blob/master/jdbc/src/java/org/apache/hive/jdbc/HiveCallableStatement.java
line 1422

HiveCallableStatement class has 2462 lines. ALL its methods except of "
getConnection()" throws new SQLException("Method not supported");

How hive can be used with BI tools or anything else if jdbc interface is
that poor?

I think jdbc driver quality and HS2 stability is very important for hive
popularity.
Good jdbc interface will significantly extend Tez and Spark engines
userbase.

Alex


On Fri, May 1, 2015 at 9:18 AM, Edward Capriolo <ed...@gmail.com>
wrote:

>  The health test result for HIVESERVER2_SCM_HEALTH has become bad: This
> role's process is starting. This role is supposed to be started.
>  Time: May 1, 2015 11:08:59 AM
>
> Speak of the devil.
>
> Does anyone care to comment? Is this "just me" can anyone claim
> moderate/heavy trouble free usage of hiveserver2. Because its nice to say
> like the 'CLI is deprecated', but... In our setup with use hue -> thrift
> and jdbc->thrift supporting maybe 10-30 users and a very recent release and
> the stability is not there. It is hard for me to believe that anything we
> are doing is causing this instability.
>
> Like i said, I get around it by running a load balancer etc, but I do not
> want to be 100% reliant on it.
>
> On Thu, Apr 30, 2015 at 10:39 AM, Edward Capriolo <ed...@gmail.com>
> wrote:
>
> > For what it is worth, even with all the enhancements made to hive-server
> 2
> > it is just not as reliable as the CLI. I am actually using cloudera 5.3
> and
> > we have multiple hive-server2 instances behind a load balancer. The load
> we
> > are doing is not particularly heavy and the mean time to failure is < 3
> > days. ClouderaManager even provides a feature "automatically restart
> > hiveserver2" it is very prone to issues. Many times I have been at the
> > point where I have considered writing queries to a file and then
> launching
> > them from the CLI -f because the reliability of hiveserver2 is
> > questionable. I am a serious -1 on removing the CLI because it just plan
> > works better.
> >
> > On Thu, Apr 30, 2015 at 9:29 AM, Xuefu Zhang <xz...@cloudera.com>
> wrote:
> >
> >> FYI. branch "beeline-cli" is created for the ongoing work in HIVE-10511.
> >>
> >> Thanks,
> >> Xuefu
> >>
> >> On Mon, Apr 27, 2015 at 8:24 PM, Xuefu Zhang <xz...@cloudera.com>
> wrote:
> >>
> >> > Hi all,
> >> >
> >> > I have created this JIRA per discussion in the "deprecating Hive CLI"
> >> > thread. The task is non-trivial, for which I think it makes sense to
> >> create
> >> > a branch for the development to avoid inference with other ongoing
> work
> >> on
> >> > master. If there is no objection, I will create a branch in 24 hours.
> >> >
> >> > Thanks,
> >> > Xuefu
> >> >
> >>
> >
> >
>