You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Thejas Nair <th...@gmail.com> on 2015/05/01 00:10:29 UTC

Re: [DISCUSS] Deprecating Hive CLI

Hi Xuefu,
What is the plan you have in mind for a transition to using beeline
from within hive?
I assume there is going to be some translation from hive cli options
and commands to beeline. Is that right ?
Once the translation is in place, how would the switch happen ?

I am thinking that once there is a hive-cli compatible beeline mode,
there can be an option to switch between beeline and hive cli codebase
.
For example,
In hive version X , when an environment variable CLI_USE_BEELINE=true
environment variable is set, "hive" command uses beeline underneath
(default remains cli codepath, so that users can start experimenting
with "hive" commands beeline mode).
In hive version Y > X, by default "hive" command starts using beeline
underneath.

Is it something like this what you have in mind ?

Thanks,
Thejas



On Mon, Apr 27, 2015 at 5:31 PM, Xuefu Zhang <xz...@cloudera.com> wrote:
> FYI, I have created an uber JIRA for this:
> https://issues.apache.org/jira/browse/HIVE-10511.
>
> Thanks,
> Xuefu
>
> On Mon, Apr 27, 2015 at 4:54 PM, Xuefu Zhang <xz...@cloudera.com> wrote:
>
>> Yes, Olga. I  will create JIRAs to track those.
>>
>> Thanks,
>> Xuefu
>>
>> On Mon, Apr 27, 2015 at 4:51 PM, Olga L. Natkovich <
>> olgan@yahoo-inc.com.invalid> wrote:
>>
>>> We would need to build a test suite that makes sure that new
>>> implementation is compatible with the old one for users to adopt it. We
>>> would also need some benchmarks to compare performance. Could you please
>>> include this in the proposal as well.
>>> Thanks,
>>> Olga
>>>       From: Xuefu Zhang <xz...@cloudera.com>
>>>  To: "dev@hive.apache.org" <de...@hive.apache.org>
>>>  Sent: Monday, April 27, 2015 4:46 PM
>>>  Subject: Re: [DISCUSS] Deprecating Hive CLI
>>>
>>> Existing implementation of Hive CLI will be replaced, so that Hive
>>> community don't need to maintain two code paths for the same thing. That's
>>> basically what option #2 provides.
>>>
>>>
>>>
>>> On Mon, Apr 27, 2015 at 4:01 PM, Alexander Pivovarov <
>>> apivovarov@gmail.com>
>>> wrote:
>>>
>>> > Does it mean that existing Hive CLI will be killed?
>>> >
>>> > On Mon, Apr 27, 2015 at 3:46 PM, Xuefu Zhang <xz...@cloudera.com>
>>> wrote:
>>> >
>>> > > To be precise, the proposal is NOT deprecating, but more of changing
>>> the
>>> > > implementation of the Hive CLI using beeline, which seems in
>>> consensus.
>>> > >
>>> > > On Mon, Apr 27, 2015 at 2:47 PM, Alexander Pivovarov <
>>> > apivovarov@gmail.com
>>> > > >
>>> > > wrote:
>>> > >
>>> > > > I just started the survey on Deprecating Hive CLI. Please share you
>>> > > > opinion.
>>> > > >
>>> > > > Deprecating Hive CLI:
>>> > > > https://www.surveymonkey.com/s/XFHLM57
>>> > > >
>>> > > > Results:
>>> > > > https://www.surveymonkey.com/results/SM-JHYY5DR9/
>>> > > >
>>> > > >
>>> > > > On Mon, Apr 27, 2015 at 2:23 PM, Alexander Pivovarov <
>>> > > apivovarov@gmail.com
>>> > > > >
>>> > > > wrote:
>>> > > >
>>> > > > > Xuefu,
>>> > > > >
>>> > > > > I'm just saying that most of the shells (e.g. mysql or accumulo)
>>> > > reserve
>>> > > > > -u for user.
>>> > > > >
>>> > > > > I believe lots of stuff in Hive take MySQL as an example.
>>> > > > >
>>> > > > > Alex
>>> > > > >
>>> > > > >
>>> > > > > On Mon, Apr 27, 2015 at 2:14 PM, Xuefu Zhang <xzhang@cloudera.com
>>> >
>>> > > > wrote:
>>> > > > >
>>> > > > >> Alex,
>>> > > > >>
>>> > > > >> Just to be sure, we are talking about replace Hive CLI, not mysql
>>> > and
>>> > > > >> accumulo command line shells. Thus, I'm not sure this is
>>> relavent.
>>> > > > >> Regardless, I think we'd better have some writeup in the proposed
>>> > uber
>>> > > > >> JIRA
>>> > > > >> so that everyone knows what we are signing up.
>>> > > > >>
>>> > > > >> Thanks,
>>> > > > >> Xuefu
>>> > > > >>
>>> > > > >> On Mon, Apr 27, 2015 at 12:57 PM, Alexander Pivovarov <
>>> > > > >> apivovarov@gmail.com>
>>> > > > >> wrote:
>>> > > > >>
>>> > > > >> > Mysql and accumulo command line shells use -u to pass <user>
>>> > > > >> >
>>> > > > >> > Can beeline use -u as well? Currently -u is reserved for URL?
>>> > > > >> > On Apr 27, 2015 12:42 PM, "Xuefu Zhang" <xz...@cloudera.com>
>>> > > wrote:
>>> > > > >> >
>>> > > > >> > > Thanks to all for the input. I assume that we have a
>>> consensus
>>> > > that
>>> > > > >> we'd
>>> > > > >> > > like to keep Hive as an alias to beeline with embedded HS2
>>> and
>>> > > make
>>> > > > >> user
>>> > > > >> > > transition as smooth as possible by identifying gaps and
>>> fixing
>>> > > > >> issues.
>>> > > > >> > I'm
>>> > > > >> > > going to create an umbrella JIRA and subtasks to track the
>>> > > progress.
>>> > > > >> > Please
>>> > > > >> > > let me know if you have further questions.
>>> > > > >> > >
>>> > > > >> > > Thanks,
>>> > > > >> > > Xuefu
>>> > > > >> > >
>>> > > > >> > > On Sat, Apr 25, 2015 at 12:59 AM, Lars Francke <
>>> > > > >> lars.francke@gmail.com>
>>> > > > >> > > wrote:
>>> > > > >> > >
>>> > > > >> > > > Yes, well put. It is about usability and "least surprise".
>>> > > > >> > > >
>>> > > > >> > > > So if people wouldn't have to deal with JDBC syntax by
>>> default
>>> > > and
>>> > > > >> > could
>>> > > > >> > > > use "hive" instead of "beeline" to start that'd be good.
>>> > > > >> > > >
>>> > > > >> > > >
>>> > > > >> > > > On Sat, Apr 25, 2015 at 12:38 AM, Alan Gates <
>>> > > > alanfgates@gmail.com>
>>> > > > >> > > wrote:
>>> > > > >> > > >
>>> > > > >> > > >> If I understand correctly this is an argument about
>>> > usability,
>>> > > > not
>>> > > > >> > > >> functionality.  So if Hive still had the CLI but it
>>> happened
>>> > to
>>> > > > use
>>> > > > >> > > either
>>> > > > >> > > >> HS2 or embedded HS2 (depending on configuration)
>>> underneath
>>> > > your
>>> > > > >> > > concerns
>>> > > > >> > > >> would be addressed.  Is that correct?
>>> > > > >> > > >>
>>> > > > >> > > >> Alan.
>>> > > > >> > > >>
>>> > > > >> > > >>  Lars Francke <la...@gmail.com>
>>> > > > >> > > >>  April 23, 2015 at 15:53
>>> > > > >> > > >> I've been at about 20 different customers in the years
>>> since
>>> > > > >> Beeline
>>> > > > >> > has
>>> > > > >> > > >> been added. I can only think of a single one that has used
>>> > > > beeline.
>>> > > > >> > The
>>> > > > >> > > >> instinct is to use "hive", partially because it is easy to
>>> > > > remember
>>> > > > >> > and
>>> > > > >> > > >> intuitive and because it is easier to use. I end up
>>> googling
>>> > > the
>>> > > > >> > stupid
>>> > > > >> > > >> JDBC syntax every single time.
>>> > > > >> > > >>
>>> > > > >> > > >> I know this might be a bit "out there" but I propose
>>> > something
>>> > > > >> else:
>>> > > > >> > > >> 1) Rename (or link) "beeline" to "hive"
>>> > > > >> > > >> 2) Add a "--hiveserver2" (or "--jdbc" or "--beeline")
>>> option
>>> > to
>>> > > > the
>>> > > > >> > > >> "hive" command to get the current "beeline", this'd keep
>>> the
>>> > > CLI
>>> > > > as
>>> > > > >> > > >> default, we could also add a "--legacy" or "--cli" option
>>> and
>>> > > > make
>>> > > > >> > > >> "hiveserver2/beeline" the default.
>>> > > > >> > > >> 3) Add a "--embedded-hs2" option to the "hive" command to
>>> get
>>> > > an
>>> > > > >> > > embedded
>>> > > > >> > > >> HS2 in Beeline
>>> > > > >> > > >> 4) Add some documentation to beeline reminding people on
>>> > > startup
>>> > > > of
>>> > > > >> > > >> beeline on how to connect and how to use embedded mode
>>> > > > >> > > >>
>>> > > > >> > > >> The fact is that the old shell just works for lots of
>>> people
>>> > > and
>>> > > > >> > there's
>>> > > > >> > > >> just no need for beeline for these people. Also the name
>>> is
>>> > > > >> confusing
>>> > > > >> > -
>>> > > > >> > > >> especially for non-native speakers. It's not a common
>>> word so
>>> > > > it's
>>> > > > >> not
>>> > > > >> > > easy
>>> > > > >> > > >> to remember.
>>> > > > >> > > >>
>>> > > > >> > > >>
>>> > > > >> > > >>  Alan Gates <al...@gmail.com>
>>> > > > >> > > >>  April 23, 2015 at 15:35
>>> > > > >> > > >>  Xuefu, thanks for getting this discussion started.
>>> Limiting
>>> > > our
>>> > > > >> code
>>> > > > >> > > >> paths is definitely a plus.  My inclination would be to go
>>> > > > towards
>>> > > > >> > > option
>>> > > > >> > > >> 2.  A few questions:
>>> > > > >> > > >>
>>> > > > >> > > >> 1) Is there any functionality in CLI that's not in
>>> beeline?
>>> > > > >> > > >> 2) If I understand correctly option 2 would have an
>>> implicit
>>> > > HS2
>>> > > > in
>>> > > > >> > > >> process when a user runs the CLI.  Would this be
>>> available in
>>> > > > >> option 1
>>> > > > >> > > as
>>> > > > >> > > >> well?
>>> > > > >> > > >> 3) Are there any performance implications, since now
>>> commands
>>> > > > have
>>> > > > >> to
>>> > > > >> > > hop
>>> > > > >> > > >> through a thrift/jdbc loop even in the embedded mode?
>>> > > > >> > > >> 4) If we choose option 2 how backward compatible can we
>>> make
>>> > > it?
>>> > > > >> Will
>>> > > > >> > > >> users need to change any scripts they have that use the
>>> CLI?
>>> > > Do
>>> > > > we
>>> > > > >> > have
>>> > > > >> > > >> tests that will make sure of this?
>>> > > > >> > > >>
>>> > > > >> > > >> Alan.
>>> > > > >> > > >>
>>> > > > >> > > >>  Xuefu Zhang <xz...@cloudera.com>
>>> > > > >> > > >>  April 23, 2015 at 14:43
>>> > > > >> > > >> Hi all,
>>> > > > >> > > >>
>>> > > > >> > > >> I'd like to revive the discussion about the fate of Hive
>>> CLI,
>>> > > as
>>> > > > >> this
>>> > > > >> > > >> topic
>>> > > > >> > > >> has haunted us several times including [1][2]. It looks
>>> to me
>>> > > > that
>>> > > > >> > there
>>> > > > >> > > >> is
>>> > > > >> > > >> a consensus that it's not wise for Hive community to keep
>>> > both
>>> > > > Hive
>>> > > > >> > CLI
>>> > > > >> > > as
>>> > > > >> > > >> it is as well as Beeline + HS2. However, I don't believe
>>> that
>>> > > no
>>> > > > >> > action
>>> > > > >> > > is
>>> > > > >> > > >> the best action for us. From discussion so far, I see the
>>> > > > following
>>> > > > >> > > >> proposals:
>>> > > > >> > > >>
>>> > > > >> > > >> 1. Deprecating Hive CLI and advise that users use Beeline.
>>> > > > >> > > >> 2. Make Hive CLI as naming flavor to beeline with embedded
>>> > > mode.
>>> > > > >> > > >>
>>> > > > >> > > >> Frankly, I don't see much difference between the two
>>> > > approaches.
>>> > > > >> > Keeping
>>> > > > >> > > >> an
>>> > > > >> > > >> alias at script or even code level isn't that much work.
>>> > > However,
>>> > > > >> > > >> shouldn't
>>> > > > >> > > >> we pick a direction and start moving to it? If there is
>>> any
>>> > > gaps
>>> > > > >> > between
>>> > > > >> > > >> beeline embedded and Hive CLI, we should identify and
>>> fill in
>>> > > > >> those.
>>> > > > >> > > >>
>>> > > > >> > > >> I'd love to hear the thoughts from the community and hope
>>> > this
>>> > > > >> time we
>>> > > > >> > > >> will
>>> > > > >> > > >> have concrete action items to work on.
>>> > > > >> > > >>
>>> > > > >> > > >> Thanks,
>>> > > > >> > > >> Xuefu
>>> > > > >> > > >>
>>> > > > >> > > >> [1]
>>> > > > >> > > >>
>>> > > > >> > > >>
>>> > > > >> > >
>>> > > > >> >
>>> > > > >>
>>> > > >
>>> > >
>>> >
>>> http://mail-archives.apache.org/mod_mbox/hive-dev/201412.mbox/%3C5485E1BE.3060709%40hortonworks.com%3E
>>> > > > >> > > >> [2]
>>> > > > >> https://www.mail-archive.com/dev@hive.apache.org/msg112378.html
>>> > > > >> > > >>
>>> > > > >> > > >>
>>> > > > >> > > >
>>> > > > >> > >
>>> > > > >> >
>>> > > > >>
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>>
>>>
>>>
>>
>>

Re: [DISCUSS] Deprecating Hive CLI

Posted by Thejas Nair <th...@gmail.com>.
That sounds fine to me.  My main concern is was that we should allow
users to switch back if they encounter some corner case bugs, for at
least a release or two.
Yes, we can add that warning as well.


On Thu, Apr 30, 2015 at 6:15 PM, Xuefu Zhang <xz...@cloudera.com> wrote:
> Okay. That's fine. I think supporting an env variable doesn't take much.
> What about enabling the new code path by default, and allowing user to
> opt-out or in case of a serious bug? We also give user an warning that the
> env variable may be discontinued in the future.
>
> thanks,
> Xuefu
>
> On Thu, Apr 30, 2015 at 5:13 PM, Thejas Nair <th...@gmail.com> wrote:
>
>> In most cases with hive, when a major implementation change is made,
>> we usually provide the user to fallback to older implementation. For
>> example, when CBO was added, it was initially not enabled by default,
>> and there still option of using non-CBO path. When new hadoop major
>> versions are added, we still give users option of using older hadoop
>> versions for some time. Or in case of jdbc, we allowed users to choose
>> between HiveServer1 and 2 for sometime. Even with putting good effort
>> into testing, some corner cases sometimes get missed.
>>
>> On similar lines, it would be good to let opt-in for a release, and
>> then switch the default in the next release. Given that we have been
>> making new releases of hive every few months, I don't see this as a
>> big issue. I think we should at the minimum allow users to opt out of
>> new implementation for a release or so (if they encounter bugs).
>>
>> Most of the work is going to be in ensuring the compatibility.
>> Supporting a flag to choose implementation should be relatively
>> simpler work. What do you think ?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Apr 30, 2015 at 4:42 PM, Xuefu Zhang <xz...@cloudera.com> wrote:
>> > Hi Thejas,
>> >
>> > Thanks for your input. I thought about this, but I don't really feel it
>> > necessary to have a "transition" stage. After all, Hive CLI is a command
>> > line tool with well-defined command line options. That's the "interface"
>> > that we need to support. We are just changing the implementation. Through
>> > comprehensive testing, we hope to discover most of the issues.
>> >
>> > On the other hand, if we have such an transition, there might never be a
>> > user bothering to flip the env variable and the transition doesn't really
>> > build up more confidence.
>> >
>> > In addition, if we provide either a transition or switch for every
>> > implementation change, wouldn't users be overwhelmed by those transitions
>> > or switches.
>> >
>> > Thoughts?
>> >
>> > Thanks,
>> > Xuefu
>> >
>> > On Thu, Apr 30, 2015 at 3:10 PM, Thejas Nair <th...@gmail.com>
>> wrote:
>> >
>> >> Hi Xuefu,
>> >> What is the plan you have in mind for a transition to using beeline
>> >> from within hive?
>> >> I assume there is going to be some translation from hive cli options
>> >> and commands to beeline. Is that right ?
>> >> Once the translation is in place, how would the switch happen ?
>> >>
>> >> I am thinking that once there is a hive-cli compatible beeline mode,
>> >> there can be an option to switch between beeline and hive cli codebase
>> >> .
>> >> For example,
>> >> In hive version X , when an environment variable CLI_USE_BEELINE=true
>> >> environment variable is set, "hive" command uses beeline underneath
>> >> (default remains cli codepath, so that users can start experimenting
>> >> with "hive" commands beeline mode).
>> >> In hive version Y > X, by default "hive" command starts using beeline
>> >> underneath.
>> >>
>> >> Is it something like this what you have in mind ?
>> >>
>> >> Thanks,
>> >> Thejas
>> >>
>> >>
>> >>
>> >> On Mon, Apr 27, 2015 at 5:31 PM, Xuefu Zhang <xz...@cloudera.com>
>> wrote:
>> >> > FYI, I have created an uber JIRA for this:
>> >> > https://issues.apache.org/jira/browse/HIVE-10511.
>> >> >
>> >> > Thanks,
>> >> > Xuefu
>> >> >
>> >> > On Mon, Apr 27, 2015 at 4:54 PM, Xuefu Zhang <xz...@cloudera.com>
>> >> wrote:
>> >> >
>> >> >> Yes, Olga. I  will create JIRAs to track those.
>> >> >>
>> >> >> Thanks,
>> >> >> Xuefu
>> >> >>
>> >> >> On Mon, Apr 27, 2015 at 4:51 PM, Olga L. Natkovich <
>> >> >> olgan@yahoo-inc.com.invalid> wrote:
>> >> >>
>> >> >>> We would need to build a test suite that makes sure that new
>> >> >>> implementation is compatible with the old one for users to adopt
>> it. We
>> >> >>> would also need some benchmarks to compare performance. Could you
>> >> please
>> >> >>> include this in the proposal as well.
>> >> >>> Thanks,
>> >> >>> Olga
>> >> >>>       From: Xuefu Zhang <xz...@cloudera.com>
>> >> >>>  To: "dev@hive.apache.org" <de...@hive.apache.org>
>> >> >>>  Sent: Monday, April 27, 2015 4:46 PM
>> >> >>>  Subject: Re: [DISCUSS] Deprecating Hive CLI
>> >> >>>
>> >> >>> Existing implementation of Hive CLI will be replaced, so that Hive
>> >> >>> community don't need to maintain two code paths for the same thing.
>> >> That's
>> >> >>> basically what option #2 provides.
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> On Mon, Apr 27, 2015 at 4:01 PM, Alexander Pivovarov <
>> >> >>> apivovarov@gmail.com>
>> >> >>> wrote:
>> >> >>>
>> >> >>> > Does it mean that existing Hive CLI will be killed?
>> >> >>> >
>> >> >>> > On Mon, Apr 27, 2015 at 3:46 PM, Xuefu Zhang <xzhang@cloudera.com
>> >
>> >> >>> wrote:
>> >> >>> >
>> >> >>> > > To be precise, the proposal is NOT deprecating, but more of
>> >> changing
>> >> >>> the
>> >> >>> > > implementation of the Hive CLI using beeline, which seems in
>> >> >>> consensus.
>> >> >>> > >
>> >> >>> > > On Mon, Apr 27, 2015 at 2:47 PM, Alexander Pivovarov <
>> >> >>> > apivovarov@gmail.com
>> >> >>> > > >
>> >> >>> > > wrote:
>> >> >>> > >
>> >> >>> > > > I just started the survey on Deprecating Hive CLI. Please
>> share
>> >> you
>> >> >>> > > > opinion.
>> >> >>> > > >
>> >> >>> > > > Deprecating Hive CLI:
>> >> >>> > > > https://www.surveymonkey.com/s/XFHLM57
>> >> >>> > > >
>> >> >>> > > > Results:
>> >> >>> > > > https://www.surveymonkey.com/results/SM-JHYY5DR9/
>> >> >>> > > >
>> >> >>> > > >
>> >> >>> > > > On Mon, Apr 27, 2015 at 2:23 PM, Alexander Pivovarov <
>> >> >>> > > apivovarov@gmail.com
>> >> >>> > > > >
>> >> >>> > > > wrote:
>> >> >>> > > >
>> >> >>> > > > > Xuefu,
>> >> >>> > > > >
>> >> >>> > > > > I'm just saying that most of the shells (e.g. mysql or
>> >> accumulo)
>> >> >>> > > reserve
>> >> >>> > > > > -u for user.
>> >> >>> > > > >
>> >> >>> > > > > I believe lots of stuff in Hive take MySQL as an example.
>> >> >>> > > > >
>> >> >>> > > > > Alex
>> >> >>> > > > >
>> >> >>> > > > >
>> >> >>> > > > > On Mon, Apr 27, 2015 at 2:14 PM, Xuefu Zhang <
>> >> xzhang@cloudera.com
>> >> >>> >
>> >> >>> > > > wrote:
>> >> >>> > > > >
>> >> >>> > > > >> Alex,
>> >> >>> > > > >>
>> >> >>> > > > >> Just to be sure, we are talking about replace Hive CLI, not
>> >> mysql
>> >> >>> > and
>> >> >>> > > > >> accumulo command line shells. Thus, I'm not sure this is
>> >> >>> relavent.
>> >> >>> > > > >> Regardless, I think we'd better have some writeup in the
>> >> proposed
>> >> >>> > uber
>> >> >>> > > > >> JIRA
>> >> >>> > > > >> so that everyone knows what we are signing up.
>> >> >>> > > > >>
>> >> >>> > > > >> Thanks,
>> >> >>> > > > >> Xuefu
>> >> >>> > > > >>
>> >> >>> > > > >> On Mon, Apr 27, 2015 at 12:57 PM, Alexander Pivovarov <
>> >> >>> > > > >> apivovarov@gmail.com>
>> >> >>> > > > >> wrote:
>> >> >>> > > > >>
>> >> >>> > > > >> > Mysql and accumulo command line shells use -u to pass
>> <user>
>> >> >>> > > > >> >
>> >> >>> > > > >> > Can beeline use -u as well? Currently -u is reserved for
>> >> URL?
>> >> >>> > > > >> > On Apr 27, 2015 12:42 PM, "Xuefu Zhang" <
>> >> xzhang@cloudera.com>
>> >> >>> > > wrote:
>> >> >>> > > > >> >
>> >> >>> > > > >> > > Thanks to all for the input. I assume that we have a
>> >> >>> consensus
>> >> >>> > > that
>> >> >>> > > > >> we'd
>> >> >>> > > > >> > > like to keep Hive as an alias to beeline with embedded
>> HS2
>> >> >>> and
>> >> >>> > > make
>> >> >>> > > > >> user
>> >> >>> > > > >> > > transition as smooth as possible by identifying gaps
>> and
>> >> >>> fixing
>> >> >>> > > > >> issues.
>> >> >>> > > > >> > I'm
>> >> >>> > > > >> > > going to create an umbrella JIRA and subtasks to track
>> the
>> >> >>> > > progress.
>> >> >>> > > > >> > Please
>> >> >>> > > > >> > > let me know if you have further questions.
>> >> >>> > > > >> > >
>> >> >>> > > > >> > > Thanks,
>> >> >>> > > > >> > > Xuefu
>> >> >>> > > > >> > >
>> >> >>> > > > >> > > On Sat, Apr 25, 2015 at 12:59 AM, Lars Francke <
>> >> >>> > > > >> lars.francke@gmail.com>
>> >> >>> > > > >> > > wrote:
>> >> >>> > > > >> > >
>> >> >>> > > > >> > > > Yes, well put. It is about usability and "least
>> >> surprise".
>> >> >>> > > > >> > > >
>> >> >>> > > > >> > > > So if people wouldn't have to deal with JDBC syntax
>> by
>> >> >>> default
>> >> >>> > > and
>> >> >>> > > > >> > could
>> >> >>> > > > >> > > > use "hive" instead of "beeline" to start that'd be
>> good.
>> >> >>> > > > >> > > >
>> >> >>> > > > >> > > >
>> >> >>> > > > >> > > > On Sat, Apr 25, 2015 at 12:38 AM, Alan Gates <
>> >> >>> > > > alanfgates@gmail.com>
>> >> >>> > > > >> > > wrote:
>> >> >>> > > > >> > > >
>> >> >>> > > > >> > > >> If I understand correctly this is an argument about
>> >> >>> > usability,
>> >> >>> > > > not
>> >> >>> > > > >> > > >> functionality.  So if Hive still had the CLI but it
>> >> >>> happened
>> >> >>> > to
>> >> >>> > > > use
>> >> >>> > > > >> > > either
>> >> >>> > > > >> > > >> HS2 or embedded HS2 (depending on configuration)
>> >> >>> underneath
>> >> >>> > > your
>> >> >>> > > > >> > > concerns
>> >> >>> > > > >> > > >> would be addressed.  Is that correct?
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >> Alan.
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >>  Lars Francke <la...@gmail.com>
>> >> >>> > > > >> > > >>  April 23, 2015 at 15:53
>> >> >>> > > > >> > > >> I've been at about 20 different customers in the
>> years
>> >> >>> since
>> >> >>> > > > >> Beeline
>> >> >>> > > > >> > has
>> >> >>> > > > >> > > >> been added. I can only think of a single one that
>> has
>> >> used
>> >> >>> > > > beeline.
>> >> >>> > > > >> > The
>> >> >>> > > > >> > > >> instinct is to use "hive", partially because it is
>> >> easy to
>> >> >>> > > > remember
>> >> >>> > > > >> > and
>> >> >>> > > > >> > > >> intuitive and because it is easier to use. I end up
>> >> >>> googling
>> >> >>> > > the
>> >> >>> > > > >> > stupid
>> >> >>> > > > >> > > >> JDBC syntax every single time.
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >> I know this might be a bit "out there" but I propose
>> >> >>> > something
>> >> >>> > > > >> else:
>> >> >>> > > > >> > > >> 1) Rename (or link) "beeline" to "hive"
>> >> >>> > > > >> > > >> 2) Add a "--hiveserver2" (or "--jdbc" or
>> "--beeline")
>> >> >>> option
>> >> >>> > to
>> >> >>> > > > the
>> >> >>> > > > >> > > >> "hive" command to get the current "beeline", this'd
>> >> keep
>> >> >>> the
>> >> >>> > > CLI
>> >> >>> > > > as
>> >> >>> > > > >> > > >> default, we could also add a "--legacy" or "--cli"
>> >> option
>> >> >>> and
>> >> >>> > > > make
>> >> >>> > > > >> > > >> "hiveserver2/beeline" the default.
>> >> >>> > > > >> > > >> 3) Add a "--embedded-hs2" option to the "hive"
>> command
>> >> to
>> >> >>> get
>> >> >>> > > an
>> >> >>> > > > >> > > embedded
>> >> >>> > > > >> > > >> HS2 in Beeline
>> >> >>> > > > >> > > >> 4) Add some documentation to beeline reminding
>> people
>> >> on
>> >> >>> > > startup
>> >> >>> > > > of
>> >> >>> > > > >> > > >> beeline on how to connect and how to use embedded
>> mode
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >> The fact is that the old shell just works for lots
>> of
>> >> >>> people
>> >> >>> > > and
>> >> >>> > > > >> > there's
>> >> >>> > > > >> > > >> just no need for beeline for these people. Also the
>> >> name
>> >> >>> is
>> >> >>> > > > >> confusing
>> >> >>> > > > >> > -
>> >> >>> > > > >> > > >> especially for non-native speakers. It's not a
>> common
>> >> >>> word so
>> >> >>> > > > it's
>> >> >>> > > > >> not
>> >> >>> > > > >> > > easy
>> >> >>> > > > >> > > >> to remember.
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >>  Alan Gates <al...@gmail.com>
>> >> >>> > > > >> > > >>  April 23, 2015 at 15:35
>> >> >>> > > > >> > > >>  Xuefu, thanks for getting this discussion started.
>> >> >>> Limiting
>> >> >>> > > our
>> >> >>> > > > >> code
>> >> >>> > > > >> > > >> paths is definitely a plus.  My inclination would be
>> >> to go
>> >> >>> > > > towards
>> >> >>> > > > >> > > option
>> >> >>> > > > >> > > >> 2.  A few questions:
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >> 1) Is there any functionality in CLI that's not in
>> >> >>> beeline?
>> >> >>> > > > >> > > >> 2) If I understand correctly option 2 would have an
>> >> >>> implicit
>> >> >>> > > HS2
>> >> >>> > > > in
>> >> >>> > > > >> > > >> process when a user runs the CLI.  Would this be
>> >> >>> available in
>> >> >>> > > > >> option 1
>> >> >>> > > > >> > > as
>> >> >>> > > > >> > > >> well?
>> >> >>> > > > >> > > >> 3) Are there any performance implications, since now
>> >> >>> commands
>> >> >>> > > > have
>> >> >>> > > > >> to
>> >> >>> > > > >> > > hop
>> >> >>> > > > >> > > >> through a thrift/jdbc loop even in the embedded
>> mode?
>> >> >>> > > > >> > > >> 4) If we choose option 2 how backward compatible
>> can we
>> >> >>> make
>> >> >>> > > it?
>> >> >>> > > > >> Will
>> >> >>> > > > >> > > >> users need to change any scripts they have that use
>> the
>> >> >>> CLI?
>> >> >>> > > Do
>> >> >>> > > > we
>> >> >>> > > > >> > have
>> >> >>> > > > >> > > >> tests that will make sure of this?
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >> Alan.
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >>  Xuefu Zhang <xz...@cloudera.com>
>> >> >>> > > > >> > > >>  April 23, 2015 at 14:43
>> >> >>> > > > >> > > >> Hi all,
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >> I'd like to revive the discussion about the fate of
>> >> Hive
>> >> >>> CLI,
>> >> >>> > > as
>> >> >>> > > > >> this
>> >> >>> > > > >> > > >> topic
>> >> >>> > > > >> > > >> has haunted us several times including [1][2]. It
>> looks
>> >> >>> to me
>> >> >>> > > > that
>> >> >>> > > > >> > there
>> >> >>> > > > >> > > >> is
>> >> >>> > > > >> > > >> a consensus that it's not wise for Hive community to
>> >> keep
>> >> >>> > both
>> >> >>> > > > Hive
>> >> >>> > > > >> > CLI
>> >> >>> > > > >> > > as
>> >> >>> > > > >> > > >> it is as well as Beeline + HS2. However, I don't
>> >> believe
>> >> >>> that
>> >> >>> > > no
>> >> >>> > > > >> > action
>> >> >>> > > > >> > > is
>> >> >>> > > > >> > > >> the best action for us. From discussion so far, I
>> see
>> >> the
>> >> >>> > > > following
>> >> >>> > > > >> > > >> proposals:
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >> 1. Deprecating Hive CLI and advise that users use
>> >> Beeline.
>> >> >>> > > > >> > > >> 2. Make Hive CLI as naming flavor to beeline with
>> >> embedded
>> >> >>> > > mode.
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >> Frankly, I don't see much difference between the two
>> >> >>> > > approaches.
>> >> >>> > > > >> > Keeping
>> >> >>> > > > >> > > >> an
>> >> >>> > > > >> > > >> alias at script or even code level isn't that much
>> >> work.
>> >> >>> > > However,
>> >> >>> > > > >> > > >> shouldn't
>> >> >>> > > > >> > > >> we pick a direction and start moving to it? If
>> there is
>> >> >>> any
>> >> >>> > > gaps
>> >> >>> > > > >> > between
>> >> >>> > > > >> > > >> beeline embedded and Hive CLI, we should identify
>> and
>> >> >>> fill in
>> >> >>> > > > >> those.
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >> I'd love to hear the thoughts from the community and
>> >> hope
>> >> >>> > this
>> >> >>> > > > >> time we
>> >> >>> > > > >> > > >> will
>> >> >>> > > > >> > > >> have concrete action items to work on.
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >> Thanks,
>> >> >>> > > > >> > > >> Xuefu
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >> [1]
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > >
>> >> >>> > > > >> >
>> >> >>> > > > >>
>> >> >>> > > >
>> >> >>> > >
>> >> >>> >
>> >> >>>
>> >>
>> http://mail-archives.apache.org/mod_mbox/hive-dev/201412.mbox/%3C5485E1BE.3060709%40hortonworks.com%3E
>> >> >>> > > > >> > > >> [2]
>> >> >>> > > > >>
>> >> https://www.mail-archive.com/dev@hive.apache.org/msg112378.html
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >>
>> >> >>> > > > >> > > >
>> >> >>> > > > >> > >
>> >> >>> > > > >> >
>> >> >>> > > > >>
>> >> >>> > > > >
>> >> >>> > > > >
>> >> >>> > > >
>> >> >>> > >
>> >> >>> >
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>
>> >> >>
>> >>
>>

Re: [DISCUSS] Deprecating Hive CLI

Posted by Xuefu Zhang <xz...@cloudera.com>.
Okay. That's fine. I think supporting an env variable doesn't take much.
What about enabling the new code path by default, and allowing user to
opt-out or in case of a serious bug? We also give user an warning that the
env variable may be discontinued in the future.

thanks,
Xuefu

On Thu, Apr 30, 2015 at 5:13 PM, Thejas Nair <th...@gmail.com> wrote:

> In most cases with hive, when a major implementation change is made,
> we usually provide the user to fallback to older implementation. For
> example, when CBO was added, it was initially not enabled by default,
> and there still option of using non-CBO path. When new hadoop major
> versions are added, we still give users option of using older hadoop
> versions for some time. Or in case of jdbc, we allowed users to choose
> between HiveServer1 and 2 for sometime. Even with putting good effort
> into testing, some corner cases sometimes get missed.
>
> On similar lines, it would be good to let opt-in for a release, and
> then switch the default in the next release. Given that we have been
> making new releases of hive every few months, I don't see this as a
> big issue. I think we should at the minimum allow users to opt out of
> new implementation for a release or so (if they encounter bugs).
>
> Most of the work is going to be in ensuring the compatibility.
> Supporting a flag to choose implementation should be relatively
> simpler work. What do you think ?
>
>
>
>
>
>
>
>
>
> On Thu, Apr 30, 2015 at 4:42 PM, Xuefu Zhang <xz...@cloudera.com> wrote:
> > Hi Thejas,
> >
> > Thanks for your input. I thought about this, but I don't really feel it
> > necessary to have a "transition" stage. After all, Hive CLI is a command
> > line tool with well-defined command line options. That's the "interface"
> > that we need to support. We are just changing the implementation. Through
> > comprehensive testing, we hope to discover most of the issues.
> >
> > On the other hand, if we have such an transition, there might never be a
> > user bothering to flip the env variable and the transition doesn't really
> > build up more confidence.
> >
> > In addition, if we provide either a transition or switch for every
> > implementation change, wouldn't users be overwhelmed by those transitions
> > or switches.
> >
> > Thoughts?
> >
> > Thanks,
> > Xuefu
> >
> > On Thu, Apr 30, 2015 at 3:10 PM, Thejas Nair <th...@gmail.com>
> wrote:
> >
> >> Hi Xuefu,
> >> What is the plan you have in mind for a transition to using beeline
> >> from within hive?
> >> I assume there is going to be some translation from hive cli options
> >> and commands to beeline. Is that right ?
> >> Once the translation is in place, how would the switch happen ?
> >>
> >> I am thinking that once there is a hive-cli compatible beeline mode,
> >> there can be an option to switch between beeline and hive cli codebase
> >> .
> >> For example,
> >> In hive version X , when an environment variable CLI_USE_BEELINE=true
> >> environment variable is set, "hive" command uses beeline underneath
> >> (default remains cli codepath, so that users can start experimenting
> >> with "hive" commands beeline mode).
> >> In hive version Y > X, by default "hive" command starts using beeline
> >> underneath.
> >>
> >> Is it something like this what you have in mind ?
> >>
> >> Thanks,
> >> Thejas
> >>
> >>
> >>
> >> On Mon, Apr 27, 2015 at 5:31 PM, Xuefu Zhang <xz...@cloudera.com>
> wrote:
> >> > FYI, I have created an uber JIRA for this:
> >> > https://issues.apache.org/jira/browse/HIVE-10511.
> >> >
> >> > Thanks,
> >> > Xuefu
> >> >
> >> > On Mon, Apr 27, 2015 at 4:54 PM, Xuefu Zhang <xz...@cloudera.com>
> >> wrote:
> >> >
> >> >> Yes, Olga. I  will create JIRAs to track those.
> >> >>
> >> >> Thanks,
> >> >> Xuefu
> >> >>
> >> >> On Mon, Apr 27, 2015 at 4:51 PM, Olga L. Natkovich <
> >> >> olgan@yahoo-inc.com.invalid> wrote:
> >> >>
> >> >>> We would need to build a test suite that makes sure that new
> >> >>> implementation is compatible with the old one for users to adopt
> it. We
> >> >>> would also need some benchmarks to compare performance. Could you
> >> please
> >> >>> include this in the proposal as well.
> >> >>> Thanks,
> >> >>> Olga
> >> >>>       From: Xuefu Zhang <xz...@cloudera.com>
> >> >>>  To: "dev@hive.apache.org" <de...@hive.apache.org>
> >> >>>  Sent: Monday, April 27, 2015 4:46 PM
> >> >>>  Subject: Re: [DISCUSS] Deprecating Hive CLI
> >> >>>
> >> >>> Existing implementation of Hive CLI will be replaced, so that Hive
> >> >>> community don't need to maintain two code paths for the same thing.
> >> That's
> >> >>> basically what option #2 provides.
> >> >>>
> >> >>>
> >> >>>
> >> >>> On Mon, Apr 27, 2015 at 4:01 PM, Alexander Pivovarov <
> >> >>> apivovarov@gmail.com>
> >> >>> wrote:
> >> >>>
> >> >>> > Does it mean that existing Hive CLI will be killed?
> >> >>> >
> >> >>> > On Mon, Apr 27, 2015 at 3:46 PM, Xuefu Zhang <xzhang@cloudera.com
> >
> >> >>> wrote:
> >> >>> >
> >> >>> > > To be precise, the proposal is NOT deprecating, but more of
> >> changing
> >> >>> the
> >> >>> > > implementation of the Hive CLI using beeline, which seems in
> >> >>> consensus.
> >> >>> > >
> >> >>> > > On Mon, Apr 27, 2015 at 2:47 PM, Alexander Pivovarov <
> >> >>> > apivovarov@gmail.com
> >> >>> > > >
> >> >>> > > wrote:
> >> >>> > >
> >> >>> > > > I just started the survey on Deprecating Hive CLI. Please
> share
> >> you
> >> >>> > > > opinion.
> >> >>> > > >
> >> >>> > > > Deprecating Hive CLI:
> >> >>> > > > https://www.surveymonkey.com/s/XFHLM57
> >> >>> > > >
> >> >>> > > > Results:
> >> >>> > > > https://www.surveymonkey.com/results/SM-JHYY5DR9/
> >> >>> > > >
> >> >>> > > >
> >> >>> > > > On Mon, Apr 27, 2015 at 2:23 PM, Alexander Pivovarov <
> >> >>> > > apivovarov@gmail.com
> >> >>> > > > >
> >> >>> > > > wrote:
> >> >>> > > >
> >> >>> > > > > Xuefu,
> >> >>> > > > >
> >> >>> > > > > I'm just saying that most of the shells (e.g. mysql or
> >> accumulo)
> >> >>> > > reserve
> >> >>> > > > > -u for user.
> >> >>> > > > >
> >> >>> > > > > I believe lots of stuff in Hive take MySQL as an example.
> >> >>> > > > >
> >> >>> > > > > Alex
> >> >>> > > > >
> >> >>> > > > >
> >> >>> > > > > On Mon, Apr 27, 2015 at 2:14 PM, Xuefu Zhang <
> >> xzhang@cloudera.com
> >> >>> >
> >> >>> > > > wrote:
> >> >>> > > > >
> >> >>> > > > >> Alex,
> >> >>> > > > >>
> >> >>> > > > >> Just to be sure, we are talking about replace Hive CLI, not
> >> mysql
> >> >>> > and
> >> >>> > > > >> accumulo command line shells. Thus, I'm not sure this is
> >> >>> relavent.
> >> >>> > > > >> Regardless, I think we'd better have some writeup in the
> >> proposed
> >> >>> > uber
> >> >>> > > > >> JIRA
> >> >>> > > > >> so that everyone knows what we are signing up.
> >> >>> > > > >>
> >> >>> > > > >> Thanks,
> >> >>> > > > >> Xuefu
> >> >>> > > > >>
> >> >>> > > > >> On Mon, Apr 27, 2015 at 12:57 PM, Alexander Pivovarov <
> >> >>> > > > >> apivovarov@gmail.com>
> >> >>> > > > >> wrote:
> >> >>> > > > >>
> >> >>> > > > >> > Mysql and accumulo command line shells use -u to pass
> <user>
> >> >>> > > > >> >
> >> >>> > > > >> > Can beeline use -u as well? Currently -u is reserved for
> >> URL?
> >> >>> > > > >> > On Apr 27, 2015 12:42 PM, "Xuefu Zhang" <
> >> xzhang@cloudera.com>
> >> >>> > > wrote:
> >> >>> > > > >> >
> >> >>> > > > >> > > Thanks to all for the input. I assume that we have a
> >> >>> consensus
> >> >>> > > that
> >> >>> > > > >> we'd
> >> >>> > > > >> > > like to keep Hive as an alias to beeline with embedded
> HS2
> >> >>> and
> >> >>> > > make
> >> >>> > > > >> user
> >> >>> > > > >> > > transition as smooth as possible by identifying gaps
> and
> >> >>> fixing
> >> >>> > > > >> issues.
> >> >>> > > > >> > I'm
> >> >>> > > > >> > > going to create an umbrella JIRA and subtasks to track
> the
> >> >>> > > progress.
> >> >>> > > > >> > Please
> >> >>> > > > >> > > let me know if you have further questions.
> >> >>> > > > >> > >
> >> >>> > > > >> > > Thanks,
> >> >>> > > > >> > > Xuefu
> >> >>> > > > >> > >
> >> >>> > > > >> > > On Sat, Apr 25, 2015 at 12:59 AM, Lars Francke <
> >> >>> > > > >> lars.francke@gmail.com>
> >> >>> > > > >> > > wrote:
> >> >>> > > > >> > >
> >> >>> > > > >> > > > Yes, well put. It is about usability and "least
> >> surprise".
> >> >>> > > > >> > > >
> >> >>> > > > >> > > > So if people wouldn't have to deal with JDBC syntax
> by
> >> >>> default
> >> >>> > > and
> >> >>> > > > >> > could
> >> >>> > > > >> > > > use "hive" instead of "beeline" to start that'd be
> good.
> >> >>> > > > >> > > >
> >> >>> > > > >> > > >
> >> >>> > > > >> > > > On Sat, Apr 25, 2015 at 12:38 AM, Alan Gates <
> >> >>> > > > alanfgates@gmail.com>
> >> >>> > > > >> > > wrote:
> >> >>> > > > >> > > >
> >> >>> > > > >> > > >> If I understand correctly this is an argument about
> >> >>> > usability,
> >> >>> > > > not
> >> >>> > > > >> > > >> functionality.  So if Hive still had the CLI but it
> >> >>> happened
> >> >>> > to
> >> >>> > > > use
> >> >>> > > > >> > > either
> >> >>> > > > >> > > >> HS2 or embedded HS2 (depending on configuration)
> >> >>> underneath
> >> >>> > > your
> >> >>> > > > >> > > concerns
> >> >>> > > > >> > > >> would be addressed.  Is that correct?
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >> Alan.
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >>  Lars Francke <la...@gmail.com>
> >> >>> > > > >> > > >>  April 23, 2015 at 15:53
> >> >>> > > > >> > > >> I've been at about 20 different customers in the
> years
> >> >>> since
> >> >>> > > > >> Beeline
> >> >>> > > > >> > has
> >> >>> > > > >> > > >> been added. I can only think of a single one that
> has
> >> used
> >> >>> > > > beeline.
> >> >>> > > > >> > The
> >> >>> > > > >> > > >> instinct is to use "hive", partially because it is
> >> easy to
> >> >>> > > > remember
> >> >>> > > > >> > and
> >> >>> > > > >> > > >> intuitive and because it is easier to use. I end up
> >> >>> googling
> >> >>> > > the
> >> >>> > > > >> > stupid
> >> >>> > > > >> > > >> JDBC syntax every single time.
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >> I know this might be a bit "out there" but I propose
> >> >>> > something
> >> >>> > > > >> else:
> >> >>> > > > >> > > >> 1) Rename (or link) "beeline" to "hive"
> >> >>> > > > >> > > >> 2) Add a "--hiveserver2" (or "--jdbc" or
> "--beeline")
> >> >>> option
> >> >>> > to
> >> >>> > > > the
> >> >>> > > > >> > > >> "hive" command to get the current "beeline", this'd
> >> keep
> >> >>> the
> >> >>> > > CLI
> >> >>> > > > as
> >> >>> > > > >> > > >> default, we could also add a "--legacy" or "--cli"
> >> option
> >> >>> and
> >> >>> > > > make
> >> >>> > > > >> > > >> "hiveserver2/beeline" the default.
> >> >>> > > > >> > > >> 3) Add a "--embedded-hs2" option to the "hive"
> command
> >> to
> >> >>> get
> >> >>> > > an
> >> >>> > > > >> > > embedded
> >> >>> > > > >> > > >> HS2 in Beeline
> >> >>> > > > >> > > >> 4) Add some documentation to beeline reminding
> people
> >> on
> >> >>> > > startup
> >> >>> > > > of
> >> >>> > > > >> > > >> beeline on how to connect and how to use embedded
> mode
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >> The fact is that the old shell just works for lots
> of
> >> >>> people
> >> >>> > > and
> >> >>> > > > >> > there's
> >> >>> > > > >> > > >> just no need for beeline for these people. Also the
> >> name
> >> >>> is
> >> >>> > > > >> confusing
> >> >>> > > > >> > -
> >> >>> > > > >> > > >> especially for non-native speakers. It's not a
> common
> >> >>> word so
> >> >>> > > > it's
> >> >>> > > > >> not
> >> >>> > > > >> > > easy
> >> >>> > > > >> > > >> to remember.
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >>  Alan Gates <al...@gmail.com>
> >> >>> > > > >> > > >>  April 23, 2015 at 15:35
> >> >>> > > > >> > > >>  Xuefu, thanks for getting this discussion started.
> >> >>> Limiting
> >> >>> > > our
> >> >>> > > > >> code
> >> >>> > > > >> > > >> paths is definitely a plus.  My inclination would be
> >> to go
> >> >>> > > > towards
> >> >>> > > > >> > > option
> >> >>> > > > >> > > >> 2.  A few questions:
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >> 1) Is there any functionality in CLI that's not in
> >> >>> beeline?
> >> >>> > > > >> > > >> 2) If I understand correctly option 2 would have an
> >> >>> implicit
> >> >>> > > HS2
> >> >>> > > > in
> >> >>> > > > >> > > >> process when a user runs the CLI.  Would this be
> >> >>> available in
> >> >>> > > > >> option 1
> >> >>> > > > >> > > as
> >> >>> > > > >> > > >> well?
> >> >>> > > > >> > > >> 3) Are there any performance implications, since now
> >> >>> commands
> >> >>> > > > have
> >> >>> > > > >> to
> >> >>> > > > >> > > hop
> >> >>> > > > >> > > >> through a thrift/jdbc loop even in the embedded
> mode?
> >> >>> > > > >> > > >> 4) If we choose option 2 how backward compatible
> can we
> >> >>> make
> >> >>> > > it?
> >> >>> > > > >> Will
> >> >>> > > > >> > > >> users need to change any scripts they have that use
> the
> >> >>> CLI?
> >> >>> > > Do
> >> >>> > > > we
> >> >>> > > > >> > have
> >> >>> > > > >> > > >> tests that will make sure of this?
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >> Alan.
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >>  Xuefu Zhang <xz...@cloudera.com>
> >> >>> > > > >> > > >>  April 23, 2015 at 14:43
> >> >>> > > > >> > > >> Hi all,
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >> I'd like to revive the discussion about the fate of
> >> Hive
> >> >>> CLI,
> >> >>> > > as
> >> >>> > > > >> this
> >> >>> > > > >> > > >> topic
> >> >>> > > > >> > > >> has haunted us several times including [1][2]. It
> looks
> >> >>> to me
> >> >>> > > > that
> >> >>> > > > >> > there
> >> >>> > > > >> > > >> is
> >> >>> > > > >> > > >> a consensus that it's not wise for Hive community to
> >> keep
> >> >>> > both
> >> >>> > > > Hive
> >> >>> > > > >> > CLI
> >> >>> > > > >> > > as
> >> >>> > > > >> > > >> it is as well as Beeline + HS2. However, I don't
> >> believe
> >> >>> that
> >> >>> > > no
> >> >>> > > > >> > action
> >> >>> > > > >> > > is
> >> >>> > > > >> > > >> the best action for us. From discussion so far, I
> see
> >> the
> >> >>> > > > following
> >> >>> > > > >> > > >> proposals:
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >> 1. Deprecating Hive CLI and advise that users use
> >> Beeline.
> >> >>> > > > >> > > >> 2. Make Hive CLI as naming flavor to beeline with
> >> embedded
> >> >>> > > mode.
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >> Frankly, I don't see much difference between the two
> >> >>> > > approaches.
> >> >>> > > > >> > Keeping
> >> >>> > > > >> > > >> an
> >> >>> > > > >> > > >> alias at script or even code level isn't that much
> >> work.
> >> >>> > > However,
> >> >>> > > > >> > > >> shouldn't
> >> >>> > > > >> > > >> we pick a direction and start moving to it? If
> there is
> >> >>> any
> >> >>> > > gaps
> >> >>> > > > >> > between
> >> >>> > > > >> > > >> beeline embedded and Hive CLI, we should identify
> and
> >> >>> fill in
> >> >>> > > > >> those.
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >> I'd love to hear the thoughts from the community and
> >> hope
> >> >>> > this
> >> >>> > > > >> time we
> >> >>> > > > >> > > >> will
> >> >>> > > > >> > > >> have concrete action items to work on.
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >> Thanks,
> >> >>> > > > >> > > >> Xuefu
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >> [1]
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >>
> >> >>> > > > >> > >
> >> >>> > > > >> >
> >> >>> > > > >>
> >> >>> > > >
> >> >>> > >
> >> >>> >
> >> >>>
> >>
> http://mail-archives.apache.org/mod_mbox/hive-dev/201412.mbox/%3C5485E1BE.3060709%40hortonworks.com%3E
> >> >>> > > > >> > > >> [2]
> >> >>> > > > >>
> >> https://www.mail-archive.com/dev@hive.apache.org/msg112378.html
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >>
> >> >>> > > > >> > > >
> >> >>> > > > >> > >
> >> >>> > > > >> >
> >> >>> > > > >>
> >> >>> > > > >
> >> >>> > > > >
> >> >>> > > >
> >> >>> > >
> >> >>> >
> >> >>>
> >> >>>
> >> >>>
> >> >>>
> >> >>
> >> >>
> >>
>

Re: [DISCUSS] Deprecating Hive CLI

Posted by Thejas Nair <th...@gmail.com>.
In most cases with hive, when a major implementation change is made,
we usually provide the user to fallback to older implementation. For
example, when CBO was added, it was initially not enabled by default,
and there still option of using non-CBO path. When new hadoop major
versions are added, we still give users option of using older hadoop
versions for some time. Or in case of jdbc, we allowed users to choose
between HiveServer1 and 2 for sometime. Even with putting good effort
into testing, some corner cases sometimes get missed.

On similar lines, it would be good to let opt-in for a release, and
then switch the default in the next release. Given that we have been
making new releases of hive every few months, I don't see this as a
big issue. I think we should at the minimum allow users to opt out of
new implementation for a release or so (if they encounter bugs).

Most of the work is going to be in ensuring the compatibility.
Supporting a flag to choose implementation should be relatively
simpler work. What do you think ?









On Thu, Apr 30, 2015 at 4:42 PM, Xuefu Zhang <xz...@cloudera.com> wrote:
> Hi Thejas,
>
> Thanks for your input. I thought about this, but I don't really feel it
> necessary to have a "transition" stage. After all, Hive CLI is a command
> line tool with well-defined command line options. That's the "interface"
> that we need to support. We are just changing the implementation. Through
> comprehensive testing, we hope to discover most of the issues.
>
> On the other hand, if we have such an transition, there might never be a
> user bothering to flip the env variable and the transition doesn't really
> build up more confidence.
>
> In addition, if we provide either a transition or switch for every
> implementation change, wouldn't users be overwhelmed by those transitions
> or switches.
>
> Thoughts?
>
> Thanks,
> Xuefu
>
> On Thu, Apr 30, 2015 at 3:10 PM, Thejas Nair <th...@gmail.com> wrote:
>
>> Hi Xuefu,
>> What is the plan you have in mind for a transition to using beeline
>> from within hive?
>> I assume there is going to be some translation from hive cli options
>> and commands to beeline. Is that right ?
>> Once the translation is in place, how would the switch happen ?
>>
>> I am thinking that once there is a hive-cli compatible beeline mode,
>> there can be an option to switch between beeline and hive cli codebase
>> .
>> For example,
>> In hive version X , when an environment variable CLI_USE_BEELINE=true
>> environment variable is set, "hive" command uses beeline underneath
>> (default remains cli codepath, so that users can start experimenting
>> with "hive" commands beeline mode).
>> In hive version Y > X, by default "hive" command starts using beeline
>> underneath.
>>
>> Is it something like this what you have in mind ?
>>
>> Thanks,
>> Thejas
>>
>>
>>
>> On Mon, Apr 27, 2015 at 5:31 PM, Xuefu Zhang <xz...@cloudera.com> wrote:
>> > FYI, I have created an uber JIRA for this:
>> > https://issues.apache.org/jira/browse/HIVE-10511.
>> >
>> > Thanks,
>> > Xuefu
>> >
>> > On Mon, Apr 27, 2015 at 4:54 PM, Xuefu Zhang <xz...@cloudera.com>
>> wrote:
>> >
>> >> Yes, Olga. I  will create JIRAs to track those.
>> >>
>> >> Thanks,
>> >> Xuefu
>> >>
>> >> On Mon, Apr 27, 2015 at 4:51 PM, Olga L. Natkovich <
>> >> olgan@yahoo-inc.com.invalid> wrote:
>> >>
>> >>> We would need to build a test suite that makes sure that new
>> >>> implementation is compatible with the old one for users to adopt it. We
>> >>> would also need some benchmarks to compare performance. Could you
>> please
>> >>> include this in the proposal as well.
>> >>> Thanks,
>> >>> Olga
>> >>>       From: Xuefu Zhang <xz...@cloudera.com>
>> >>>  To: "dev@hive.apache.org" <de...@hive.apache.org>
>> >>>  Sent: Monday, April 27, 2015 4:46 PM
>> >>>  Subject: Re: [DISCUSS] Deprecating Hive CLI
>> >>>
>> >>> Existing implementation of Hive CLI will be replaced, so that Hive
>> >>> community don't need to maintain two code paths for the same thing.
>> That's
>> >>> basically what option #2 provides.
>> >>>
>> >>>
>> >>>
>> >>> On Mon, Apr 27, 2015 at 4:01 PM, Alexander Pivovarov <
>> >>> apivovarov@gmail.com>
>> >>> wrote:
>> >>>
>> >>> > Does it mean that existing Hive CLI will be killed?
>> >>> >
>> >>> > On Mon, Apr 27, 2015 at 3:46 PM, Xuefu Zhang <xz...@cloudera.com>
>> >>> wrote:
>> >>> >
>> >>> > > To be precise, the proposal is NOT deprecating, but more of
>> changing
>> >>> the
>> >>> > > implementation of the Hive CLI using beeline, which seems in
>> >>> consensus.
>> >>> > >
>> >>> > > On Mon, Apr 27, 2015 at 2:47 PM, Alexander Pivovarov <
>> >>> > apivovarov@gmail.com
>> >>> > > >
>> >>> > > wrote:
>> >>> > >
>> >>> > > > I just started the survey on Deprecating Hive CLI. Please share
>> you
>> >>> > > > opinion.
>> >>> > > >
>> >>> > > > Deprecating Hive CLI:
>> >>> > > > https://www.surveymonkey.com/s/XFHLM57
>> >>> > > >
>> >>> > > > Results:
>> >>> > > > https://www.surveymonkey.com/results/SM-JHYY5DR9/
>> >>> > > >
>> >>> > > >
>> >>> > > > On Mon, Apr 27, 2015 at 2:23 PM, Alexander Pivovarov <
>> >>> > > apivovarov@gmail.com
>> >>> > > > >
>> >>> > > > wrote:
>> >>> > > >
>> >>> > > > > Xuefu,
>> >>> > > > >
>> >>> > > > > I'm just saying that most of the shells (e.g. mysql or
>> accumulo)
>> >>> > > reserve
>> >>> > > > > -u for user.
>> >>> > > > >
>> >>> > > > > I believe lots of stuff in Hive take MySQL as an example.
>> >>> > > > >
>> >>> > > > > Alex
>> >>> > > > >
>> >>> > > > >
>> >>> > > > > On Mon, Apr 27, 2015 at 2:14 PM, Xuefu Zhang <
>> xzhang@cloudera.com
>> >>> >
>> >>> > > > wrote:
>> >>> > > > >
>> >>> > > > >> Alex,
>> >>> > > > >>
>> >>> > > > >> Just to be sure, we are talking about replace Hive CLI, not
>> mysql
>> >>> > and
>> >>> > > > >> accumulo command line shells. Thus, I'm not sure this is
>> >>> relavent.
>> >>> > > > >> Regardless, I think we'd better have some writeup in the
>> proposed
>> >>> > uber
>> >>> > > > >> JIRA
>> >>> > > > >> so that everyone knows what we are signing up.
>> >>> > > > >>
>> >>> > > > >> Thanks,
>> >>> > > > >> Xuefu
>> >>> > > > >>
>> >>> > > > >> On Mon, Apr 27, 2015 at 12:57 PM, Alexander Pivovarov <
>> >>> > > > >> apivovarov@gmail.com>
>> >>> > > > >> wrote:
>> >>> > > > >>
>> >>> > > > >> > Mysql and accumulo command line shells use -u to pass <user>
>> >>> > > > >> >
>> >>> > > > >> > Can beeline use -u as well? Currently -u is reserved for
>> URL?
>> >>> > > > >> > On Apr 27, 2015 12:42 PM, "Xuefu Zhang" <
>> xzhang@cloudera.com>
>> >>> > > wrote:
>> >>> > > > >> >
>> >>> > > > >> > > Thanks to all for the input. I assume that we have a
>> >>> consensus
>> >>> > > that
>> >>> > > > >> we'd
>> >>> > > > >> > > like to keep Hive as an alias to beeline with embedded HS2
>> >>> and
>> >>> > > make
>> >>> > > > >> user
>> >>> > > > >> > > transition as smooth as possible by identifying gaps and
>> >>> fixing
>> >>> > > > >> issues.
>> >>> > > > >> > I'm
>> >>> > > > >> > > going to create an umbrella JIRA and subtasks to track the
>> >>> > > progress.
>> >>> > > > >> > Please
>> >>> > > > >> > > let me know if you have further questions.
>> >>> > > > >> > >
>> >>> > > > >> > > Thanks,
>> >>> > > > >> > > Xuefu
>> >>> > > > >> > >
>> >>> > > > >> > > On Sat, Apr 25, 2015 at 12:59 AM, Lars Francke <
>> >>> > > > >> lars.francke@gmail.com>
>> >>> > > > >> > > wrote:
>> >>> > > > >> > >
>> >>> > > > >> > > > Yes, well put. It is about usability and "least
>> surprise".
>> >>> > > > >> > > >
>> >>> > > > >> > > > So if people wouldn't have to deal with JDBC syntax by
>> >>> default
>> >>> > > and
>> >>> > > > >> > could
>> >>> > > > >> > > > use "hive" instead of "beeline" to start that'd be good.
>> >>> > > > >> > > >
>> >>> > > > >> > > >
>> >>> > > > >> > > > On Sat, Apr 25, 2015 at 12:38 AM, Alan Gates <
>> >>> > > > alanfgates@gmail.com>
>> >>> > > > >> > > wrote:
>> >>> > > > >> > > >
>> >>> > > > >> > > >> If I understand correctly this is an argument about
>> >>> > usability,
>> >>> > > > not
>> >>> > > > >> > > >> functionality.  So if Hive still had the CLI but it
>> >>> happened
>> >>> > to
>> >>> > > > use
>> >>> > > > >> > > either
>> >>> > > > >> > > >> HS2 or embedded HS2 (depending on configuration)
>> >>> underneath
>> >>> > > your
>> >>> > > > >> > > concerns
>> >>> > > > >> > > >> would be addressed.  Is that correct?
>> >>> > > > >> > > >>
>> >>> > > > >> > > >> Alan.
>> >>> > > > >> > > >>
>> >>> > > > >> > > >>  Lars Francke <la...@gmail.com>
>> >>> > > > >> > > >>  April 23, 2015 at 15:53
>> >>> > > > >> > > >> I've been at about 20 different customers in the years
>> >>> since
>> >>> > > > >> Beeline
>> >>> > > > >> > has
>> >>> > > > >> > > >> been added. I can only think of a single one that has
>> used
>> >>> > > > beeline.
>> >>> > > > >> > The
>> >>> > > > >> > > >> instinct is to use "hive", partially because it is
>> easy to
>> >>> > > > remember
>> >>> > > > >> > and
>> >>> > > > >> > > >> intuitive and because it is easier to use. I end up
>> >>> googling
>> >>> > > the
>> >>> > > > >> > stupid
>> >>> > > > >> > > >> JDBC syntax every single time.
>> >>> > > > >> > > >>
>> >>> > > > >> > > >> I know this might be a bit "out there" but I propose
>> >>> > something
>> >>> > > > >> else:
>> >>> > > > >> > > >> 1) Rename (or link) "beeline" to "hive"
>> >>> > > > >> > > >> 2) Add a "--hiveserver2" (or "--jdbc" or "--beeline")
>> >>> option
>> >>> > to
>> >>> > > > the
>> >>> > > > >> > > >> "hive" command to get the current "beeline", this'd
>> keep
>> >>> the
>> >>> > > CLI
>> >>> > > > as
>> >>> > > > >> > > >> default, we could also add a "--legacy" or "--cli"
>> option
>> >>> and
>> >>> > > > make
>> >>> > > > >> > > >> "hiveserver2/beeline" the default.
>> >>> > > > >> > > >> 3) Add a "--embedded-hs2" option to the "hive" command
>> to
>> >>> get
>> >>> > > an
>> >>> > > > >> > > embedded
>> >>> > > > >> > > >> HS2 in Beeline
>> >>> > > > >> > > >> 4) Add some documentation to beeline reminding people
>> on
>> >>> > > startup
>> >>> > > > of
>> >>> > > > >> > > >> beeline on how to connect and how to use embedded mode
>> >>> > > > >> > > >>
>> >>> > > > >> > > >> The fact is that the old shell just works for lots of
>> >>> people
>> >>> > > and
>> >>> > > > >> > there's
>> >>> > > > >> > > >> just no need for beeline for these people. Also the
>> name
>> >>> is
>> >>> > > > >> confusing
>> >>> > > > >> > -
>> >>> > > > >> > > >> especially for non-native speakers. It's not a common
>> >>> word so
>> >>> > > > it's
>> >>> > > > >> not
>> >>> > > > >> > > easy
>> >>> > > > >> > > >> to remember.
>> >>> > > > >> > > >>
>> >>> > > > >> > > >>
>> >>> > > > >> > > >>  Alan Gates <al...@gmail.com>
>> >>> > > > >> > > >>  April 23, 2015 at 15:35
>> >>> > > > >> > > >>  Xuefu, thanks for getting this discussion started.
>> >>> Limiting
>> >>> > > our
>> >>> > > > >> code
>> >>> > > > >> > > >> paths is definitely a plus.  My inclination would be
>> to go
>> >>> > > > towards
>> >>> > > > >> > > option
>> >>> > > > >> > > >> 2.  A few questions:
>> >>> > > > >> > > >>
>> >>> > > > >> > > >> 1) Is there any functionality in CLI that's not in
>> >>> beeline?
>> >>> > > > >> > > >> 2) If I understand correctly option 2 would have an
>> >>> implicit
>> >>> > > HS2
>> >>> > > > in
>> >>> > > > >> > > >> process when a user runs the CLI.  Would this be
>> >>> available in
>> >>> > > > >> option 1
>> >>> > > > >> > > as
>> >>> > > > >> > > >> well?
>> >>> > > > >> > > >> 3) Are there any performance implications, since now
>> >>> commands
>> >>> > > > have
>> >>> > > > >> to
>> >>> > > > >> > > hop
>> >>> > > > >> > > >> through a thrift/jdbc loop even in the embedded mode?
>> >>> > > > >> > > >> 4) If we choose option 2 how backward compatible can we
>> >>> make
>> >>> > > it?
>> >>> > > > >> Will
>> >>> > > > >> > > >> users need to change any scripts they have that use the
>> >>> CLI?
>> >>> > > Do
>> >>> > > > we
>> >>> > > > >> > have
>> >>> > > > >> > > >> tests that will make sure of this?
>> >>> > > > >> > > >>
>> >>> > > > >> > > >> Alan.
>> >>> > > > >> > > >>
>> >>> > > > >> > > >>  Xuefu Zhang <xz...@cloudera.com>
>> >>> > > > >> > > >>  April 23, 2015 at 14:43
>> >>> > > > >> > > >> Hi all,
>> >>> > > > >> > > >>
>> >>> > > > >> > > >> I'd like to revive the discussion about the fate of
>> Hive
>> >>> CLI,
>> >>> > > as
>> >>> > > > >> this
>> >>> > > > >> > > >> topic
>> >>> > > > >> > > >> has haunted us several times including [1][2]. It looks
>> >>> to me
>> >>> > > > that
>> >>> > > > >> > there
>> >>> > > > >> > > >> is
>> >>> > > > >> > > >> a consensus that it's not wise for Hive community to
>> keep
>> >>> > both
>> >>> > > > Hive
>> >>> > > > >> > CLI
>> >>> > > > >> > > as
>> >>> > > > >> > > >> it is as well as Beeline + HS2. However, I don't
>> believe
>> >>> that
>> >>> > > no
>> >>> > > > >> > action
>> >>> > > > >> > > is
>> >>> > > > >> > > >> the best action for us. From discussion so far, I see
>> the
>> >>> > > > following
>> >>> > > > >> > > >> proposals:
>> >>> > > > >> > > >>
>> >>> > > > >> > > >> 1. Deprecating Hive CLI and advise that users use
>> Beeline.
>> >>> > > > >> > > >> 2. Make Hive CLI as naming flavor to beeline with
>> embedded
>> >>> > > mode.
>> >>> > > > >> > > >>
>> >>> > > > >> > > >> Frankly, I don't see much difference between the two
>> >>> > > approaches.
>> >>> > > > >> > Keeping
>> >>> > > > >> > > >> an
>> >>> > > > >> > > >> alias at script or even code level isn't that much
>> work.
>> >>> > > However,
>> >>> > > > >> > > >> shouldn't
>> >>> > > > >> > > >> we pick a direction and start moving to it? If there is
>> >>> any
>> >>> > > gaps
>> >>> > > > >> > between
>> >>> > > > >> > > >> beeline embedded and Hive CLI, we should identify and
>> >>> fill in
>> >>> > > > >> those.
>> >>> > > > >> > > >>
>> >>> > > > >> > > >> I'd love to hear the thoughts from the community and
>> hope
>> >>> > this
>> >>> > > > >> time we
>> >>> > > > >> > > >> will
>> >>> > > > >> > > >> have concrete action items to work on.
>> >>> > > > >> > > >>
>> >>> > > > >> > > >> Thanks,
>> >>> > > > >> > > >> Xuefu
>> >>> > > > >> > > >>
>> >>> > > > >> > > >> [1]
>> >>> > > > >> > > >>
>> >>> > > > >> > > >>
>> >>> > > > >> > >
>> >>> > > > >> >
>> >>> > > > >>
>> >>> > > >
>> >>> > >
>> >>> >
>> >>>
>> http://mail-archives.apache.org/mod_mbox/hive-dev/201412.mbox/%3C5485E1BE.3060709%40hortonworks.com%3E
>> >>> > > > >> > > >> [2]
>> >>> > > > >>
>> https://www.mail-archive.com/dev@hive.apache.org/msg112378.html
>> >>> > > > >> > > >>
>> >>> > > > >> > > >>
>> >>> > > > >> > > >
>> >>> > > > >> > >
>> >>> > > > >> >
>> >>> > > > >>
>> >>> > > > >
>> >>> > > > >
>> >>> > > >
>> >>> > >
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>>
>> >>
>> >>
>>

Re: [DISCUSS] Deprecating Hive CLI

Posted by Xuefu Zhang <xz...@cloudera.com>.
Hi Thejas,

Thanks for your input. I thought about this, but I don't really feel it
necessary to have a "transition" stage. After all, Hive CLI is a command
line tool with well-defined command line options. That's the "interface"
that we need to support. We are just changing the implementation. Through
comprehensive testing, we hope to discover most of the issues.

On the other hand, if we have such an transition, there might never be a
user bothering to flip the env variable and the transition doesn't really
build up more confidence.

In addition, if we provide either a transition or switch for every
implementation change, wouldn't users be overwhelmed by those transitions
or switches.

Thoughts?

Thanks,
Xuefu

On Thu, Apr 30, 2015 at 3:10 PM, Thejas Nair <th...@gmail.com> wrote:

> Hi Xuefu,
> What is the plan you have in mind for a transition to using beeline
> from within hive?
> I assume there is going to be some translation from hive cli options
> and commands to beeline. Is that right ?
> Once the translation is in place, how would the switch happen ?
>
> I am thinking that once there is a hive-cli compatible beeline mode,
> there can be an option to switch between beeline and hive cli codebase
> .
> For example,
> In hive version X , when an environment variable CLI_USE_BEELINE=true
> environment variable is set, "hive" command uses beeline underneath
> (default remains cli codepath, so that users can start experimenting
> with "hive" commands beeline mode).
> In hive version Y > X, by default "hive" command starts using beeline
> underneath.
>
> Is it something like this what you have in mind ?
>
> Thanks,
> Thejas
>
>
>
> On Mon, Apr 27, 2015 at 5:31 PM, Xuefu Zhang <xz...@cloudera.com> wrote:
> > FYI, I have created an uber JIRA for this:
> > https://issues.apache.org/jira/browse/HIVE-10511.
> >
> > Thanks,
> > Xuefu
> >
> > On Mon, Apr 27, 2015 at 4:54 PM, Xuefu Zhang <xz...@cloudera.com>
> wrote:
> >
> >> Yes, Olga. I  will create JIRAs to track those.
> >>
> >> Thanks,
> >> Xuefu
> >>
> >> On Mon, Apr 27, 2015 at 4:51 PM, Olga L. Natkovich <
> >> olgan@yahoo-inc.com.invalid> wrote:
> >>
> >>> We would need to build a test suite that makes sure that new
> >>> implementation is compatible with the old one for users to adopt it. We
> >>> would also need some benchmarks to compare performance. Could you
> please
> >>> include this in the proposal as well.
> >>> Thanks,
> >>> Olga
> >>>       From: Xuefu Zhang <xz...@cloudera.com>
> >>>  To: "dev@hive.apache.org" <de...@hive.apache.org>
> >>>  Sent: Monday, April 27, 2015 4:46 PM
> >>>  Subject: Re: [DISCUSS] Deprecating Hive CLI
> >>>
> >>> Existing implementation of Hive CLI will be replaced, so that Hive
> >>> community don't need to maintain two code paths for the same thing.
> That's
> >>> basically what option #2 provides.
> >>>
> >>>
> >>>
> >>> On Mon, Apr 27, 2015 at 4:01 PM, Alexander Pivovarov <
> >>> apivovarov@gmail.com>
> >>> wrote:
> >>>
> >>> > Does it mean that existing Hive CLI will be killed?
> >>> >
> >>> > On Mon, Apr 27, 2015 at 3:46 PM, Xuefu Zhang <xz...@cloudera.com>
> >>> wrote:
> >>> >
> >>> > > To be precise, the proposal is NOT deprecating, but more of
> changing
> >>> the
> >>> > > implementation of the Hive CLI using beeline, which seems in
> >>> consensus.
> >>> > >
> >>> > > On Mon, Apr 27, 2015 at 2:47 PM, Alexander Pivovarov <
> >>> > apivovarov@gmail.com
> >>> > > >
> >>> > > wrote:
> >>> > >
> >>> > > > I just started the survey on Deprecating Hive CLI. Please share
> you
> >>> > > > opinion.
> >>> > > >
> >>> > > > Deprecating Hive CLI:
> >>> > > > https://www.surveymonkey.com/s/XFHLM57
> >>> > > >
> >>> > > > Results:
> >>> > > > https://www.surveymonkey.com/results/SM-JHYY5DR9/
> >>> > > >
> >>> > > >
> >>> > > > On Mon, Apr 27, 2015 at 2:23 PM, Alexander Pivovarov <
> >>> > > apivovarov@gmail.com
> >>> > > > >
> >>> > > > wrote:
> >>> > > >
> >>> > > > > Xuefu,
> >>> > > > >
> >>> > > > > I'm just saying that most of the shells (e.g. mysql or
> accumulo)
> >>> > > reserve
> >>> > > > > -u for user.
> >>> > > > >
> >>> > > > > I believe lots of stuff in Hive take MySQL as an example.
> >>> > > > >
> >>> > > > > Alex
> >>> > > > >
> >>> > > > >
> >>> > > > > On Mon, Apr 27, 2015 at 2:14 PM, Xuefu Zhang <
> xzhang@cloudera.com
> >>> >
> >>> > > > wrote:
> >>> > > > >
> >>> > > > >> Alex,
> >>> > > > >>
> >>> > > > >> Just to be sure, we are talking about replace Hive CLI, not
> mysql
> >>> > and
> >>> > > > >> accumulo command line shells. Thus, I'm not sure this is
> >>> relavent.
> >>> > > > >> Regardless, I think we'd better have some writeup in the
> proposed
> >>> > uber
> >>> > > > >> JIRA
> >>> > > > >> so that everyone knows what we are signing up.
> >>> > > > >>
> >>> > > > >> Thanks,
> >>> > > > >> Xuefu
> >>> > > > >>
> >>> > > > >> On Mon, Apr 27, 2015 at 12:57 PM, Alexander Pivovarov <
> >>> > > > >> apivovarov@gmail.com>
> >>> > > > >> wrote:
> >>> > > > >>
> >>> > > > >> > Mysql and accumulo command line shells use -u to pass <user>
> >>> > > > >> >
> >>> > > > >> > Can beeline use -u as well? Currently -u is reserved for
> URL?
> >>> > > > >> > On Apr 27, 2015 12:42 PM, "Xuefu Zhang" <
> xzhang@cloudera.com>
> >>> > > wrote:
> >>> > > > >> >
> >>> > > > >> > > Thanks to all for the input. I assume that we have a
> >>> consensus
> >>> > > that
> >>> > > > >> we'd
> >>> > > > >> > > like to keep Hive as an alias to beeline with embedded HS2
> >>> and
> >>> > > make
> >>> > > > >> user
> >>> > > > >> > > transition as smooth as possible by identifying gaps and
> >>> fixing
> >>> > > > >> issues.
> >>> > > > >> > I'm
> >>> > > > >> > > going to create an umbrella JIRA and subtasks to track the
> >>> > > progress.
> >>> > > > >> > Please
> >>> > > > >> > > let me know if you have further questions.
> >>> > > > >> > >
> >>> > > > >> > > Thanks,
> >>> > > > >> > > Xuefu
> >>> > > > >> > >
> >>> > > > >> > > On Sat, Apr 25, 2015 at 12:59 AM, Lars Francke <
> >>> > > > >> lars.francke@gmail.com>
> >>> > > > >> > > wrote:
> >>> > > > >> > >
> >>> > > > >> > > > Yes, well put. It is about usability and "least
> surprise".
> >>> > > > >> > > >
> >>> > > > >> > > > So if people wouldn't have to deal with JDBC syntax by
> >>> default
> >>> > > and
> >>> > > > >> > could
> >>> > > > >> > > > use "hive" instead of "beeline" to start that'd be good.
> >>> > > > >> > > >
> >>> > > > >> > > >
> >>> > > > >> > > > On Sat, Apr 25, 2015 at 12:38 AM, Alan Gates <
> >>> > > > alanfgates@gmail.com>
> >>> > > > >> > > wrote:
> >>> > > > >> > > >
> >>> > > > >> > > >> If I understand correctly this is an argument about
> >>> > usability,
> >>> > > > not
> >>> > > > >> > > >> functionality.  So if Hive still had the CLI but it
> >>> happened
> >>> > to
> >>> > > > use
> >>> > > > >> > > either
> >>> > > > >> > > >> HS2 or embedded HS2 (depending on configuration)
> >>> underneath
> >>> > > your
> >>> > > > >> > > concerns
> >>> > > > >> > > >> would be addressed.  Is that correct?
> >>> > > > >> > > >>
> >>> > > > >> > > >> Alan.
> >>> > > > >> > > >>
> >>> > > > >> > > >>  Lars Francke <la...@gmail.com>
> >>> > > > >> > > >>  April 23, 2015 at 15:53
> >>> > > > >> > > >> I've been at about 20 different customers in the years
> >>> since
> >>> > > > >> Beeline
> >>> > > > >> > has
> >>> > > > >> > > >> been added. I can only think of a single one that has
> used
> >>> > > > beeline.
> >>> > > > >> > The
> >>> > > > >> > > >> instinct is to use "hive", partially because it is
> easy to
> >>> > > > remember
> >>> > > > >> > and
> >>> > > > >> > > >> intuitive and because it is easier to use. I end up
> >>> googling
> >>> > > the
> >>> > > > >> > stupid
> >>> > > > >> > > >> JDBC syntax every single time.
> >>> > > > >> > > >>
> >>> > > > >> > > >> I know this might be a bit "out there" but I propose
> >>> > something
> >>> > > > >> else:
> >>> > > > >> > > >> 1) Rename (or link) "beeline" to "hive"
> >>> > > > >> > > >> 2) Add a "--hiveserver2" (or "--jdbc" or "--beeline")
> >>> option
> >>> > to
> >>> > > > the
> >>> > > > >> > > >> "hive" command to get the current "beeline", this'd
> keep
> >>> the
> >>> > > CLI
> >>> > > > as
> >>> > > > >> > > >> default, we could also add a "--legacy" or "--cli"
> option
> >>> and
> >>> > > > make
> >>> > > > >> > > >> "hiveserver2/beeline" the default.
> >>> > > > >> > > >> 3) Add a "--embedded-hs2" option to the "hive" command
> to
> >>> get
> >>> > > an
> >>> > > > >> > > embedded
> >>> > > > >> > > >> HS2 in Beeline
> >>> > > > >> > > >> 4) Add some documentation to beeline reminding people
> on
> >>> > > startup
> >>> > > > of
> >>> > > > >> > > >> beeline on how to connect and how to use embedded mode
> >>> > > > >> > > >>
> >>> > > > >> > > >> The fact is that the old shell just works for lots of
> >>> people
> >>> > > and
> >>> > > > >> > there's
> >>> > > > >> > > >> just no need for beeline for these people. Also the
> name
> >>> is
> >>> > > > >> confusing
> >>> > > > >> > -
> >>> > > > >> > > >> especially for non-native speakers. It's not a common
> >>> word so
> >>> > > > it's
> >>> > > > >> not
> >>> > > > >> > > easy
> >>> > > > >> > > >> to remember.
> >>> > > > >> > > >>
> >>> > > > >> > > >>
> >>> > > > >> > > >>  Alan Gates <al...@gmail.com>
> >>> > > > >> > > >>  April 23, 2015 at 15:35
> >>> > > > >> > > >>  Xuefu, thanks for getting this discussion started.
> >>> Limiting
> >>> > > our
> >>> > > > >> code
> >>> > > > >> > > >> paths is definitely a plus.  My inclination would be
> to go
> >>> > > > towards
> >>> > > > >> > > option
> >>> > > > >> > > >> 2.  A few questions:
> >>> > > > >> > > >>
> >>> > > > >> > > >> 1) Is there any functionality in CLI that's not in
> >>> beeline?
> >>> > > > >> > > >> 2) If I understand correctly option 2 would have an
> >>> implicit
> >>> > > HS2
> >>> > > > in
> >>> > > > >> > > >> process when a user runs the CLI.  Would this be
> >>> available in
> >>> > > > >> option 1
> >>> > > > >> > > as
> >>> > > > >> > > >> well?
> >>> > > > >> > > >> 3) Are there any performance implications, since now
> >>> commands
> >>> > > > have
> >>> > > > >> to
> >>> > > > >> > > hop
> >>> > > > >> > > >> through a thrift/jdbc loop even in the embedded mode?
> >>> > > > >> > > >> 4) If we choose option 2 how backward compatible can we
> >>> make
> >>> > > it?
> >>> > > > >> Will
> >>> > > > >> > > >> users need to change any scripts they have that use the
> >>> CLI?
> >>> > > Do
> >>> > > > we
> >>> > > > >> > have
> >>> > > > >> > > >> tests that will make sure of this?
> >>> > > > >> > > >>
> >>> > > > >> > > >> Alan.
> >>> > > > >> > > >>
> >>> > > > >> > > >>  Xuefu Zhang <xz...@cloudera.com>
> >>> > > > >> > > >>  April 23, 2015 at 14:43
> >>> > > > >> > > >> Hi all,
> >>> > > > >> > > >>
> >>> > > > >> > > >> I'd like to revive the discussion about the fate of
> Hive
> >>> CLI,
> >>> > > as
> >>> > > > >> this
> >>> > > > >> > > >> topic
> >>> > > > >> > > >> has haunted us several times including [1][2]. It looks
> >>> to me
> >>> > > > that
> >>> > > > >> > there
> >>> > > > >> > > >> is
> >>> > > > >> > > >> a consensus that it's not wise for Hive community to
> keep
> >>> > both
> >>> > > > Hive
> >>> > > > >> > CLI
> >>> > > > >> > > as
> >>> > > > >> > > >> it is as well as Beeline + HS2. However, I don't
> believe
> >>> that
> >>> > > no
> >>> > > > >> > action
> >>> > > > >> > > is
> >>> > > > >> > > >> the best action for us. From discussion so far, I see
> the
> >>> > > > following
> >>> > > > >> > > >> proposals:
> >>> > > > >> > > >>
> >>> > > > >> > > >> 1. Deprecating Hive CLI and advise that users use
> Beeline.
> >>> > > > >> > > >> 2. Make Hive CLI as naming flavor to beeline with
> embedded
> >>> > > mode.
> >>> > > > >> > > >>
> >>> > > > >> > > >> Frankly, I don't see much difference between the two
> >>> > > approaches.
> >>> > > > >> > Keeping
> >>> > > > >> > > >> an
> >>> > > > >> > > >> alias at script or even code level isn't that much
> work.
> >>> > > However,
> >>> > > > >> > > >> shouldn't
> >>> > > > >> > > >> we pick a direction and start moving to it? If there is
> >>> any
> >>> > > gaps
> >>> > > > >> > between
> >>> > > > >> > > >> beeline embedded and Hive CLI, we should identify and
> >>> fill in
> >>> > > > >> those.
> >>> > > > >> > > >>
> >>> > > > >> > > >> I'd love to hear the thoughts from the community and
> hope
> >>> > this
> >>> > > > >> time we
> >>> > > > >> > > >> will
> >>> > > > >> > > >> have concrete action items to work on.
> >>> > > > >> > > >>
> >>> > > > >> > > >> Thanks,
> >>> > > > >> > > >> Xuefu
> >>> > > > >> > > >>
> >>> > > > >> > > >> [1]
> >>> > > > >> > > >>
> >>> > > > >> > > >>
> >>> > > > >> > >
> >>> > > > >> >
> >>> > > > >>
> >>> > > >
> >>> > >
> >>> >
> >>>
> http://mail-archives.apache.org/mod_mbox/hive-dev/201412.mbox/%3C5485E1BE.3060709%40hortonworks.com%3E
> >>> > > > >> > > >> [2]
> >>> > > > >>
> https://www.mail-archive.com/dev@hive.apache.org/msg112378.html
> >>> > > > >> > > >>
> >>> > > > >> > > >>
> >>> > > > >> > > >
> >>> > > > >> > >
> >>> > > > >> >
> >>> > > > >>
> >>> > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>>
> >>>
> >>>
> >>
> >>
>