You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Arun C Murthy <ac...@hortonworks.com> on 2012/04/20 08:46:30 UTC

Locking protocols for hadoop-2.x.x

Moving to a separate thread...

On Apr 20, 2012, at 1:24 AM, Todd Lipcon wrote:

> On Thu, Apr 19, 2012 at 12:26 PM, Eli Collins <el...@cloudera.com> wrote:
> 
>> On Thu, Apr 19, 2012 at 11:45 AM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>> However, we should consider whether HDFS protocols are 'ready' for us to
>> commit to them for the foreseeable future, my sense is that it's a tad
>> early - particularly with auto-failover not complete.
>> 
>> Agree that we're a little too early on the HDFS protocol side, think
>> MR2 is probably in a similar boat wrt stability as well.
>> 

Agreed, I didn't mean to point fingers at HDFS - it was just the most recent changes.

> Regarding protocols:
> +1 to _not_ locking down "cluster-internal" wire compatibility at this
> point. i.e we can break DN<->NN, or NN<->SBN, or Admin command -> NN
> compatibility still.
> +1 to locking down client wire compatibility with the release of 2.0. After
> 2.0 is released I would like to see all 2.0.x clients continue to be
> compatible. Now that we are protobuf-ified, I think this is doable.
> Should we open a separate discussion thread for the above?

Good points on separating client & internal protocols.

My sense is that locking client-protocols is a great start, but not sufficient.

Ideally, we should be considering things like rolling upgrades etc. which necessitate compatibility all across. I'm fully aware it might be too early for us to lock them...

Maybe we can do some hadoop-2.x-(alpha,beta) releases for a few months and then just bite the bullet as HA & YARN protocols stabilize?

Thoughts?

Arun

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Locking protocols for hadoop-2.x.x

Posted by Steve Loughran <st...@gmail.com>.
On 20 April 2012 15:44, Robert Evans <ev...@yahoo-inc.com> wrote:
> The trick with yarn is that there are a lot more client APIs then for HDFS.  In addition they have not had as much time as the HDFS APIs to mature.  If we want to lock down the client APIs I am fine with that, because I don't really see any huge problems with the existing APIs, but I think waiting to lock them down is a good thing, at least until we can get Hamster and other non-mapreduce applications up and running, and any feedback they may have about the APIs integrated back into them.
>
> --Bobby Evans

+1. The alphas need to be out and other things working w/ YARN before
things should be frozen, as that frozen state will have to be carried
forward into the future. At least they might have to be carried
forward, unless there is a big disclaimer "YARN developers -things may
change, get on the mailing list"

Re: Locking protocols for hadoop-2.x.x

Posted by Robert Evans <ev...@yahoo-inc.com>.
The trick with yarn is that there are a lot more client APIs then for HDFS.  In addition they have not had as much time as the HDFS APIs to mature.  If we want to lock down the client APIs I am fine with that, because I don't really see any huge problems with the existing APIs, but I think waiting to lock them down is a good thing, at least until we can get Hamster and other non-mapreduce applications up and running, and any feedback they may have about the APIs integrated back into them.

--Bobby Evans

On 4/20/12 2:15 AM, "Eli Collins" <el...@cloudera.com> wrote:

On Thu, Apr 19, 2012 at 11:46 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Moving to a separate thread...
>
> On Apr 20, 2012, at 1:24 AM, Todd Lipcon wrote:
>
>> On Thu, Apr 19, 2012 at 12:26 PM, Eli Collins <el...@cloudera.com> wrote:
>>
>>> On Thu, Apr 19, 2012 at 11:45 AM, Arun C Murthy <ac...@hortonworks.com>
>>> wrote:
>>>> However, we should consider whether HDFS protocols are 'ready' for us to
>>> commit to them for the foreseeable future, my sense is that it's a tad
>>> early - particularly with auto-failover not complete.
>>>
>>> Agree that we're a little too early on the HDFS protocol side, think
>>> MR2 is probably in a similar boat wrt stability as well.
>>>
>
> Agreed, I didn't mean to point fingers at HDFS - it was just the most recent changes.
>
>> Regarding protocols:
>> +1 to _not_ locking down "cluster-internal" wire compatibility at this
>> point. i.e we can break DN<->NN, or NN<->SBN, or Admin command -> NN
>> compatibility still.
>> +1 to locking down client wire compatibility with the release of 2.0. After
>> 2.0 is released I would like to see all 2.0.x clients continue to be
>> compatible. Now that we are protobuf-ified, I think this is doable.
>> Should we open a separate discussion thread for the above?
>
> Good points on separating client & internal protocols.
>
> My sense is that locking client-protocols is a great start, but not sufficient.
>
> Ideally, we should be considering things like rolling upgrades etc. which necessitate compatibility all across. I'm fully aware it might be too early for us to lock them...
>

Yup, we've put the mechanism into HDFS for rolling upgrades
(HDFS-2983) and filed (MR-4150) for the same in MR2, but they'll only
be useful if we lock down the protocol (and use PB to get around
differences).  Agree w Todd that we're too early for those right now,
and they're much less painful breakages than client <-> server.

> Maybe we can do some hadoop-2.x-(alpha,beta) releases for a few months and then just bite the bullet as HA & YARN protocols stabilize?
>

Sounds good, we should probably use eg "alpha1", "alpha2" etc in case
we need to do more than a single alpha or beta release.

Thanks,
Eli

> Thoughts?
>
> Arun
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>


Re: Locking protocols for hadoop-2.x.x

Posted by Eli Collins <el...@cloudera.com>.
On Thu, Apr 19, 2012 at 11:46 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Moving to a separate thread...
>
> On Apr 20, 2012, at 1:24 AM, Todd Lipcon wrote:
>
>> On Thu, Apr 19, 2012 at 12:26 PM, Eli Collins <el...@cloudera.com> wrote:
>>
>>> On Thu, Apr 19, 2012 at 11:45 AM, Arun C Murthy <ac...@hortonworks.com>
>>> wrote:
>>>> However, we should consider whether HDFS protocols are 'ready' for us to
>>> commit to them for the foreseeable future, my sense is that it's a tad
>>> early - particularly with auto-failover not complete.
>>>
>>> Agree that we're a little too early on the HDFS protocol side, think
>>> MR2 is probably in a similar boat wrt stability as well.
>>>
>
> Agreed, I didn't mean to point fingers at HDFS - it was just the most recent changes.
>
>> Regarding protocols:
>> +1 to _not_ locking down "cluster-internal" wire compatibility at this
>> point. i.e we can break DN<->NN, or NN<->SBN, or Admin command -> NN
>> compatibility still.
>> +1 to locking down client wire compatibility with the release of 2.0. After
>> 2.0 is released I would like to see all 2.0.x clients continue to be
>> compatible. Now that we are protobuf-ified, I think this is doable.
>> Should we open a separate discussion thread for the above?
>
> Good points on separating client & internal protocols.
>
> My sense is that locking client-protocols is a great start, but not sufficient.
>
> Ideally, we should be considering things like rolling upgrades etc. which necessitate compatibility all across. I'm fully aware it might be too early for us to lock them...
>

Yup, we've put the mechanism into HDFS for rolling upgrades
(HDFS-2983) and filed (MR-4150) for the same in MR2, but they'll only
be useful if we lock down the protocol (and use PB to get around
differences).  Agree w Todd that we're too early for those right now,
and they're much less painful breakages than client <-> server.

> Maybe we can do some hadoop-2.x-(alpha,beta) releases for a few months and then just bite the bullet as HA & YARN protocols stabilize?
>

Sounds good, we should probably use eg "alpha1", "alpha2" etc in case
we need to do more than a single alpha or beta release.

Thanks,
Eli

> Thoughts?
>
> Arun
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>