You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Arun C Murthy <ac...@hortonworks.com> on 2013/04/26 03:34:17 UTC

Heads up - 2.0.5-beta

Gang,

 With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I believe we are nearly there, exciting times!

 As we have discussed previously, I hope to do a final push to stabilize hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then declare hadoop-2.1 as stable this summer after a short period of intensive testing.

 With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features, but please ensure that all APIs are frozen for hadoop-2.0.5-beta

 Vinod is helping out on the YARN/MR side and has tagged a number of final changes (including some the final API incompatibilities) we'd like to push in before we call hadoop-2.x as ready to be supported (Target Version set to 2.0.5-beta):
 http://s.apache.org/target-hadoop-2.0.5-beta
 Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be tagged, but their necessity is implied).

 Similarly on HDFS side, can someone please help out by tagging features, bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs & protocols are locked down too - I'd really appreciate it!

thanks,
Arun


--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Heads up - 2.0.5-beta

Posted by Eli Collins <el...@cloudera.com>.
On Fri, Apr 26, 2013 at 2:42 PM, Suresh Srinivas <su...@hortonworks.com> wrote:
> Eli, I will post a more detailed reply soon. But one small correction:
>
>
> I'm also not sure there's currently consensus on what an incompatible
>> change is. For example, I think HADOOP-9151 is incompatible because it
>> broke client/server wire compatibility with previous releases and any
>> change that breaks wire compatibility is incompatible.  Suresh felt it
>> was not an incompatible change because it did not affect API
>> compatibility (ie PB is not considered part of the API) and the change
>> occurred while v2 is in alpha.
>>
>
> This is not correct. I did not say it was not an incompatible change.
> It was indeed an incompatible wire protocol change. My argument was,
> the phase of development we were in, we could not mark wire protocol
> as stable and not make any incompatible change. But once 2.0.5-beta
> is out, as had discussed earlier, we should not make further incompatible
> changes to wire protocol.

Sorry for the confusion, I misinterpreted your comments on the jira
(specifically, "This is an incompatible change: I disagree." and "see
my argument that about why this is not incompatible.")  to indicate
that you thought it was not incompatible.



>
> --
> http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Eli Collins <el...@cloudera.com>.
On Fri, Apr 26, 2013 at 2:42 PM, Suresh Srinivas <su...@hortonworks.com> wrote:
> Eli, I will post a more detailed reply soon. But one small correction:
>
>
> I'm also not sure there's currently consensus on what an incompatible
>> change is. For example, I think HADOOP-9151 is incompatible because it
>> broke client/server wire compatibility with previous releases and any
>> change that breaks wire compatibility is incompatible.  Suresh felt it
>> was not an incompatible change because it did not affect API
>> compatibility (ie PB is not considered part of the API) and the change
>> occurred while v2 is in alpha.
>>
>
> This is not correct. I did not say it was not an incompatible change.
> It was indeed an incompatible wire protocol change. My argument was,
> the phase of development we were in, we could not mark wire protocol
> as stable and not make any incompatible change. But once 2.0.5-beta
> is out, as had discussed earlier, we should not make further incompatible
> changes to wire protocol.

Sorry for the confusion, I misinterpreted your comments on the jira
(specifically, "This is an incompatible change: I disagree." and "see
my argument that about why this is not incompatible.")  to indicate
that you thought it was not incompatible.



>
> --
> http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Eli Collins <el...@cloudera.com>.
On Fri, Apr 26, 2013 at 2:42 PM, Suresh Srinivas <su...@hortonworks.com> wrote:
> Eli, I will post a more detailed reply soon. But one small correction:
>
>
> I'm also not sure there's currently consensus on what an incompatible
>> change is. For example, I think HADOOP-9151 is incompatible because it
>> broke client/server wire compatibility with previous releases and any
>> change that breaks wire compatibility is incompatible.  Suresh felt it
>> was not an incompatible change because it did not affect API
>> compatibility (ie PB is not considered part of the API) and the change
>> occurred while v2 is in alpha.
>>
>
> This is not correct. I did not say it was not an incompatible change.
> It was indeed an incompatible wire protocol change. My argument was,
> the phase of development we were in, we could not mark wire protocol
> as stable and not make any incompatible change. But once 2.0.5-beta
> is out, as had discussed earlier, we should not make further incompatible
> changes to wire protocol.

Sorry for the confusion, I misinterpreted your comments on the jira
(specifically, "This is an incompatible change: I disagree." and "see
my argument that about why this is not incompatible.")  to indicate
that you thought it was not incompatible.



>
> --
> http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Eli Collins <el...@cloudera.com>.
On Fri, Apr 26, 2013 at 2:42 PM, Suresh Srinivas <su...@hortonworks.com> wrote:
> Eli, I will post a more detailed reply soon. But one small correction:
>
>
> I'm also not sure there's currently consensus on what an incompatible
>> change is. For example, I think HADOOP-9151 is incompatible because it
>> broke client/server wire compatibility with previous releases and any
>> change that breaks wire compatibility is incompatible.  Suresh felt it
>> was not an incompatible change because it did not affect API
>> compatibility (ie PB is not considered part of the API) and the change
>> occurred while v2 is in alpha.
>>
>
> This is not correct. I did not say it was not an incompatible change.
> It was indeed an incompatible wire protocol change. My argument was,
> the phase of development we were in, we could not mark wire protocol
> as stable and not make any incompatible change. But once 2.0.5-beta
> is out, as had discussed earlier, we should not make further incompatible
> changes to wire protocol.

Sorry for the confusion, I misinterpreted your comments on the jira
(specifically, "This is an incompatible change: I disagree." and "see
my argument that about why this is not incompatible.")  to indicate
that you thought it was not incompatible.



>
> --
> http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Suresh Srinivas <su...@hortonworks.com>.
Eli, I will post a more detailed reply soon. But one small correction:


I'm also not sure there's currently consensus on what an incompatible
> change is. For example, I think HADOOP-9151 is incompatible because it
> broke client/server wire compatibility with previous releases and any
> change that breaks wire compatibility is incompatible.  Suresh felt it
> was not an incompatible change because it did not affect API
> compatibility (ie PB is not considered part of the API) and the change
> occurred while v2 is in alpha.
>

This is not correct. I did not say it was not an incompatible change.
It was indeed an incompatible wire protocol change. My argument was,
the phase of development we were in, we could not mark wire protocol
as stable and not make any incompatible change. But once 2.0.5-beta
is out, as had discussed earlier, we should not make further incompatible
changes to wire protocol.

-- 
http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Agreed Luke. Thanks for pointing it out, I'll track it as such.

Arun

On Apr 26, 2013, at 1:37 PM, Luke Lu wrote:

> If protocol compatibility of v2 and v3 is a goal, HADOOP-8990 should be a
> blocker for v2.
> 
> __Luke
> 
> On Fri, Apr 26, 2013 at 12:07 PM, Eli Collins <el...@cloudera.com> wrote:
> 
>> On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>> 
>>> On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
>>> 
>>>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>>> 
>>>>> With that in mind, I really want to make a serious push to lock down
>> APIs and wire-protocols for hadoop-2.0.5-beta.
>>>>> Thus, we can confidently support hadoop-2.x in a compatible manner in
>> the future. So, it's fine to add new features,
>>>>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>>>> 
>>>> Arun, since it sounds like you have a pretty definite idea
>>>> in mind for what you want 'beta' label to actually mean,
>>>> could you, please, share the exact criteria?
>>> 
>>> Sorry, I'm not sure if this is exactly what you are looking for but, as
>> I mentioned above, the primary aim would be make the final set of required
>> API/write-protocol changes so that we can call it a 'beta' i.e. once
>> 2.0.5-beta ships users & downstream projects can be confident about forward
>> compatibility in hadoop-2.x line. Obviously, we might discover a blocker
>> bug post 2.0.5 which *might* necessitate an unfortunate change - but that
>> should be an outstanding exception.
>> 
>> Arun, Suresh,
>> 
>> Mind reviewing the following page Karthik put together on
>> compatibility?   http://wiki.apache.org/hadoop/Compatibility
>> 
>> I think we should do something similar to what Sanjay proposed in
>> HADOOP-5071 for Hadoop v2.   If we get on the same page on
>> compatibility terms/APIs then we can quickly draft the policy, at
>> least for the things we've already got consensus on.  I think our new
>> developers, users, downstream projects, and partners would really
>> appreciate us making this clear.  If people like the content we can
>> move it to the Hadoop website and maintain it in svn like the bylaws.
>> 
>> The reason I think we need to do so is because there's been confusion
>> about what types of compatibility we promise and some open questions
>> which I'm not sure everyone is clear on. Examples:
>> - Are we going to preserve Hadoop v3 clients against v2 servers now
>> that we have protobuf support?  (I think so..)
>> - Can we break rolling upgrade of daemons in updates post GA? (I don't
>> think so..)
>> - Do we disallow HDFS metadata changes that require an HDFS upgrade in
>> an update? (I think so..)
>> - Can we remove methods from v2 and v2 updates that were deprecated in
>> v0.20-22?  (Unclear)
>> - Will we preserve binary compatibility for MR2 going forward? (I think
>> so..)
>> - Does the ability to support multiple versions of MR simultaneously
>> via MR2 change the MR API compatibility story? (I don't think so..)
>> - Are the RM protocols sufficiently stable to disallow incompatible
>> changes potentially required by non-MR projects? (Unclear, most large
>> Yarn deployments I'm aware of are running 0.23, not v2 alphas)
>> 
>> I'm also not sure there's currently consensus on what an incompatible
>> change is. For example, I think HADOOP-9151 is incompatible because it
>> broke client/server wire compatibility with previous releases and any
>> change that breaks wire compatibility is incompatible.  Suresh felt it
>> was not an incompatible change because it did not affect API
>> compatibility (ie PB is not considered part of the API) and the change
>> occurred while v2 is in alpha.  Not sure we need to go through the
>> whole exercise of what's allowed in an alpha and beta (water under the
>> bridge, hopefully), but I do think we should clearly define an
>> incompatible change.  It's fine that v2 has been a bit wild wild west
>> in the alpha development stage but I think we need to get a little
>> more rigorous.
>> 
>> Thanks,
>> Eli
>> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Agreed Luke. Thanks for pointing it out, I'll track it as such.

Arun

On Apr 26, 2013, at 1:37 PM, Luke Lu wrote:

> If protocol compatibility of v2 and v3 is a goal, HADOOP-8990 should be a
> blocker for v2.
> 
> __Luke
> 
> On Fri, Apr 26, 2013 at 12:07 PM, Eli Collins <el...@cloudera.com> wrote:
> 
>> On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>> 
>>> On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
>>> 
>>>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>>> 
>>>>> With that in mind, I really want to make a serious push to lock down
>> APIs and wire-protocols for hadoop-2.0.5-beta.
>>>>> Thus, we can confidently support hadoop-2.x in a compatible manner in
>> the future. So, it's fine to add new features,
>>>>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>>>> 
>>>> Arun, since it sounds like you have a pretty definite idea
>>>> in mind for what you want 'beta' label to actually mean,
>>>> could you, please, share the exact criteria?
>>> 
>>> Sorry, I'm not sure if this is exactly what you are looking for but, as
>> I mentioned above, the primary aim would be make the final set of required
>> API/write-protocol changes so that we can call it a 'beta' i.e. once
>> 2.0.5-beta ships users & downstream projects can be confident about forward
>> compatibility in hadoop-2.x line. Obviously, we might discover a blocker
>> bug post 2.0.5 which *might* necessitate an unfortunate change - but that
>> should be an outstanding exception.
>> 
>> Arun, Suresh,
>> 
>> Mind reviewing the following page Karthik put together on
>> compatibility?   http://wiki.apache.org/hadoop/Compatibility
>> 
>> I think we should do something similar to what Sanjay proposed in
>> HADOOP-5071 for Hadoop v2.   If we get on the same page on
>> compatibility terms/APIs then we can quickly draft the policy, at
>> least for the things we've already got consensus on.  I think our new
>> developers, users, downstream projects, and partners would really
>> appreciate us making this clear.  If people like the content we can
>> move it to the Hadoop website and maintain it in svn like the bylaws.
>> 
>> The reason I think we need to do so is because there's been confusion
>> about what types of compatibility we promise and some open questions
>> which I'm not sure everyone is clear on. Examples:
>> - Are we going to preserve Hadoop v3 clients against v2 servers now
>> that we have protobuf support?  (I think so..)
>> - Can we break rolling upgrade of daemons in updates post GA? (I don't
>> think so..)
>> - Do we disallow HDFS metadata changes that require an HDFS upgrade in
>> an update? (I think so..)
>> - Can we remove methods from v2 and v2 updates that were deprecated in
>> v0.20-22?  (Unclear)
>> - Will we preserve binary compatibility for MR2 going forward? (I think
>> so..)
>> - Does the ability to support multiple versions of MR simultaneously
>> via MR2 change the MR API compatibility story? (I don't think so..)
>> - Are the RM protocols sufficiently stable to disallow incompatible
>> changes potentially required by non-MR projects? (Unclear, most large
>> Yarn deployments I'm aware of are running 0.23, not v2 alphas)
>> 
>> I'm also not sure there's currently consensus on what an incompatible
>> change is. For example, I think HADOOP-9151 is incompatible because it
>> broke client/server wire compatibility with previous releases and any
>> change that breaks wire compatibility is incompatible.  Suresh felt it
>> was not an incompatible change because it did not affect API
>> compatibility (ie PB is not considered part of the API) and the change
>> occurred while v2 is in alpha.  Not sure we need to go through the
>> whole exercise of what's allowed in an alpha and beta (water under the
>> bridge, hopefully), but I do think we should clearly define an
>> incompatible change.  It's fine that v2 has been a bit wild wild west
>> in the alpha development stage but I think we need to get a little
>> more rigorous.
>> 
>> Thanks,
>> Eli
>> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Agreed Luke. Thanks for pointing it out, I'll track it as such.

Arun

On Apr 26, 2013, at 1:37 PM, Luke Lu wrote:

> If protocol compatibility of v2 and v3 is a goal, HADOOP-8990 should be a
> blocker for v2.
> 
> __Luke
> 
> On Fri, Apr 26, 2013 at 12:07 PM, Eli Collins <el...@cloudera.com> wrote:
> 
>> On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>> 
>>> On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
>>> 
>>>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>>> 
>>>>> With that in mind, I really want to make a serious push to lock down
>> APIs and wire-protocols for hadoop-2.0.5-beta.
>>>>> Thus, we can confidently support hadoop-2.x in a compatible manner in
>> the future. So, it's fine to add new features,
>>>>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>>>> 
>>>> Arun, since it sounds like you have a pretty definite idea
>>>> in mind for what you want 'beta' label to actually mean,
>>>> could you, please, share the exact criteria?
>>> 
>>> Sorry, I'm not sure if this is exactly what you are looking for but, as
>> I mentioned above, the primary aim would be make the final set of required
>> API/write-protocol changes so that we can call it a 'beta' i.e. once
>> 2.0.5-beta ships users & downstream projects can be confident about forward
>> compatibility in hadoop-2.x line. Obviously, we might discover a blocker
>> bug post 2.0.5 which *might* necessitate an unfortunate change - but that
>> should be an outstanding exception.
>> 
>> Arun, Suresh,
>> 
>> Mind reviewing the following page Karthik put together on
>> compatibility?   http://wiki.apache.org/hadoop/Compatibility
>> 
>> I think we should do something similar to what Sanjay proposed in
>> HADOOP-5071 for Hadoop v2.   If we get on the same page on
>> compatibility terms/APIs then we can quickly draft the policy, at
>> least for the things we've already got consensus on.  I think our new
>> developers, users, downstream projects, and partners would really
>> appreciate us making this clear.  If people like the content we can
>> move it to the Hadoop website and maintain it in svn like the bylaws.
>> 
>> The reason I think we need to do so is because there's been confusion
>> about what types of compatibility we promise and some open questions
>> which I'm not sure everyone is clear on. Examples:
>> - Are we going to preserve Hadoop v3 clients against v2 servers now
>> that we have protobuf support?  (I think so..)
>> - Can we break rolling upgrade of daemons in updates post GA? (I don't
>> think so..)
>> - Do we disallow HDFS metadata changes that require an HDFS upgrade in
>> an update? (I think so..)
>> - Can we remove methods from v2 and v2 updates that were deprecated in
>> v0.20-22?  (Unclear)
>> - Will we preserve binary compatibility for MR2 going forward? (I think
>> so..)
>> - Does the ability to support multiple versions of MR simultaneously
>> via MR2 change the MR API compatibility story? (I don't think so..)
>> - Are the RM protocols sufficiently stable to disallow incompatible
>> changes potentially required by non-MR projects? (Unclear, most large
>> Yarn deployments I'm aware of are running 0.23, not v2 alphas)
>> 
>> I'm also not sure there's currently consensus on what an incompatible
>> change is. For example, I think HADOOP-9151 is incompatible because it
>> broke client/server wire compatibility with previous releases and any
>> change that breaks wire compatibility is incompatible.  Suresh felt it
>> was not an incompatible change because it did not affect API
>> compatibility (ie PB is not considered part of the API) and the change
>> occurred while v2 is in alpha.  Not sure we need to go through the
>> whole exercise of what's allowed in an alpha and beta (water under the
>> bridge, hopefully), but I do think we should clearly define an
>> incompatible change.  It's fine that v2 has been a bit wild wild west
>> in the alpha development stage but I think we need to get a little
>> more rigorous.
>> 
>> Thanks,
>> Eli
>> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Agreed Luke. Thanks for pointing it out, I'll track it as such.

Arun

On Apr 26, 2013, at 1:37 PM, Luke Lu wrote:

> If protocol compatibility of v2 and v3 is a goal, HADOOP-8990 should be a
> blocker for v2.
> 
> __Luke
> 
> On Fri, Apr 26, 2013 at 12:07 PM, Eli Collins <el...@cloudera.com> wrote:
> 
>> On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>> 
>>> On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
>>> 
>>>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com>
>> wrote:
>>>> 
>>>>> With that in mind, I really want to make a serious push to lock down
>> APIs and wire-protocols for hadoop-2.0.5-beta.
>>>>> Thus, we can confidently support hadoop-2.x in a compatible manner in
>> the future. So, it's fine to add new features,
>>>>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>>>> 
>>>> Arun, since it sounds like you have a pretty definite idea
>>>> in mind for what you want 'beta' label to actually mean,
>>>> could you, please, share the exact criteria?
>>> 
>>> Sorry, I'm not sure if this is exactly what you are looking for but, as
>> I mentioned above, the primary aim would be make the final set of required
>> API/write-protocol changes so that we can call it a 'beta' i.e. once
>> 2.0.5-beta ships users & downstream projects can be confident about forward
>> compatibility in hadoop-2.x line. Obviously, we might discover a blocker
>> bug post 2.0.5 which *might* necessitate an unfortunate change - but that
>> should be an outstanding exception.
>> 
>> Arun, Suresh,
>> 
>> Mind reviewing the following page Karthik put together on
>> compatibility?   http://wiki.apache.org/hadoop/Compatibility
>> 
>> I think we should do something similar to what Sanjay proposed in
>> HADOOP-5071 for Hadoop v2.   If we get on the same page on
>> compatibility terms/APIs then we can quickly draft the policy, at
>> least for the things we've already got consensus on.  I think our new
>> developers, users, downstream projects, and partners would really
>> appreciate us making this clear.  If people like the content we can
>> move it to the Hadoop website and maintain it in svn like the bylaws.
>> 
>> The reason I think we need to do so is because there's been confusion
>> about what types of compatibility we promise and some open questions
>> which I'm not sure everyone is clear on. Examples:
>> - Are we going to preserve Hadoop v3 clients against v2 servers now
>> that we have protobuf support?  (I think so..)
>> - Can we break rolling upgrade of daemons in updates post GA? (I don't
>> think so..)
>> - Do we disallow HDFS metadata changes that require an HDFS upgrade in
>> an update? (I think so..)
>> - Can we remove methods from v2 and v2 updates that were deprecated in
>> v0.20-22?  (Unclear)
>> - Will we preserve binary compatibility for MR2 going forward? (I think
>> so..)
>> - Does the ability to support multiple versions of MR simultaneously
>> via MR2 change the MR API compatibility story? (I don't think so..)
>> - Are the RM protocols sufficiently stable to disallow incompatible
>> changes potentially required by non-MR projects? (Unclear, most large
>> Yarn deployments I'm aware of are running 0.23, not v2 alphas)
>> 
>> I'm also not sure there's currently consensus on what an incompatible
>> change is. For example, I think HADOOP-9151 is incompatible because it
>> broke client/server wire compatibility with previous releases and any
>> change that breaks wire compatibility is incompatible.  Suresh felt it
>> was not an incompatible change because it did not affect API
>> compatibility (ie PB is not considered part of the API) and the change
>> occurred while v2 is in alpha.  Not sure we need to go through the
>> whole exercise of what's allowed in an alpha and beta (water under the
>> bridge, hopefully), but I do think we should clearly define an
>> incompatible change.  It's fine that v2 has been a bit wild wild west
>> in the alpha development stage but I think we need to get a little
>> more rigorous.
>> 
>> Thanks,
>> Eli
>> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Re: Heads up - 2.0.5-beta

Posted by Luke Lu <ll...@apache.org>.
If protocol compatibility of v2 and v3 is a goal, HADOOP-8990 should be a
blocker for v2.

__Luke

On Fri, Apr 26, 2013 at 12:07 PM, Eli Collins <el...@cloudera.com> wrote:

> On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> >
> > On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
> >
> >> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> >>
> >>> With that in mind, I really want to make a serious push to lock down
> APIs and wire-protocols for hadoop-2.0.5-beta.
> >>> Thus, we can confidently support hadoop-2.x in a compatible manner in
> the future. So, it's fine to add new features,
> >>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
> >>
> >> Arun, since it sounds like you have a pretty definite idea
> >> in mind for what you want 'beta' label to actually mean,
> >> could you, please, share the exact criteria?
> >
> > Sorry, I'm not sure if this is exactly what you are looking for but, as
> I mentioned above, the primary aim would be make the final set of required
> API/write-protocol changes so that we can call it a 'beta' i.e. once
> 2.0.5-beta ships users & downstream projects can be confident about forward
> compatibility in hadoop-2.x line. Obviously, we might discover a blocker
> bug post 2.0.5 which *might* necessitate an unfortunate change - but that
> should be an outstanding exception.
>
> Arun, Suresh,
>
> Mind reviewing the following page Karthik put together on
> compatibility?   http://wiki.apache.org/hadoop/Compatibility
>
> I think we should do something similar to what Sanjay proposed in
> HADOOP-5071 for Hadoop v2.   If we get on the same page on
> compatibility terms/APIs then we can quickly draft the policy, at
> least for the things we've already got consensus on.  I think our new
> developers, users, downstream projects, and partners would really
> appreciate us making this clear.  If people like the content we can
> move it to the Hadoop website and maintain it in svn like the bylaws.
>
> The reason I think we need to do so is because there's been confusion
> about what types of compatibility we promise and some open questions
> which I'm not sure everyone is clear on. Examples:
> - Are we going to preserve Hadoop v3 clients against v2 servers now
> that we have protobuf support?  (I think so..)
> - Can we break rolling upgrade of daemons in updates post GA? (I don't
> think so..)
> - Do we disallow HDFS metadata changes that require an HDFS upgrade in
> an update? (I think so..)
> - Can we remove methods from v2 and v2 updates that were deprecated in
> v0.20-22?  (Unclear)
> - Will we preserve binary compatibility for MR2 going forward? (I think
> so..)
> - Does the ability to support multiple versions of MR simultaneously
> via MR2 change the MR API compatibility story? (I don't think so..)
> - Are the RM protocols sufficiently stable to disallow incompatible
> changes potentially required by non-MR projects? (Unclear, most large
> Yarn deployments I'm aware of are running 0.23, not v2 alphas)
>
> I'm also not sure there's currently consensus on what an incompatible
> change is. For example, I think HADOOP-9151 is incompatible because it
> broke client/server wire compatibility with previous releases and any
> change that breaks wire compatibility is incompatible.  Suresh felt it
> was not an incompatible change because it did not affect API
> compatibility (ie PB is not considered part of the API) and the change
> occurred while v2 is in alpha.  Not sure we need to go through the
> whole exercise of what's allowed in an alpha and beta (water under the
> bridge, hopefully), but I do think we should clearly define an
> incompatible change.  It's fine that v2 has been a bit wild wild west
> in the alpha development stage but I think we need to get a little
> more rigorous.
>
> Thanks,
> Eli
>

Re: Heads up - 2.0.5-beta

Posted by Luke Lu <ll...@apache.org>.
If protocol compatibility of v2 and v3 is a goal, HADOOP-8990 should be a
blocker for v2.

__Luke

On Fri, Apr 26, 2013 at 12:07 PM, Eli Collins <el...@cloudera.com> wrote:

> On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> >
> > On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
> >
> >> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> >>
> >>> With that in mind, I really want to make a serious push to lock down
> APIs and wire-protocols for hadoop-2.0.5-beta.
> >>> Thus, we can confidently support hadoop-2.x in a compatible manner in
> the future. So, it's fine to add new features,
> >>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
> >>
> >> Arun, since it sounds like you have a pretty definite idea
> >> in mind for what you want 'beta' label to actually mean,
> >> could you, please, share the exact criteria?
> >
> > Sorry, I'm not sure if this is exactly what you are looking for but, as
> I mentioned above, the primary aim would be make the final set of required
> API/write-protocol changes so that we can call it a 'beta' i.e. once
> 2.0.5-beta ships users & downstream projects can be confident about forward
> compatibility in hadoop-2.x line. Obviously, we might discover a blocker
> bug post 2.0.5 which *might* necessitate an unfortunate change - but that
> should be an outstanding exception.
>
> Arun, Suresh,
>
> Mind reviewing the following page Karthik put together on
> compatibility?   http://wiki.apache.org/hadoop/Compatibility
>
> I think we should do something similar to what Sanjay proposed in
> HADOOP-5071 for Hadoop v2.   If we get on the same page on
> compatibility terms/APIs then we can quickly draft the policy, at
> least for the things we've already got consensus on.  I think our new
> developers, users, downstream projects, and partners would really
> appreciate us making this clear.  If people like the content we can
> move it to the Hadoop website and maintain it in svn like the bylaws.
>
> The reason I think we need to do so is because there's been confusion
> about what types of compatibility we promise and some open questions
> which I'm not sure everyone is clear on. Examples:
> - Are we going to preserve Hadoop v3 clients against v2 servers now
> that we have protobuf support?  (I think so..)
> - Can we break rolling upgrade of daemons in updates post GA? (I don't
> think so..)
> - Do we disallow HDFS metadata changes that require an HDFS upgrade in
> an update? (I think so..)
> - Can we remove methods from v2 and v2 updates that were deprecated in
> v0.20-22?  (Unclear)
> - Will we preserve binary compatibility for MR2 going forward? (I think
> so..)
> - Does the ability to support multiple versions of MR simultaneously
> via MR2 change the MR API compatibility story? (I don't think so..)
> - Are the RM protocols sufficiently stable to disallow incompatible
> changes potentially required by non-MR projects? (Unclear, most large
> Yarn deployments I'm aware of are running 0.23, not v2 alphas)
>
> I'm also not sure there's currently consensus on what an incompatible
> change is. For example, I think HADOOP-9151 is incompatible because it
> broke client/server wire compatibility with previous releases and any
> change that breaks wire compatibility is incompatible.  Suresh felt it
> was not an incompatible change because it did not affect API
> compatibility (ie PB is not considered part of the API) and the change
> occurred while v2 is in alpha.  Not sure we need to go through the
> whole exercise of what's allowed in an alpha and beta (water under the
> bridge, hopefully), but I do think we should clearly define an
> incompatible change.  It's fine that v2 has been a bit wild wild west
> in the alpha development stage but I think we need to get a little
> more rigorous.
>
> Thanks,
> Eli
>

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 26, 2013, at 12:07 PM, Eli Collins wrote:

> Arun, Suresh,
> 
> Mind reviewing the following page Karthik put together on
> compatibility?   http://wiki.apache.org/hadoop/Compatibility

Sure. Will do.

I just opened https://issues.apache.org/jira/browse/HADOOP-9517 to ensure we capture it for posterity.

Karthik - Would you like to take a crack at it? The wiki would be a good starting point.

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 26, 2013, at 12:07 PM, Eli Collins wrote:

> Arun, Suresh,
> 
> Mind reviewing the following page Karthik put together on
> compatibility?   http://wiki.apache.org/hadoop/Compatibility

Sure. Will do.

I just opened https://issues.apache.org/jira/browse/HADOOP-9517 to ensure we capture it for posterity.

Karthik - Would you like to take a crack at it? The wiki would be a good starting point.

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Luke Lu <ll...@apache.org>.
If protocol compatibility of v2 and v3 is a goal, HADOOP-8990 should be a
blocker for v2.

__Luke

On Fri, Apr 26, 2013 at 12:07 PM, Eli Collins <el...@cloudera.com> wrote:

> On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> >
> > On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
> >
> >> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> >>
> >>> With that in mind, I really want to make a serious push to lock down
> APIs and wire-protocols for hadoop-2.0.5-beta.
> >>> Thus, we can confidently support hadoop-2.x in a compatible manner in
> the future. So, it's fine to add new features,
> >>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
> >>
> >> Arun, since it sounds like you have a pretty definite idea
> >> in mind for what you want 'beta' label to actually mean,
> >> could you, please, share the exact criteria?
> >
> > Sorry, I'm not sure if this is exactly what you are looking for but, as
> I mentioned above, the primary aim would be make the final set of required
> API/write-protocol changes so that we can call it a 'beta' i.e. once
> 2.0.5-beta ships users & downstream projects can be confident about forward
> compatibility in hadoop-2.x line. Obviously, we might discover a blocker
> bug post 2.0.5 which *might* necessitate an unfortunate change - but that
> should be an outstanding exception.
>
> Arun, Suresh,
>
> Mind reviewing the following page Karthik put together on
> compatibility?   http://wiki.apache.org/hadoop/Compatibility
>
> I think we should do something similar to what Sanjay proposed in
> HADOOP-5071 for Hadoop v2.   If we get on the same page on
> compatibility terms/APIs then we can quickly draft the policy, at
> least for the things we've already got consensus on.  I think our new
> developers, users, downstream projects, and partners would really
> appreciate us making this clear.  If people like the content we can
> move it to the Hadoop website and maintain it in svn like the bylaws.
>
> The reason I think we need to do so is because there's been confusion
> about what types of compatibility we promise and some open questions
> which I'm not sure everyone is clear on. Examples:
> - Are we going to preserve Hadoop v3 clients against v2 servers now
> that we have protobuf support?  (I think so..)
> - Can we break rolling upgrade of daemons in updates post GA? (I don't
> think so..)
> - Do we disallow HDFS metadata changes that require an HDFS upgrade in
> an update? (I think so..)
> - Can we remove methods from v2 and v2 updates that were deprecated in
> v0.20-22?  (Unclear)
> - Will we preserve binary compatibility for MR2 going forward? (I think
> so..)
> - Does the ability to support multiple versions of MR simultaneously
> via MR2 change the MR API compatibility story? (I don't think so..)
> - Are the RM protocols sufficiently stable to disallow incompatible
> changes potentially required by non-MR projects? (Unclear, most large
> Yarn deployments I'm aware of are running 0.23, not v2 alphas)
>
> I'm also not sure there's currently consensus on what an incompatible
> change is. For example, I think HADOOP-9151 is incompatible because it
> broke client/server wire compatibility with previous releases and any
> change that breaks wire compatibility is incompatible.  Suresh felt it
> was not an incompatible change because it did not affect API
> compatibility (ie PB is not considered part of the API) and the change
> occurred while v2 is in alpha.  Not sure we need to go through the
> whole exercise of what's allowed in an alpha and beta (water under the
> bridge, hopefully), but I do think we should clearly define an
> incompatible change.  It's fine that v2 has been a bit wild wild west
> in the alpha development stage but I think we need to get a little
> more rigorous.
>
> Thanks,
> Eli
>

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 26, 2013, at 12:07 PM, Eli Collins wrote:

> Arun, Suresh,
> 
> Mind reviewing the following page Karthik put together on
> compatibility?   http://wiki.apache.org/hadoop/Compatibility

Sure. Will do.

I just opened https://issues.apache.org/jira/browse/HADOOP-9517 to ensure we capture it for posterity.

Karthik - Would you like to take a crack at it? The wiki would be a good starting point.

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Suresh Srinivas <su...@hortonworks.com>.
Eli, I will post a more detailed reply soon. But one small correction:


I'm also not sure there's currently consensus on what an incompatible
> change is. For example, I think HADOOP-9151 is incompatible because it
> broke client/server wire compatibility with previous releases and any
> change that breaks wire compatibility is incompatible.  Suresh felt it
> was not an incompatible change because it did not affect API
> compatibility (ie PB is not considered part of the API) and the change
> occurred while v2 is in alpha.
>

This is not correct. I did not say it was not an incompatible change.
It was indeed an incompatible wire protocol change. My argument was,
the phase of development we were in, we could not mark wire protocol
as stable and not make any incompatible change. But once 2.0.5-beta
is out, as had discussed earlier, we should not make further incompatible
changes to wire protocol.

-- 
http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Suresh Srinivas <su...@hortonworks.com>.
Eli, I will post a more detailed reply soon. But one small correction:


I'm also not sure there's currently consensus on what an incompatible
> change is. For example, I think HADOOP-9151 is incompatible because it
> broke client/server wire compatibility with previous releases and any
> change that breaks wire compatibility is incompatible.  Suresh felt it
> was not an incompatible change because it did not affect API
> compatibility (ie PB is not considered part of the API) and the change
> occurred while v2 is in alpha.
>

This is not correct. I did not say it was not an incompatible change.
It was indeed an incompatible wire protocol change. My argument was,
the phase of development we were in, we could not mark wire protocol
as stable and not make any incompatible change. But once 2.0.5-beta
is out, as had discussed earlier, we should not make further incompatible
changes to wire protocol.

-- 
http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 26, 2013, at 12:07 PM, Eli Collins wrote:

> Arun, Suresh,
> 
> Mind reviewing the following page Karthik put together on
> compatibility?   http://wiki.apache.org/hadoop/Compatibility

Sure. Will do.

I just opened https://issues.apache.org/jira/browse/HADOOP-9517 to ensure we capture it for posterity.

Karthik - Would you like to take a crack at it? The wiki would be a good starting point.

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Suresh Srinivas <su...@hortonworks.com>.
Eli, I will post a more detailed reply soon. But one small correction:


I'm also not sure there's currently consensus on what an incompatible
> change is. For example, I think HADOOP-9151 is incompatible because it
> broke client/server wire compatibility with previous releases and any
> change that breaks wire compatibility is incompatible.  Suresh felt it
> was not an incompatible change because it did not affect API
> compatibility (ie PB is not considered part of the API) and the change
> occurred while v2 is in alpha.
>

This is not correct. I did not say it was not an incompatible change.
It was indeed an incompatible wire protocol change. My argument was,
the phase of development we were in, we could not mark wire protocol
as stable and not make any incompatible change. But once 2.0.5-beta
is out, as had discussed earlier, we should not make further incompatible
changes to wire protocol.

-- 
http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Luke Lu <ll...@apache.org>.
If protocol compatibility of v2 and v3 is a goal, HADOOP-8990 should be a
blocker for v2.

__Luke

On Fri, Apr 26, 2013 at 12:07 PM, Eli Collins <el...@cloudera.com> wrote:

> On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> >
> > On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
> >
> >> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> >>
> >>> With that in mind, I really want to make a serious push to lock down
> APIs and wire-protocols for hadoop-2.0.5-beta.
> >>> Thus, we can confidently support hadoop-2.x in a compatible manner in
> the future. So, it's fine to add new features,
> >>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
> >>
> >> Arun, since it sounds like you have a pretty definite idea
> >> in mind for what you want 'beta' label to actually mean,
> >> could you, please, share the exact criteria?
> >
> > Sorry, I'm not sure if this is exactly what you are looking for but, as
> I mentioned above, the primary aim would be make the final set of required
> API/write-protocol changes so that we can call it a 'beta' i.e. once
> 2.0.5-beta ships users & downstream projects can be confident about forward
> compatibility in hadoop-2.x line. Obviously, we might discover a blocker
> bug post 2.0.5 which *might* necessitate an unfortunate change - but that
> should be an outstanding exception.
>
> Arun, Suresh,
>
> Mind reviewing the following page Karthik put together on
> compatibility?   http://wiki.apache.org/hadoop/Compatibility
>
> I think we should do something similar to what Sanjay proposed in
> HADOOP-5071 for Hadoop v2.   If we get on the same page on
> compatibility terms/APIs then we can quickly draft the policy, at
> least for the things we've already got consensus on.  I think our new
> developers, users, downstream projects, and partners would really
> appreciate us making this clear.  If people like the content we can
> move it to the Hadoop website and maintain it in svn like the bylaws.
>
> The reason I think we need to do so is because there's been confusion
> about what types of compatibility we promise and some open questions
> which I'm not sure everyone is clear on. Examples:
> - Are we going to preserve Hadoop v3 clients against v2 servers now
> that we have protobuf support?  (I think so..)
> - Can we break rolling upgrade of daemons in updates post GA? (I don't
> think so..)
> - Do we disallow HDFS metadata changes that require an HDFS upgrade in
> an update? (I think so..)
> - Can we remove methods from v2 and v2 updates that were deprecated in
> v0.20-22?  (Unclear)
> - Will we preserve binary compatibility for MR2 going forward? (I think
> so..)
> - Does the ability to support multiple versions of MR simultaneously
> via MR2 change the MR API compatibility story? (I don't think so..)
> - Are the RM protocols sufficiently stable to disallow incompatible
> changes potentially required by non-MR projects? (Unclear, most large
> Yarn deployments I'm aware of are running 0.23, not v2 alphas)
>
> I'm also not sure there's currently consensus on what an incompatible
> change is. For example, I think HADOOP-9151 is incompatible because it
> broke client/server wire compatibility with previous releases and any
> change that breaks wire compatibility is incompatible.  Suresh felt it
> was not an incompatible change because it did not affect API
> compatibility (ie PB is not considered part of the API) and the change
> occurred while v2 is in alpha.  Not sure we need to go through the
> whole exercise of what's allowed in an alpha and beta (water under the
> bridge, hopefully), but I do think we should clearly define an
> incompatible change.  It's fine that v2 has been a bit wild wild west
> in the alpha development stage but I think we need to get a little
> more rigorous.
>
> Thanks,
> Eli
>

Re: Heads up - 2.0.5-beta

Posted by Eli Collins <el...@cloudera.com>.
On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
>
>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>>
>>> With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
>>> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
>>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>>
>> Arun, since it sounds like you have a pretty definite idea
>> in mind for what you want 'beta' label to actually mean,
>> could you, please, share the exact criteria?
>
> Sorry, I'm not sure if this is exactly what you are looking for but, as I mentioned above, the primary aim would be make the final set of required API/write-protocol changes so that we can call it a 'beta' i.e. once 2.0.5-beta ships users & downstream projects can be confident about forward compatibility in hadoop-2.x line. Obviously, we might discover a blocker bug post 2.0.5 which *might* necessitate an unfortunate change - but that should be an outstanding exception.

Arun, Suresh,

Mind reviewing the following page Karthik put together on
compatibility?   http://wiki.apache.org/hadoop/Compatibility

I think we should do something similar to what Sanjay proposed in
HADOOP-5071 for Hadoop v2.   If we get on the same page on
compatibility terms/APIs then we can quickly draft the policy, at
least for the things we've already got consensus on.  I think our new
developers, users, downstream projects, and partners would really
appreciate us making this clear.  If people like the content we can
move it to the Hadoop website and maintain it in svn like the bylaws.

The reason I think we need to do so is because there's been confusion
about what types of compatibility we promise and some open questions
which I'm not sure everyone is clear on. Examples:
- Are we going to preserve Hadoop v3 clients against v2 servers now
that we have protobuf support?  (I think so..)
- Can we break rolling upgrade of daemons in updates post GA? (I don't
think so..)
- Do we disallow HDFS metadata changes that require an HDFS upgrade in
an update? (I think so..)
- Can we remove methods from v2 and v2 updates that were deprecated in
v0.20-22?  (Unclear)
- Will we preserve binary compatibility for MR2 going forward? (I think so..)
- Does the ability to support multiple versions of MR simultaneously
via MR2 change the MR API compatibility story? (I don't think so..)
- Are the RM protocols sufficiently stable to disallow incompatible
changes potentially required by non-MR projects? (Unclear, most large
Yarn deployments I'm aware of are running 0.23, not v2 alphas)

I'm also not sure there's currently consensus on what an incompatible
change is. For example, I think HADOOP-9151 is incompatible because it
broke client/server wire compatibility with previous releases and any
change that breaks wire compatibility is incompatible.  Suresh felt it
was not an incompatible change because it did not affect API
compatibility (ie PB is not considered part of the API) and the change
occurred while v2 is in alpha.  Not sure we need to go through the
whole exercise of what's allowed in an alpha and beta (water under the
bridge, hopefully), but I do think we should clearly define an
incompatible change.  It's fine that v2 has been a bit wild wild west
in the alpha development stage but I think we need to get a little
more rigorous.

Thanks,
Eli

Re: Heads up - 2.0.5-beta

Posted by Eli Collins <el...@cloudera.com>.
On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
>
>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>>
>>> With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
>>> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
>>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>>
>> Arun, since it sounds like you have a pretty definite idea
>> in mind for what you want 'beta' label to actually mean,
>> could you, please, share the exact criteria?
>
> Sorry, I'm not sure if this is exactly what you are looking for but, as I mentioned above, the primary aim would be make the final set of required API/write-protocol changes so that we can call it a 'beta' i.e. once 2.0.5-beta ships users & downstream projects can be confident about forward compatibility in hadoop-2.x line. Obviously, we might discover a blocker bug post 2.0.5 which *might* necessitate an unfortunate change - but that should be an outstanding exception.

Arun, Suresh,

Mind reviewing the following page Karthik put together on
compatibility?   http://wiki.apache.org/hadoop/Compatibility

I think we should do something similar to what Sanjay proposed in
HADOOP-5071 for Hadoop v2.   If we get on the same page on
compatibility terms/APIs then we can quickly draft the policy, at
least for the things we've already got consensus on.  I think our new
developers, users, downstream projects, and partners would really
appreciate us making this clear.  If people like the content we can
move it to the Hadoop website and maintain it in svn like the bylaws.

The reason I think we need to do so is because there's been confusion
about what types of compatibility we promise and some open questions
which I'm not sure everyone is clear on. Examples:
- Are we going to preserve Hadoop v3 clients against v2 servers now
that we have protobuf support?  (I think so..)
- Can we break rolling upgrade of daemons in updates post GA? (I don't
think so..)
- Do we disallow HDFS metadata changes that require an HDFS upgrade in
an update? (I think so..)
- Can we remove methods from v2 and v2 updates that were deprecated in
v0.20-22?  (Unclear)
- Will we preserve binary compatibility for MR2 going forward? (I think so..)
- Does the ability to support multiple versions of MR simultaneously
via MR2 change the MR API compatibility story? (I don't think so..)
- Are the RM protocols sufficiently stable to disallow incompatible
changes potentially required by non-MR projects? (Unclear, most large
Yarn deployments I'm aware of are running 0.23, not v2 alphas)

I'm also not sure there's currently consensus on what an incompatible
change is. For example, I think HADOOP-9151 is incompatible because it
broke client/server wire compatibility with previous releases and any
change that breaks wire compatibility is incompatible.  Suresh felt it
was not an incompatible change because it did not affect API
compatibility (ie PB is not considered part of the API) and the change
occurred while v2 is in alpha.  Not sure we need to go through the
whole exercise of what's allowed in an alpha and beta (water under the
bridge, hopefully), but I do think we should clearly define an
incompatible change.  It's fine that v2 has been a bit wild wild west
in the alpha development stage but I think we need to get a little
more rigorous.

Thanks,
Eli

Re: Heads up - 2.0.5-beta

Posted by Roman Shaposhnik <rv...@apache.org>.
On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
>
>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>>
>>> With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
>>> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
>>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>>
>> Arun, since it sounds like you have a pretty definite idea
>> in mind for what you want 'beta' label to actually mean,
>> could you, please, share the exact criteria?
>
> Sorry, I'm not sure if this is exactly what you are looking for but, as I mentioned above, the primary aim would be make the final set of required API/write-protocol changes so that we can call it a 'beta' i.e. once 2.0.5-beta ships users & downstream projects can be confident about forward compatibility in hadoop-2.x line. Obviously, we might discover a blocker bug post 2.0.5 which *might* necessitate an unfortunate change - but that should be an outstanding exception.
>
> Hope that helps.

It does make things a bit easier, but here's what I'd like to find
out what *level* of feedback from downstream components
and DevOps community would be considered adequate for a
release to be called beta.

IOW, would it make sense for us as a community, to make
the following things as part of the release criteria as far
as downstream components are concerned:
   * producing Maven artifacts of downstream components
     against branch-2 artifacts.
   * having unit test jobs for all the downstream components
     against branch-2 artifacts
   * having all the failures in those unit tests triaged and filed
     either against a component itself or hadoop
   * running Bigtop integration tests on branch-2 nightly
   * having all the failures of unit tests triaged and filed
     either against components or hadoop

Obviously, quantifying DevOps feedback and involvement
is more difficult, but would it be completely out of the question
to, essentially, predicate beta on some level of feedback
coming from Yahoo!/LI/FB/etc?

Thanks,
Roman.

P.S. Note that most of those things Bigtop can help with -- so lets
not get hung up on resources too much for now -- but rather on
whether we'd want those to be part of the release criteria
IF we had all the resources.

Re: Heads up - 2.0.5-beta

Posted by Roman Shaposhnik <rv...@apache.org>.
On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
>
>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>>
>>> With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
>>> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
>>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>>
>> Arun, since it sounds like you have a pretty definite idea
>> in mind for what you want 'beta' label to actually mean,
>> could you, please, share the exact criteria?
>
> Sorry, I'm not sure if this is exactly what you are looking for but, as I mentioned above, the primary aim would be make the final set of required API/write-protocol changes so that we can call it a 'beta' i.e. once 2.0.5-beta ships users & downstream projects can be confident about forward compatibility in hadoop-2.x line. Obviously, we might discover a blocker bug post 2.0.5 which *might* necessitate an unfortunate change - but that should be an outstanding exception.
>
> Hope that helps.

It does make things a bit easier, but here's what I'd like to find
out what *level* of feedback from downstream components
and DevOps community would be considered adequate for a
release to be called beta.

IOW, would it make sense for us as a community, to make
the following things as part of the release criteria as far
as downstream components are concerned:
   * producing Maven artifacts of downstream components
     against branch-2 artifacts.
   * having unit test jobs for all the downstream components
     against branch-2 artifacts
   * having all the failures in those unit tests triaged and filed
     either against a component itself or hadoop
   * running Bigtop integration tests on branch-2 nightly
   * having all the failures of unit tests triaged and filed
     either against components or hadoop

Obviously, quantifying DevOps feedback and involvement
is more difficult, but would it be completely out of the question
to, essentially, predicate beta on some level of feedback
coming from Yahoo!/LI/FB/etc?

Thanks,
Roman.

P.S. Note that most of those things Bigtop can help with -- so lets
not get hung up on resources too much for now -- but rather on
whether we'd want those to be part of the release criteria
IF we had all the resources.

Re: Heads up - 2.0.5-beta

Posted by Eli Collins <el...@cloudera.com>.
On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
>
>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>>
>>> With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
>>> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
>>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>>
>> Arun, since it sounds like you have a pretty definite idea
>> in mind for what you want 'beta' label to actually mean,
>> could you, please, share the exact criteria?
>
> Sorry, I'm not sure if this is exactly what you are looking for but, as I mentioned above, the primary aim would be make the final set of required API/write-protocol changes so that we can call it a 'beta' i.e. once 2.0.5-beta ships users & downstream projects can be confident about forward compatibility in hadoop-2.x line. Obviously, we might discover a blocker bug post 2.0.5 which *might* necessitate an unfortunate change - but that should be an outstanding exception.

Arun, Suresh,

Mind reviewing the following page Karthik put together on
compatibility?   http://wiki.apache.org/hadoop/Compatibility

I think we should do something similar to what Sanjay proposed in
HADOOP-5071 for Hadoop v2.   If we get on the same page on
compatibility terms/APIs then we can quickly draft the policy, at
least for the things we've already got consensus on.  I think our new
developers, users, downstream projects, and partners would really
appreciate us making this clear.  If people like the content we can
move it to the Hadoop website and maintain it in svn like the bylaws.

The reason I think we need to do so is because there's been confusion
about what types of compatibility we promise and some open questions
which I'm not sure everyone is clear on. Examples:
- Are we going to preserve Hadoop v3 clients against v2 servers now
that we have protobuf support?  (I think so..)
- Can we break rolling upgrade of daemons in updates post GA? (I don't
think so..)
- Do we disallow HDFS metadata changes that require an HDFS upgrade in
an update? (I think so..)
- Can we remove methods from v2 and v2 updates that were deprecated in
v0.20-22?  (Unclear)
- Will we preserve binary compatibility for MR2 going forward? (I think so..)
- Does the ability to support multiple versions of MR simultaneously
via MR2 change the MR API compatibility story? (I don't think so..)
- Are the RM protocols sufficiently stable to disallow incompatible
changes potentially required by non-MR projects? (Unclear, most large
Yarn deployments I'm aware of are running 0.23, not v2 alphas)

I'm also not sure there's currently consensus on what an incompatible
change is. For example, I think HADOOP-9151 is incompatible because it
broke client/server wire compatibility with previous releases and any
change that breaks wire compatibility is incompatible.  Suresh felt it
was not an incompatible change because it did not affect API
compatibility (ie PB is not considered part of the API) and the change
occurred while v2 is in alpha.  Not sure we need to go through the
whole exercise of what's allowed in an alpha and beta (water under the
bridge, hopefully), but I do think we should clearly define an
incompatible change.  It's fine that v2 has been a bit wild wild west
in the alpha development stage but I think we need to get a little
more rigorous.

Thanks,
Eli

Re: Heads up - 2.0.5-beta

Posted by Eli Collins <el...@cloudera.com>.
On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
>
>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>>
>>> With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
>>> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
>>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>>
>> Arun, since it sounds like you have a pretty definite idea
>> in mind for what you want 'beta' label to actually mean,
>> could you, please, share the exact criteria?
>
> Sorry, I'm not sure if this is exactly what you are looking for but, as I mentioned above, the primary aim would be make the final set of required API/write-protocol changes so that we can call it a 'beta' i.e. once 2.0.5-beta ships users & downstream projects can be confident about forward compatibility in hadoop-2.x line. Obviously, we might discover a blocker bug post 2.0.5 which *might* necessitate an unfortunate change - but that should be an outstanding exception.

Arun, Suresh,

Mind reviewing the following page Karthik put together on
compatibility?   http://wiki.apache.org/hadoop/Compatibility

I think we should do something similar to what Sanjay proposed in
HADOOP-5071 for Hadoop v2.   If we get on the same page on
compatibility terms/APIs then we can quickly draft the policy, at
least for the things we've already got consensus on.  I think our new
developers, users, downstream projects, and partners would really
appreciate us making this clear.  If people like the content we can
move it to the Hadoop website and maintain it in svn like the bylaws.

The reason I think we need to do so is because there's been confusion
about what types of compatibility we promise and some open questions
which I'm not sure everyone is clear on. Examples:
- Are we going to preserve Hadoop v3 clients against v2 servers now
that we have protobuf support?  (I think so..)
- Can we break rolling upgrade of daemons in updates post GA? (I don't
think so..)
- Do we disallow HDFS metadata changes that require an HDFS upgrade in
an update? (I think so..)
- Can we remove methods from v2 and v2 updates that were deprecated in
v0.20-22?  (Unclear)
- Will we preserve binary compatibility for MR2 going forward? (I think so..)
- Does the ability to support multiple versions of MR simultaneously
via MR2 change the MR API compatibility story? (I don't think so..)
- Are the RM protocols sufficiently stable to disallow incompatible
changes potentially required by non-MR projects? (Unclear, most large
Yarn deployments I'm aware of are running 0.23, not v2 alphas)

I'm also not sure there's currently consensus on what an incompatible
change is. For example, I think HADOOP-9151 is incompatible because it
broke client/server wire compatibility with previous releases and any
change that breaks wire compatibility is incompatible.  Suresh felt it
was not an incompatible change because it did not affect API
compatibility (ie PB is not considered part of the API) and the change
occurred while v2 is in alpha.  Not sure we need to go through the
whole exercise of what's allowed in an alpha and beta (water under the
bridge, hopefully), but I do think we should clearly define an
incompatible change.  It's fine that v2 has been a bit wild wild west
in the alpha development stage but I think we need to get a little
more rigorous.

Thanks,
Eli

Re: Heads up - 2.0.5-beta

Posted by Roman Shaposhnik <rv...@apache.org>.
On Fri, Apr 26, 2013 at 11:15 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:
>
>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>>
>>> With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
>>> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
>>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>>
>> Arun, since it sounds like you have a pretty definite idea
>> in mind for what you want 'beta' label to actually mean,
>> could you, please, share the exact criteria?
>
> Sorry, I'm not sure if this is exactly what you are looking for but, as I mentioned above, the primary aim would be make the final set of required API/write-protocol changes so that we can call it a 'beta' i.e. once 2.0.5-beta ships users & downstream projects can be confident about forward compatibility in hadoop-2.x line. Obviously, we might discover a blocker bug post 2.0.5 which *might* necessitate an unfortunate change - but that should be an outstanding exception.
>
> Hope that helps.

It does make things a bit easier, but here's what I'd like to find
out what *level* of feedback from downstream components
and DevOps community would be considered adequate for a
release to be called beta.

IOW, would it make sense for us as a community, to make
the following things as part of the release criteria as far
as downstream components are concerned:
   * producing Maven artifacts of downstream components
     against branch-2 artifacts.
   * having unit test jobs for all the downstream components
     against branch-2 artifacts
   * having all the failures in those unit tests triaged and filed
     either against a component itself or hadoop
   * running Bigtop integration tests on branch-2 nightly
   * having all the failures of unit tests triaged and filed
     either against components or hadoop

Obviously, quantifying DevOps feedback and involvement
is more difficult, but would it be completely out of the question
to, essentially, predicate beta on some level of feedback
coming from Yahoo!/LI/FB/etc?

Thanks,
Roman.

P.S. Note that most of those things Bigtop can help with -- so lets
not get hung up on resources too much for now -- but rather on
whether we'd want those to be part of the release criteria
IF we had all the resources.

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:

> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> 
>> With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
>> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
> 
> Arun, since it sounds like you have a pretty definite idea
> in mind for what you want 'beta' label to actually mean,
> could you, please, share the exact criteria? 

Sorry, I'm not sure if this is exactly what you are looking for but, as I mentioned above, the primary aim would be make the final set of required API/write-protocol changes so that we can call it a 'beta' i.e. once 2.0.5-beta ships users & downstream projects can be confident about forward compatibility in hadoop-2.x line. Obviously, we might discover a blocker bug post 2.0.5 which *might* necessitate an unfortunate change - but that should be an outstanding exception.

Hope that helps.

thanks,
Arun


Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:

> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> 
>> With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
>> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
> 
> Arun, since it sounds like you have a pretty definite idea
> in mind for what you want 'beta' label to actually mean,
> could you, please, share the exact criteria? 

Sorry, I'm not sure if this is exactly what you are looking for but, as I mentioned above, the primary aim would be make the final set of required API/write-protocol changes so that we can call it a 'beta' i.e. once 2.0.5-beta ships users & downstream projects can be confident about forward compatibility in hadoop-2.x line. Obviously, we might discover a blocker bug post 2.0.5 which *might* necessitate an unfortunate change - but that should be an outstanding exception.

Hope that helps.

thanks,
Arun


Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:

> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> 
>> With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
>> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
> 
> Arun, since it sounds like you have a pretty definite idea
> in mind for what you want 'beta' label to actually mean,
> could you, please, share the exact criteria? 

Sorry, I'm not sure if this is exactly what you are looking for but, as I mentioned above, the primary aim would be make the final set of required API/write-protocol changes so that we can call it a 'beta' i.e. once 2.0.5-beta ships users & downstream projects can be confident about forward compatibility in hadoop-2.x line. Obviously, we might discover a blocker bug post 2.0.5 which *might* necessitate an unfortunate change - but that should be an outstanding exception.

Hope that helps.

thanks,
Arun


Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 25, 2013, at 7:31 PM, Roman Shaposhnik wrote:

> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> 
>> With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
>> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
>> but please ensure that all APIs are frozen for hadoop-2.0.5-beta
> 
> Arun, since it sounds like you have a pretty definite idea
> in mind for what you want 'beta' label to actually mean,
> could you, please, share the exact criteria? 

Sorry, I'm not sure if this is exactly what you are looking for but, as I mentioned above, the primary aim would be make the final set of required API/write-protocol changes so that we can call it a 'beta' i.e. once 2.0.5-beta ships users & downstream projects can be confident about forward compatibility in hadoop-2.x line. Obviously, we might discover a blocker bug post 2.0.5 which *might* necessitate an unfortunate change - but that should be an outstanding exception.

Hope that helps.

thanks,
Arun


Re: Heads up - 2.0.5-beta

Posted by Roman Shaposhnik <rv...@apache.org>.
On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Gang,
>
>  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our hadoop-2.x alphas.
> We have made lots of progress on hadoop-2.x and I believe we are nearly there, exciting times!

Indeed!

>  As we have discussed previously, I hope to do a final push to stabilize hadoop-2.x, release a
> hadoop-2.0.5-beta in the next month or so; and then declare hadoop-2.1 as stable this summer
> after a short period of intensive testing.
>
>  With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
> but please ensure that all APIs are frozen for hadoop-2.0.5-beta

Arun, since it sounds like you have a pretty definite idea
in mind for what you want 'beta' label to actually mean,
could you, please, share the exact criteria? Either in the
thread I started a few days ago: http://s.apache.org/da5
or here.

That would be appreciated!

Thanks,
Roman.

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Arun,

Could you please define the release plan and put it into vote.
In accordance with the ByLaws. After this discussion of course.

http://hadoop.apache.org/bylaws.html
Release Plan
Defines the timetable and actions for a release. The plan also nominates a
Release Manager.
Lazy majority of active committers

Do I understand correctly you volunteering for RM? Just to clarify.
Suresh had already put a list of features for HDFS and common.
So you probably need to indicate features for MapReduce and Yarn.

Thanks,
--Konstantin



On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Gang,
>
>  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> believe we are nearly there, exciting times!
>
>  As we have discussed previously, I hope to do a final push to stabilize
> hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> declare hadoop-2.1 as stable this summer after a short period of intensive
> testing.
>
>  With that in mind, I really want to make a serious push to lock down APIs
> and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently support
> hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> features, but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>
>  Vinod is helping out on the YARN/MR side and has tagged a number of final
> changes (including some the final API incompatibilities) we'd like to push
> in before we call hadoop-2.x as ready to be supported (Target Version set
> to 2.0.5-beta):
>  http://s.apache.org/target-hadoop-2.0.5-beta
>  Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> tagged, but their necessity is implied).
>
>  Similarly on HDFS side, can someone please help out by tagging features,
> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> protocols are locked down too - I'd really appreciate it!
>
> thanks,
> Arun
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 25, 2013, at 6:36 PM, Suresh Srinivas wrote:

>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>> 
>> Similarly on HDFS side, can someone please help out by tagging features,
>> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
>> protocols are locked down too - I'd really appreciate it!
> 
> To ensure a timely release of 2.0.5-beta, we should not hold back for
> individual features. However, I would like to make necessary API and/or
> protocol changes right-away. This will allow us to adding  features in
> subsequent releases e.g. hadoop-2.2 or hadoop-2.3 etc without breaking
> compatibility. 

+1, sounds like a good plan. Thanks!

Arun

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
If there are no objections, I'll start a vote on this proposal now.

Thanks,
--Konstantin


On Tue, Apr 30, 2013 at 4:28 PM, Konstantin Shvachko
<sh...@gmail.com>wrote:

> Hi Arun,
>
> I am agnostic about version numbers too, as long as the count goes up.
> The discussion you are referring to is somewhat outdated, it was talking
> about 2.0.4-beta, which we already passed. It is talking about producing a
> series "not suitable for general consumption", which isn't correct for the
> latest release 2.0.4. That discussion clearly outlined general (or
> specific) frustration about breaking compatibility from top level projects.
>
> You are not listing new features for MR and YARN.
> So it will only be about the four HDFS features Suresh proposed for 2.0.5.
> As I said earlier my problem with them is that each is big enough to
> destabilize the code base, and big enough to be targeted for a separate
> release. The latter relates to the "streamlining" thread on general@.
> I also think the proposed features will delay stable 2.x beyond the
> time-frame you projected, because some of them are not implemented yet, and
> Windows is in unknown to me condition, as integration builds are still not
> run for it.
>
> If the next release has to be 2.0.5 I would like to make an alternative
> proposal, which would include
> - stabilization of current 2.0.4
> - making all API changes to allow freezing them post 2.0.5
> And nothing else.
>
> We can add new features in subsequent release (release). Potentially we
> can end up in the same place as you proposed but with more certainty along
> the road.
> The main reason I am asking for stabilization is to make it available for
> large installations such as Yahoo sooner. And this will require commitment
> to compatibility as Bobby mentioned on several occasions.
>
> As a rule of thumb compatibility for me means that I can do a rolling
> upgrade on the cluster. More formal definitions like Karthik's
> Compatibility page are better. BigTop's integration testing proved to be
> very productive.
>
> Thanks,
> --Konstantin
>
>
> On Fri, Apr 26, 2013 at 6:06 PM, Arun C Murthy <ac...@hortonworks.com>wrote:
>
>> Konstantin,
>>
>> On Apr 26, 2013, at 4:34 PM, Konstantin Shvachko wrote:
>>
>> > Do you think we can call the version you proposed to release
>> > 2.1.0 or 2.1.0-beta?
>> >
>> > The proposed new features imho do not exactly conform with the idea
>> > of dot-dot release, but definitely qualify for a major number change.
>> > I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
>> > also possible.
>>
>> I'm agnostic to the schemes.
>>
>> During the long discussion we had just 2 months ago, I proposed that
>> 2.1.x be the beta series initially.
>>
>> The feedback and consensus was that it wasn't the right numbering scheme:
>> http://s.apache.org/1j4
>>
>> thanks,
>> Arun
>>
>
>

Re: Heads up - 2.0.5-beta

Posted by Robert Evans <ev...@yahoo-inc.com>.
I agree that "destructive" is not the correct word to describe features
like snapshots and windows support.  However, I also agree with Konstantin
that any large feature will have a destabilizing effect on the code base,
even if it is done on a branch and thoroughly tested before being merged
in. HDFS HA from what I have seen and heard is rock solid, but it took a
while to get there even after it was merged into branch-2. And we all know
how long YARN and MRv2 have taken to stabilize.

I also agree that no one individual is able to police all of Hadoop.  We
have to rely on the committers to make sure that what is placed in a
branch is appropriate for that branch in preparation for a release.  As a
community we need to decided what the goals of a branch are so that I as a
committer can know what is and is not appropriate to be placed in that
branch.  This is the reason why we are discussing API and binary
compatibility. This is the reason why I support having a vote for a
release plan.  The question for the community comes down to do we want to
release quickly and often off of trunk trying hard to maintain
compatibility between releases or do we want to follow what we have done
up to now where a single branch goes into stabilization, trunk gets
anything that is not "compatible" with that branch, and it takes a huge
effort to switch momentum from one branch to another.  Up to this point we
have almost successfully done this switch once, from 1.0 to 2.0. I have a
hard time believing that we are going to do this again for another 5 years.

There is nothing preventing the community from letting each organization
decide what they want to do and we end up with both.  But this results in
fragmentation of the community, and makes it difficult for those trying to
stabilize a release because there is no critical mass of individuals using
and testing that branch.  It also results in the scrambling we are seeing
now to try and revert the incompatibles between 1.0 and 2.0 that were
introduced in the years between these releases.  If we are going to do the
same and make 3.0 compatible with 2.0 when the switch comes, why do we
even allow any incompatible changes in at all?  It just feels like trunk
is a place to put tech debt that we are going to try and revert later.  I
personally like the Linux and BSD models, where there is a new feature
merge window and any new features can come in, then the entire community
works together to stabilize the release before going on the the next merge
window.  If the release does not stabilize quickly the next merge window
gets pushed back. I realize this is very different from the current model
and is not likely to receive a lot of support, but it has worked for them
for a long time, and they have code bases just as large as Hadoop and even
larger and more diverse communities.

I am +1 for Konstantin's release plan and will vote as such on that thread.

--Bobby

On 5/3/13 3:06 AM, "Konstantin Shvachko" <sh...@gmail.com> wrote:

>Hi Arun and Suresh,
>
>I am glad my choice of words attracted your attention. I consider this
>important for the project otherwise I wouldn't waste everybody's time.
>You tend reacting on a latest message taken out of context, which does not
>reveal full picture.
>I'll try here to summarize my proposal and motivation expressed earlier in
>these two threads:
>http://s.apache.org/fs
>http://s.apache.org/Streamlining
>
>I am advocating
>1. to make 2.0.5 a release that will
>    a) make any necessary changes so that Hadoop APIs could be fixed after
>that
>    b) fix bugs: internal and those important for stabilizing downstream
>projects
>2. Release 2.1.0 stable. I.e. both with stable APIs and stable code base.
>3. Produce a series of feature releases. Potentially catching up with the
>state of trunk.
>4. Release from trunk afterwards.
>
>The main motivation to minimize changes in 2.0.5 is to let Hadoop users
>and
>the downstream projects, that is the Hadoop community, to start adapting
>to
>the new APIs asap. This will provide certainty that people can build their
>products on top of 2.0.5 APIs with minimal risk the next release will
>break
>them.
>Thus Bobby in http://goo.gl/jm5am
>is saying that the meaning of beta for him is locked down APIs for wire
>and
>binary compatibility. For Hadoop Yahoo using 2.x is an opportunity to have
>it tested at very large scale, which in turn will bring other users on
>board.
>
>I agree with Arun that we are not disagreeing on much. Just on the order
>of
>execution: what goes first stability or features.
>I am not challenging any features, the implementations, or the developers.
>But putting all changes together is destructive for the stability of the
>release. Adding a 500 KB patch invalidates prio testing solely because it
>is a big change that needs testing not only by itself but with upstream
>applications.
>With 2.0.3 , 2.0.4 tested thoroughly and widely in many organizations and
>several distributions it seems like a perfect base for the stable release.
>We could be just two steps away from it.
>
>I tried to explained as good as I could what I suggest, why, and why now.
>I
>am not here to police, claim, mandate, enforce edicts, be a gatekeeper,
>narrow view, tie up knots ... (did I miss any). If we disagree let's do it
>by the rules we created for ourselves and move on. Life will self-adjust
>and the entropy will keep increasing no matter what.
>
>Thanks,
>--Konstantin


Re: Heads up - 2.0.5-beta

Posted by Robert Evans <ev...@yahoo-inc.com>.
I agree that "destructive" is not the correct word to describe features
like snapshots and windows support.  However, I also agree with Konstantin
that any large feature will have a destabilizing effect on the code base,
even if it is done on a branch and thoroughly tested before being merged
in. HDFS HA from what I have seen and heard is rock solid, but it took a
while to get there even after it was merged into branch-2. And we all know
how long YARN and MRv2 have taken to stabilize.

I also agree that no one individual is able to police all of Hadoop.  We
have to rely on the committers to make sure that what is placed in a
branch is appropriate for that branch in preparation for a release.  As a
community we need to decided what the goals of a branch are so that I as a
committer can know what is and is not appropriate to be placed in that
branch.  This is the reason why we are discussing API and binary
compatibility. This is the reason why I support having a vote for a
release plan.  The question for the community comes down to do we want to
release quickly and often off of trunk trying hard to maintain
compatibility between releases or do we want to follow what we have done
up to now where a single branch goes into stabilization, trunk gets
anything that is not "compatible" with that branch, and it takes a huge
effort to switch momentum from one branch to another.  Up to this point we
have almost successfully done this switch once, from 1.0 to 2.0. I have a
hard time believing that we are going to do this again for another 5 years.

There is nothing preventing the community from letting each organization
decide what they want to do and we end up with both.  But this results in
fragmentation of the community, and makes it difficult for those trying to
stabilize a release because there is no critical mass of individuals using
and testing that branch.  It also results in the scrambling we are seeing
now to try and revert the incompatibles between 1.0 and 2.0 that were
introduced in the years between these releases.  If we are going to do the
same and make 3.0 compatible with 2.0 when the switch comes, why do we
even allow any incompatible changes in at all?  It just feels like trunk
is a place to put tech debt that we are going to try and revert later.  I
personally like the Linux and BSD models, where there is a new feature
merge window and any new features can come in, then the entire community
works together to stabilize the release before going on the the next merge
window.  If the release does not stabilize quickly the next merge window
gets pushed back. I realize this is very different from the current model
and is not likely to receive a lot of support, but it has worked for them
for a long time, and they have code bases just as large as Hadoop and even
larger and more diverse communities.

I am +1 for Konstantin's release plan and will vote as such on that thread.

--Bobby

On 5/3/13 3:06 AM, "Konstantin Shvachko" <sh...@gmail.com> wrote:

>Hi Arun and Suresh,
>
>I am glad my choice of words attracted your attention. I consider this
>important for the project otherwise I wouldn't waste everybody's time.
>You tend reacting on a latest message taken out of context, which does not
>reveal full picture.
>I'll try here to summarize my proposal and motivation expressed earlier in
>these two threads:
>http://s.apache.org/fs
>http://s.apache.org/Streamlining
>
>I am advocating
>1. to make 2.0.5 a release that will
>    a) make any necessary changes so that Hadoop APIs could be fixed after
>that
>    b) fix bugs: internal and those important for stabilizing downstream
>projects
>2. Release 2.1.0 stable. I.e. both with stable APIs and stable code base.
>3. Produce a series of feature releases. Potentially catching up with the
>state of trunk.
>4. Release from trunk afterwards.
>
>The main motivation to minimize changes in 2.0.5 is to let Hadoop users
>and
>the downstream projects, that is the Hadoop community, to start adapting
>to
>the new APIs asap. This will provide certainty that people can build their
>products on top of 2.0.5 APIs with minimal risk the next release will
>break
>them.
>Thus Bobby in http://goo.gl/jm5am
>is saying that the meaning of beta for him is locked down APIs for wire
>and
>binary compatibility. For Hadoop Yahoo using 2.x is an opportunity to have
>it tested at very large scale, which in turn will bring other users on
>board.
>
>I agree with Arun that we are not disagreeing on much. Just on the order
>of
>execution: what goes first stability or features.
>I am not challenging any features, the implementations, or the developers.
>But putting all changes together is destructive for the stability of the
>release. Adding a 500 KB patch invalidates prio testing solely because it
>is a big change that needs testing not only by itself but with upstream
>applications.
>With 2.0.3 , 2.0.4 tested thoroughly and widely in many organizations and
>several distributions it seems like a perfect base for the stable release.
>We could be just two steps away from it.
>
>I tried to explained as good as I could what I suggest, why, and why now.
>I
>am not here to police, claim, mandate, enforce edicts, be a gatekeeper,
>narrow view, tie up knots ... (did I miss any). If we disagree let's do it
>by the rules we created for ourselves and move on. Life will self-adjust
>and the entropy will keep increasing no matter what.
>
>Thanks,
>--Konstantin


Re: Heads up - 2.0.5-beta

Posted by Robert Evans <ev...@yahoo-inc.com>.
I agree that "destructive" is not the correct word to describe features
like snapshots and windows support.  However, I also agree with Konstantin
that any large feature will have a destabilizing effect on the code base,
even if it is done on a branch and thoroughly tested before being merged
in. HDFS HA from what I have seen and heard is rock solid, but it took a
while to get there even after it was merged into branch-2. And we all know
how long YARN and MRv2 have taken to stabilize.

I also agree that no one individual is able to police all of Hadoop.  We
have to rely on the committers to make sure that what is placed in a
branch is appropriate for that branch in preparation for a release.  As a
community we need to decided what the goals of a branch are so that I as a
committer can know what is and is not appropriate to be placed in that
branch.  This is the reason why we are discussing API and binary
compatibility. This is the reason why I support having a vote for a
release plan.  The question for the community comes down to do we want to
release quickly and often off of trunk trying hard to maintain
compatibility between releases or do we want to follow what we have done
up to now where a single branch goes into stabilization, trunk gets
anything that is not "compatible" with that branch, and it takes a huge
effort to switch momentum from one branch to another.  Up to this point we
have almost successfully done this switch once, from 1.0 to 2.0. I have a
hard time believing that we are going to do this again for another 5 years.

There is nothing preventing the community from letting each organization
decide what they want to do and we end up with both.  But this results in
fragmentation of the community, and makes it difficult for those trying to
stabilize a release because there is no critical mass of individuals using
and testing that branch.  It also results in the scrambling we are seeing
now to try and revert the incompatibles between 1.0 and 2.0 that were
introduced in the years between these releases.  If we are going to do the
same and make 3.0 compatible with 2.0 when the switch comes, why do we
even allow any incompatible changes in at all?  It just feels like trunk
is a place to put tech debt that we are going to try and revert later.  I
personally like the Linux and BSD models, where there is a new feature
merge window and any new features can come in, then the entire community
works together to stabilize the release before going on the the next merge
window.  If the release does not stabilize quickly the next merge window
gets pushed back. I realize this is very different from the current model
and is not likely to receive a lot of support, but it has worked for them
for a long time, and they have code bases just as large as Hadoop and even
larger and more diverse communities.

I am +1 for Konstantin's release plan and will vote as such on that thread.

--Bobby

On 5/3/13 3:06 AM, "Konstantin Shvachko" <sh...@gmail.com> wrote:

>Hi Arun and Suresh,
>
>I am glad my choice of words attracted your attention. I consider this
>important for the project otherwise I wouldn't waste everybody's time.
>You tend reacting on a latest message taken out of context, which does not
>reveal full picture.
>I'll try here to summarize my proposal and motivation expressed earlier in
>these two threads:
>http://s.apache.org/fs
>http://s.apache.org/Streamlining
>
>I am advocating
>1. to make 2.0.5 a release that will
>    a) make any necessary changes so that Hadoop APIs could be fixed after
>that
>    b) fix bugs: internal and those important for stabilizing downstream
>projects
>2. Release 2.1.0 stable. I.e. both with stable APIs and stable code base.
>3. Produce a series of feature releases. Potentially catching up with the
>state of trunk.
>4. Release from trunk afterwards.
>
>The main motivation to minimize changes in 2.0.5 is to let Hadoop users
>and
>the downstream projects, that is the Hadoop community, to start adapting
>to
>the new APIs asap. This will provide certainty that people can build their
>products on top of 2.0.5 APIs with minimal risk the next release will
>break
>them.
>Thus Bobby in http://goo.gl/jm5am
>is saying that the meaning of beta for him is locked down APIs for wire
>and
>binary compatibility. For Hadoop Yahoo using 2.x is an opportunity to have
>it tested at very large scale, which in turn will bring other users on
>board.
>
>I agree with Arun that we are not disagreeing on much. Just on the order
>of
>execution: what goes first stability or features.
>I am not challenging any features, the implementations, or the developers.
>But putting all changes together is destructive for the stability of the
>release. Adding a 500 KB patch invalidates prio testing solely because it
>is a big change that needs testing not only by itself but with upstream
>applications.
>With 2.0.3 , 2.0.4 tested thoroughly and widely in many organizations and
>several distributions it seems like a perfect base for the stable release.
>We could be just two steps away from it.
>
>I tried to explained as good as I could what I suggest, why, and why now.
>I
>am not here to police, claim, mandate, enforce edicts, be a gatekeeper,
>narrow view, tie up knots ... (did I miss any). If we disagree let's do it
>by the rules we created for ourselves and move on. Life will self-adjust
>and the entropy will keep increasing no matter what.
>
>Thanks,
>--Konstantin


Re: Heads up - 2.0.5-beta

Posted by Robert Evans <ev...@yahoo-inc.com>.
I agree that "destructive" is not the correct word to describe features
like snapshots and windows support.  However, I also agree with Konstantin
that any large feature will have a destabilizing effect on the code base,
even if it is done on a branch and thoroughly tested before being merged
in. HDFS HA from what I have seen and heard is rock solid, but it took a
while to get there even after it was merged into branch-2. And we all know
how long YARN and MRv2 have taken to stabilize.

I also agree that no one individual is able to police all of Hadoop.  We
have to rely on the committers to make sure that what is placed in a
branch is appropriate for that branch in preparation for a release.  As a
community we need to decided what the goals of a branch are so that I as a
committer can know what is and is not appropriate to be placed in that
branch.  This is the reason why we are discussing API and binary
compatibility. This is the reason why I support having a vote for a
release plan.  The question for the community comes down to do we want to
release quickly and often off of trunk trying hard to maintain
compatibility between releases or do we want to follow what we have done
up to now where a single branch goes into stabilization, trunk gets
anything that is not "compatible" with that branch, and it takes a huge
effort to switch momentum from one branch to another.  Up to this point we
have almost successfully done this switch once, from 1.0 to 2.0. I have a
hard time believing that we are going to do this again for another 5 years.

There is nothing preventing the community from letting each organization
decide what they want to do and we end up with both.  But this results in
fragmentation of the community, and makes it difficult for those trying to
stabilize a release because there is no critical mass of individuals using
and testing that branch.  It also results in the scrambling we are seeing
now to try and revert the incompatibles between 1.0 and 2.0 that were
introduced in the years between these releases.  If we are going to do the
same and make 3.0 compatible with 2.0 when the switch comes, why do we
even allow any incompatible changes in at all?  It just feels like trunk
is a place to put tech debt that we are going to try and revert later.  I
personally like the Linux and BSD models, where there is a new feature
merge window and any new features can come in, then the entire community
works together to stabilize the release before going on the the next merge
window.  If the release does not stabilize quickly the next merge window
gets pushed back. I realize this is very different from the current model
and is not likely to receive a lot of support, but it has worked for them
for a long time, and they have code bases just as large as Hadoop and even
larger and more diverse communities.

I am +1 for Konstantin's release plan and will vote as such on that thread.

--Bobby

On 5/3/13 3:06 AM, "Konstantin Shvachko" <sh...@gmail.com> wrote:

>Hi Arun and Suresh,
>
>I am glad my choice of words attracted your attention. I consider this
>important for the project otherwise I wouldn't waste everybody's time.
>You tend reacting on a latest message taken out of context, which does not
>reveal full picture.
>I'll try here to summarize my proposal and motivation expressed earlier in
>these two threads:
>http://s.apache.org/fs
>http://s.apache.org/Streamlining
>
>I am advocating
>1. to make 2.0.5 a release that will
>    a) make any necessary changes so that Hadoop APIs could be fixed after
>that
>    b) fix bugs: internal and those important for stabilizing downstream
>projects
>2. Release 2.1.0 stable. I.e. both with stable APIs and stable code base.
>3. Produce a series of feature releases. Potentially catching up with the
>state of trunk.
>4. Release from trunk afterwards.
>
>The main motivation to minimize changes in 2.0.5 is to let Hadoop users
>and
>the downstream projects, that is the Hadoop community, to start adapting
>to
>the new APIs asap. This will provide certainty that people can build their
>products on top of 2.0.5 APIs with minimal risk the next release will
>break
>them.
>Thus Bobby in http://goo.gl/jm5am
>is saying that the meaning of beta for him is locked down APIs for wire
>and
>binary compatibility. For Hadoop Yahoo using 2.x is an opportunity to have
>it tested at very large scale, which in turn will bring other users on
>board.
>
>I agree with Arun that we are not disagreeing on much. Just on the order
>of
>execution: what goes first stability or features.
>I am not challenging any features, the implementations, or the developers.
>But putting all changes together is destructive for the stability of the
>release. Adding a 500 KB patch invalidates prio testing solely because it
>is a big change that needs testing not only by itself but with upstream
>applications.
>With 2.0.3 , 2.0.4 tested thoroughly and widely in many organizations and
>several distributions it seems like a perfect base for the stable release.
>We could be just two steps away from it.
>
>I tried to explained as good as I could what I suggest, why, and why now.
>I
>am not here to police, claim, mandate, enforce edicts, be a gatekeeper,
>narrow view, tie up knots ... (did I miss any). If we disagree let's do it
>by the rules we created for ourselves and move on. Life will self-adjust
>and the entropy will keep increasing no matter what.
>
>Thanks,
>--Konstantin


Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Arun and Suresh,

I am glad my choice of words attracted your attention. I consider this
important for the project otherwise I wouldn't waste everybody's time.
You tend reacting on a latest message taken out of context, which does not
reveal full picture.
I'll try here to summarize my proposal and motivation expressed earlier in
these two threads:
http://s.apache.org/fs
http://s.apache.org/Streamlining

I am advocating
1. to make 2.0.5 a release that will
    a) make any necessary changes so that Hadoop APIs could be fixed after
that
    b) fix bugs: internal and those important for stabilizing downstream
projects
2. Release 2.1.0 stable. I.e. both with stable APIs and stable code base.
3. Produce a series of feature releases. Potentially catching up with the
state of trunk.
4. Release from trunk afterwards.

The main motivation to minimize changes in 2.0.5 is to let Hadoop users and
the downstream projects, that is the Hadoop community, to start adapting to
the new APIs asap. This will provide certainty that people can build their
products on top of 2.0.5 APIs with minimal risk the next release will break
them.
Thus Bobby in http://goo.gl/jm5am
is saying that the meaning of beta for him is locked down APIs for wire and
binary compatibility. For Hadoop Yahoo using 2.x is an opportunity to have
it tested at very large scale, which in turn will bring other users on
board.

I agree with Arun that we are not disagreeing on much. Just on the order of
execution: what goes first stability or features.
I am not challenging any features, the implementations, or the developers.
But putting all changes together is destructive for the stability of the
release. Adding a 500 KB patch invalidates prio testing solely because it
is a big change that needs testing not only by itself but with upstream
applications.
With 2.0.3 , 2.0.4 tested thoroughly and widely in many organizations and
several distributions it seems like a perfect base for the stable release.
We could be just two steps away from it.

I tried to explained as good as I could what I suggest, why, and why now. I
am not here to police, claim, mandate, enforce edicts, be a gatekeeper,
narrow view, tie up knots ... (did I miss any). If we disagree let's do it
by the rules we created for ourselves and move on. Life will self-adjust
and the entropy will keep increasing no matter what.

Thanks,
--Konstantin

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Arun and Suresh,

I am glad my choice of words attracted your attention. I consider this
important for the project otherwise I wouldn't waste everybody's time.
You tend reacting on a latest message taken out of context, which does not
reveal full picture.
I'll try here to summarize my proposal and motivation expressed earlier in
these two threads:
http://s.apache.org/fs
http://s.apache.org/Streamlining

I am advocating
1. to make 2.0.5 a release that will
    a) make any necessary changes so that Hadoop APIs could be fixed after
that
    b) fix bugs: internal and those important for stabilizing downstream
projects
2. Release 2.1.0 stable. I.e. both with stable APIs and stable code base.
3. Produce a series of feature releases. Potentially catching up with the
state of trunk.
4. Release from trunk afterwards.

The main motivation to minimize changes in 2.0.5 is to let Hadoop users and
the downstream projects, that is the Hadoop community, to start adapting to
the new APIs asap. This will provide certainty that people can build their
products on top of 2.0.5 APIs with minimal risk the next release will break
them.
Thus Bobby in http://goo.gl/jm5am
is saying that the meaning of beta for him is locked down APIs for wire and
binary compatibility. For Hadoop Yahoo using 2.x is an opportunity to have
it tested at very large scale, which in turn will bring other users on
board.

I agree with Arun that we are not disagreeing on much. Just on the order of
execution: what goes first stability or features.
I am not challenging any features, the implementations, or the developers.
But putting all changes together is destructive for the stability of the
release. Adding a 500 KB patch invalidates prio testing solely because it
is a big change that needs testing not only by itself but with upstream
applications.
With 2.0.3 , 2.0.4 tested thoroughly and widely in many organizations and
several distributions it seems like a perfect base for the stable release.
We could be just two steps away from it.

I tried to explained as good as I could what I suggest, why, and why now. I
am not here to police, claim, mandate, enforce edicts, be a gatekeeper,
narrow view, tie up knots ... (did I miss any). If we disagree let's do it
by the rules we created for ourselves and move on. Life will self-adjust
and the entropy will keep increasing no matter what.

Thanks,
--Konstantin

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Arun and Suresh,

I am glad my choice of words attracted your attention. I consider this
important for the project otherwise I wouldn't waste everybody's time.
You tend reacting on a latest message taken out of context, which does not
reveal full picture.
I'll try here to summarize my proposal and motivation expressed earlier in
these two threads:
http://s.apache.org/fs
http://s.apache.org/Streamlining

I am advocating
1. to make 2.0.5 a release that will
    a) make any necessary changes so that Hadoop APIs could be fixed after
that
    b) fix bugs: internal and those important for stabilizing downstream
projects
2. Release 2.1.0 stable. I.e. both with stable APIs and stable code base.
3. Produce a series of feature releases. Potentially catching up with the
state of trunk.
4. Release from trunk afterwards.

The main motivation to minimize changes in 2.0.5 is to let Hadoop users and
the downstream projects, that is the Hadoop community, to start adapting to
the new APIs asap. This will provide certainty that people can build their
products on top of 2.0.5 APIs with minimal risk the next release will break
them.
Thus Bobby in http://goo.gl/jm5am
is saying that the meaning of beta for him is locked down APIs for wire and
binary compatibility. For Hadoop Yahoo using 2.x is an opportunity to have
it tested at very large scale, which in turn will bring other users on
board.

I agree with Arun that we are not disagreeing on much. Just on the order of
execution: what goes first stability or features.
I am not challenging any features, the implementations, or the developers.
But putting all changes together is destructive for the stability of the
release. Adding a 500 KB patch invalidates prio testing solely because it
is a big change that needs testing not only by itself but with upstream
applications.
With 2.0.3 , 2.0.4 tested thoroughly and widely in many organizations and
several distributions it seems like a perfect base for the stable release.
We could be just two steps away from it.

I tried to explained as good as I could what I suggest, why, and why now. I
am not here to police, claim, mandate, enforce edicts, be a gatekeeper,
narrow view, tie up knots ... (did I miss any). If we disagree let's do it
by the rules we created for ourselves and move on. Life will self-adjust
and the entropy will keep increasing no matter what.

Thanks,
--Konstantin

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Arun and Suresh,

I am glad my choice of words attracted your attention. I consider this
important for the project otherwise I wouldn't waste everybody's time.
You tend reacting on a latest message taken out of context, which does not
reveal full picture.
I'll try here to summarize my proposal and motivation expressed earlier in
these two threads:
http://s.apache.org/fs
http://s.apache.org/Streamlining

I am advocating
1. to make 2.0.5 a release that will
    a) make any necessary changes so that Hadoop APIs could be fixed after
that
    b) fix bugs: internal and those important for stabilizing downstream
projects
2. Release 2.1.0 stable. I.e. both with stable APIs and stable code base.
3. Produce a series of feature releases. Potentially catching up with the
state of trunk.
4. Release from trunk afterwards.

The main motivation to minimize changes in 2.0.5 is to let Hadoop users and
the downstream projects, that is the Hadoop community, to start adapting to
the new APIs asap. This will provide certainty that people can build their
products on top of 2.0.5 APIs with minimal risk the next release will break
them.
Thus Bobby in http://goo.gl/jm5am
is saying that the meaning of beta for him is locked down APIs for wire and
binary compatibility. For Hadoop Yahoo using 2.x is an opportunity to have
it tested at very large scale, which in turn will bring other users on
board.

I agree with Arun that we are not disagreeing on much. Just on the order of
execution: what goes first stability or features.
I am not challenging any features, the implementations, or the developers.
But putting all changes together is destructive for the stability of the
release. Adding a 500 KB patch invalidates prio testing solely because it
is a big change that needs testing not only by itself but with upstream
applications.
With 2.0.3 , 2.0.4 tested thoroughly and widely in many organizations and
several distributions it seems like a perfect base for the stable release.
We could be just two steps away from it.

I tried to explained as good as I could what I suggest, why, and why now. I
am not here to police, claim, mandate, enforce edicts, be a gatekeeper,
narrow view, tie up knots ... (did I miss any). If we disagree let's do it
by the rules we created for ourselves and move on. Life will self-adjust
and the entropy will keep increasing no matter what.

Thanks,
--Konstantin

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On May 2, 2013, at 2:08 AM, Konstantin Shvachko wrote:

> I am arguing against invasive and destructive features proposed for the
> release.
> Just to remind here they are again, since the history has been wiped out.
> 
> # Snapshots
> # NFS gateway for HDFS
> # HDFS-347 unix domain socket based short circuits
> # Windows support
> 
> Do I understand correctly that you as a Release Manager will allow any
> changes in your release?
> In the next 3-4 weeks.

It is not appropriate for me, or anyone for that matter, to behave like a gatekeeper for a branch or a release. We have established this many times over as being counter-productive (for e.g. see Roy's responses to role of RM on in archives). This is particularly because Hadoop is such a complex system, exacerbated by the fact that this really is an umbrella project which needs to be broken up (HDFS, YARN, MapReduce); hence, no one person can sufficiently police all changes or try to enforce 'thou shalt do this, and this alone' edicts. There are too many shades of gray.

IAC, the only role of RM is to gently prod people into working together so that we can get releases out of the door for our users. 

It shouldn't shock anyone when I confess that I do not have sufficient expertise to argue with you about the list of HDFS features you are calling 'destructive' - I'll warn you that people working on those features might not share your opinion of their work as such, either. *smile* 

Given this, I urge you, again, to talk to people working on those features, feature-by-feature. Provide your feedback, review their code and please come to a consensus about what release they should be part of. If possible, help them test it; if not, ask them for what testing they have done or plan to do and see if that seems reasonable to you, help them if you can; end of the day, please come to a consensus with them. 

It's not my place to express opinions about others' work on HDFS; however, under extreme duress, I may confess that my understanding is that HDFS-347 is reasonably well-tested, the only changes needed to support NFS are changes to some apis/protocols (FileID) while the actual feature might come in later and that Snapshots have been worked on collaboratively for a long while. Even then we seem to agree that all necessary protocol changes will go into 2.0.5-beta; so I'm not sure whether we are disagreeing on much at all!

If anyone has concerns about YARN/MapReduce I'm willing to participate in a constructive dialogue. So for e.g., the only opinion I can offer for the list here is that it is my understanding that the proposed changes to support Windows in YARN/MR are very contained and hence non-risky. Lots of people have spent more time on adding that feature than I have; hence I'll assign more weight to their opinion of it's stability than my own. 

None of this means that I'll withhold the release for any of the features - but if someone steps up and says they want to pull it into branch-2 I will not block them. 

I hope this is reasonable, and that we can all get back to finishing up the release.

Clearly, one thing we all need to agree on (quickly) are the rules for compatibility for major/minor/patch releases. I plan to spend more time on it this week and the next.

FTR, my opinion is that within a major release we need to be compatible (both APIs and protocols, both user-facing and internal i..e for rolling upgrades etc.), minor releases can add compatible features and patch releases are meant for bug-fixes.

thanks,
Arun


Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On May 2, 2013, at 2:08 AM, Konstantin Shvachko wrote:

> I am arguing against invasive and destructive features proposed for the
> release.
> Just to remind here they are again, since the history has been wiped out.
> 
> # Snapshots
> # NFS gateway for HDFS
> # HDFS-347 unix domain socket based short circuits
> # Windows support
> 
> Do I understand correctly that you as a Release Manager will allow any
> changes in your release?
> In the next 3-4 weeks.

It is not appropriate for me, or anyone for that matter, to behave like a gatekeeper for a branch or a release. We have established this many times over as being counter-productive (for e.g. see Roy's responses to role of RM on in archives). This is particularly because Hadoop is such a complex system, exacerbated by the fact that this really is an umbrella project which needs to be broken up (HDFS, YARN, MapReduce); hence, no one person can sufficiently police all changes or try to enforce 'thou shalt do this, and this alone' edicts. There are too many shades of gray.

IAC, the only role of RM is to gently prod people into working together so that we can get releases out of the door for our users. 

It shouldn't shock anyone when I confess that I do not have sufficient expertise to argue with you about the list of HDFS features you are calling 'destructive' - I'll warn you that people working on those features might not share your opinion of their work as such, either. *smile* 

Given this, I urge you, again, to talk to people working on those features, feature-by-feature. Provide your feedback, review their code and please come to a consensus about what release they should be part of. If possible, help them test it; if not, ask them for what testing they have done or plan to do and see if that seems reasonable to you, help them if you can; end of the day, please come to a consensus with them. 

It's not my place to express opinions about others' work on HDFS; however, under extreme duress, I may confess that my understanding is that HDFS-347 is reasonably well-tested, the only changes needed to support NFS are changes to some apis/protocols (FileID) while the actual feature might come in later and that Snapshots have been worked on collaboratively for a long while. Even then we seem to agree that all necessary protocol changes will go into 2.0.5-beta; so I'm not sure whether we are disagreeing on much at all!

If anyone has concerns about YARN/MapReduce I'm willing to participate in a constructive dialogue. So for e.g., the only opinion I can offer for the list here is that it is my understanding that the proposed changes to support Windows in YARN/MR are very contained and hence non-risky. Lots of people have spent more time on adding that feature than I have; hence I'll assign more weight to their opinion of it's stability than my own. 

None of this means that I'll withhold the release for any of the features - but if someone steps up and says they want to pull it into branch-2 I will not block them. 

I hope this is reasonable, and that we can all get back to finishing up the release.

Clearly, one thing we all need to agree on (quickly) are the rules for compatibility for major/minor/patch releases. I plan to spend more time on it this week and the next.

FTR, my opinion is that within a major release we need to be compatible (both APIs and protocols, both user-facing and internal i..e for rolling upgrades etc.), minor releases can add compatible features and patch releases are meant for bug-fixes.

thanks,
Arun


Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On May 2, 2013, at 2:08 AM, Konstantin Shvachko wrote:

> I am arguing against invasive and destructive features proposed for the
> release.
> Just to remind here they are again, since the history has been wiped out.
> 
> # Snapshots
> # NFS gateway for HDFS
> # HDFS-347 unix domain socket based short circuits
> # Windows support
> 
> Do I understand correctly that you as a Release Manager will allow any
> changes in your release?
> In the next 3-4 weeks.

It is not appropriate for me, or anyone for that matter, to behave like a gatekeeper for a branch or a release. We have established this many times over as being counter-productive (for e.g. see Roy's responses to role of RM on in archives). This is particularly because Hadoop is such a complex system, exacerbated by the fact that this really is an umbrella project which needs to be broken up (HDFS, YARN, MapReduce); hence, no one person can sufficiently police all changes or try to enforce 'thou shalt do this, and this alone' edicts. There are too many shades of gray.

IAC, the only role of RM is to gently prod people into working together so that we can get releases out of the door for our users. 

It shouldn't shock anyone when I confess that I do not have sufficient expertise to argue with you about the list of HDFS features you are calling 'destructive' - I'll warn you that people working on those features might not share your opinion of their work as such, either. *smile* 

Given this, I urge you, again, to talk to people working on those features, feature-by-feature. Provide your feedback, review their code and please come to a consensus about what release they should be part of. If possible, help them test it; if not, ask them for what testing they have done or plan to do and see if that seems reasonable to you, help them if you can; end of the day, please come to a consensus with them. 

It's not my place to express opinions about others' work on HDFS; however, under extreme duress, I may confess that my understanding is that HDFS-347 is reasonably well-tested, the only changes needed to support NFS are changes to some apis/protocols (FileID) while the actual feature might come in later and that Snapshots have been worked on collaboratively for a long while. Even then we seem to agree that all necessary protocol changes will go into 2.0.5-beta; so I'm not sure whether we are disagreeing on much at all!

If anyone has concerns about YARN/MapReduce I'm willing to participate in a constructive dialogue. So for e.g., the only opinion I can offer for the list here is that it is my understanding that the proposed changes to support Windows in YARN/MR are very contained and hence non-risky. Lots of people have spent more time on adding that feature than I have; hence I'll assign more weight to their opinion of it's stability than my own. 

None of this means that I'll withhold the release for any of the features - but if someone steps up and says they want to pull it into branch-2 I will not block them. 

I hope this is reasonable, and that we can all get back to finishing up the release.

Clearly, one thing we all need to agree on (quickly) are the rules for compatibility for major/minor/patch releases. I plan to spend more time on it this week and the next.

FTR, my opinion is that within a major release we need to be compatible (both APIs and protocols, both user-facing and internal i..e for rolling upgrades etc.), minor releases can add compatible features and patch releases are meant for bug-fixes.

thanks,
Arun


Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On May 2, 2013, at 2:08 AM, Konstantin Shvachko wrote:

> I am arguing against invasive and destructive features proposed for the
> release.
> Just to remind here they are again, since the history has been wiped out.
> 
> # Snapshots
> # NFS gateway for HDFS
> # HDFS-347 unix domain socket based short circuits
> # Windows support
> 
> Do I understand correctly that you as a Release Manager will allow any
> changes in your release?
> In the next 3-4 weeks.

It is not appropriate for me, or anyone for that matter, to behave like a gatekeeper for a branch or a release. We have established this many times over as being counter-productive (for e.g. see Roy's responses to role of RM on in archives). This is particularly because Hadoop is such a complex system, exacerbated by the fact that this really is an umbrella project which needs to be broken up (HDFS, YARN, MapReduce); hence, no one person can sufficiently police all changes or try to enforce 'thou shalt do this, and this alone' edicts. There are too many shades of gray.

IAC, the only role of RM is to gently prod people into working together so that we can get releases out of the door for our users. 

It shouldn't shock anyone when I confess that I do not have sufficient expertise to argue with you about the list of HDFS features you are calling 'destructive' - I'll warn you that people working on those features might not share your opinion of their work as such, either. *smile* 

Given this, I urge you, again, to talk to people working on those features, feature-by-feature. Provide your feedback, review their code and please come to a consensus about what release they should be part of. If possible, help them test it; if not, ask them for what testing they have done or plan to do and see if that seems reasonable to you, help them if you can; end of the day, please come to a consensus with them. 

It's not my place to express opinions about others' work on HDFS; however, under extreme duress, I may confess that my understanding is that HDFS-347 is reasonably well-tested, the only changes needed to support NFS are changes to some apis/protocols (FileID) while the actual feature might come in later and that Snapshots have been worked on collaboratively for a long while. Even then we seem to agree that all necessary protocol changes will go into 2.0.5-beta; so I'm not sure whether we are disagreeing on much at all!

If anyone has concerns about YARN/MapReduce I'm willing to participate in a constructive dialogue. So for e.g., the only opinion I can offer for the list here is that it is my understanding that the proposed changes to support Windows in YARN/MR are very contained and hence non-risky. Lots of people have spent more time on adding that feature than I have; hence I'll assign more weight to their opinion of it's stability than my own. 

None of this means that I'll withhold the release for any of the features - but if someone steps up and says they want to pull it into branch-2 I will not block them. 

I hope this is reasonable, and that we can all get back to finishing up the release.

Clearly, one thing we all need to agree on (quickly) are the rules for compatibility for major/minor/patch releases. I plan to spend more time on it this week and the next.

FTR, my opinion is that within a major release we need to be compatible (both APIs and protocols, both user-facing and internal i..e for rolling upgrades etc.), minor releases can add compatible features and patch releases are meant for bug-fixes.

thanks,
Arun


Fwd: Heads up - 2.0.5-beta

Posted by Suresh Srinivas <su...@hortonworks.com>.
Konstantin,

I am arguing against invasive and destructive features proposed for the
> release.
>

Your choice of words is deplorable, to say the least.

Can you explain what do you mean by *destructive*? Please substantiate your
claim on technical grounds.

So far you have been quiet while we have been developing these
features on multiple jiras for many months. Now for you to suddenly appear
on the release thread and try to block it, by calling them * destructive*,
surprises
me.

All these feature developments has happened all in the open.
If you are concerned about it being *destructive*, please participate in the
discussions, code reviews and code contribution to make it
*non-destructive*.

 # Snapshots
> # NFS gateway for HDFS
> # HDFS-347 unix domain socket based short circuits
> # Windows support
>

We are not throwing all this code over the wall. These features are tested
well enough
and are ready. We have been working at it for many many months. Some of
this code
has been in trunk for quite sometime without causing any instability. I
take the responsibility
of testing these features and stabilizing them. Obviously, any help is
welcome.

Regards,
Suresh



-- 
http://hortonworks.com/download/



-- 
http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
I am not sure what was your point here. You seem to be assuming things I
never mentioned.

I am arguing against invasive and destructive features proposed for the
release.
Just to remind here they are again, since the history has been wiped out.

# Snapshots
# NFS gateway for HDFS
# HDFS-347 unix domain socket based short circuits
# Windows support

Do I understand correctly that you as a Release Manager will allow any
changes in your release?
In the next 3-4 weeks.

Thanks,
--Konstantin


On Wed, May 1, 2013 at 6:24 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

>
> On May 1, 2013, at 4:08 PM, Konstantin Shvachko wrote:
>
> > On Wed, May 1, 2013 at 1:15 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> >>
> >> On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:
> >>
> >>> If the next release has to be 2.0.5 I would like to make an alternative
> >>> proposal, which would include
> >>> - stabilization of current 2.0.4
> >>> - making all API changes to allow freezing them post 2.0.5
> >>> And nothing else.
> >>
> >> I think it's hard to clearly define - 'nothing else'. For e.g.
> > YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since
> > it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for
> ensuring
> > a smooth transition from MR1 to MR2 etc. etc.
> >>
> >
> > Don't see contradictions to the plan here.
> > Both YARN-398, YARN-392 are important optimizations. They require API
> > changes, so it is better to commit them into 2.0.5. If RM sees a low risk
> > in including the implementations, I don't see a problem.
> > MAPREDUCE-5108 as a compatibility issue should go in, imho.
>
> Actually, YARN-398/YARN-392 and other such optimizations can go in in
> future too releases in a compatible manner too since we have PB-based
> protocols in YARN (as in HDFS).
>
> However, they serve to illustrate why having a very narrow view of
> 'allowed' changes for the next 3-4 weeks will just add needless complexity.
>
> IAC, like I said it would be better to let individual contributors decide
> on risk of individual changes since they are the ones supporting them.
> Having a strict policy leads to all sorts of further dialogues and issues
> we could do well without.
>
> thanks,
> Arun
>
>

Re: Heads up - 2.0.5-beta

Posted by Chris Douglas <cd...@apache.org>.
On Thu, May 2, 2013 at 2:11 AM, Konstantin Shvachko
<sh...@gmail.com> wrote:
> On Thu, May 2, 2013 at 12:07 AM, Chris Douglas <cd...@apache.org> wrote:
>> Can anyone remember why we vote on release plans? -C
>
> To vote on features to include in the release.

Since most features are developed in branches (requiring a merge
vote), each change is RTC, and the release itself requires a vote... a
vote on the executive summary for a release is a poor time to engage
development. It doesn't seem to accomplish anything when it's not a
formality, so maybe we're better without it. Thoughts?

> I am arguing against invasive and destructive features proposed for the
> release.

Heh; do we need new tags in JIRA?

Setting aside the choice of words, we don't assign work by voting.
Stability is a shared goal, but conflating it with inertia after our
experiences with the 0.20 forks, 0.21, and 0.22 takes exactly the
wrong lessons from those episodes.

If you want to create a 2.x branch, pull out the features you view as
high-risk, and invite others to join your effort: you don't need
anyone's permission. If the bylaws contradict this, then that's a bug.
But one can't vote a set of priorities into preeminence, he can only
convince others to share them *and work on them.* It's cheap to
reassign versions to accommodate whatever shape the community takes,
but voting first and expecting everyone to follow the result has never
worked. Cos's chart gives the lie to the attempt: every time we've
tried, we end up reassigning the versions according to reality,
anyway. -C

Re: Heads up - 2.0.5-beta

Posted by Chris Douglas <cd...@apache.org>.
On Thu, May 2, 2013 at 2:11 AM, Konstantin Shvachko
<sh...@gmail.com> wrote:
> On Thu, May 2, 2013 at 12:07 AM, Chris Douglas <cd...@apache.org> wrote:
>> Can anyone remember why we vote on release plans? -C
>
> To vote on features to include in the release.

Since most features are developed in branches (requiring a merge
vote), each change is RTC, and the release itself requires a vote... a
vote on the executive summary for a release is a poor time to engage
development. It doesn't seem to accomplish anything when it's not a
formality, so maybe we're better without it. Thoughts?

> I am arguing against invasive and destructive features proposed for the
> release.

Heh; do we need new tags in JIRA?

Setting aside the choice of words, we don't assign work by voting.
Stability is a shared goal, but conflating it with inertia after our
experiences with the 0.20 forks, 0.21, and 0.22 takes exactly the
wrong lessons from those episodes.

If you want to create a 2.x branch, pull out the features you view as
high-risk, and invite others to join your effort: you don't need
anyone's permission. If the bylaws contradict this, then that's a bug.
But one can't vote a set of priorities into preeminence, he can only
convince others to share them *and work on them.* It's cheap to
reassign versions to accommodate whatever shape the community takes,
but voting first and expecting everyone to follow the result has never
worked. Cos's chart gives the lie to the attempt: every time we've
tried, we end up reassigning the versions according to reality,
anyway. -C

Re: Heads up - 2.0.5-beta

Posted by Chris Douglas <cd...@apache.org>.
On Thu, May 2, 2013 at 2:11 AM, Konstantin Shvachko
<sh...@gmail.com> wrote:
> On Thu, May 2, 2013 at 12:07 AM, Chris Douglas <cd...@apache.org> wrote:
>> Can anyone remember why we vote on release plans? -C
>
> To vote on features to include in the release.

Since most features are developed in branches (requiring a merge
vote), each change is RTC, and the release itself requires a vote... a
vote on the executive summary for a release is a poor time to engage
development. It doesn't seem to accomplish anything when it's not a
formality, so maybe we're better without it. Thoughts?

> I am arguing against invasive and destructive features proposed for the
> release.

Heh; do we need new tags in JIRA?

Setting aside the choice of words, we don't assign work by voting.
Stability is a shared goal, but conflating it with inertia after our
experiences with the 0.20 forks, 0.21, and 0.22 takes exactly the
wrong lessons from those episodes.

If you want to create a 2.x branch, pull out the features you view as
high-risk, and invite others to join your effort: you don't need
anyone's permission. If the bylaws contradict this, then that's a bug.
But one can't vote a set of priorities into preeminence, he can only
convince others to share them *and work on them.* It's cheap to
reassign versions to accommodate whatever shape the community takes,
but voting first and expecting everyone to follow the result has never
worked. Cos's chart gives the lie to the attempt: every time we've
tried, we end up reassigning the versions according to reality,
anyway. -C

Re: Heads up - 2.0.5-beta

Posted by Chris Douglas <cd...@apache.org>.
On Thu, May 2, 2013 at 2:11 AM, Konstantin Shvachko
<sh...@gmail.com> wrote:
> On Thu, May 2, 2013 at 12:07 AM, Chris Douglas <cd...@apache.org> wrote:
>> Can anyone remember why we vote on release plans? -C
>
> To vote on features to include in the release.

Since most features are developed in branches (requiring a merge
vote), each change is RTC, and the release itself requires a vote... a
vote on the executive summary for a release is a poor time to engage
development. It doesn't seem to accomplish anything when it's not a
formality, so maybe we're better without it. Thoughts?

> I am arguing against invasive and destructive features proposed for the
> release.

Heh; do we need new tags in JIRA?

Setting aside the choice of words, we don't assign work by voting.
Stability is a shared goal, but conflating it with inertia after our
experiences with the 0.20 forks, 0.21, and 0.22 takes exactly the
wrong lessons from those episodes.

If you want to create a 2.x branch, pull out the features you view as
high-risk, and invite others to join your effort: you don't need
anyone's permission. If the bylaws contradict this, then that's a bug.
But one can't vote a set of priorities into preeminence, he can only
convince others to share them *and work on them.* It's cheap to
reassign versions to accommodate whatever shape the community takes,
but voting first and expecting everyone to follow the result has never
worked. Cos's chart gives the lie to the attempt: every time we've
tried, we end up reassigning the versions according to reality,
anyway. -C

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
On Thu, May 2, 2013 at 12:07 AM, Chris Douglas <cd...@apache.org> wrote:
> Can anyone remember why we vote on release plans? -C

To vote on features to include in the release.

Thanks,
--Konstantin

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
On Thu, May 2, 2013 at 12:07 AM, Chris Douglas <cd...@apache.org> wrote:
> Can anyone remember why we vote on release plans? -C

To vote on features to include in the release.

Thanks,
--Konstantin

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
On Thu, May 2, 2013 at 12:07 AM, Chris Douglas <cd...@apache.org> wrote:
> Can anyone remember why we vote on release plans? -C

To vote on features to include in the release.

Thanks,
--Konstantin

Re: Heads up - 2.0.5-beta

Posted by Chris Douglas <cd...@apache.org>.
On Wed, May 1, 2013 at 6:24 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Having a strict policy leads to all sorts of further dialogues and issues we could do well without.

+1

Can anyone remember why we vote on release plans? -C

Re: Heads up - 2.0.5-beta

Posted by Chris Douglas <cd...@apache.org>.
On Wed, May 1, 2013 at 6:24 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Having a strict policy leads to all sorts of further dialogues and issues we could do well without.

+1

Can anyone remember why we vote on release plans? -C

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
I am not sure what was your point here. You seem to be assuming things I
never mentioned.

I am arguing against invasive and destructive features proposed for the
release.
Just to remind here they are again, since the history has been wiped out.

# Snapshots
# NFS gateway for HDFS
# HDFS-347 unix domain socket based short circuits
# Windows support

Do I understand correctly that you as a Release Manager will allow any
changes in your release?
In the next 3-4 weeks.

Thanks,
--Konstantin


On Wed, May 1, 2013 at 6:24 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

>
> On May 1, 2013, at 4:08 PM, Konstantin Shvachko wrote:
>
> > On Wed, May 1, 2013 at 1:15 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> >>
> >> On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:
> >>
> >>> If the next release has to be 2.0.5 I would like to make an alternative
> >>> proposal, which would include
> >>> - stabilization of current 2.0.4
> >>> - making all API changes to allow freezing them post 2.0.5
> >>> And nothing else.
> >>
> >> I think it's hard to clearly define - 'nothing else'. For e.g.
> > YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since
> > it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for
> ensuring
> > a smooth transition from MR1 to MR2 etc. etc.
> >>
> >
> > Don't see contradictions to the plan here.
> > Both YARN-398, YARN-392 are important optimizations. They require API
> > changes, so it is better to commit them into 2.0.5. If RM sees a low risk
> > in including the implementations, I don't see a problem.
> > MAPREDUCE-5108 as a compatibility issue should go in, imho.
>
> Actually, YARN-398/YARN-392 and other such optimizations can go in in
> future too releases in a compatible manner too since we have PB-based
> protocols in YARN (as in HDFS).
>
> However, they serve to illustrate why having a very narrow view of
> 'allowed' changes for the next 3-4 weeks will just add needless complexity.
>
> IAC, like I said it would be better to let individual contributors decide
> on risk of individual changes since they are the ones supporting them.
> Having a strict policy leads to all sorts of further dialogues and issues
> we could do well without.
>
> thanks,
> Arun
>
>

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
I am not sure what was your point here. You seem to be assuming things I
never mentioned.

I am arguing against invasive and destructive features proposed for the
release.
Just to remind here they are again, since the history has been wiped out.

# Snapshots
# NFS gateway for HDFS
# HDFS-347 unix domain socket based short circuits
# Windows support

Do I understand correctly that you as a Release Manager will allow any
changes in your release?
In the next 3-4 weeks.

Thanks,
--Konstantin


On Wed, May 1, 2013 at 6:24 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

>
> On May 1, 2013, at 4:08 PM, Konstantin Shvachko wrote:
>
> > On Wed, May 1, 2013 at 1:15 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
> >>
> >> On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:
> >>
> >>> If the next release has to be 2.0.5 I would like to make an alternative
> >>> proposal, which would include
> >>> - stabilization of current 2.0.4
> >>> - making all API changes to allow freezing them post 2.0.5
> >>> And nothing else.
> >>
> >> I think it's hard to clearly define - 'nothing else'. For e.g.
> > YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since
> > it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for
> ensuring
> > a smooth transition from MR1 to MR2 etc. etc.
> >>
> >
> > Don't see contradictions to the plan here.
> > Both YARN-398, YARN-392 are important optimizations. They require API
> > changes, so it is better to commit them into 2.0.5. If RM sees a low risk
> > in including the implementations, I don't see a problem.
> > MAPREDUCE-5108 as a compatibility issue should go in, imho.
>
> Actually, YARN-398/YARN-392 and other such optimizations can go in in
> future too releases in a compatible manner too since we have PB-based
> protocols in YARN (as in HDFS).
>
> However, they serve to illustrate why having a very narrow view of
> 'allowed' changes for the next 3-4 weeks will just add needless complexity.
>
> IAC, like I said it would be better to let individual contributors decide
> on risk of individual changes since they are the ones supporting them.
> Having a strict policy leads to all sorts of further dialogues and issues
> we could do well without.
>
> thanks,
> Arun
>
>

Re: Heads up - 2.0.5-beta

Posted by Chris Douglas <cd...@apache.org>.
On Wed, May 1, 2013 at 6:24 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Having a strict policy leads to all sorts of further dialogues and issues we could do well without.

+1

Can anyone remember why we vote on release plans? -C

Re: Heads up - 2.0.5-beta

Posted by Chris Douglas <cd...@apache.org>.
On Wed, May 1, 2013 at 6:24 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Having a strict policy leads to all sorts of further dialogues and issues we could do well without.

+1

Can anyone remember why we vote on release plans? -C

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On May 1, 2013, at 4:08 PM, Konstantin Shvachko wrote:

> On Wed, May 1, 2013 at 1:15 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>> 
>> On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:
>> 
>>> If the next release has to be 2.0.5 I would like to make an alternative
>>> proposal, which would include
>>> - stabilization of current 2.0.4
>>> - making all API changes to allow freezing them post 2.0.5
>>> And nothing else.
>> 
>> I think it's hard to clearly define - 'nothing else'. For e.g.
> YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since
> it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for ensuring
> a smooth transition from MR1 to MR2 etc. etc.
>> 
> 
> Don't see contradictions to the plan here.
> Both YARN-398, YARN-392 are important optimizations. They require API
> changes, so it is better to commit them into 2.0.5. If RM sees a low risk
> in including the implementations, I don't see a problem.
> MAPREDUCE-5108 as a compatibility issue should go in, imho.

Actually, YARN-398/YARN-392 and other such optimizations can go in in future too releases in a compatible manner too since we have PB-based protocols in YARN (as in HDFS). 

However, they serve to illustrate why having a very narrow view of 'allowed' changes for the next 3-4 weeks will just add needless complexity.

IAC, like I said it would be better to let individual contributors decide on risk of individual changes since they are the ones supporting them. Having a strict policy leads to all sorts of further dialogues and issues we could do well without. 

thanks,
Arun


Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On May 1, 2013, at 4:08 PM, Konstantin Shvachko wrote:

> On Wed, May 1, 2013 at 1:15 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>> 
>> On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:
>> 
>>> If the next release has to be 2.0.5 I would like to make an alternative
>>> proposal, which would include
>>> - stabilization of current 2.0.4
>>> - making all API changes to allow freezing them post 2.0.5
>>> And nothing else.
>> 
>> I think it's hard to clearly define - 'nothing else'. For e.g.
> YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since
> it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for ensuring
> a smooth transition from MR1 to MR2 etc. etc.
>> 
> 
> Don't see contradictions to the plan here.
> Both YARN-398, YARN-392 are important optimizations. They require API
> changes, so it is better to commit them into 2.0.5. If RM sees a low risk
> in including the implementations, I don't see a problem.
> MAPREDUCE-5108 as a compatibility issue should go in, imho.

Actually, YARN-398/YARN-392 and other such optimizations can go in in future too releases in a compatible manner too since we have PB-based protocols in YARN (as in HDFS). 

However, they serve to illustrate why having a very narrow view of 'allowed' changes for the next 3-4 weeks will just add needless complexity.

IAC, like I said it would be better to let individual contributors decide on risk of individual changes since they are the ones supporting them. Having a strict policy leads to all sorts of further dialogues and issues we could do well without. 

thanks,
Arun


Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On May 1, 2013, at 4:08 PM, Konstantin Shvachko wrote:

> On Wed, May 1, 2013 at 1:15 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>> 
>> On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:
>> 
>>> If the next release has to be 2.0.5 I would like to make an alternative
>>> proposal, which would include
>>> - stabilization of current 2.0.4
>>> - making all API changes to allow freezing them post 2.0.5
>>> And nothing else.
>> 
>> I think it's hard to clearly define - 'nothing else'. For e.g.
> YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since
> it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for ensuring
> a smooth transition from MR1 to MR2 etc. etc.
>> 
> 
> Don't see contradictions to the plan here.
> Both YARN-398, YARN-392 are important optimizations. They require API
> changes, so it is better to commit them into 2.0.5. If RM sees a low risk
> in including the implementations, I don't see a problem.
> MAPREDUCE-5108 as a compatibility issue should go in, imho.

Actually, YARN-398/YARN-392 and other such optimizations can go in in future too releases in a compatible manner too since we have PB-based protocols in YARN (as in HDFS). 

However, they serve to illustrate why having a very narrow view of 'allowed' changes for the next 3-4 weeks will just add needless complexity.

IAC, like I said it would be better to let individual contributors decide on risk of individual changes since they are the ones supporting them. Having a strict policy leads to all sorts of further dialogues and issues we could do well without. 

thanks,
Arun


Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On May 1, 2013, at 4:08 PM, Konstantin Shvachko wrote:

> On Wed, May 1, 2013 at 1:15 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>> 
>> On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:
>> 
>>> If the next release has to be 2.0.5 I would like to make an alternative
>>> proposal, which would include
>>> - stabilization of current 2.0.4
>>> - making all API changes to allow freezing them post 2.0.5
>>> And nothing else.
>> 
>> I think it's hard to clearly define - 'nothing else'. For e.g.
> YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since
> it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for ensuring
> a smooth transition from MR1 to MR2 etc. etc.
>> 
> 
> Don't see contradictions to the plan here.
> Both YARN-398, YARN-392 are important optimizations. They require API
> changes, so it is better to commit them into 2.0.5. If RM sees a low risk
> in including the implementations, I don't see a problem.
> MAPREDUCE-5108 as a compatibility issue should go in, imho.

Actually, YARN-398/YARN-392 and other such optimizations can go in in future too releases in a compatible manner too since we have PB-based protocols in YARN (as in HDFS). 

However, they serve to illustrate why having a very narrow view of 'allowed' changes for the next 3-4 weeks will just add needless complexity.

IAC, like I said it would be better to let individual contributors decide on risk of individual changes since they are the ones supporting them. Having a strict policy leads to all sorts of further dialogues and issues we could do well without. 

thanks,
Arun


Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
On Wed, May 1, 2013 at 1:15 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:
>
> > If the next release has to be 2.0.5 I would like to make an alternative
> > proposal, which would include
> > - stabilization of current 2.0.4
> > - making all API changes to allow freezing them post 2.0.5
> > And nothing else.
>
> I think it's hard to clearly define - 'nothing else'. For e.g.
YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since
it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for ensuring
a smooth transition from MR1 to MR2 etc. etc.
>

Don't see contradictions to the plan here.
Both YARN-398, YARN-392 are important optimizations. They require API
changes, so it is better to commit them into 2.0.5. If RM sees a low risk
in including the implementations, I don't see a problem.
MAPREDUCE-5108 as a compatibility issue should go in, imho.

> Rather than get tied up in knots, it would be useful to go with API
changes as *mandatory* and everything as optional and not hold up the
release for them (which is what we have done in hadoop-2.x since forever).
IAC, risk should be quantified by people working on individual jiras.
>

People were and are complaining that every release 2.0 was incompatible
with the previous.
I would not say any API changes, but those that help locking them post 2.0.5
"everything as optional" is too wide in my understanding as it can be
anything, including changes that break downstream components. In order to
avoid that, the changes should be minimized to bug fixes.

> Also, it will be useful to actually start testing things as they stand
rather than continue to discuss endlessly - would you be willing to help
test on of hadoop-2.x? If so, could you please share your plans? I'm sure
everyone will appreciate it.
>

Thank you for asking.
We did comprehensive testing internally of hadoop 2.0.3 and hadoop 2.0.4 as
they stand now using standard Hadoop tools and BigTop for integration. Cos
introduced Jenkins build for branch 2, which wasn't set up.
Testing things as they currently EVOLVE doesn't make sense to me, as the
volume of changes proposed will invalidate any current testing.
Endless discussions are not productive. I put up the vote for this release
plan.

Thanks,
--Konstantin

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
On Wed, May 1, 2013 at 1:15 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:
>
> > If the next release has to be 2.0.5 I would like to make an alternative
> > proposal, which would include
> > - stabilization of current 2.0.4
> > - making all API changes to allow freezing them post 2.0.5
> > And nothing else.
>
> I think it's hard to clearly define - 'nothing else'. For e.g.
YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since
it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for ensuring
a smooth transition from MR1 to MR2 etc. etc.
>

Don't see contradictions to the plan here.
Both YARN-398, YARN-392 are important optimizations. They require API
changes, so it is better to commit them into 2.0.5. If RM sees a low risk
in including the implementations, I don't see a problem.
MAPREDUCE-5108 as a compatibility issue should go in, imho.

> Rather than get tied up in knots, it would be useful to go with API
changes as *mandatory* and everything as optional and not hold up the
release for them (which is what we have done in hadoop-2.x since forever).
IAC, risk should be quantified by people working on individual jiras.
>

People were and are complaining that every release 2.0 was incompatible
with the previous.
I would not say any API changes, but those that help locking them post 2.0.5
"everything as optional" is too wide in my understanding as it can be
anything, including changes that break downstream components. In order to
avoid that, the changes should be minimized to bug fixes.

> Also, it will be useful to actually start testing things as they stand
rather than continue to discuss endlessly - would you be willing to help
test on of hadoop-2.x? If so, could you please share your plans? I'm sure
everyone will appreciate it.
>

Thank you for asking.
We did comprehensive testing internally of hadoop 2.0.3 and hadoop 2.0.4 as
they stand now using standard Hadoop tools and BigTop for integration. Cos
introduced Jenkins build for branch 2, which wasn't set up.
Testing things as they currently EVOLVE doesn't make sense to me, as the
volume of changes proposed will invalidate any current testing.
Endless discussions are not productive. I put up the vote for this release
plan.

Thanks,
--Konstantin

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
On Wed, May 1, 2013 at 1:15 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>
> On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:
>
> > If the next release has to be 2.0.5 I would like to make an alternative
> > proposal, which would include
> > - stabilization of current 2.0.4
> > - making all API changes to allow freezing them post 2.0.5
> > And nothing else.
>
> I think it's hard to clearly define - 'nothing else'. For e.g.
YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since
it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for ensuring
a smooth transition from MR1 to MR2 etc. etc.
>

Don't see contradictions to the plan here.
Both YARN-398, YARN-392 are important optimizations. They require API
changes, so it is better to commit them into 2.0.5. If RM sees a low risk
in including the implementations, I don't see a problem.
MAPREDUCE-5108 as a compatibility issue should go in, imho.

> Rather than get tied up in knots, it would be useful to go with API
changes as *mandatory* and everything as optional and not hold up the
release for them (which is what we have done in hadoop-2.x since forever).
IAC, risk should be quantified by people working on individual jiras.
>

People were and are complaining that every release 2.0 was incompatible
with the previous.
I would not say any API changes, but those that help locking them post 2.0.5
"everything as optional" is too wide in my understanding as it can be
anything, including changes that break downstream components. In order to
avoid that, the changes should be minimized to bug fixes.

> Also, it will be useful to actually start testing things as they stand
rather than continue to discuss endlessly - would you be willing to help
test on of hadoop-2.x? If so, could you please share your plans? I'm sure
everyone will appreciate it.
>

Thank you for asking.
We did comprehensive testing internally of hadoop 2.0.3 and hadoop 2.0.4 as
they stand now using standard Hadoop tools and BigTop for integration. Cos
introduced Jenkins build for branch 2, which wasn't set up.
Testing things as they currently EVOLVE doesn't make sense to me, as the
volume of changes proposed will invalidate any current testing.
Endless discussions are not productive. I put up the vote for this release
plan.

Thanks,
--Konstantin

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:

> Hi Arun,
> 
> I am agnostic about version numbers too, as long as the count goes up.
> The discussion you are referring to is somewhat outdated, it was talking
> about 2.0.4-beta, which we already passed. 

It's very relevant and related, we pushed 2.0.4-beta to 2.0.5-beta since we slipped in a 2.0.4-alpha bug-fix release.

We could re-visit the same discussion again, but seems hardly worth the time.

> If the next release has to be 2.0.5 I would like to make an alternative
> proposal, which would include
> - stabilization of current 2.0.4
> - making all API changes to allow freezing them post 2.0.5
> And nothing else.

I think it's hard to clearly define - 'nothing else'. For e.g. YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for ensuring a smooth transition from MR1 to MR2 etc. etc.

Rather than get tied up in knots, it would be useful to go with API changes as *mandatory* and everything as optional and not hold up the release for them (which is what we have done in hadoop-2.x since forever). IAC, risk should be quantified by people working on individual jiras.

Also, it will be useful to actually start testing things as they stand rather than continue to discuss endlessly - would you be willing to help test on of hadoop-2.x? If so, could you please share your plans? I'm sure everyone will appreciate it.

From my end (and speaking for rest of my team), we are spending a lot of work running functional and scale tests and also busy ensuring transition from hadoop-1 to hadoop-2 is smooth (e.g. MAPREDUCE-5108).

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
If there are no objections, I'll start a vote on this proposal now.

Thanks,
--Konstantin


On Tue, Apr 30, 2013 at 4:28 PM, Konstantin Shvachko
<sh...@gmail.com>wrote:

> Hi Arun,
>
> I am agnostic about version numbers too, as long as the count goes up.
> The discussion you are referring to is somewhat outdated, it was talking
> about 2.0.4-beta, which we already passed. It is talking about producing a
> series "not suitable for general consumption", which isn't correct for the
> latest release 2.0.4. That discussion clearly outlined general (or
> specific) frustration about breaking compatibility from top level projects.
>
> You are not listing new features for MR and YARN.
> So it will only be about the four HDFS features Suresh proposed for 2.0.5.
> As I said earlier my problem with them is that each is big enough to
> destabilize the code base, and big enough to be targeted for a separate
> release. The latter relates to the "streamlining" thread on general@.
> I also think the proposed features will delay stable 2.x beyond the
> time-frame you projected, because some of them are not implemented yet, and
> Windows is in unknown to me condition, as integration builds are still not
> run for it.
>
> If the next release has to be 2.0.5 I would like to make an alternative
> proposal, which would include
> - stabilization of current 2.0.4
> - making all API changes to allow freezing them post 2.0.5
> And nothing else.
>
> We can add new features in subsequent release (release). Potentially we
> can end up in the same place as you proposed but with more certainty along
> the road.
> The main reason I am asking for stabilization is to make it available for
> large installations such as Yahoo sooner. And this will require commitment
> to compatibility as Bobby mentioned on several occasions.
>
> As a rule of thumb compatibility for me means that I can do a rolling
> upgrade on the cluster. More formal definitions like Karthik's
> Compatibility page are better. BigTop's integration testing proved to be
> very productive.
>
> Thanks,
> --Konstantin
>
>
> On Fri, Apr 26, 2013 at 6:06 PM, Arun C Murthy <ac...@hortonworks.com>wrote:
>
>> Konstantin,
>>
>> On Apr 26, 2013, at 4:34 PM, Konstantin Shvachko wrote:
>>
>> > Do you think we can call the version you proposed to release
>> > 2.1.0 or 2.1.0-beta?
>> >
>> > The proposed new features imho do not exactly conform with the idea
>> > of dot-dot release, but definitely qualify for a major number change.
>> > I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
>> > also possible.
>>
>> I'm agnostic to the schemes.
>>
>> During the long discussion we had just 2 months ago, I proposed that
>> 2.1.x be the beta series initially.
>>
>> The feedback and consensus was that it wasn't the right numbering scheme:
>> http://s.apache.org/1j4
>>
>> thanks,
>> Arun
>>
>
>

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:

> Hi Arun,
> 
> I am agnostic about version numbers too, as long as the count goes up.
> The discussion you are referring to is somewhat outdated, it was talking
> about 2.0.4-beta, which we already passed. 

It's very relevant and related, we pushed 2.0.4-beta to 2.0.5-beta since we slipped in a 2.0.4-alpha bug-fix release.

We could re-visit the same discussion again, but seems hardly worth the time.

> If the next release has to be 2.0.5 I would like to make an alternative
> proposal, which would include
> - stabilization of current 2.0.4
> - making all API changes to allow freezing them post 2.0.5
> And nothing else.

I think it's hard to clearly define - 'nothing else'. For e.g. YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for ensuring a smooth transition from MR1 to MR2 etc. etc.

Rather than get tied up in knots, it would be useful to go with API changes as *mandatory* and everything as optional and not hold up the release for them (which is what we have done in hadoop-2.x since forever). IAC, risk should be quantified by people working on individual jiras.

Also, it will be useful to actually start testing things as they stand rather than continue to discuss endlessly - would you be willing to help test on of hadoop-2.x? If so, could you please share your plans? I'm sure everyone will appreciate it.

From my end (and speaking for rest of my team), we are spending a lot of work running functional and scale tests and also busy ensuring transition from hadoop-1 to hadoop-2 is smooth (e.g. MAPREDUCE-5108).

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:

> Hi Arun,
> 
> I am agnostic about version numbers too, as long as the count goes up.
> The discussion you are referring to is somewhat outdated, it was talking
> about 2.0.4-beta, which we already passed. 

It's very relevant and related, we pushed 2.0.4-beta to 2.0.5-beta since we slipped in a 2.0.4-alpha bug-fix release.

We could re-visit the same discussion again, but seems hardly worth the time.

> If the next release has to be 2.0.5 I would like to make an alternative
> proposal, which would include
> - stabilization of current 2.0.4
> - making all API changes to allow freezing them post 2.0.5
> And nothing else.

I think it's hard to clearly define - 'nothing else'. For e.g. YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for ensuring a smooth transition from MR1 to MR2 etc. etc.

Rather than get tied up in knots, it would be useful to go with API changes as *mandatory* and everything as optional and not hold up the release for them (which is what we have done in hadoop-2.x since forever). IAC, risk should be quantified by people working on individual jiras.

Also, it will be useful to actually start testing things as they stand rather than continue to discuss endlessly - would you be willing to help test on of hadoop-2.x? If so, could you please share your plans? I'm sure everyone will appreciate it.

From my end (and speaking for rest of my team), we are spending a lot of work running functional and scale tests and also busy ensuring transition from hadoop-1 to hadoop-2 is smooth (e.g. MAPREDUCE-5108).

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
If there are no objections, I'll start a vote on this proposal now.

Thanks,
--Konstantin


On Tue, Apr 30, 2013 at 4:28 PM, Konstantin Shvachko
<sh...@gmail.com>wrote:

> Hi Arun,
>
> I am agnostic about version numbers too, as long as the count goes up.
> The discussion you are referring to is somewhat outdated, it was talking
> about 2.0.4-beta, which we already passed. It is talking about producing a
> series "not suitable for general consumption", which isn't correct for the
> latest release 2.0.4. That discussion clearly outlined general (or
> specific) frustration about breaking compatibility from top level projects.
>
> You are not listing new features for MR and YARN.
> So it will only be about the four HDFS features Suresh proposed for 2.0.5.
> As I said earlier my problem with them is that each is big enough to
> destabilize the code base, and big enough to be targeted for a separate
> release. The latter relates to the "streamlining" thread on general@.
> I also think the proposed features will delay stable 2.x beyond the
> time-frame you projected, because some of them are not implemented yet, and
> Windows is in unknown to me condition, as integration builds are still not
> run for it.
>
> If the next release has to be 2.0.5 I would like to make an alternative
> proposal, which would include
> - stabilization of current 2.0.4
> - making all API changes to allow freezing them post 2.0.5
> And nothing else.
>
> We can add new features in subsequent release (release). Potentially we
> can end up in the same place as you proposed but with more certainty along
> the road.
> The main reason I am asking for stabilization is to make it available for
> large installations such as Yahoo sooner. And this will require commitment
> to compatibility as Bobby mentioned on several occasions.
>
> As a rule of thumb compatibility for me means that I can do a rolling
> upgrade on the cluster. More formal definitions like Karthik's
> Compatibility page are better. BigTop's integration testing proved to be
> very productive.
>
> Thanks,
> --Konstantin
>
>
> On Fri, Apr 26, 2013 at 6:06 PM, Arun C Murthy <ac...@hortonworks.com>wrote:
>
>> Konstantin,
>>
>> On Apr 26, 2013, at 4:34 PM, Konstantin Shvachko wrote:
>>
>> > Do you think we can call the version you proposed to release
>> > 2.1.0 or 2.1.0-beta?
>> >
>> > The proposed new features imho do not exactly conform with the idea
>> > of dot-dot release, but definitely qualify for a major number change.
>> > I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
>> > also possible.
>>
>> I'm agnostic to the schemes.
>>
>> During the long discussion we had just 2 months ago, I proposed that
>> 2.1.x be the beta series initially.
>>
>> The feedback and consensus was that it wasn't the right numbering scheme:
>> http://s.apache.org/1j4
>>
>> thanks,
>> Arun
>>
>
>

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On Apr 30, 2013, at 4:28 PM, Konstantin Shvachko wrote:

> Hi Arun,
> 
> I am agnostic about version numbers too, as long as the count goes up.
> The discussion you are referring to is somewhat outdated, it was talking
> about 2.0.4-beta, which we already passed. 

It's very relevant and related, we pushed 2.0.4-beta to 2.0.5-beta since we slipped in a 2.0.4-alpha bug-fix release.

We could re-visit the same discussion again, but seems hardly worth the time.

> If the next release has to be 2.0.5 I would like to make an alternative
> proposal, which would include
> - stabilization of current 2.0.4
> - making all API changes to allow freezing them post 2.0.5
> And nothing else.

I think it's hard to clearly define - 'nothing else'. For e.g. YARN-398/YARN-392. It's a 'feature' but worth putting in right-away since it so low-risk. MAPREDUCE-5108 is a 'feature' but is critical for ensuring a smooth transition from MR1 to MR2 etc. etc.

Rather than get tied up in knots, it would be useful to go with API changes as *mandatory* and everything as optional and not hold up the release for them (which is what we have done in hadoop-2.x since forever). IAC, risk should be quantified by people working on individual jiras.

Also, it will be useful to actually start testing things as they stand rather than continue to discuss endlessly - would you be willing to help test on of hadoop-2.x? If so, could you please share your plans? I'm sure everyone will appreciate it.

From my end (and speaking for rest of my team), we are spending a lot of work running functional and scale tests and also busy ensuring transition from hadoop-1 to hadoop-2 is smooth (e.g. MAPREDUCE-5108).

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Arun,

I am agnostic about version numbers too, as long as the count goes up.
The discussion you are referring to is somewhat outdated, it was talking
about 2.0.4-beta, which we already passed. It is talking about producing a
series "not suitable for general consumption", which isn't correct for the
latest release 2.0.4. That discussion clearly outlined general (or
specific) frustration about breaking compatibility from top level projects.

You are not listing new features for MR and YARN.
So it will only be about the four HDFS features Suresh proposed for 2.0.5.
As I said earlier my problem with them is that each is big enough to
destabilize the code base, and big enough to be targeted for a separate
release. The latter relates to the "streamlining" thread on general@.
I also think the proposed features will delay stable 2.x beyond the
time-frame you projected, because some of them are not implemented yet, and
Windows is in unknown to me condition, as integration builds are still not
run for it.

If the next release has to be 2.0.5 I would like to make an alternative
proposal, which would include
- stabilization of current 2.0.4
- making all API changes to allow freezing them post 2.0.5
And nothing else.

We can add new features in subsequent release (release). Potentially we can
end up in the same place as you proposed but with more certainty along the
road.
The main reason I am asking for stabilization is to make it available for
large installations such as Yahoo sooner. And this will require commitment
to compatibility as Bobby mentioned on several occasions.

As a rule of thumb compatibility for me means that I can do a rolling
upgrade on the cluster. More formal definitions like Karthik's
Compatibility page are better. BigTop's integration testing proved to be
very productive.

Thanks,
--Konstantin


On Fri, Apr 26, 2013 at 6:06 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Konstantin,
>
> On Apr 26, 2013, at 4:34 PM, Konstantin Shvachko wrote:
>
> > Do you think we can call the version you proposed to release
> > 2.1.0 or 2.1.0-beta?
> >
> > The proposed new features imho do not exactly conform with the idea
> > of dot-dot release, but definitely qualify for a major number change.
> > I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
> > also possible.
>
> I'm agnostic to the schemes.
>
> During the long discussion we had just 2 months ago, I proposed that 2.1.x
> be the beta series initially.
>
> The feedback and consensus was that it wasn't the right numbering scheme:
> http://s.apache.org/1j4
>
> thanks,
> Arun
>

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Arun,

I am agnostic about version numbers too, as long as the count goes up.
The discussion you are referring to is somewhat outdated, it was talking
about 2.0.4-beta, which we already passed. It is talking about producing a
series "not suitable for general consumption", which isn't correct for the
latest release 2.0.4. That discussion clearly outlined general (or
specific) frustration about breaking compatibility from top level projects.

You are not listing new features for MR and YARN.
So it will only be about the four HDFS features Suresh proposed for 2.0.5.
As I said earlier my problem with them is that each is big enough to
destabilize the code base, and big enough to be targeted for a separate
release. The latter relates to the "streamlining" thread on general@.
I also think the proposed features will delay stable 2.x beyond the
time-frame you projected, because some of them are not implemented yet, and
Windows is in unknown to me condition, as integration builds are still not
run for it.

If the next release has to be 2.0.5 I would like to make an alternative
proposal, which would include
- stabilization of current 2.0.4
- making all API changes to allow freezing them post 2.0.5
And nothing else.

We can add new features in subsequent release (release). Potentially we can
end up in the same place as you proposed but with more certainty along the
road.
The main reason I am asking for stabilization is to make it available for
large installations such as Yahoo sooner. And this will require commitment
to compatibility as Bobby mentioned on several occasions.

As a rule of thumb compatibility for me means that I can do a rolling
upgrade on the cluster. More formal definitions like Karthik's
Compatibility page are better. BigTop's integration testing proved to be
very productive.

Thanks,
--Konstantin


On Fri, Apr 26, 2013 at 6:06 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Konstantin,
>
> On Apr 26, 2013, at 4:34 PM, Konstantin Shvachko wrote:
>
> > Do you think we can call the version you proposed to release
> > 2.1.0 or 2.1.0-beta?
> >
> > The proposed new features imho do not exactly conform with the idea
> > of dot-dot release, but definitely qualify for a major number change.
> > I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
> > also possible.
>
> I'm agnostic to the schemes.
>
> During the long discussion we had just 2 months ago, I proposed that 2.1.x
> be the beta series initially.
>
> The feedback and consensus was that it wasn't the right numbering scheme:
> http://s.apache.org/1j4
>
> thanks,
> Arun
>

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Arun,

I am agnostic about version numbers too, as long as the count goes up.
The discussion you are referring to is somewhat outdated, it was talking
about 2.0.4-beta, which we already passed. It is talking about producing a
series "not suitable for general consumption", which isn't correct for the
latest release 2.0.4. That discussion clearly outlined general (or
specific) frustration about breaking compatibility from top level projects.

You are not listing new features for MR and YARN.
So it will only be about the four HDFS features Suresh proposed for 2.0.5.
As I said earlier my problem with them is that each is big enough to
destabilize the code base, and big enough to be targeted for a separate
release. The latter relates to the "streamlining" thread on general@.
I also think the proposed features will delay stable 2.x beyond the
time-frame you projected, because some of them are not implemented yet, and
Windows is in unknown to me condition, as integration builds are still not
run for it.

If the next release has to be 2.0.5 I would like to make an alternative
proposal, which would include
- stabilization of current 2.0.4
- making all API changes to allow freezing them post 2.0.5
And nothing else.

We can add new features in subsequent release (release). Potentially we can
end up in the same place as you proposed but with more certainty along the
road.
The main reason I am asking for stabilization is to make it available for
large installations such as Yahoo sooner. And this will require commitment
to compatibility as Bobby mentioned on several occasions.

As a rule of thumb compatibility for me means that I can do a rolling
upgrade on the cluster. More formal definitions like Karthik's
Compatibility page are better. BigTop's integration testing proved to be
very productive.

Thanks,
--Konstantin


On Fri, Apr 26, 2013 at 6:06 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Konstantin,
>
> On Apr 26, 2013, at 4:34 PM, Konstantin Shvachko wrote:
>
> > Do you think we can call the version you proposed to release
> > 2.1.0 or 2.1.0-beta?
> >
> > The proposed new features imho do not exactly conform with the idea
> > of dot-dot release, but definitely qualify for a major number change.
> > I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
> > also possible.
>
> I'm agnostic to the schemes.
>
> During the long discussion we had just 2 months ago, I proposed that 2.1.x
> be the beta series initially.
>
> The feedback and consensus was that it wasn't the right numbering scheme:
> http://s.apache.org/1j4
>
> thanks,
> Arun
>

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Hi Arun,

I am agnostic about version numbers too, as long as the count goes up.
The discussion you are referring to is somewhat outdated, it was talking
about 2.0.4-beta, which we already passed. It is talking about producing a
series "not suitable for general consumption", which isn't correct for the
latest release 2.0.4. That discussion clearly outlined general (or
specific) frustration about breaking compatibility from top level projects.

You are not listing new features for MR and YARN.
So it will only be about the four HDFS features Suresh proposed for 2.0.5.
As I said earlier my problem with them is that each is big enough to
destabilize the code base, and big enough to be targeted for a separate
release. The latter relates to the "streamlining" thread on general@.
I also think the proposed features will delay stable 2.x beyond the
time-frame you projected, because some of them are not implemented yet, and
Windows is in unknown to me condition, as integration builds are still not
run for it.

If the next release has to be 2.0.5 I would like to make an alternative
proposal, which would include
- stabilization of current 2.0.4
- making all API changes to allow freezing them post 2.0.5
And nothing else.

We can add new features in subsequent release (release). Potentially we can
end up in the same place as you proposed but with more certainty along the
road.
The main reason I am asking for stabilization is to make it available for
large installations such as Yahoo sooner. And this will require commitment
to compatibility as Bobby mentioned on several occasions.

As a rule of thumb compatibility for me means that I can do a rolling
upgrade on the cluster. More formal definitions like Karthik's
Compatibility page are better. BigTop's integration testing proved to be
very productive.

Thanks,
--Konstantin


On Fri, Apr 26, 2013 at 6:06 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Konstantin,
>
> On Apr 26, 2013, at 4:34 PM, Konstantin Shvachko wrote:
>
> > Do you think we can call the version you proposed to release
> > 2.1.0 or 2.1.0-beta?
> >
> > The proposed new features imho do not exactly conform with the idea
> > of dot-dot release, but definitely qualify for a major number change.
> > I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
> > also possible.
>
> I'm agnostic to the schemes.
>
> During the long discussion we had just 2 months ago, I proposed that 2.1.x
> be the beta series initially.
>
> The feedback and consensus was that it wasn't the right numbering scheme:
> http://s.apache.org/1j4
>
> thanks,
> Arun
>

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On Apr 26, 2013, at 4:34 PM, Konstantin Shvachko wrote:

> Do you think we can call the version you proposed to release
> 2.1.0 or 2.1.0-beta?
> 
> The proposed new features imho do not exactly conform with the idea
> of dot-dot release, but definitely qualify for a major number change.
> I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
> also possible.

I'm agnostic to the schemes. 

During the long discussion we had just 2 months ago, I proposed that 2.1.x be the beta series initially.

The feedback and consensus was that it wasn't the right numbering scheme:
http://s.apache.org/1j4

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Arpit Agarwal <aa...@hortonworks.com>.
On Thu, Apr 25, 2013 at 6:36 PM, Suresh Srinivas <suresh@hortonworks.com
>wrote:

 > Thanks for starting this discussion. I volunteer to do a final review of
> protocol changes, so we can avoid incompatible changes to API and wire
> protocol post 2.0.5 in Common and HDFS.
>
> We have been working really hard on the following features. I would like
to
> get into 2.x and see it reach HDFS users:
> # Snapshots
> # NFS gateway for HDFS
> # HDFS-347 unix domain socket based short circuits
> # Windows support


Thanks Suresh. It would be great to see Windows support and Snapshots
pushed out with 2.0.5 and get picked up by users.

-Arpit

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On Apr 26, 2013, at 4:34 PM, Konstantin Shvachko wrote:

> Do you think we can call the version you proposed to release
> 2.1.0 or 2.1.0-beta?
> 
> The proposed new features imho do not exactly conform with the idea
> of dot-dot release, but definitely qualify for a major number change.
> I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
> also possible.

I'm agnostic to the schemes. 

During the long discussion we had just 2 months ago, I proposed that 2.1.x be the beta series initially.

The feedback and consensus was that it wasn't the right numbering scheme:
http://s.apache.org/1j4

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On Apr 26, 2013, at 4:34 PM, Konstantin Shvachko wrote:

> Do you think we can call the version you proposed to release
> 2.1.0 or 2.1.0-beta?
> 
> The proposed new features imho do not exactly conform with the idea
> of dot-dot release, but definitely qualify for a major number change.
> I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
> also possible.

I'm agnostic to the schemes. 

During the long discussion we had just 2 months ago, I proposed that 2.1.x be the beta series initially.

The feedback and consensus was that it wasn't the right numbering scheme:
http://s.apache.org/1j4

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Arpit Agarwal <aa...@hortonworks.com>.
On Thu, Apr 25, 2013 at 6:36 PM, Suresh Srinivas <suresh@hortonworks.com
>wrote:

 > Thanks for starting this discussion. I volunteer to do a final review of
> protocol changes, so we can avoid incompatible changes to API and wire
> protocol post 2.0.5 in Common and HDFS.
>
> We have been working really hard on the following features. I would like
to
> get into 2.x and see it reach HDFS users:
> # Snapshots
> # NFS gateway for HDFS
> # HDFS-347 unix domain socket based short circuits
> # Windows support


Thanks Suresh. It would be great to see Windows support and Snapshots
pushed out with 2.0.5 and get picked up by users.

-Arpit

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
Konstantin,

On Apr 26, 2013, at 4:34 PM, Konstantin Shvachko wrote:

> Do you think we can call the version you proposed to release
> 2.1.0 or 2.1.0-beta?
> 
> The proposed new features imho do not exactly conform with the idea
> of dot-dot release, but definitely qualify for a major number change.
> I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
> also possible.

I'm agnostic to the schemes. 

During the long discussion we had just 2 months ago, I proposed that 2.1.x be the beta series initially.

The feedback and consensus was that it wasn't the right numbering scheme:
http://s.apache.org/1j4

thanks,
Arun

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Arun, Suresh,

Very exciting to hear about this final push to stable Hadoop 2.
But I have a problem. Either with the plan or with the version number.
I'll be arguing for the number change below rather than the plan.

1. Based on features listed by Suresh it looks that you plan a heavy
feature-full release.
2. You are saying you want to complete this within a month (or so).
3. You would like to give it beta quality mark.

Not saying it is impossible. But in line with the common saying
"You can have fast, good or big: pick two"
(a little rephrasing here)
I would like to propose to leave some gap between 2.0.4 and the next
version so that just in case there was a version to put bug fixes on top
of  the last release.
Do you think we can call the version you proposed to release
2.1.0 or 2.1.0-beta?

The proposed new features imho do not exactly conform with the idea
of dot-dot release, but definitely qualify for a major number change.
I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
also possible.

Thanks,
--Konstantin


On Thu, Apr 25, 2013 at 6:36 PM, Suresh Srinivas <su...@hortonworks.com>wrote:

> Thanks for starting this discussion. I volunteer to do a final review of
> protocol changes, so we can avoid incompatible changes to API and wire
> protocol post 2.0.5 in Common and HDFS.
>
> We have been working really hard on the following features. I would like to
> get into 2.x and see it reach HDFS users:
> # Snapshots
> # NFS gateway for HDFS
> # HDFS-347 unix domain socket based short circuits
> # Windows support
>
> Other HDFS folks please let me know if I missed anything.
>
> To ensure a timely release of 2.0.5-beta, we should not hold back for
> individual features. However, I would like to make necessary API and/or
> protocol changes right-away. This will allow us to adding  features in
> subsequent releases e.g. hadoop-2.2 or hadoop-2.3 etc without breaking
> compatibility. For e.g. even if we don't complete NFS support, making
> FileID related changes in 2.0.5-beta will ensure future compatbility.
>
> Regards,
> Suresh
>
>
>
> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
>
> > Gang,
> >
> >  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> > hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> > believe we are nearly there, exciting times!
> >
> >  As we have discussed previously, I hope to do a final push to stabilize
> > hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> > declare hadoop-2.1 as stable this summer after a short period of
> intensive
> > testing.
> >
> >  With that in mind, I really want to make a serious push to lock down
> APIs
> > and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently
> support
> > hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> > features, but please ensure that all APIs are frozen for
> hadoop-2.0.5-beta
> >
> >  Vinod is helping out on the YARN/MR side and has tagged a number of
> final
> > changes (including some the final API incompatibilities) we'd like to
> push
> > in before we call hadoop-2.x as ready to be supported (Target Version set
> > to 2.0.5-beta):
> >  http://s.apache.org/target-hadoop-2.0.5-beta
> >  Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> > tagged, but their necessity is implied).
> >
> >  Similarly on HDFS side, can someone please help out by tagging features,
> > bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> > protocols are locked down too - I'd really appreciate it!
> >
> > thanks,
> > Arun
> >
> >
> > --
> > Arun C. Murthy
> > Hortonworks Inc.
> > http://hortonworks.com/
> >
> >
> >
>
>
> --
> http://hortonworks.com/download/
>

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 25, 2013, at 6:36 PM, Suresh Srinivas wrote:

>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>> 
>> Similarly on HDFS side, can someone please help out by tagging features,
>> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
>> protocols are locked down too - I'd really appreciate it!
> 
> To ensure a timely release of 2.0.5-beta, we should not hold back for
> individual features. However, I would like to make necessary API and/or
> protocol changes right-away. This will allow us to adding  features in
> subsequent releases e.g. hadoop-2.2 or hadoop-2.3 etc without breaking
> compatibility. 

+1, sounds like a good plan. Thanks!

Arun

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 25, 2013, at 6:36 PM, Suresh Srinivas wrote:

>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>> 
>> Similarly on HDFS side, can someone please help out by tagging features,
>> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
>> protocols are locked down too - I'd really appreciate it!
> 
> To ensure a timely release of 2.0.5-beta, we should not hold back for
> individual features. However, I would like to make necessary API and/or
> protocol changes right-away. This will allow us to adding  features in
> subsequent releases e.g. hadoop-2.2 or hadoop-2.3 etc without breaking
> compatibility. 

+1, sounds like a good plan. Thanks!

Arun

Re: Heads up - 2.0.5-beta

Posted by Arun C Murthy <ac...@hortonworks.com>.
On Apr 25, 2013, at 6:36 PM, Suresh Srinivas wrote:

>> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
>> 
>> Similarly on HDFS side, can someone please help out by tagging features,
>> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
>> protocols are locked down too - I'd really appreciate it!
> 
> To ensure a timely release of 2.0.5-beta, we should not hold back for
> individual features. However, I would like to make necessary API and/or
> protocol changes right-away. This will allow us to adding  features in
> subsequent releases e.g. hadoop-2.2 or hadoop-2.3 etc without breaking
> compatibility. 

+1, sounds like a good plan. Thanks!

Arun

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Arun, Suresh,

Very exciting to hear about this final push to stable Hadoop 2.
But I have a problem. Either with the plan or with the version number.
I'll be arguing for the number change below rather than the plan.

1. Based on features listed by Suresh it looks that you plan a heavy
feature-full release.
2. You are saying you want to complete this within a month (or so).
3. You would like to give it beta quality mark.

Not saying it is impossible. But in line with the common saying
"You can have fast, good or big: pick two"
(a little rephrasing here)
I would like to propose to leave some gap between 2.0.4 and the next
version so that just in case there was a version to put bug fixes on top
of  the last release.
Do you think we can call the version you proposed to release
2.1.0 or 2.1.0-beta?

The proposed new features imho do not exactly conform with the idea
of dot-dot release, but definitely qualify for a major number change.
I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
also possible.

Thanks,
--Konstantin


On Thu, Apr 25, 2013 at 6:36 PM, Suresh Srinivas <su...@hortonworks.com>wrote:

> Thanks for starting this discussion. I volunteer to do a final review of
> protocol changes, so we can avoid incompatible changes to API and wire
> protocol post 2.0.5 in Common and HDFS.
>
> We have been working really hard on the following features. I would like to
> get into 2.x and see it reach HDFS users:
> # Snapshots
> # NFS gateway for HDFS
> # HDFS-347 unix domain socket based short circuits
> # Windows support
>
> Other HDFS folks please let me know if I missed anything.
>
> To ensure a timely release of 2.0.5-beta, we should not hold back for
> individual features. However, I would like to make necessary API and/or
> protocol changes right-away. This will allow us to adding  features in
> subsequent releases e.g. hadoop-2.2 or hadoop-2.3 etc without breaking
> compatibility. For e.g. even if we don't complete NFS support, making
> FileID related changes in 2.0.5-beta will ensure future compatbility.
>
> Regards,
> Suresh
>
>
>
> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
>
> > Gang,
> >
> >  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> > hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> > believe we are nearly there, exciting times!
> >
> >  As we have discussed previously, I hope to do a final push to stabilize
> > hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> > declare hadoop-2.1 as stable this summer after a short period of
> intensive
> > testing.
> >
> >  With that in mind, I really want to make a serious push to lock down
> APIs
> > and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently
> support
> > hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> > features, but please ensure that all APIs are frozen for
> hadoop-2.0.5-beta
> >
> >  Vinod is helping out on the YARN/MR side and has tagged a number of
> final
> > changes (including some the final API incompatibilities) we'd like to
> push
> > in before we call hadoop-2.x as ready to be supported (Target Version set
> > to 2.0.5-beta):
> >  http://s.apache.org/target-hadoop-2.0.5-beta
> >  Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> > tagged, but their necessity is implied).
> >
> >  Similarly on HDFS side, can someone please help out by tagging features,
> > bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> > protocols are locked down too - I'd really appreciate it!
> >
> > thanks,
> > Arun
> >
> >
> > --
> > Arun C. Murthy
> > Hortonworks Inc.
> > http://hortonworks.com/
> >
> >
> >
>
>
> --
> http://hortonworks.com/download/
>

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Arun, Suresh,

Very exciting to hear about this final push to stable Hadoop 2.
But I have a problem. Either with the plan or with the version number.
I'll be arguing for the number change below rather than the plan.

1. Based on features listed by Suresh it looks that you plan a heavy
feature-full release.
2. You are saying you want to complete this within a month (or so).
3. You would like to give it beta quality mark.

Not saying it is impossible. But in line with the common saying
"You can have fast, good or big: pick two"
(a little rephrasing here)
I would like to propose to leave some gap between 2.0.4 and the next
version so that just in case there was a version to put bug fixes on top
of  the last release.
Do you think we can call the version you proposed to release
2.1.0 or 2.1.0-beta?

The proposed new features imho do not exactly conform with the idea
of dot-dot release, but definitely qualify for a major number change.
I am just trying to avoid rather ugly 2.0.4.1 versions, which of course
also possible.

Thanks,
--Konstantin


On Thu, Apr 25, 2013 at 6:36 PM, Suresh Srinivas <su...@hortonworks.com>wrote:

> Thanks for starting this discussion. I volunteer to do a final review of
> protocol changes, so we can avoid incompatible changes to API and wire
> protocol post 2.0.5 in Common and HDFS.
>
> We have been working really hard on the following features. I would like to
> get into 2.x and see it reach HDFS users:
> # Snapshots
> # NFS gateway for HDFS
> # HDFS-347 unix domain socket based short circuits
> # Windows support
>
> Other HDFS folks please let me know if I missed anything.
>
> To ensure a timely release of 2.0.5-beta, we should not hold back for
> individual features. However, I would like to make necessary API and/or
> protocol changes right-away. This will allow us to adding  features in
> subsequent releases e.g. hadoop-2.2 or hadoop-2.3 etc without breaking
> compatibility. For e.g. even if we don't complete NFS support, making
> FileID related changes in 2.0.5-beta will ensure future compatbility.
>
> Regards,
> Suresh
>
>
>
> On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
>
> > Gang,
> >
> >  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> > hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> > believe we are nearly there, exciting times!
> >
> >  As we have discussed previously, I hope to do a final push to stabilize
> > hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> > declare hadoop-2.1 as stable this summer after a short period of
> intensive
> > testing.
> >
> >  With that in mind, I really want to make a serious push to lock down
> APIs
> > and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently
> support
> > hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> > features, but please ensure that all APIs are frozen for
> hadoop-2.0.5-beta
> >
> >  Vinod is helping out on the YARN/MR side and has tagged a number of
> final
> > changes (including some the final API incompatibilities) we'd like to
> push
> > in before we call hadoop-2.x as ready to be supported (Target Version set
> > to 2.0.5-beta):
> >  http://s.apache.org/target-hadoop-2.0.5-beta
> >  Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> > tagged, but their necessity is implied).
> >
> >  Similarly on HDFS side, can someone please help out by tagging features,
> > bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> > protocols are locked down too - I'd really appreciate it!
> >
> > thanks,
> > Arun
> >
> >
> > --
> > Arun C. Murthy
> > Hortonworks Inc.
> > http://hortonworks.com/
> >
> >
> >
>
>
> --
> http://hortonworks.com/download/
>

Re: Heads up - 2.0.5-beta

Posted by Suresh Srinivas <su...@hortonworks.com>.
Thanks for starting this discussion. I volunteer to do a final review of
protocol changes, so we can avoid incompatible changes to API and wire
protocol post 2.0.5 in Common and HDFS.

We have been working really hard on the following features. I would like to
get into 2.x and see it reach HDFS users:
# Snapshots
# NFS gateway for HDFS
# HDFS-347 unix domain socket based short circuits
# Windows support

Other HDFS folks please let me know if I missed anything.

To ensure a timely release of 2.0.5-beta, we should not hold back for
individual features. However, I would like to make necessary API and/or
protocol changes right-away. This will allow us to adding  features in
subsequent releases e.g. hadoop-2.2 or hadoop-2.3 etc without breaking
compatibility. For e.g. even if we don't complete NFS support, making
FileID related changes in 2.0.5-beta will ensure future compatbility.

Regards,
Suresh



On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Gang,
>
>  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> believe we are nearly there, exciting times!
>
>  As we have discussed previously, I hope to do a final push to stabilize
> hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> declare hadoop-2.1 as stable this summer after a short period of intensive
> testing.
>
>  With that in mind, I really want to make a serious push to lock down APIs
> and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently support
> hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> features, but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>
>  Vinod is helping out on the YARN/MR side and has tagged a number of final
> changes (including some the final API incompatibilities) we'd like to push
> in before we call hadoop-2.x as ready to be supported (Target Version set
> to 2.0.5-beta):
>  http://s.apache.org/target-hadoop-2.0.5-beta
>  Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> tagged, but their necessity is implied).
>
>  Similarly on HDFS side, can someone please help out by tagging features,
> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> protocols are locked down too - I'd really appreciate it!
>
> thanks,
> Arun
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>


-- 
http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Arun,

Could you please define the release plan and put it into vote.
In accordance with the ByLaws. After this discussion of course.

http://hadoop.apache.org/bylaws.html
Release Plan
Defines the timetable and actions for a release. The plan also nominates a
Release Manager.
Lazy majority of active committers

Do I understand correctly you volunteering for RM? Just to clarify.
Suresh had already put a list of features for HDFS and common.
So you probably need to indicate features for MapReduce and Yarn.

Thanks,
--Konstantin



On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Gang,
>
>  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> believe we are nearly there, exciting times!
>
>  As we have discussed previously, I hope to do a final push to stabilize
> hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> declare hadoop-2.1 as stable this summer after a short period of intensive
> testing.
>
>  With that in mind, I really want to make a serious push to lock down APIs
> and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently support
> hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> features, but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>
>  Vinod is helping out on the YARN/MR side and has tagged a number of final
> changes (including some the final API incompatibilities) we'd like to push
> in before we call hadoop-2.x as ready to be supported (Target Version set
> to 2.0.5-beta):
>  http://s.apache.org/target-hadoop-2.0.5-beta
>  Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> tagged, but their necessity is implied).
>
>  Similarly on HDFS side, can someone please help out by tagging features,
> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> protocols are locked down too - I'd really appreciate it!
>
> thanks,
> Arun
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>

Re: Heads up - 2.0.5-beta

Posted by Roman Shaposhnik <rv...@apache.org>.
On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:
> Gang,
>
>  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our hadoop-2.x alphas.
> We have made lots of progress on hadoop-2.x and I believe we are nearly there, exciting times!

Indeed!

>  As we have discussed previously, I hope to do a final push to stabilize hadoop-2.x, release a
> hadoop-2.0.5-beta in the next month or so; and then declare hadoop-2.1 as stable this summer
> after a short period of intensive testing.
>
>  With that in mind, I really want to make a serious push to lock down APIs and wire-protocols for hadoop-2.0.5-beta.
> Thus, we can confidently support hadoop-2.x in a compatible manner in the future. So, it's fine to add new features,
> but please ensure that all APIs are frozen for hadoop-2.0.5-beta

Arun, since it sounds like you have a pretty definite idea
in mind for what you want 'beta' label to actually mean,
could you, please, share the exact criteria? Either in the
thread I started a few days ago: http://s.apache.org/da5
or here.

That would be appreciated!

Thanks,
Roman.

Re: Heads up - 2.0.5-beta

Posted by Suresh Srinivas <su...@hortonworks.com>.
Thanks for starting this discussion. I volunteer to do a final review of
protocol changes, so we can avoid incompatible changes to API and wire
protocol post 2.0.5 in Common and HDFS.

We have been working really hard on the following features. I would like to
get into 2.x and see it reach HDFS users:
# Snapshots
# NFS gateway for HDFS
# HDFS-347 unix domain socket based short circuits
# Windows support

Other HDFS folks please let me know if I missed anything.

To ensure a timely release of 2.0.5-beta, we should not hold back for
individual features. However, I would like to make necessary API and/or
protocol changes right-away. This will allow us to adding  features in
subsequent releases e.g. hadoop-2.2 or hadoop-2.3 etc without breaking
compatibility. For e.g. even if we don't complete NFS support, making
FileID related changes in 2.0.5-beta will ensure future compatbility.

Regards,
Suresh



On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Gang,
>
>  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> believe we are nearly there, exciting times!
>
>  As we have discussed previously, I hope to do a final push to stabilize
> hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> declare hadoop-2.1 as stable this summer after a short period of intensive
> testing.
>
>  With that in mind, I really want to make a serious push to lock down APIs
> and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently support
> hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> features, but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>
>  Vinod is helping out on the YARN/MR side and has tagged a number of final
> changes (including some the final API incompatibilities) we'd like to push
> in before we call hadoop-2.x as ready to be supported (Target Version set
> to 2.0.5-beta):
>  http://s.apache.org/target-hadoop-2.0.5-beta
>  Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> tagged, but their necessity is implied).
>
>  Similarly on HDFS side, can someone please help out by tagging features,
> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> protocols are locked down too - I'd really appreciate it!
>
> thanks,
> Arun
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>


-- 
http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Konstantin Shvachko <sh...@gmail.com>.
Arun,

Could you please define the release plan and put it into vote.
In accordance with the ByLaws. After this discussion of course.

http://hadoop.apache.org/bylaws.html
Release Plan
Defines the timetable and actions for a release. The plan also nominates a
Release Manager.
Lazy majority of active committers

Do I understand correctly you volunteering for RM? Just to clarify.
Suresh had already put a list of features for HDFS and common.
So you probably need to indicate features for MapReduce and Yarn.

Thanks,
--Konstantin



On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Gang,
>
>  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> believe we are nearly there, exciting times!
>
>  As we have discussed previously, I hope to do a final push to stabilize
> hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> declare hadoop-2.1 as stable this summer after a short period of intensive
> testing.
>
>  With that in mind, I really want to make a serious push to lock down APIs
> and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently support
> hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> features, but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>
>  Vinod is helping out on the YARN/MR side and has tagged a number of final
> changes (including some the final API incompatibilities) we'd like to push
> in before we call hadoop-2.x as ready to be supported (Target Version set
> to 2.0.5-beta):
>  http://s.apache.org/target-hadoop-2.0.5-beta
>  Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> tagged, but their necessity is implied).
>
>  Similarly on HDFS side, can someone please help out by tagging features,
> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> protocols are locked down too - I'd really appreciate it!
>
> thanks,
> Arun
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>

Re: Heads up - 2.0.5-beta

Posted by Suresh Srinivas <su...@hortonworks.com>.
Thanks for starting this discussion. I volunteer to do a final review of
protocol changes, so we can avoid incompatible changes to API and wire
protocol post 2.0.5 in Common and HDFS.

We have been working really hard on the following features. I would like to
get into 2.x and see it reach HDFS users:
# Snapshots
# NFS gateway for HDFS
# HDFS-347 unix domain socket based short circuits
# Windows support

Other HDFS folks please let me know if I missed anything.

To ensure a timely release of 2.0.5-beta, we should not hold back for
individual features. However, I would like to make necessary API and/or
protocol changes right-away. This will allow us to adding  features in
subsequent releases e.g. hadoop-2.2 or hadoop-2.3 etc without breaking
compatibility. For e.g. even if we don't complete NFS support, making
FileID related changes in 2.0.5-beta will ensure future compatbility.

Regards,
Suresh



On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Gang,
>
>  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> believe we are nearly there, exciting times!
>
>  As we have discussed previously, I hope to do a final push to stabilize
> hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> declare hadoop-2.1 as stable this summer after a short period of intensive
> testing.
>
>  With that in mind, I really want to make a serious push to lock down APIs
> and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently support
> hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> features, but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>
>  Vinod is helping out on the YARN/MR side and has tagged a number of final
> changes (including some the final API incompatibilities) we'd like to push
> in before we call hadoop-2.x as ready to be supported (Target Version set
> to 2.0.5-beta):
>  http://s.apache.org/target-hadoop-2.0.5-beta
>  Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> tagged, but their necessity is implied).
>
>  Similarly on HDFS side, can someone please help out by tagging features,
> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> protocols are locked down too - I'd really appreciate it!
>
> thanks,
> Arun
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>


-- 
http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Suresh Srinivas <su...@hortonworks.com>.
Thanks for starting this discussion. I volunteer to do a final review of
protocol changes, so we can avoid incompatible changes to API and wire
protocol post 2.0.5 in Common and HDFS.

We have been working really hard on the following features. I would like to
get into 2.x and see it reach HDFS users:
# Snapshots
# NFS gateway for HDFS
# HDFS-347 unix domain socket based short circuits
# Windows support

Other HDFS folks please let me know if I missed anything.

To ensure a timely release of 2.0.5-beta, we should not hold back for
individual features. However, I would like to make necessary API and/or
protocol changes right-away. This will allow us to adding  features in
subsequent releases e.g. hadoop-2.2 or hadoop-2.3 etc without breaking
compatibility. For e.g. even if we don't complete NFS support, making
FileID related changes in 2.0.5-beta will ensure future compatbility.

Regards,
Suresh



On Thu, Apr 25, 2013 at 6:34 PM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Gang,
>
>  With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> believe we are nearly there, exciting times!
>
>  As we have discussed previously, I hope to do a final push to stabilize
> hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> declare hadoop-2.1 as stable this summer after a short period of intensive
> testing.
>
>  With that in mind, I really want to make a serious push to lock down APIs
> and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently support
> hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> features, but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>
>  Vinod is helping out on the YARN/MR side and has tagged a number of final
> changes (including some the final API incompatibilities) we'd like to push
> in before we call hadoop-2.x as ready to be supported (Target Version set
> to 2.0.5-beta):
>  http://s.apache.org/target-hadoop-2.0.5-beta
>  Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> tagged, but their necessity is implied).
>
>  Similarly on HDFS side, can someone please help out by tagging features,
> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> protocols are locked down too - I'd really appreciate it!
>
> thanks,
> Arun
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>


-- 
http://hortonworks.com/download/

Re: Heads up - 2.0.5-beta

Posted by Amir Sanjar <v1...@us.ibm.com>.
thanks Steve, will do that..
Last time we built hadoop with IBM JAVA 7 was hadoop 1.0.4. We didn't
notice much problems except few socket issues that were later fixed in IBM
JAVA 7 fixpack 3. We are planing to migrate Hadoop 2.x to JAVA 7 after
June, that will be fun :)

Best Regards
Amir Sanjar

System Management Architect
PowerLinux Open Source Hadoop development lead
IBM Senior Software Engineer
Phone# 512-286-8393
Fax#      512-838-8858





From:	Steve Loughran <st...@hortonworks.com>
To:	common-dev@hadoop.apache.org,
Date:	04/30/2013 11:54 AM
Subject:	Re: Heads up - 2.0.5-beta



OK, If you can create a meta-JIRA something like "Have Hadoop-2/trunk work
on IBM JVM", linking to those things, I'll review and commit these patches
that strip the sun specificness.

BTW, do you build/test on your Java7 JVM? Because Hadoop-2 doesn't
currently build on Java7+mac without a lot of effort related to jspc and
the hadoop annotations dependency on com.sun stuff that moved

On 29 April 2013 19:05, Amir Sanjar <v1...@us.ibm.com> wrote:

> yes Steve.
>
>
> Best Regards
> Amir Sanjar
>
> System Management Architect
> PowerLinux Open Source Hadoop development lead
> IBM Senior Software Engineer
> Phone# 512-286-8393
> Fax#      512-838-8858
>
>
>
> [image: Inactive hide details for Steve Loughran ---04/29/2013 05:40:33
> PM---you need those patches to remove sun-specific bits in, don]Steve
> Loughran ---04/29/2013 05:40:33 PM---you need those patches to remove
> sun-specific bits in, don't you? On 25 April 2013 19:23, Amir Sanja
>
> From: Steve Loughran <st...@hortonworks.com>
> To: common-dev@hadoop.apache.org,
> Date: 04/29/2013 05:40 PM
> Subject: Re: Heads up - 2.0.5-beta
> ------------------------------
>
>
>
> you need those patches to remove sun-specific bits in, don't you?
>
>

Re: Heads up - 2.0.5-beta

Posted by Steve Loughran <st...@hortonworks.com>.
OK, If you can create a meta-JIRA something like "Have Hadoop-2/trunk work
on IBM JVM", linking to those things, I'll review and commit these patches
that strip the sun specificness.

BTW, do you build/test on your Java7 JVM? Because Hadoop-2 doesn't
currently build on Java7+mac without a lot of effort related to jspc and
the hadoop annotations dependency on com.sun stuff that moved

On 29 April 2013 19:05, Amir Sanjar <v1...@us.ibm.com> wrote:

> yes Steve.
>
>
> Best Regards
> Amir Sanjar
>
> System Management Architect
> PowerLinux Open Source Hadoop development lead
> IBM Senior Software Engineer
> Phone# 512-286-8393
> Fax#      512-838-8858
>
>
>
> [image: Inactive hide details for Steve Loughran ---04/29/2013 05:40:33
> PM---you need those patches to remove sun-specific bits in, don]Steve
> Loughran ---04/29/2013 05:40:33 PM---you need those patches to remove
> sun-specific bits in, don't you? On 25 April 2013 19:23, Amir Sanja
>
> From: Steve Loughran <st...@hortonworks.com>
> To: common-dev@hadoop.apache.org,
> Date: 04/29/2013 05:40 PM
> Subject: Re: Heads up - 2.0.5-beta
> ------------------------------
>
>
>
> you need those patches to remove sun-specific bits in, don't you?
>
>

Re: Heads up - 2.0.5-beta

Posted by Amir Sanjar <v1...@us.ibm.com>.
yes Steve.

Best Regards
Amir Sanjar

System Management Architect
PowerLinux Open Source Hadoop development lead
IBM Senior Software Engineer
Phone# 512-286-8393
Fax#      512-838-8858





From:	Steve Loughran <st...@hortonworks.com>
To:	common-dev@hadoop.apache.org,
Date:	04/29/2013 05:40 PM
Subject:	Re: Heads up - 2.0.5-beta



you need those patches to remove sun-specific bits in, don't you?

On 25 April 2013 19:23, Amir Sanjar <v1...@us.ibm.com> wrote:

> Arun, thanks for the update. This is indeed the news we (IBM) have been
> waiting for. Please let us know if there is anyway
> we can help.
>
> Best Regards
> Amir Sanjar
>
> System Management Architect
> PowerLinux Open Source Hadoop development lead
> IBM Senior Software Engineer
> Phone# 512-286-8393
> Fax#      512-838-8858
>
>
>
> [image: Inactive hide details for Arun C Murthy ---04/25/2013 08:34:55
> PM---Gang, With hadoop-2.0.4-alpha released, I'd like 2.0.4 to]Arun C
> Murthy ---04/25/2013 08:34:55 PM---Gang,  With hadoop-2.0.4-alpha
released,
> I'd like 2.0.4 to be the final of our hadoop-2.x alphas. We
>
> From: Arun C Murthy <ac...@hortonworks.com>
> To: common-dev@hadoop.apache.org, hdfs-dev@hadoop.apache.org,
> mapreduce-dev@hadoop.apache.org, yarn-dev@hadoop.apache.org,
> Date: 04/25/2013 08:34 PM
> Subject: Heads up - 2.0.5-beta
> ------------------------------
>
>
>
> Gang,
>
> With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> believe we are nearly there, exciting times!
>
> As we have discussed previously, I hope to do a final push to stabilize
> hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> declare hadoop-2.1 as stable this summer after a short period of
intensive
> testing.
>
> With that in mind, I really want to make a serious push to lock down APIs
> and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently
support
> hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> features, but please ensure that all APIs are frozen for
hadoop-2.0.5-beta
>
> Vinod is helping out on the YARN/MR side and has tagged a number of final
> changes (including some the final API incompatibilities) we'd like to
push
> in before we call hadoop-2.x as ready to be supported (Target Version set
> to 2.0.5-beta):
> http://s.apache.org/target-hadoop-2.0.5-beta
> Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> tagged, but their necessity is implied).
>
> Similarly on HDFS side, can someone please help out by tagging features,
> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> protocols are locked down too - I'd really appreciate it!
>
> thanks,
> Arun
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>
>

Re: Heads up - 2.0.5-beta

Posted by Steve Loughran <st...@hortonworks.com>.
you need those patches to remove sun-specific bits in, don't you?

On 25 April 2013 19:23, Amir Sanjar <v1...@us.ibm.com> wrote:

> Arun, thanks for the update. This is indeed the news we (IBM) have been
> waiting for. Please let us know if there is anyway
> we can help.
>
> Best Regards
> Amir Sanjar
>
> System Management Architect
> PowerLinux Open Source Hadoop development lead
> IBM Senior Software Engineer
> Phone# 512-286-8393
> Fax#      512-838-8858
>
>
>
> [image: Inactive hide details for Arun C Murthy ---04/25/2013 08:34:55
> PM---Gang, With hadoop-2.0.4-alpha released, I'd like 2.0.4 to]Arun C
> Murthy ---04/25/2013 08:34:55 PM---Gang,  With hadoop-2.0.4-alpha released,
> I'd like 2.0.4 to be the final of our hadoop-2.x alphas. We
>
> From: Arun C Murthy <ac...@hortonworks.com>
> To: common-dev@hadoop.apache.org, hdfs-dev@hadoop.apache.org,
> mapreduce-dev@hadoop.apache.org, yarn-dev@hadoop.apache.org,
> Date: 04/25/2013 08:34 PM
> Subject: Heads up - 2.0.5-beta
> ------------------------------
>
>
>
> Gang,
>
> With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
> hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
> believe we are nearly there, exciting times!
>
> As we have discussed previously, I hope to do a final push to stabilize
> hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
> declare hadoop-2.1 as stable this summer after a short period of intensive
> testing.
>
> With that in mind, I really want to make a serious push to lock down APIs
> and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently support
> hadoop-2.x in a compatible manner in the future. So, it's fine to add new
> features, but please ensure that all APIs are frozen for hadoop-2.0.5-beta
>
> Vinod is helping out on the YARN/MR side and has tagged a number of final
> changes (including some the final API incompatibilities) we'd like to push
> in before we call hadoop-2.x as ready to be supported (Target Version set
> to 2.0.5-beta):
> http://s.apache.org/target-hadoop-2.0.5-beta
> Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
> tagged, but their necessity is implied).
>
> Similarly on HDFS side, can someone please help out by tagging features,
> bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
> protocols are locked down too - I'd really appreciate it!
>
> thanks,
> Arun
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>
>

Re: Heads up - 2.0.5-beta

Posted by Amir Sanjar <v1...@us.ibm.com>.
Arun, thanks for the update. This is indeed the news we (IBM) have been
waiting for. Please let us know if there is anyway
we can help.

Best Regards
Amir Sanjar

System Management Architect
PowerLinux Open Source Hadoop development lead
IBM Senior Software Engineer
Phone# 512-286-8393
Fax#      512-838-8858





From:	Arun C Murthy <ac...@hortonworks.com>
To:	common-dev@hadoop.apache.org, hdfs-dev@hadoop.apache.org,
            mapreduce-dev@hadoop.apache.org, yarn-dev@hadoop.apache.org,
Date:	04/25/2013 08:34 PM
Subject:	Heads up - 2.0.5-beta



Gang,

 With hadoop-2.0.4-alpha released, I'd like 2.0.4 to be the final of our
hadoop-2.x alphas. We have made lots of progress on hadoop-2.x and I
believe we are nearly there, exciting times!

 As we have discussed previously, I hope to do a final push to stabilize
hadoop-2.x, release a hadoop-2.0.5-beta in the next month or so; and then
declare hadoop-2.1 as stable this summer after a short period of intensive
testing.

 With that in mind, I really want to make a serious push to lock down APIs
and wire-protocols for hadoop-2.0.5-beta. Thus, we can confidently support
hadoop-2.x in a compatible manner in the future. So, it's fine to add new
features, but please ensure that all APIs are frozen for hadoop-2.0.5-beta

 Vinod is helping out on the YARN/MR side and has tagged a number of final
changes (including some the final API incompatibilities) we'd like to push
in before we call hadoop-2.x as ready to be supported (Target Version set
to 2.0.5-beta):
 http://s.apache.org/target-hadoop-2.0.5-beta
 Thanks Vinod! (Note some of the sub-tasks of umbrella jiras may not be
tagged, but their necessity is implied).

 Similarly on HDFS side, can someone please help out by tagging features,
bug-fixes, protocol/API changes etc.? This way we can ensure HDFS APIs &
protocols are locked down too - I'd really appreciate it!

thanks,
Arun


--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/