You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Sean Busbey <bu...@clouderagovt.com> on 2013/11/12 15:22:53 UTC

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

On Fri, Oct 18, 2013 at 12:29 AM, Sean Busbey <bu...@cloudera.com> wrote:

> On Tue, Oct 15, 2013 at 10:20 AM, Sean Busbey <bu...@cloudera.com> wrote:
>
>>
>> On Tue, Oct 15, 2013 at 10:16 AM, Sean Busbey <bu...@cloudera.com>wrote:
>>
>>>
>>> On Tue, Oct 15, 2013 at 7:16 AM, <dl...@comcast.net> wrote:
>>>
>>>> Just to be clear, we are talking about adding profile support to the
>>>> pom's for Hadoop 2.2.0 for a 1.4.5 and 1.5.1 release, correct? We are not
>>>> talking about changing the default build profile for these branches are we?
>>>>
>>>>
>>>>
>>> for 1.4.5-SNAPSHOT I am only talking about adding support Hadoop 2.2.0.
>>> I am not suggesting we change the default from building against Hadoop
>>> 0.23.203.
>>>
>>>
>>>
>> I mean 0.20.203.0. Ugh, Hadoop versions.
>>
>>
>
> Okay, barring additional suggestions, tomorrow afternoon I'll break things
> down into an umbrella and 3 sub tasks:
>
> 1) addition of hadoop 2 support
>
>  - to include backports of commits
>  - to include making the target hadoop 2 version 2.2.0
>  - to include test changes that flex hadoop 2 features like fail over
>
> 2) ensuring compatibility for 0.20.203
>
> - presuming some subset of the commits in 1) will break it since 0.20
> support was left behind in 1.5
>
> 3) doc / packaging updates
>
> - the issue of binary releases per distro
> - doc patch for what version(s) the release tests are expected to run
> against
>
> Once work is put against those tickets, I'd expect things to go into a
> branch based on the umbrella ticket until such time as the complete work
> can pass the test suite that we'll use at the next release. Then it can get
> rebased onto the 1.4.x dev branch.
>
> --
> Sean
>

Based on recent feedback on ACCUMULO-1792 and ACCUMULO-1795, I want to
resurrect this thread to make sure everyone's concerns are addressed.

For context, here's a link to the start of the last thread:

http://bit.ly/1aPqKuH

>From ACCUMULO-1792, ctubbsii:

> I'd be reluctant to support any Hadoop 2.x support in the 1.4 release
line that breaks compatibility with 0.20. I don't think breaking 0.20
> and then possibly fixing it again as a second step is acceptable (because
that subsequent work may not ever be done, and I don't think
> we should break the compatibility contract that we've established with
1.4.0).

Chris, I believe keeping all of the work in a branch under the umbrella
jira of ACCUMULO-1790 will ensure that we don't end up with a 1.4 release
that doesn't have proper support for 0.20.203.

Is there something beyond making sure the branch passes a full set of
release tests on 0.20.203 that you'd like to see? In the event that the
branch only ever contains the work for adding Hadoop 2, it's a simple
matter to abandon without rolling into the 1.4 development line.

>From ACCUMULO-1795, bills (and +1ed by elserj and ctubbsii):

> I'm very uncomfortable with risking breaking continuity in such an old
release, and I don't think managing two lines of 1.4 releases is
> worth the effort. Though we have no official EOL policy, 1.3 was
practically dead in the water once 1.4 was around, and I hope we start
> encouraging more adoption of 1.5 (and soon 1.6) versus continually
propping up 1.4.

I'd love to get people to move off of 1.4. However, I think adding Hadoop 2
support to 1.4 encourages this more than leaving it out.

Accumulo 1.5.x places a higher burden on HDFS than 1.4 did, and I'm not
surprised people find relying on 0.20 for the 1.5 WAL intimidating.
Upgrading both HDFS and Accumulo across major versions at once is asking
them to take on a bunch of risk. By adding in Hadoop 2 support to 1.4 we
allow them to break the risk up into steps: they can upgrade HDFS versions
first, get comfortable, then upgrade Accumulo to 1.5.

I think the existing tickets under the umbrella of ACCUMULO-1790 should
ensure that we end up with a single 1.4 line that can work with either the
existing 0.20.203.0 claimed in releases or against 2.2.0.

Bill (or Josh or Chris), is there stronger language you'd like to see
around docs / packaging (area #3 in the original plan and currently
ACCUMULO-1796)? Maybe expressly only doing a binary convenience package for
0.20.203.0? Are you looking for something beyond a full release suite to
ensure 1.4 is still maintaining compatibility on Hadoop 0.20.203?


-Sean

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by Christopher <ct...@apache.org>.
Nope, I think we're on the same page now.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Thu, Nov 14, 2013 at 7:39 PM, Sean Busbey <bu...@clouderagovt.com> wrote:
> On Thu, Nov 14, 2013 at 6:27 PM, Christopher <ct...@apache.org> wrote:
>
>> The main thing is that I would not want to see an ACCUMULO-1790
>> *without* ACCUMULO-1795. Having 1792 alone would be insufficient for
>> me.
>>
>>
> That is precisely the intention of ACCUMULO-1790. All of the subtasks
> (including ACCUMULO-1792 and ACCUMULO-1795) have to be complete for things
> to get into the 1.4 branch. Until that time the work would just go into a
> feature branch for ACCUMULO-1790 (to make working and testing easier for
> those implementing the subtasks). If you wanted to see the full
> implementation you would just wait until all of the subtasks were committed
> to the feature branch.
>
> Am I missing something?
>
>
> --
> Sean

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by Sean Busbey <bu...@clouderagovt.com>.
On Thu, Nov 14, 2013 at 6:27 PM, Christopher <ct...@apache.org> wrote:

> The main thing is that I would not want to see an ACCUMULO-1790
> *without* ACCUMULO-1795. Having 1792 alone would be insufficient for
> me.
>
>
That is precisely the intention of ACCUMULO-1790. All of the subtasks
(including ACCUMULO-1792 and ACCUMULO-1795) have to be complete for things
to get into the 1.4 branch. Until that time the work would just go into a
feature branch for ACCUMULO-1790 (to make working and testing easier for
those implementing the subtasks). If you wanted to see the full
implementation you would just wait until all of the subtasks were committed
to the feature branch.

Am I missing something?


-- 
Sean

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by Christopher <ct...@apache.org>.
The main thing is that I would not want to see an ACCUMULO-1790
*without* ACCUMULO-1795. Having 1792 alone would be insufficient for
me.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Tue, Nov 12, 2013 at 9:22 AM, Sean Busbey <bu...@clouderagovt.com> wrote:
> On Fri, Oct 18, 2013 at 12:29 AM, Sean Busbey <bu...@cloudera.com> wrote:
>
>> On Tue, Oct 15, 2013 at 10:20 AM, Sean Busbey <bu...@cloudera.com> wrote:
>>
>>>
>>> On Tue, Oct 15, 2013 at 10:16 AM, Sean Busbey <bu...@cloudera.com>wrote:
>>>
>>>>
>>>> On Tue, Oct 15, 2013 at 7:16 AM, <dl...@comcast.net> wrote:
>>>>
>>>>> Just to be clear, we are talking about adding profile support to the
>>>>> pom's for Hadoop 2.2.0 for a 1.4.5 and 1.5.1 release, correct? We are not
>>>>> talking about changing the default build profile for these branches are we?
>>>>>
>>>>>
>>>>>
>>>> for 1.4.5-SNAPSHOT I am only talking about adding support Hadoop 2.2.0.
>>>> I am not suggesting we change the default from building against Hadoop
>>>> 0.23.203.
>>>>
>>>>
>>>>
>>> I mean 0.20.203.0. Ugh, Hadoop versions.
>>>
>>>
>>
>> Okay, barring additional suggestions, tomorrow afternoon I'll break things
>> down into an umbrella and 3 sub tasks:
>>
>> 1) addition of hadoop 2 support
>>
>>  - to include backports of commits
>>  - to include making the target hadoop 2 version 2.2.0
>>  - to include test changes that flex hadoop 2 features like fail over
>>
>> 2) ensuring compatibility for 0.20.203
>>
>> - presuming some subset of the commits in 1) will break it since 0.20
>> support was left behind in 1.5
>>
>> 3) doc / packaging updates
>>
>> - the issue of binary releases per distro
>> - doc patch for what version(s) the release tests are expected to run
>> against
>>
>> Once work is put against those tickets, I'd expect things to go into a
>> branch based on the umbrella ticket until such time as the complete work
>> can pass the test suite that we'll use at the next release. Then it can get
>> rebased onto the 1.4.x dev branch.
>>
>> --
>> Sean
>>
>
> Based on recent feedback on ACCUMULO-1792 and ACCUMULO-1795, I want to
> resurrect this thread to make sure everyone's concerns are addressed.
>
> For context, here's a link to the start of the last thread:
>
> http://bit.ly/1aPqKuH
>
> From ACCUMULO-1792, ctubbsii:
>
>> I'd be reluctant to support any Hadoop 2.x support in the 1.4 release
> line that breaks compatibility with 0.20. I don't think breaking 0.20
>> and then possibly fixing it again as a second step is acceptable (because
> that subsequent work may not ever be done, and I don't think
>> we should break the compatibility contract that we've established with
> 1.4.0).
>
> Chris, I believe keeping all of the work in a branch under the umbrella
> jira of ACCUMULO-1790 will ensure that we don't end up with a 1.4 release
> that doesn't have proper support for 0.20.203.
>
> Is there something beyond making sure the branch passes a full set of
> release tests on 0.20.203 that you'd like to see? In the event that the
> branch only ever contains the work for adding Hadoop 2, it's a simple
> matter to abandon without rolling into the 1.4 development line.
>
> From ACCUMULO-1795, bills (and +1ed by elserj and ctubbsii):
>
>> I'm very uncomfortable with risking breaking continuity in such an old
> release, and I don't think managing two lines of 1.4 releases is
>> worth the effort. Though we have no official EOL policy, 1.3 was
> practically dead in the water once 1.4 was around, and I hope we start
>> encouraging more adoption of 1.5 (and soon 1.6) versus continually
> propping up 1.4.
>
> I'd love to get people to move off of 1.4. However, I think adding Hadoop 2
> support to 1.4 encourages this more than leaving it out.
>
> Accumulo 1.5.x places a higher burden on HDFS than 1.4 did, and I'm not
> surprised people find relying on 0.20 for the 1.5 WAL intimidating.
> Upgrading both HDFS and Accumulo across major versions at once is asking
> them to take on a bunch of risk. By adding in Hadoop 2 support to 1.4 we
> allow them to break the risk up into steps: they can upgrade HDFS versions
> first, get comfortable, then upgrade Accumulo to 1.5.
>
> I think the existing tickets under the umbrella of ACCUMULO-1790 should
> ensure that we end up with a single 1.4 line that can work with either the
> existing 0.20.203.0 claimed in releases or against 2.2.0.
>
> Bill (or Josh or Chris), is there stronger language you'd like to see
> around docs / packaging (area #3 in the original plan and currently
> ACCUMULO-1796)? Maybe expressly only doing a binary convenience package for
> 0.20.203.0? Are you looking for something beyond a full release suite to
> ensure 1.4 is still maintaining compatibility on Hadoop 0.20.203?
>
>
> -Sean

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by Christopher <ct...@apache.org>.
On Tue, Nov 12, 2013 at 4:49 PM, Sean Busbey <bu...@clouderagovt.com> wrote:
> On Tue, Nov 12, 2013 at 3:14 PM, William Slacum <
> wilhelm.von.cloud@accumulo.net> wrote:
>
>> The language of ACCUMULO-1795 indicated that an acceptable state was
>> something that wasn't binary compatible. That's my #1 thing to avoid.
>>
>>
> Ah. So I see, not sure why I phrased that that way. Since the default build
> should still be 0.20.203.0, I'm not sure how it'd end up not being binary
> compatible. I can update the ticket to clarify the language. Any need to
> compile should be limited to running Hadoop 2.2.0.
>
> Sound good?

+1
(The confusing wording was the basis for my concerns also.)

>> > Maybe expressly only doing a binary convenience package for
>> > 0.20.203.0?
>>
>> If we need an extra package, doesn't that mean a user can't just upgrade
>> Accumulo?
>>
>
> By "binary convenience package" I mean the binary distribution tarball (or
> rpms, or whatevs) that we make as a part of the release process. For users
> of Hadoop 0.20.203.0, upgrading should be unchanged from how they would
> normally get their Accumulo 1.4.x distribution.
>
> ACCUMULO-1796 has some leeway about the convenience packages for people who
> want Hadoop 2 support. On the extreme end, they'd have to build from source
> and then run a normal upgrade process.

I'd prefer binary compatibility with a single build, but if that's too
hard to achieve, I have no objection to providing a mechanism to
perform an alternate build against 2.x (whether or not we provide a
pre-built binary package for it), so long as the default build is
0.20.x

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by Sean Busbey <bu...@clouderagovt.com>.
On Tue, Nov 12, 2013 at 3:14 PM, William Slacum <
wilhelm.von.cloud@accumulo.net> wrote:

> The language of ACCUMULO-1795 indicated that an acceptable state was
> something that wasn't binary compatible. That's my #1 thing to avoid.
>
>
Ah. So I see, not sure why I phrased that that way. Since the default build
should still be 0.20.203.0, I'm not sure how it'd end up not being binary
compatible. I can update the ticket to clarify the language. Any need to
compile should be limited to running Hadoop 2.2.0.

Sound good?



> > Maybe expressly only doing a binary convenience package for
> > 0.20.203.0?
>
> If we need an extra package, doesn't that mean a user can't just upgrade
> Accumulo?
>

By "binary convenience package" I mean the binary distribution tarball (or
rpms, or whatevs) that we make as a part of the release process. For users
of Hadoop 0.20.203.0, upgrading should be unchanged from how they would
normally get their Accumulo 1.4.x distribution.

ACCUMULO-1796 has some leeway about the convenience packages for people who
want Hadoop 2 support. On the extreme end, they'd have to build from source
and then run a normal upgrade process.

-- 
Sean

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by William Slacum <wi...@accumulo.net>.
herp, ignore that "As a side note" line. I was crafting something else and
came back later. Forgot to clean it up.


On Tue, Nov 12, 2013 at 4:14 PM, William Slacum <
wilhelm.von.cloud@accumulo.net> wrote:

> The language of ACCUMULO-1795 indicated that an acceptable state was
> something that wasn't binary compatible. That's my #1 thing to avoid.
>
> > Maybe expressly only doing a binary convenience package for
> > 0.20.203.0?
>
> If we need an extra package, doesn't that mean a user can't just upgrade
> Accumulo?
>
> As a side note, 0.20.203.0 is 1.4,
>
>
> On Tue, Nov 12, 2013 at 3:28 PM, Sean Busbey <bu...@clouderagovt.com>wrote:
>
>> On Tue, Nov 12, 2013 at 1:28 PM, William Slacum <
>> wilhelm.von.cloud@accumulo.net> wrote:
>>
>> > A user of 1.4.a should be able to move to 1.4.b without any "major"
>> > infrastructure changes, such as swapping out HDFS or installing extra
>> > add-ons.
>> >
>> >
>>
>> Right, exactly. Hopefully no part of the original plan contradicts this.
>> Is
>> there something that appears to?
>>
>>
>> --
>> Sean
>>
>
>

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by William Slacum <wi...@accumulo.net>.
The language of ACCUMULO-1795 indicated that an acceptable state was
something that wasn't binary compatible. That's my #1 thing to avoid.

> Maybe expressly only doing a binary convenience package for
> 0.20.203.0?

If we need an extra package, doesn't that mean a user can't just upgrade
Accumulo?

As a side note, 0.20.203.0 is 1.4,

On Tue, Nov 12, 2013 at 3:28 PM, Sean Busbey <bu...@clouderagovt.com>wrote:

> On Tue, Nov 12, 2013 at 1:28 PM, William Slacum <
> wilhelm.von.cloud@accumulo.net> wrote:
>
> > A user of 1.4.a should be able to move to 1.4.b without any "major"
> > infrastructure changes, such as swapping out HDFS or installing extra
> > add-ons.
> >
> >
>
> Right, exactly. Hopefully no part of the original plan contradicts this. Is
> there something that appears to?
>
>
> --
> Sean
>

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by Sean Busbey <bu...@clouderagovt.com>.
On Tue, Nov 12, 2013 at 1:28 PM, William Slacum <
wilhelm.von.cloud@accumulo.net> wrote:

> A user of 1.4.a should be able to move to 1.4.b without any "major"
> infrastructure changes, such as swapping out HDFS or installing extra
> add-ons.
>
>

Right, exactly. Hopefully no part of the original plan contradicts this. Is
there something that appears to?


-- 
Sean

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by William Slacum <wi...@accumulo.net>.
A user of 1.4.a should be able to move to 1.4.b without any "major"
infrastructure changes, such as swapping out HDFS or installing extra
add-ons.

I don't find much merit in debating local WAL vs HDFS WAL cost/benefit
since the only quantifiable evidence we have supported the move.

I should note, Sean, that if you see merit in the work, you don't need
community approval for forking and sharing. However, I do not think it is
in the community's best interest to continue to upgrade 1.4.



On Tue, Nov 12, 2013 at 2:12 PM, Josh Elser <jo...@gmail.com> wrote:

>
>> Based on recent feedback on ACCUMULO-1792 and ACCUMULO-1795, I want to
>> resurrect this thread to make sure everyone's concerns are addressed.
>>
>> For context, here's a link to the start of the last thread:
>>
>> http://bit.ly/1aPqKuH
>>
>>  From ACCUMULO-1792, ctubbsii:
>>
>>  I'd be reluctant to support any Hadoop 2.x support in the 1.4 release
>>>
>> line that breaks compatibility with 0.20. I don't think breaking 0.20
>>
>>> and then possibly fixing it again as a second step is acceptable (because
>>>
>> that subsequent work may not ever be done, and I don't think
>>
>>> we should break the compatibility contract that we've established with
>>>
>> 1.4.0).
>>
>> Chris, I believe keeping all of the work in a branch under the umbrella
>> jira of ACCUMULO-1790 will ensure that we don't end up with a 1.4 release
>> that doesn't have proper support for 0.20.203.
>>
>> Is there something beyond making sure the branch passes a full set of
>> release tests on 0.20.203 that you'd like to see? In the event that the
>> branch only ever contains the work for adding Hadoop 2, it's a simple
>> matter to abandon without rolling into the 1.4 development line.
>>
>>  From ACCUMULO-1795, bills (and +1ed by elserj and ctubbsii):
>>
>>  I'm very uncomfortable with risking breaking continuity in such an old
>>>
>> release, and I don't think managing two lines of 1.4 releases is
>>
>>> worth the effort. Though we have no official EOL policy, 1.3 was
>>>
>> practically dead in the water once 1.4 was around, and I hope we start
>>
>>> encouraging more adoption of 1.5 (and soon 1.6) versus continually
>>>
>> propping up 1.4.
>>
>> I'd love to get people to move off of 1.4. However, I think adding Hadoop
>> 2
>> support to 1.4 encourages this more than leaving it out.
>>
>
> I'm not sure I agree that adding Hadoop2 support to 1.4 encourages people
> to upgrade Accumulo. My gut reaction would be that it allows people to
> completely ignore Accumulo updates (ignoring moving to 1.4.5 which would
> allow them to do hadoop2 with your proposed changes)
>
>
>  Accumulo 1.5.x places a higher burden on HDFS than 1.4 did, and I'm not
>> surprised people find relying on 0.20 for the 1.5 WAL intimidating.
>> Upgrading both HDFS and Accumulo across major versions at once is asking
>> them to take on a bunch of risk. By adding in Hadoop 2 support to 1.4 we
>> allow them to break the risk up into steps: they can upgrade HDFS versions
>> first, get comfortable, then upgrade Accumulo to 1.5.
>>
>
> Personally, maintaining 0.20 compatibility is not a big concern on my
> radar. If you're still running an 0.20 release, I'd *really* hope that you
> have an upgrade path to 1.2.x (if not 2.2.x) scheduled.
>
> I think claiming that 1.5 has a higher burden on 1.4 is a bit of a
> fallacy. There were many problems and pains regarding WALs in <=1.4 that
> are very difficult to work with in a large environment (try finding WALs in
> server failure cases). I think the increased I/O on HDFS is a much smaller
> cost than the completely different I/O path that the old loggers have.
>
> I also think upgrading Accumulo is much less scary than upgrading HDFS,
> but that's just me.
>
> To me, it seems like the argument may be coming down to whether or not we
> break 0.20 hadoop compatibility on a bug-fix release and how concerned we
> are about letting users lag behind the upstream development.
>
>
>  I think the existing tickets under the umbrella of ACCUMULO-1790 should
>> ensure that we end up with a single 1.4 line that can work with either the
>> existing 0.20.203.0 claimed in releases or against 2.2.0.
>>
>> Bill (or Josh or Chris), is there stronger language you'd like to see
>> around docs / packaging (area #3 in the original plan and currently
>> ACCUMULO-1796)? Maybe expressly only doing a binary convenience package
>> for
>> 0.20.203.0? Are you looking for something beyond a full release suite to
>> ensure 1.4 is still maintaining compatibility on Hadoop 0.20.203?
>>
>>
> Again, my biggest concern here is not following our own guidelines of
> breaking changes across minor releases, but I'd hope 0.20 users have an
> upgrade path outlined for themselves.
>

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by Josh Elser <jo...@gmail.com>.
On 11/12/13, 1:26 PM, Sean Busbey wrote:
> On Tue, Nov 12, 2013 at 2:48 PM, Josh Elser <jo...@gmail.com> wrote:
>
>>
>>>>>
>> What about the other half: "encouraging" users to lag (soon to be) two
>> major releases behind?
>>
>
>
> I don't think our current user base needs to be encouraged strongly to
> upgrade. And as I said previously I think this change provides them with an
> upgrade path that's easier to stomach, but I suspect this is a point we
> disagree on.
>

Yes, I believe that disagreement to be accurate :). IMO, HDFS WALs are 
themselves a sufficient feature to strongly encourage users to upgrade 
from 1.4 to 1.5 when running Accumulo in a production environment.

However, I do want to say that my feelings are not solely influenced by 
1.5 features that 1.4 does not have. Providing avenues for users to use 
old versions of code is extremely degrading to the morale of developers 
who are trying to innovate and make things better upstream.

Having to repeatedly answer questions about why things are the way they 
are or why they are sub-optimal with "If you upgraded, it wouldn't be an 
issue" can be very depressing to a growing community. There are multiple 
amazing devs who have invested crazy units of effort into making 1.5 and 
(soon) 1.6 be a reality -- I want to make sure as a community we 
encourage our users to reap the benefits of that work, and let 
aforementioned devs see successes from their effort.

Also, users still aren't precluded from using 1.4 if they so choose (and 
if critical bugs are found, it's very likely for them to be patched in 
1.4). I don't see it as unreasonable to force users to upgrade to a 
newer version of Accumulo to use a newer version of Hadoop.

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by Sean Busbey <bu...@clouderagovt.com>.
On Tue, Nov 12, 2013 at 2:48 PM, Josh Elser <jo...@gmail.com> wrote:

>
>>>>
> What about the other half: "encouraging" users to lag (soon to be) two
> major releases behind?
>


I don't think our current user base needs to be encouraged strongly to
upgrade. And as I said previously I think this change provides them with an
upgrade path that's easier to stomach, but I suspect this is a point we
disagree on.

-- 
Sean

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by Josh Elser <jo...@gmail.com>.
On 11/12/13, 12:24 PM, Sean Busbey wrote:
> On Tue, Nov 12, 2013 at 1:12 PM, Josh Elser <jo...@gmail.com> wrote:
>
>>
>>>
>> To me, it seems like the argument may be coming down to whether or not we
>> break 0.20 hadoop compatibility on a bug-fix release and how concerned we
>> are about letting users lag behind the upstream development.
>>
>>
>>   I think the existing tickets under the umbrella of ACCUMULO-1790 should
>>> ensure that we end up with a single 1.4 line that can work with either the
>>> existing 0.20.203.0 claimed in releases or against 2.2.0.
>>>
>>> Bill (or Josh or Chris), is there stronger language you'd like to see
>>> around docs / packaging (area #3 in the original plan and currently
>>> ACCUMULO-1796)? Maybe expressly only doing a binary convenience package
>>> for
>>> 0.20.203.0? Are you looking for something beyond a full release suite to
>>> ensure 1.4 is still maintaining compatibility on Hadoop 0.20.203?
>>>
>>>
>> Again, my biggest concern here is not following our own guidelines of
>> breaking changes across minor releases, but I'd hope 0.20 users have an
>> upgrade path outlined for themselves.
>>
>
>
> The plan outlined in the original thread, and in the subtasks under
> ACCUMULO-1790, is expressly aimed at not breaking 0.20 compatibility in the
> 1.4 bugfix line. If there's anything we can do besides running through the
> release test suite on a 0.20 cluster to help ensure that, I am interested
> in adding it to the existing plan.
>
>

What about the other half: "encouraging" users to lag (soon to be) two 
major releases behind?

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by Sean Busbey <bu...@clouderagovt.com>.
On Tue, Nov 12, 2013 at 1:12 PM, Josh Elser <jo...@gmail.com> wrote:

>
>>
> To me, it seems like the argument may be coming down to whether or not we
> break 0.20 hadoop compatibility on a bug-fix release and how concerned we
> are about letting users lag behind the upstream development.
>
>
>  I think the existing tickets under the umbrella of ACCUMULO-1790 should
>> ensure that we end up with a single 1.4 line that can work with either the
>> existing 0.20.203.0 claimed in releases or against 2.2.0.
>>
>> Bill (or Josh or Chris), is there stronger language you'd like to see
>> around docs / packaging (area #3 in the original plan and currently
>> ACCUMULO-1796)? Maybe expressly only doing a binary convenience package
>> for
>> 0.20.203.0? Are you looking for something beyond a full release suite to
>> ensure 1.4 is still maintaining compatibility on Hadoop 0.20.203?
>>
>>
> Again, my biggest concern here is not following our own guidelines of
> breaking changes across minor releases, but I'd hope 0.20 users have an
> upgrade path outlined for themselves.
>


The plan outlined in the original thread, and in the subtasks under
ACCUMULO-1790, is expressly aimed at not breaking 0.20 compatibility in the
1.4 bugfix line. If there's anything we can do besides running through the
release test suite on a 0.20 cluster to help ensure that, I am interested
in adding it to the existing plan.


-- 
Sean

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

Posted by Josh Elser <jo...@gmail.com>.
>
> Based on recent feedback on ACCUMULO-1792 and ACCUMULO-1795, I want to
> resurrect this thread to make sure everyone's concerns are addressed.
>
> For context, here's a link to the start of the last thread:
>
> http://bit.ly/1aPqKuH
>
>  From ACCUMULO-1792, ctubbsii:
>
>> I'd be reluctant to support any Hadoop 2.x support in the 1.4 release
> line that breaks compatibility with 0.20. I don't think breaking 0.20
>> and then possibly fixing it again as a second step is acceptable (because
> that subsequent work may not ever be done, and I don't think
>> we should break the compatibility contract that we've established with
> 1.4.0).
>
> Chris, I believe keeping all of the work in a branch under the umbrella
> jira of ACCUMULO-1790 will ensure that we don't end up with a 1.4 release
> that doesn't have proper support for 0.20.203.
>
> Is there something beyond making sure the branch passes a full set of
> release tests on 0.20.203 that you'd like to see? In the event that the
> branch only ever contains the work for adding Hadoop 2, it's a simple
> matter to abandon without rolling into the 1.4 development line.
>
>  From ACCUMULO-1795, bills (and +1ed by elserj and ctubbsii):
>
>> I'm very uncomfortable with risking breaking continuity in such an old
> release, and I don't think managing two lines of 1.4 releases is
>> worth the effort. Though we have no official EOL policy, 1.3 was
> practically dead in the water once 1.4 was around, and I hope we start
>> encouraging more adoption of 1.5 (and soon 1.6) versus continually
> propping up 1.4.
>
> I'd love to get people to move off of 1.4. However, I think adding Hadoop 2
> support to 1.4 encourages this more than leaving it out.

I'm not sure I agree that adding Hadoop2 support to 1.4 encourages 
people to upgrade Accumulo. My gut reaction would be that it allows 
people to completely ignore Accumulo updates (ignoring moving to 1.4.5 
which would allow them to do hadoop2 with your proposed changes)

> Accumulo 1.5.x places a higher burden on HDFS than 1.4 did, and I'm not
> surprised people find relying on 0.20 for the 1.5 WAL intimidating.
> Upgrading both HDFS and Accumulo across major versions at once is asking
> them to take on a bunch of risk. By adding in Hadoop 2 support to 1.4 we
> allow them to break the risk up into steps: they can upgrade HDFS versions
> first, get comfortable, then upgrade Accumulo to 1.5.

Personally, maintaining 0.20 compatibility is not a big concern on my 
radar. If you're still running an 0.20 release, I'd *really* hope that 
you have an upgrade path to 1.2.x (if not 2.2.x) scheduled.

I think claiming that 1.5 has a higher burden on 1.4 is a bit of a 
fallacy. There were many problems and pains regarding WALs in <=1.4 that 
are very difficult to work with in a large environment (try finding WALs 
in server failure cases). I think the increased I/O on HDFS is a much 
smaller cost than the completely different I/O path that the old loggers 
have.

I also think upgrading Accumulo is much less scary than upgrading HDFS, 
but that's just me.

To me, it seems like the argument may be coming down to whether or not 
we break 0.20 hadoop compatibility on a bug-fix release and how 
concerned we are about letting users lag behind the upstream development.

> I think the existing tickets under the umbrella of ACCUMULO-1790 should
> ensure that we end up with a single 1.4 line that can work with either the
> existing 0.20.203.0 claimed in releases or against 2.2.0.
>
> Bill (or Josh or Chris), is there stronger language you'd like to see
> around docs / packaging (area #3 in the original plan and currently
> ACCUMULO-1796)? Maybe expressly only doing a binary convenience package for
> 0.20.203.0? Are you looking for something beyond a full release suite to
> ensure 1.4 is still maintaining compatibility on Hadoop 0.20.203?
>

Again, my biggest concern here is not following our own guidelines of 
breaking changes across minor releases, but I'd hope 0.20 users have an 
upgrade path outlined for themselves.