You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Arun C Murthy <ac...@yahoo-inc.com> on 2011/04/08 00:52:44 UTC
Re: [ANNOUNCEMENT] Yahoo focusing on Apache Hadoop, discontinuing "The Yahoo Distribution of Hadoop"
On Feb 14, 2011, at 1:34 PM, Arun C Murthy wrote:
>
> As the final installment in this process, I've started a discussion on
> us contributing a re-factor of Map-Reduce in https://issues.apache.org/jira/browse/MAPREDUCE-279
> .
Hi Folks,
We wanted to share our thoughts around the co-development of the
NextGen MapReduce branch (Jira MR-279), maintaining the branch-0.20-
security and merging the work on the security branch with trunk.
We've concluded that it does not make sense for us to port a very
small subset of the work from the branch-0.20-security to the Hadoop
mainline. The JIRAs we don't plan to port all effect areas of the
mainline that are going to be replaced by work in the NextGen
MapReduce branch (http://svn.apache.org/viewvc/hadoop/mapreduce/branches/MR-279/
).
We've been working on the NextGen MapReduce branch (MAPREDUCE-279)
within Apache for a while now and are excited about it's progress. We
think that this branch will be a huge improvement in scalability,
performance and functionality. We are now confident that we can get
it ready for release in in the next few months. We believe that the
next major release of Apache Hadoop we will test at Yahoo will include
the work in this branch and we are committed to merging the NextGen
branch into the mainline after the PMC approves the merge.
Meanwhile, we have continued to find and fix bugs on branch-0.20-
security and have been working to port that work into the Hadoop
mainline. Most of this work is done and we've also brought all the
patches in from our github branch into apache subversion, so that it
is easy for everyone to see the work remaining. What we've found is
that some of the work in branch-0.20-security is in code sections that
have been completely replaced / refactored in the NextGen MapReduce
branch. Since we are committed to the NextGen branch, we don't think
there is any upside in porting this code into portions of mainline we
expect to discard. All of these JIRAs will be fixed in the NextGen
MapReduce branch and through there ultimately in trunk (assuming the
PMC approves the merge).
So at this point it is our intent to not port the JIRAs listed above
to trunk, but to wait until we merge NextGen into trunk to resolve
these issues there. If you are interested in seeing these issues
ported to mainline, let us know. We are happy to help review your
patches and explain context to anyone who is interested in doing this
work.
Arun and Eric
Re: [ANNOUNCEMENT] Yahoo focusing on Apache Hadoop, discontinuing "The Yahoo Distribution of Hadoop"
Posted by Arun C Murthy <ac...@yahoo-inc.com>.
On Apr 8, 2011, at 11:08 AM, Todd Lipcon wrote:
> These all have patches that are pretty small, and I'd imagine would
> apply pretty easily to trunk. Let me know if you'd like any help
> forward-porting.
>
Thanks Todd, I'm happy to help review etc.
> The other ones, as new features/improvements, I'd agree it makes
> sense not to waste effort re-implementing them for trunk MR, but
> rather to make sure they're incorporated in next-gen.
Yep, exactly. Glad to know it makes sense.
thanks,
Arun
Re: [ANNOUNCEMENT] Yahoo focusing on Apache Hadoop, discontinuing
"The Yahoo Distribution of Hadoop"
Posted by Eric Baldeschwieler <er...@yahoo-inc.com>.
Thanks Todd, your help with the jiras you IDed would be welcome!
---
E14 - typing on glass
On Apr 8, 2011, at 11:09 AM, "Todd Lipcon" <to...@cloudera.com> wrote:
> On Fri, Apr 8, 2011 at 10:34 AM, Arun C Murthy <ac...@yahoo-inc.com> wrote:
>
>>
>> On Apr 7, 2011, at 4:22 PM, Todd Lipcon wrote:
>>
>> Is there a list available of which patches you've made this decision
>>> about? I'm curious, for example, about MAPREDUCE-2178 -- as of today, the MR
>>> security in trunk has a serious vulnerability. Do we plan on fixing it, or
>>> will the answer be that, if anyone needs security, they must update to "MR
>>> Next Gen"?
>>>
>>
>> Apologies if my original message was abstruse - I want to ensure that there
>> is no confusion between 'forward-port' and 'merge from yahoo-merge branch'.
>>
>> Let me try to explain again: there are several forward ports from the
>> hadoop-0.20-2xx (branch-0.20-security) which are complete, including
>> MAPREDUCE-2178. They are currently part of the 'yahoo-merge' branch in
>> MapReduce. These are awaiting a merge into trunk. Trunk (with a few merges
>> from yahoo-merge) will have a complete security implementation.
>>
>
> Ah, OK, I see. That makes sense.
>
>
>>
>> My message was intended to highlight some small number of features/bugs
>> which are/will-be in hadoop-0.20.2xx. Here is a nearly complete list of such
>> jiras: MAPREDUCE-517, MAPREDUCE-1872, MAPREDUCE-291, MAPREDUCE-2418,
>> MAPREDUCE-2409, MAPREDUCE-2411. I'll check to ensure there aren't others.
>>
>>
>>
> Looking briefly at those, it seems that the ones that are clear bugs (with
> small fixes) should be put in the current MR implementation:
> MAPREDUCE-2411
> MAPREDUCE-2409
> MAPREDUCE-2418 (maybe)
>
> These all have patches that are pretty small, and I'd imagine would apply
> pretty easily to trunk. Let me know if you'd like any help forward-porting.
>
> The other ones, as new features/improvements, I'd agree it makes sense not
> to waste effort re-implementing them for trunk MR, but rather to make sure
> they're incorporated in next-gen.
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
Re: [ANNOUNCEMENT] Yahoo focusing on Apache Hadoop, discontinuing
"The Yahoo Distribution of Hadoop"
Posted by Todd Lipcon <to...@cloudera.com>.
On Fri, Apr 8, 2011 at 10:34 AM, Arun C Murthy <ac...@yahoo-inc.com> wrote:
>
> On Apr 7, 2011, at 4:22 PM, Todd Lipcon wrote:
>
> Is there a list available of which patches you've made this decision
>> about? I'm curious, for example, about MAPREDUCE-2178 -- as of today, the MR
>> security in trunk has a serious vulnerability. Do we plan on fixing it, or
>> will the answer be that, if anyone needs security, they must update to "MR
>> Next Gen"?
>>
>
> Apologies if my original message was abstruse - I want to ensure that there
> is no confusion between 'forward-port' and 'merge from yahoo-merge branch'.
>
> Let me try to explain again: there are several forward ports from the
> hadoop-0.20-2xx (branch-0.20-security) which are complete, including
> MAPREDUCE-2178. They are currently part of the 'yahoo-merge' branch in
> MapReduce. These are awaiting a merge into trunk. Trunk (with a few merges
> from yahoo-merge) will have a complete security implementation.
>
Ah, OK, I see. That makes sense.
>
> My message was intended to highlight some small number of features/bugs
> which are/will-be in hadoop-0.20.2xx. Here is a nearly complete list of such
> jiras: MAPREDUCE-517, MAPREDUCE-1872, MAPREDUCE-291, MAPREDUCE-2418,
> MAPREDUCE-2409, MAPREDUCE-2411. I'll check to ensure there aren't others.
>
>
>
Looking briefly at those, it seems that the ones that are clear bugs (with
small fixes) should be put in the current MR implementation:
MAPREDUCE-2411
MAPREDUCE-2409
MAPREDUCE-2418 (maybe)
These all have patches that are pretty small, and I'd imagine would apply
pretty easily to trunk. Let me know if you'd like any help forward-porting.
The other ones, as new features/improvements, I'd agree it makes sense not
to waste effort re-implementing them for trunk MR, but rather to make sure
they're incorporated in next-gen.
-Todd
--
Todd Lipcon
Software Engineer, Cloudera
Re: [ANNOUNCEMENT] Yahoo focusing on Apache Hadoop, discontinuing "The Yahoo Distribution of Hadoop"
Posted by Arun C Murthy <ac...@yahoo-inc.com>.
Todd,
On Apr 7, 2011, at 4:22 PM, Todd Lipcon wrote:
> Is there a list available of which patches you've made this decision
> about? I'm curious, for example, about MAPREDUCE-2178 -- as of
> today, the MR security in trunk has a serious vulnerability. Do we
> plan on fixing it, or will the answer be that, if anyone needs
> security, they must update to "MR Next Gen"?
Apologies if my original message was abstruse - I want to ensure that
there is no confusion between 'forward-port' and 'merge from yahoo-
merge branch'.
Let me try to explain again: there are several forward ports from the
hadoop-0.20-2xx (branch-0.20-security) which are complete, including
MAPREDUCE-2178. They are currently part of the 'yahoo-merge' branch in
MapReduce. These are awaiting a merge into trunk. Trunk (with a few
merges from yahoo-merge) will have a complete security implementation.
My message was intended to highlight some small number of features/
bugs which are/will-be in hadoop-0.20.2xx. Here is a nearly complete
list of such jiras: MAPREDUCE-517, MAPREDUCE-1872, MAPREDUCE-291,
MAPREDUCE-2418, MAPREDUCE-2409, MAPREDUCE-2411. I'll check to ensure
there aren't others.
Hope that makes sense. Again, apologies for any confusion I've caused.
thanks,
Arun
Re: [ANNOUNCEMENT] Yahoo focusing on Apache Hadoop, discontinuing
"The Yahoo Distribution of Hadoop"
Posted by Todd Lipcon <to...@cloudera.com>.
Is there a list available of which patches you've made this decision about?
I'm curious, for example, about MAPREDUCE-2178 -- as of today, the MR
security in trunk has a serious vulnerability. Do we plan on fixing it, or
will the answer be that, if anyone needs security, they must update to "MR
Next Gen"?
-Todd
On Thu, Apr 7, 2011 at 3:52 PM, Arun C Murthy <ac...@yahoo-inc.com> wrote:
>
> On Feb 14, 2011, at 1:34 PM, Arun C Murthy wrote:
>
>>
>> As the final installment in this process, I've started a discussion on
>> us contributing a re-factor of Map-Reduce in
>> https://issues.apache.org/jira/browse/MAPREDUCE-279
>> .
>>
>
>
>
> Hi Folks,
>
> We wanted to share our thoughts around the co-development of the NextGen
> MapReduce branch (Jira MR-279), maintaining the branch-0.20-security and
> merging the work on the security branch with trunk. We've concluded that it
> does not make sense for us to port a very small subset of the work from the
> branch-0.20-security to the Hadoop mainline. The JIRAs we don't plan to
> port all effect areas of the mainline that are going to be replaced by work
> in the NextGen MapReduce branch (
> http://svn.apache.org/viewvc/hadoop/mapreduce/branches/MR-279/).
>
> We've been working on the NextGen MapReduce branch (MAPREDUCE-279) within
> Apache for a while now and are excited about it's progress. We think that
> this branch will be a huge improvement in scalability, performance and
> functionality. We are now confident that we can get it ready for release in
> in the next few months. We believe that the next major release of Apache
> Hadoop we will test at Yahoo will include the work in this branch and we are
> committed to merging the NextGen branch into the mainline after the PMC
> approves the merge.
>
> Meanwhile, we have continued to find and fix bugs on branch-0.20-security
> and have been working to port that work into the Hadoop mainline. Most of
> this work is done and we've also brought all the patches in from our github
> branch into apache subversion, so that it is easy for everyone to see the
> work remaining. What we've found is that some of the work in
> branch-0.20-security is in code sections that have been completely replaced
> / refactored in the NextGen MapReduce branch. Since we are committed to the
> NextGen branch, we don't think there is any upside in porting this code into
> portions of mainline we expect to discard. All of these JIRAs will be fixed
> in the NextGen MapReduce branch and through there ultimately in trunk
> (assuming the PMC approves the merge).
>
> So at this point it is our intent to not port the JIRAs listed above to
> trunk, but to wait until we merge NextGen into trunk to resolve these issues
> there. If you are interested in seeing these issues ported to mainline, let
> us know. We are happy to help review your patches and explain context to
> anyone who is interested in doing this work.
>
> Arun and Eric
>
--
Todd Lipcon
Software Engineer, Cloudera