You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by sanjay Radia <sa...@gmail.com> on 2018/03/01 01:10:11 UTC

Re: [VOTE] Merging branch HDFS-7240 to trunk

Andrew, thanks for your response.

1) Wrt to NN on top of HDSL. You raised the issue of FSN lock separation . This was a key issue we discussed heavily in the past in the context of “Show the community a way to connect NN into the the new block layer”. We heard you clearly and thought deeply and showed how NN can be put on top of  WITHOUT removing the FSN.  We described this in detail  in HDFS-10419 and also  in the summary of the DISCUSSION thread:
 ---- Milestone 1 (no removal of FSN) gives almost 2x scalability and does not require separation of FSN lock and that milestone 2 which removes the FSN lock gives 2x scalability. 

You have conveniently ignored this. Let me reemphasize: Removing the FSN lock is not necessary for NN/HDFS to benefit from HDSL and you get almost the same scalability benefit. Hence the FSN local issue is moot. 

2) You have also conveniently ignored our arguments that there is benefit in keeping HDSL and HDFS together that are in the vote and discussion thread summary:
  A) Side by side usage and resulting operational concerns
>>"In the short term and medium term, the new system and HDFS
>> will be used side-by-side by users. ……  
>> During this time, sharing the DN daemon and admin functions
>> between the two systems is operationally important”

   B) Sharing code 
>>"Need to easily share the block layer code between the two systems
>> when used side-by-side. Areas where sharing code is desired over time: 
>>  - Sharing new block layer’s  new netty based protocol engine
>>     for old HDFS DNs (a long time sore issue for HDFS block layer). 
>> - Shallow data copy from old system to new system is practical
>> only if within same project and daemon otherwise have to deal
>> with security setting and coordinations across daemons.
>> Shallow copy is useful as customer migrate from old to new.
>> - Shared disk scheduling in the future"



3) You argue for separate project from 2 conflicting arguments: (1) Separate then merge later, what’s the hurry.  (2) keep seperate and focus on non-HDFS storage use cases. The HDFS community members built HDSL to address HDFS scalability; they were  not trying go after object store users or market (ceph etc). As explained multiple times OzoneFS is an intermediate step to stabilize HDSL but of immediate value for apps such as Hive and Spark. So even if there might be value in being separate (your motivation 2)  and go after a new storage use cases, the HDFS community members that built HDSL want to focus on improving HDFS; you may not agree with that but the engineers that are writing the code should be able to drive the direction.  Further look at the Security design we posted  - shows a Hadoop/HDFS focus not a focus for some other object store market: it fits into the Hadoop security model, especially supporting the use case of Jobs and the resulting need to support delegation tokens. 

4) You argue that the  HDSL and OzoneFS modules are separate and therefore one should go as a separate project. * Looks like one can’t win here. Damned if you do and Damned if you don’t. In the discussion with the Cloudera team one of the issues raised was that there a lot of new code and it will destabilized HDFS. We explained that  we have kept the code in separate modules so that it will not impact current HDFS stability, and that features like HDSL’s  new protocol engine will be plugged into the old HDFS block layer only after stabilization. You argue for stability and hence separate modules and then use it against to push it out as a separate project.

sanjay


> On Feb 28, 2018, at 12:10 AM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Resending since the formatting was messed up, let's try plain text this
> time:
> 
> Hi Jitendra and all,
> 
> Thanks for putting this together. I caught up on the discussion on JIRA and
> document at HDFS-10419, and still have the same concerns raised earlier
> about merging the Ozone branch to trunk.
> 
> To recap these questions/concerns at a very high level:
> 
> * Wouldn't Ozone benefit from being a separate project?
> * Why should it be merged now?
> 
> I still believe that both Ozone and Hadoop would benefit from Ozone being a
> separate project, and that there is no pressing reason to merge Ozone/HDSL
> now.
> 
> The primary reason I've heard for merging is that the Ozone is that it's at
> a stage where it's ready for user feedback. Second, that it needs to be
> merged to start on the NN refactoring for HDFS-on-HDSL.
> 
> First, without HDFS-on-HDSL support, users are testing against the Ozone
> object storage interface. Ozone and HDSL themselves are implemented as
> separate masters and new functionality bolted onto the datanode. It also
> doesn't look like HDFS in terms of API or featureset; yes, it speaks
> FileSystem, but so do many out-of-tree storage systems like S3, Ceph,
> Swift, ADLS etc. Ozone/HDSL does not support popular HDFS features like
> erasure coding, encryption, high-availability, snapshots, hflush/hsync (and
> thus HBase), or APIs like WebHDFS or NFS. This means that Ozone feels like
> a new, different system that could reasonably be deployed and tested
> separately from HDFS. It's unlikely to replace many of today's HDFS
> deployments, and from what I understand, Ozone was not designed to do this.
> 
> Second, the NameNode refactoring for HDFS-on-HDSL by itself is a major
> undertaking. The discussion on HDFS-10419 is still ongoing so it’s not
> clear what the ultimate refactoring will be, but I do know that the earlier
> FSN/BM refactoring during 2.x was very painful (introducing new bugs and
> making backports difficult) and probably should have been deferred to a new
> major release instead. I think this refactoring is important for the
> long-term maintainability of the NN and worth pursuing, but as a Hadoop 4.0
> item. Merging HDSL is also not a prerequisite for starting this
> refactoring. Really, I see the refactoring as the prerequisite for
> HDFS-on-HDSL to be possible.
> 
> Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements. There are also publicity and community
> benefits; it's an opportunity to build a community focused on the novel
> capabilities and architectural choices of Ozone/HDSL. There are examples of
> other projects that were "incubated" on a branch in the Hadoop repo before
> being spun off to great success.
> 
> In conclusion, I'd like to see Ozone succeeding and thriving as a separate
> project. Meanwhile, we can work on the HDFS refactoring required to
> separate the FSN and BM and make it pluggable. At that point (likely in the
> Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.
> 
> Best,
> Andrew
> 
> On Tue, Feb 27, 2018 at 11:23 PM, Andrew Wang <an...@cloudera.com>
> wrote:
> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> *Hi Jitendra and all,Thanks for putting this together. I caught up on the
>> discussion on JIRA and document at HDFS-10419, and still have the same
>> concerns raised earlier
>> <https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
>> about merging the Ozone branch to trunk.To recap these questions/concerns
>> at a very high level:* Wouldn't Ozone benefit from being a separate
>> project?* Why should it be merged now?I still believe that both Ozone and
>> Hadoop would benefit from Ozone being a separate project, and that there is
>> no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
>> for merging is that the Ozone is that it's at a stage where it's ready for
>> user feedback. Second, that it needs to be merged to start on the NN
>> refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
>> testing against the Ozone object storage interface. Ozone and HDSL
>> themselves are implemented as separate masters and new functionality bolted
>> onto the datanode. It also doesn't look like HDFS in terms of API or
>> featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
>> systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
>> HDFS features like erasure coding, encryption, high-availability,
>> snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
>> means that Ozone feels like a new, different system that could reasonably
>> be deployed and tested separately from HDFS. It's unlikely to replace many
>> of today's HDFS deployments, and from what I understand, Ozone was not
>> designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
>> itself is a major undertaking. The discussion on HDFS-10419 is still
>> ongoing so it’s not clear what the ultimate refactoring will be, but I do
>> know that the earlier FSN/BM refactoring during 2.x was very painful
>> (introducing new bugs and making backports difficult) and probably should
>> have been deferred to a new major release instead. I think this refactoring
>> is important for the long-term maintainability of the NN and worth
>> pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
>> for starting this refactoring. Really, I see the refactoring as the
>> prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
>> that Ozone/HDSL itself would benefit from being a separate project. Ozone
>> could release faster and iterate more quickly if it wasn't hampered by
>> Hadoop's release schedule and security and compatibility requirements.
>> There are also publicity and community benefits; it's an opportunity to
>> build a community focused on the novel capabilities and architectural
>> choices of Ozone/HDSL. There are examples of other projects that were
>> "incubated" on a branch in the Hadoop repo before being spun off to great
>> success.In conclusion, I'd like to see Ozone succeeding and thriving as a
>> separate project. Meanwhile, we can work on the HDFS refactoring required
>> to separate the FSN and BM and make it pluggable. At that point (likely in
>> the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
>> Best,
>> Andrew
>> 
>> On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <jitendra@hortonworks.com
>>> wrote:
>> 
>>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>> 
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>>> is a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>>> layer as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>> 
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>> 
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>>> Distributed+Storage+Layer+and+Applications
>>> 
>>> 
>>>    I will start with my vote.
>>>            +1 (binding)
>>> 
>>> 
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>> 
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>> 
>>> 
>>>    Thanks
>>>    jitendra
>>> 
>>> 
>>> 
>>> 
>>> 
>>>            DISCUSSION THREAD SUMMARY :
>>> 
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>> 
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>> 
>>> 
>>>                Dear
>>>                 Hadoop Community Members,
>>> 
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread. We
>>> express our gratitude for participation and valuable comments.
>>> 
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>>> brings in a large body of code.
>>>                4) Why can’t they be separate projects forever or merged
>>> in when production ready?
>>> 
>>>                We have responded to all the above questions with
>>> detailed explanations and answers on the jira as well as in the
>>> discussions. We believe that should sufficiently address community’s
>>> concerns.
>>> 
>>>                Please see the summary below:
>>> 
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>> 
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>>> block layer. See details below on sharing of code.
>>> 
>>> 
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the old
>>> HDFS continues to remains stable. (for example we plan to stabilize the new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS’s old block layer)
>>> 
>>> 
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual production
>>> use till the new system has feature parity with old HDFS. During this time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>>> place to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>>> capacity across disks - this needs to be done for both the current HDFS
>>> blocks and the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>> 
>>> 
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer’s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>>> possible (anything is possible in software),  it is significantly harder
>>> typically requiring  cleaner public apis etc. Sharing within a project
>>> though internal APIs is often simpler (such as the protocol engine that we
>>> want to share).
>>> 
>>> 
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two code
>>> bases for now and then later merge them when the new code is stable:
>>> 
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.  We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>>> harder.
>>> 
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>>> same then.
>>> 
>>> 
>>>                ------------------------------
>>> ---------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>>> ache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org