You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Jitendra Pandey <ji...@hortonworks.com> on 2018/02/26 21:18:56 UTC

[VOTE] Merging branch HDFS-7240 to trunk

Dear folks,
We would like to start a vote to merge HDFS-7240 branch into trunk. The context can be reviewed in the DISCUSSION thread, and in the jiras (See references below).

HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is a distributed, replicated block layer.
The old HDFS namespace and NN can be connected to this new block layer as we have described in HDFS-10419.
We also introduce a key-value namespace called Ozone built on HDSL.

The code is in a separate module and is turned off by default. In a secure setup, HDSL and Ozone daemons cannot be started.

The detailed documentation is available at
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Distributed+Storage+Layer+and+Applications

I will start with my vote.
+1 (binding)

Discussion Thread:
https://s.apache.org/7240-merge
https://s.apache.org/4sfU

Jiras:
https://issues.apache.org/jira/browse/HDFS-7240
https://issues.apache.org/jira/browse/HDFS-10419
https://issues.apache.org/jira/browse/HDFS-13074
https://issues.apache.org/jira/browse/HDFS-13180

Thanks
jitendra

DISCUSSION THREAD SUMMARY :

On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com> wrote:

Sorry the formatting got messed by my email client. Here it is again

Dear
Hadoop Community Members,

We had multiple community discussions, a few meetings in smaller groups and also jira discussions with respect to this thread. We express our gratitude for participation and valuable comments.

The key questions raised were following
1) How the new block storage layer and OzoneFS benefit HDFS and we were asked to chalk out a roadmap towards the goal of a scalable namenode working with the new storage layer
2) We were asked to provide a security design
3)There were questions around stability given ozone brings in a large body of code.
4) Why can’t they be separate projects forever or merged in when production ready?

We have responded to all the above questions with detailed explanations and answers on the jira as well as in the discussions. We believe that should sufficiently address community’s concerns.

Please see the summary below:

1) The new code base benefits HDFS scaling and a roadmap has been provided.

Summary:
- New block storage layer addresses the scalability of the block layer. We have shown how existing NN can be connected to the new block layer and its benefits. We have shown 2 milestones, 1st milestone is much simpler than 2nd milestone while giving almost the same scaling benefits. Originally we had proposed simply milestone 2 and the community felt that removing the FSN/BM lock was was a fair amount of work and a simpler solution would be useful
- We provide a new K-V namespace called Ozone FS with FileSystem/FileContext plugins to allow the users to use the new system. BTW Hive and Spark work very well on KV-namespaces on the cloud. This will facilitate stabilizing the new block layer.
- The new block layer has a new netty based protocol engine in the Datanode which, when stabilized, can be used by the old hdfs block layer. See details below on sharing of code.

2) Stability impact on the existing HDFS code base and code separation. The new block layer and the OzoneFS are in modules that are separate from old HDFS code - currently there are no calls from HDFS into Ozone except for DN starting the new block layer module if configured to do so. It does not add instability (the instability argument has been raised many times). Over time as we share code, we will ensure that the old HDFS continues to remains stable. (for example we plan to stabilize the new netty based protocol engine in the new block layer before sharing it with HDFS’s old block layer)

3) In the short term and medium term, the new system and HDFS will be used side-by-side by users. Side by-side usage in the short term for testing and side-by-side in the medium term for actual production use till the new system has feature parity with old HDFS. During this time, sharing the DN daemon and admin functions between the two systems is operationally important:
- Sharing DN daemon to avoid additional operational daemon lifecycle management
- Common decommissioning of the daemon and DN: One place to decommission for a node and its storage.
- Replacing failed disks and internal balancing capacity across disks - this needs to be done for both the current HDFS blocks and the new block-layer blocks.
- Balancer: we would like use the same balancer and provide a common way to balance and common management of the bandwidth used for balancing
- Security configuration setup - reuse existing set up for DNs rather then a new one for an independent cluster.

4) Need to easily share the block layer code between the two systems when used side-by-side. Areas where sharing code is desired over time:
- Sharing new block layer’s new netty based protocol engine for old HDFS DNs (a long time sore issue for HDFS block layer).
- Shallow data copy from old system to new system is practical only if within same project and daemon otherwise have to deal with security setting and coordinations across daemons. Shallow copy is useful as customer migrate from old to new.
- Shared disk scheduling in the future and in the short term have a single round robin rather than independent round robins.
While sharing code across projects is technically possible (anything is possible in software), it is significantly harder typically requiring cleaner public apis etc. Sharing within a project though internal APIs is often simpler (such as the protocol engine that we want to share).

5) Security design, including a threat model and and the solution has been posted.
6) Temporary Separation and merge later: Several of the comments in the jira have argued that we temporarily separate the two code bases for now and then later merge them when the new code is stable:

- If there is agreement to merge later, why bother separating now - there needs to be to be good reasons to separate now. We have addressed the stability and separation of the new code from existing above.
- Merge the new code back into HDFS later will be harder.

**The code and goals will diverge further.
** We will be taking on extra work to split and then take extra work to merge.
** The issues raised today will be raised all the same then.

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Andrew, thanks for your response.

1) Wrt to NN on top of HDSL. You raised the issue of FSN lock separation . This was a key issue we discussed heavily in the past in the context of “Show the community a way to connect NN into the the new block layer”. We heard you clearly and thought deeply and showed how NN can be put on top of  WITHOUT removing the FSN.  We described this in detail  in HDFS-10419 and also  in the summary of the DISCUSSION thread:
 ---- Milestone 1 (no removal of FSN) gives almost 2x scalability and does not require separation of FSN lock and that milestone 2 which removes the FSN lock gives 2x scalability. 

You have conveniently ignored this. Let me reemphasize: Removing the FSN lock is not necessary for NN/HDFS to benefit from HDSL and you get almost the same scalability benefit. Hence the FSN local issue is moot. 

2) You have also conveniently ignored our arguments that there is benefit in keeping HDSL and HDFS together that are in the vote and discussion thread summary:
  A) Side by side usage and resulting operational concerns
>>"In the short term and medium term, the new system and HDFS
>> will be used side-by-side by users. ……  
>> During this time, sharing the DN daemon and admin functions
>> between the two systems is operationally important”

   B) Sharing code 
>>"Need to easily share the block layer code between the two systems
>> when used side-by-side. Areas where sharing code is desired over time: 
>>  - Sharing new block layer’s  new netty based protocol engine
>>     for old HDFS DNs (a long time sore issue for HDFS block layer). 
>> - Shallow data copy from old system to new system is practical
>> only if within same project and daemon otherwise have to deal
>> with security setting and coordinations across daemons.
>> Shallow copy is useful as customer migrate from old to new.
>> - Shared disk scheduling in the future"



3) You argue for separate project from 2 conflicting arguments: (1) Separate then merge later, what’s the hurry.  (2) keep seperate and focus on non-HDFS storage use cases. The HDFS community members built HDSL to address HDFS scalability; they were  not trying go after object store users or market (ceph etc). As explained multiple times OzoneFS is an intermediate step to stabilize HDSL but of immediate value for apps such as Hive and Spark. So even if there might be value in being separate (your motivation 2)  and go after a new storage use cases, the HDFS community members that built HDSL want to focus on improving HDFS; you may not agree with that but the engineers that are writing the code should be able to drive the direction.  Further look at the Security design we posted  - shows a Hadoop/HDFS focus not a focus for some other object store market: it fits into the Hadoop security model, especially supporting the use case of Jobs and the resulting need to support delegation tokens. 

4) You argue that the  HDSL and OzoneFS modules are separate and therefore one should go as a separate project. * Looks like one can’t win here. Damned if you do and Damned if you don’t. In the discussion with the Cloudera team one of the issues raised was that there a lot of new code and it will destabilized HDFS. We explained that  we have kept the code in separate modules so that it will not impact current HDFS stability, and that features like HDSL’s  new protocol engine will be plugged into the old HDFS block layer only after stabilization. You argue for stability and hence separate modules and then use it against to push it out as a separate project.

sanjay


> On Feb 28, 2018, at 12:10 AM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Resending since the formatting was messed up, let's try plain text this
> time:
> 
> Hi Jitendra and all,
> 
> Thanks for putting this together. I caught up on the discussion on JIRA and
> document at HDFS-10419, and still have the same concerns raised earlier
> about merging the Ozone branch to trunk.
> 
> To recap these questions/concerns at a very high level:
> 
> * Wouldn't Ozone benefit from being a separate project?
> * Why should it be merged now?
> 
> I still believe that both Ozone and Hadoop would benefit from Ozone being a
> separate project, and that there is no pressing reason to merge Ozone/HDSL
> now.
> 
> The primary reason I've heard for merging is that the Ozone is that it's at
> a stage where it's ready for user feedback. Second, that it needs to be
> merged to start on the NN refactoring for HDFS-on-HDSL.
> 
> First, without HDFS-on-HDSL support, users are testing against the Ozone
> object storage interface. Ozone and HDSL themselves are implemented as
> separate masters and new functionality bolted onto the datanode. It also
> doesn't look like HDFS in terms of API or featureset; yes, it speaks
> FileSystem, but so do many out-of-tree storage systems like S3, Ceph,
> Swift, ADLS etc. Ozone/HDSL does not support popular HDFS features like
> erasure coding, encryption, high-availability, snapshots, hflush/hsync (and
> thus HBase), or APIs like WebHDFS or NFS. This means that Ozone feels like
> a new, different system that could reasonably be deployed and tested
> separately from HDFS. It's unlikely to replace many of today's HDFS
> deployments, and from what I understand, Ozone was not designed to do this.
> 
> Second, the NameNode refactoring for HDFS-on-HDSL by itself is a major
> undertaking. The discussion on HDFS-10419 is still ongoing so it’s not
> clear what the ultimate refactoring will be, but I do know that the earlier
> FSN/BM refactoring during 2.x was very painful (introducing new bugs and
> making backports difficult) and probably should have been deferred to a new
> major release instead. I think this refactoring is important for the
> long-term maintainability of the NN and worth pursuing, but as a Hadoop 4.0
> item. Merging HDSL is also not a prerequisite for starting this
> refactoring. Really, I see the refactoring as the prerequisite for
> HDFS-on-HDSL to be possible.
> 
> Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements. There are also publicity and community
> benefits; it's an opportunity to build a community focused on the novel
> capabilities and architectural choices of Ozone/HDSL. There are examples of
> other projects that were "incubated" on a branch in the Hadoop repo before
> being spun off to great success.
> 
> In conclusion, I'd like to see Ozone succeeding and thriving as a separate
> project. Meanwhile, we can work on the HDFS refactoring required to
> separate the FSN and BM and make it pluggable. At that point (likely in the
> Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.
> 
> Best,
> Andrew
> 
> On Tue, Feb 27, 2018 at 11:23 PM, Andrew Wang <an...@cloudera.com>
> wrote:
> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> *Hi Jitendra and all,Thanks for putting this together. I caught up on the
>> discussion on JIRA and document at HDFS-10419, and still have the same
>> concerns raised earlier
>> <https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
>> about merging the Ozone branch to trunk.To recap these questions/concerns
>> at a very high level:* Wouldn't Ozone benefit from being a separate
>> project?* Why should it be merged now?I still believe that both Ozone and
>> Hadoop would benefit from Ozone being a separate project, and that there is
>> no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
>> for merging is that the Ozone is that it's at a stage where it's ready for
>> user feedback. Second, that it needs to be merged to start on the NN
>> refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
>> testing against the Ozone object storage interface. Ozone and HDSL
>> themselves are implemented as separate masters and new functionality bolted
>> onto the datanode. It also doesn't look like HDFS in terms of API or
>> featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
>> systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
>> HDFS features like erasure coding, encryption, high-availability,
>> snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
>> means that Ozone feels like a new, different system that could reasonably
>> be deployed and tested separately from HDFS. It's unlikely to replace many
>> of today's HDFS deployments, and from what I understand, Ozone was not
>> designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
>> itself is a major undertaking. The discussion on HDFS-10419 is still
>> ongoing so it’s not clear what the ultimate refactoring will be, but I do
>> know that the earlier FSN/BM refactoring during 2.x was very painful
>> (introducing new bugs and making backports difficult) and probably should
>> have been deferred to a new major release instead. I think this refactoring
>> is important for the long-term maintainability of the NN and worth
>> pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
>> for starting this refactoring. Really, I see the refactoring as the
>> prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
>> that Ozone/HDSL itself would benefit from being a separate project. Ozone
>> could release faster and iterate more quickly if it wasn't hampered by
>> Hadoop's release schedule and security and compatibility requirements.
>> There are also publicity and community benefits; it's an opportunity to
>> build a community focused on the novel capabilities and architectural
>> choices of Ozone/HDSL. There are examples of other projects that were
>> "incubated" on a branch in the Hadoop repo before being spun off to great
>> success.In conclusion, I'd like to see Ozone succeeding and thriving as a
>> separate project. Meanwhile, we can work on the HDFS refactoring required
>> to separate the FSN and BM and make it pluggable. At that point (likely in
>> the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
>> Best,
>> Andrew
>> 
>> On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <jitendra@hortonworks.com
>>> wrote:
>> 
>>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>> 
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>>> is a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>>> layer as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>> 
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>> 
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>>> Distributed+Storage+Layer+and+Applications
>>> 
>>> 
>>>    I will start with my vote.
>>>            +1 (binding)
>>> 
>>> 
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>> 
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>> 
>>> 
>>>    Thanks
>>>    jitendra
>>> 
>>> 
>>> 
>>> 
>>> 
>>>            DISCUSSION THREAD SUMMARY :
>>> 
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>> 
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>> 
>>> 
>>>                Dear
>>>                 Hadoop Community Members,
>>> 
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread. We
>>> express our gratitude for participation and valuable comments.
>>> 
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>>> brings in a large body of code.
>>>                4) Why can’t they be separate projects forever or merged
>>> in when production ready?
>>> 
>>>                We have responded to all the above questions with
>>> detailed explanations and answers on the jira as well as in the
>>> discussions. We believe that should sufficiently address community’s
>>> concerns.
>>> 
>>>                Please see the summary below:
>>> 
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>> 
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>>> block layer. See details below on sharing of code.
>>> 
>>> 
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the old
>>> HDFS continues to remains stable. (for example we plan to stabilize the new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS’s old block layer)
>>> 
>>> 
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual production
>>> use till the new system has feature parity with old HDFS. During this time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>>> place to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>>> capacity across disks - this needs to be done for both the current HDFS
>>> blocks and the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>> 
>>> 
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer’s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>>> possible (anything is possible in software),  it is significantly harder
>>> typically requiring  cleaner public apis etc. Sharing within a project
>>> though internal APIs is often simpler (such as the protocol engine that we
>>> want to share).
>>> 
>>> 
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two code
>>> bases for now and then later merge them when the new code is stable:
>>> 
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.  We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>>> harder.
>>> 
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>>> same then.
>>> 
>>> 
>>>                ------------------------------
>>> ---------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>>> ache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Andrew, thanks for your response.

1) Wrt to NN on top of HDSL. You raised the issue of FSN lock separation . This was a key issue we discussed heavily in the past in the context of “Show the community a way to connect NN into the the new block layer”. We heard you clearly and thought deeply and showed how NN can be put on top of  WITHOUT removing the FSN.  We described this in detail  in HDFS-10419 and also  in the summary of the DISCUSSION thread:
 ---- Milestone 1 (no removal of FSN) gives almost 2x scalability and does not require separation of FSN lock and that milestone 2 which removes the FSN lock gives 2x scalability. 

You have conveniently ignored this. Let me reemphasize: Removing the FSN lock is not necessary for NN/HDFS to benefit from HDSL and you get almost the same scalability benefit. Hence the FSN local issue is moot. 

2) You have also conveniently ignored our arguments that there is benefit in keeping HDSL and HDFS together that are in the vote and discussion thread summary:
  A) Side by side usage and resulting operational concerns
>>"In the short term and medium term, the new system and HDFS
>> will be used side-by-side by users. ……  
>> During this time, sharing the DN daemon and admin functions
>> between the two systems is operationally important”

   B) Sharing code 
>>"Need to easily share the block layer code between the two systems
>> when used side-by-side. Areas where sharing code is desired over time: 
>>  - Sharing new block layer’s  new netty based protocol engine
>>     for old HDFS DNs (a long time sore issue for HDFS block layer). 
>> - Shallow data copy from old system to new system is practical
>> only if within same project and daemon otherwise have to deal
>> with security setting and coordinations across daemons.
>> Shallow copy is useful as customer migrate from old to new.
>> - Shared disk scheduling in the future"



3) You argue for separate project from 2 conflicting arguments: (1) Separate then merge later, what’s the hurry.  (2) keep seperate and focus on non-HDFS storage use cases. The HDFS community members built HDSL to address HDFS scalability; they were  not trying go after object store users or market (ceph etc). As explained multiple times OzoneFS is an intermediate step to stabilize HDSL but of immediate value for apps such as Hive and Spark. So even if there might be value in being separate (your motivation 2)  and go after a new storage use cases, the HDFS community members that built HDSL want to focus on improving HDFS; you may not agree with that but the engineers that are writing the code should be able to drive the direction.  Further look at the Security design we posted  - shows a Hadoop/HDFS focus not a focus for some other object store market: it fits into the Hadoop security model, especially supporting the use case of Jobs and the resulting need to support delegation tokens. 

4) You argue that the  HDSL and OzoneFS modules are separate and therefore one should go as a separate project. * Looks like one can’t win here. Damned if you do and Damned if you don’t. In the discussion with the Cloudera team one of the issues raised was that there a lot of new code and it will destabilized HDFS. We explained that  we have kept the code in separate modules so that it will not impact current HDFS stability, and that features like HDSL’s  new protocol engine will be plugged into the old HDFS block layer only after stabilization. You argue for stability and hence separate modules and then use it against to push it out as a separate project.

sanjay


> On Feb 28, 2018, at 12:10 AM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Resending since the formatting was messed up, let's try plain text this
> time:
> 
> Hi Jitendra and all,
> 
> Thanks for putting this together. I caught up on the discussion on JIRA and
> document at HDFS-10419, and still have the same concerns raised earlier
> about merging the Ozone branch to trunk.
> 
> To recap these questions/concerns at a very high level:
> 
> * Wouldn't Ozone benefit from being a separate project?
> * Why should it be merged now?
> 
> I still believe that both Ozone and Hadoop would benefit from Ozone being a
> separate project, and that there is no pressing reason to merge Ozone/HDSL
> now.
> 
> The primary reason I've heard for merging is that the Ozone is that it's at
> a stage where it's ready for user feedback. Second, that it needs to be
> merged to start on the NN refactoring for HDFS-on-HDSL.
> 
> First, without HDFS-on-HDSL support, users are testing against the Ozone
> object storage interface. Ozone and HDSL themselves are implemented as
> separate masters and new functionality bolted onto the datanode. It also
> doesn't look like HDFS in terms of API or featureset; yes, it speaks
> FileSystem, but so do many out-of-tree storage systems like S3, Ceph,
> Swift, ADLS etc. Ozone/HDSL does not support popular HDFS features like
> erasure coding, encryption, high-availability, snapshots, hflush/hsync (and
> thus HBase), or APIs like WebHDFS or NFS. This means that Ozone feels like
> a new, different system that could reasonably be deployed and tested
> separately from HDFS. It's unlikely to replace many of today's HDFS
> deployments, and from what I understand, Ozone was not designed to do this.
> 
> Second, the NameNode refactoring for HDFS-on-HDSL by itself is a major
> undertaking. The discussion on HDFS-10419 is still ongoing so it’s not
> clear what the ultimate refactoring will be, but I do know that the earlier
> FSN/BM refactoring during 2.x was very painful (introducing new bugs and
> making backports difficult) and probably should have been deferred to a new
> major release instead. I think this refactoring is important for the
> long-term maintainability of the NN and worth pursuing, but as a Hadoop 4.0
> item. Merging HDSL is also not a prerequisite for starting this
> refactoring. Really, I see the refactoring as the prerequisite for
> HDFS-on-HDSL to be possible.
> 
> Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements. There are also publicity and community
> benefits; it's an opportunity to build a community focused on the novel
> capabilities and architectural choices of Ozone/HDSL. There are examples of
> other projects that were "incubated" on a branch in the Hadoop repo before
> being spun off to great success.
> 
> In conclusion, I'd like to see Ozone succeeding and thriving as a separate
> project. Meanwhile, we can work on the HDFS refactoring required to
> separate the FSN and BM and make it pluggable. At that point (likely in the
> Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.
> 
> Best,
> Andrew
> 
> On Tue, Feb 27, 2018 at 11:23 PM, Andrew Wang <an...@cloudera.com>
> wrote:
> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> *Hi Jitendra and all,Thanks for putting this together. I caught up on the
>> discussion on JIRA and document at HDFS-10419, and still have the same
>> concerns raised earlier
>> <https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
>> about merging the Ozone branch to trunk.To recap these questions/concerns
>> at a very high level:* Wouldn't Ozone benefit from being a separate
>> project?* Why should it be merged now?I still believe that both Ozone and
>> Hadoop would benefit from Ozone being a separate project, and that there is
>> no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
>> for merging is that the Ozone is that it's at a stage where it's ready for
>> user feedback. Second, that it needs to be merged to start on the NN
>> refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
>> testing against the Ozone object storage interface. Ozone and HDSL
>> themselves are implemented as separate masters and new functionality bolted
>> onto the datanode. It also doesn't look like HDFS in terms of API or
>> featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
>> systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
>> HDFS features like erasure coding, encryption, high-availability,
>> snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
>> means that Ozone feels like a new, different system that could reasonably
>> be deployed and tested separately from HDFS. It's unlikely to replace many
>> of today's HDFS deployments, and from what I understand, Ozone was not
>> designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
>> itself is a major undertaking. The discussion on HDFS-10419 is still
>> ongoing so it’s not clear what the ultimate refactoring will be, but I do
>> know that the earlier FSN/BM refactoring during 2.x was very painful
>> (introducing new bugs and making backports difficult) and probably should
>> have been deferred to a new major release instead. I think this refactoring
>> is important for the long-term maintainability of the NN and worth
>> pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
>> for starting this refactoring. Really, I see the refactoring as the
>> prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
>> that Ozone/HDSL itself would benefit from being a separate project. Ozone
>> could release faster and iterate more quickly if it wasn't hampered by
>> Hadoop's release schedule and security and compatibility requirements.
>> There are also publicity and community benefits; it's an opportunity to
>> build a community focused on the novel capabilities and architectural
>> choices of Ozone/HDSL. There are examples of other projects that were
>> "incubated" on a branch in the Hadoop repo before being spun off to great
>> success.In conclusion, I'd like to see Ozone succeeding and thriving as a
>> separate project. Meanwhile, we can work on the HDFS refactoring required
>> to separate the FSN and BM and make it pluggable. At that point (likely in
>> the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
>> Best,
>> Andrew
>> 
>> On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <jitendra@hortonworks.com
>>> wrote:
>> 
>>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>> 
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>>> is a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>>> layer as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>> 
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>> 
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>>> Distributed+Storage+Layer+and+Applications
>>> 
>>> 
>>>    I will start with my vote.
>>>            +1 (binding)
>>> 
>>> 
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>> 
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>> 
>>> 
>>>    Thanks
>>>    jitendra
>>> 
>>> 
>>> 
>>> 
>>> 
>>>            DISCUSSION THREAD SUMMARY :
>>> 
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>> 
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>> 
>>> 
>>>                Dear
>>>                 Hadoop Community Members,
>>> 
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread. We
>>> express our gratitude for participation and valuable comments.
>>> 
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>>> brings in a large body of code.
>>>                4) Why can’t they be separate projects forever or merged
>>> in when production ready?
>>> 
>>>                We have responded to all the above questions with
>>> detailed explanations and answers on the jira as well as in the
>>> discussions. We believe that should sufficiently address community’s
>>> concerns.
>>> 
>>>                Please see the summary below:
>>> 
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>> 
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>>> block layer. See details below on sharing of code.
>>> 
>>> 
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the old
>>> HDFS continues to remains stable. (for example we plan to stabilize the new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS’s old block layer)
>>> 
>>> 
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual production
>>> use till the new system has feature parity with old HDFS. During this time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>>> place to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>>> capacity across disks - this needs to be done for both the current HDFS
>>> blocks and the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>> 
>>> 
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer’s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>>> possible (anything is possible in software),  it is significantly harder
>>> typically requiring  cleaner public apis etc. Sharing within a project
>>> though internal APIs is often simpler (such as the protocol engine that we
>>> want to share).
>>> 
>>> 
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two code
>>> bases for now and then later merge them when the new code is stable:
>>> 
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.  We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>>> harder.
>>> 
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>>> same then.
>>> 
>>> 
>>>                ------------------------------
>>> ---------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>>> ache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Andrew, thanks for your response.

1) Wrt to NN on top of HDSL. You raised the issue of FSN lock separation . This was a key issue we discussed heavily in the past in the context of “Show the community a way to connect NN into the the new block layer”. We heard you clearly and thought deeply and showed how NN can be put on top of  WITHOUT removing the FSN.  We described this in detail  in HDFS-10419 and also  in the summary of the DISCUSSION thread:
 ---- Milestone 1 (no removal of FSN) gives almost 2x scalability and does not require separation of FSN lock and that milestone 2 which removes the FSN lock gives 2x scalability. 

You have conveniently ignored this. Let me reemphasize: Removing the FSN lock is not necessary for NN/HDFS to benefit from HDSL and you get almost the same scalability benefit. Hence the FSN local issue is moot. 

2) You have also conveniently ignored our arguments that there is benefit in keeping HDSL and HDFS together that are in the vote and discussion thread summary:
  A) Side by side usage and resulting operational concerns
>>"In the short term and medium term, the new system and HDFS
>> will be used side-by-side by users. ……  
>> During this time, sharing the DN daemon and admin functions
>> between the two systems is operationally important”

   B) Sharing code 
>>"Need to easily share the block layer code between the two systems
>> when used side-by-side. Areas where sharing code is desired over time: 
>>  - Sharing new block layer’s  new netty based protocol engine
>>     for old HDFS DNs (a long time sore issue for HDFS block layer). 
>> - Shallow data copy from old system to new system is practical
>> only if within same project and daemon otherwise have to deal
>> with security setting and coordinations across daemons.
>> Shallow copy is useful as customer migrate from old to new.
>> - Shared disk scheduling in the future"



3) You argue for separate project from 2 conflicting arguments: (1) Separate then merge later, what’s the hurry.  (2) keep seperate and focus on non-HDFS storage use cases. The HDFS community members built HDSL to address HDFS scalability; they were  not trying go after object store users or market (ceph etc). As explained multiple times OzoneFS is an intermediate step to stabilize HDSL but of immediate value for apps such as Hive and Spark. So even if there might be value in being separate (your motivation 2)  and go after a new storage use cases, the HDFS community members that built HDSL want to focus on improving HDFS; you may not agree with that but the engineers that are writing the code should be able to drive the direction.  Further look at the Security design we posted  - shows a Hadoop/HDFS focus not a focus for some other object store market: it fits into the Hadoop security model, especially supporting the use case of Jobs and the resulting need to support delegation tokens. 

4) You argue that the  HDSL and OzoneFS modules are separate and therefore one should go as a separate project. * Looks like one can’t win here. Damned if you do and Damned if you don’t. In the discussion with the Cloudera team one of the issues raised was that there a lot of new code and it will destabilized HDFS. We explained that  we have kept the code in separate modules so that it will not impact current HDFS stability, and that features like HDSL’s  new protocol engine will be plugged into the old HDFS block layer only after stabilization. You argue for stability and hence separate modules and then use it against to push it out as a separate project.

sanjay


> On Feb 28, 2018, at 12:10 AM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Resending since the formatting was messed up, let's try plain text this
> time:
> 
> Hi Jitendra and all,
> 
> Thanks for putting this together. I caught up on the discussion on JIRA and
> document at HDFS-10419, and still have the same concerns raised earlier
> about merging the Ozone branch to trunk.
> 
> To recap these questions/concerns at a very high level:
> 
> * Wouldn't Ozone benefit from being a separate project?
> * Why should it be merged now?
> 
> I still believe that both Ozone and Hadoop would benefit from Ozone being a
> separate project, and that there is no pressing reason to merge Ozone/HDSL
> now.
> 
> The primary reason I've heard for merging is that the Ozone is that it's at
> a stage where it's ready for user feedback. Second, that it needs to be
> merged to start on the NN refactoring for HDFS-on-HDSL.
> 
> First, without HDFS-on-HDSL support, users are testing against the Ozone
> object storage interface. Ozone and HDSL themselves are implemented as
> separate masters and new functionality bolted onto the datanode. It also
> doesn't look like HDFS in terms of API or featureset; yes, it speaks
> FileSystem, but so do many out-of-tree storage systems like S3, Ceph,
> Swift, ADLS etc. Ozone/HDSL does not support popular HDFS features like
> erasure coding, encryption, high-availability, snapshots, hflush/hsync (and
> thus HBase), or APIs like WebHDFS or NFS. This means that Ozone feels like
> a new, different system that could reasonably be deployed and tested
> separately from HDFS. It's unlikely to replace many of today's HDFS
> deployments, and from what I understand, Ozone was not designed to do this.
> 
> Second, the NameNode refactoring for HDFS-on-HDSL by itself is a major
> undertaking. The discussion on HDFS-10419 is still ongoing so it’s not
> clear what the ultimate refactoring will be, but I do know that the earlier
> FSN/BM refactoring during 2.x was very painful (introducing new bugs and
> making backports difficult) and probably should have been deferred to a new
> major release instead. I think this refactoring is important for the
> long-term maintainability of the NN and worth pursuing, but as a Hadoop 4.0
> item. Merging HDSL is also not a prerequisite for starting this
> refactoring. Really, I see the refactoring as the prerequisite for
> HDFS-on-HDSL to be possible.
> 
> Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements. There are also publicity and community
> benefits; it's an opportunity to build a community focused on the novel
> capabilities and architectural choices of Ozone/HDSL. There are examples of
> other projects that were "incubated" on a branch in the Hadoop repo before
> being spun off to great success.
> 
> In conclusion, I'd like to see Ozone succeeding and thriving as a separate
> project. Meanwhile, we can work on the HDFS refactoring required to
> separate the FSN and BM and make it pluggable. At that point (likely in the
> Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.
> 
> Best,
> Andrew
> 
> On Tue, Feb 27, 2018 at 11:23 PM, Andrew Wang <an...@cloudera.com>
> wrote:
> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> *Hi Jitendra and all,Thanks for putting this together. I caught up on the
>> discussion on JIRA and document at HDFS-10419, and still have the same
>> concerns raised earlier
>> <https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
>> about merging the Ozone branch to trunk.To recap these questions/concerns
>> at a very high level:* Wouldn't Ozone benefit from being a separate
>> project?* Why should it be merged now?I still believe that both Ozone and
>> Hadoop would benefit from Ozone being a separate project, and that there is
>> no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
>> for merging is that the Ozone is that it's at a stage where it's ready for
>> user feedback. Second, that it needs to be merged to start on the NN
>> refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
>> testing against the Ozone object storage interface. Ozone and HDSL
>> themselves are implemented as separate masters and new functionality bolted
>> onto the datanode. It also doesn't look like HDFS in terms of API or
>> featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
>> systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
>> HDFS features like erasure coding, encryption, high-availability,
>> snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
>> means that Ozone feels like a new, different system that could reasonably
>> be deployed and tested separately from HDFS. It's unlikely to replace many
>> of today's HDFS deployments, and from what I understand, Ozone was not
>> designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
>> itself is a major undertaking. The discussion on HDFS-10419 is still
>> ongoing so it’s not clear what the ultimate refactoring will be, but I do
>> know that the earlier FSN/BM refactoring during 2.x was very painful
>> (introducing new bugs and making backports difficult) and probably should
>> have been deferred to a new major release instead. I think this refactoring
>> is important for the long-term maintainability of the NN and worth
>> pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
>> for starting this refactoring. Really, I see the refactoring as the
>> prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
>> that Ozone/HDSL itself would benefit from being a separate project. Ozone
>> could release faster and iterate more quickly if it wasn't hampered by
>> Hadoop's release schedule and security and compatibility requirements.
>> There are also publicity and community benefits; it's an opportunity to
>> build a community focused on the novel capabilities and architectural
>> choices of Ozone/HDSL. There are examples of other projects that were
>> "incubated" on a branch in the Hadoop repo before being spun off to great
>> success.In conclusion, I'd like to see Ozone succeeding and thriving as a
>> separate project. Meanwhile, we can work on the HDFS refactoring required
>> to separate the FSN and BM and make it pluggable. At that point (likely in
>> the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
>> Best,
>> Andrew
>> 
>> On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <jitendra@hortonworks.com
>>> wrote:
>> 
>>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>> 
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>>> is a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>>> layer as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>> 
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>> 
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>>> Distributed+Storage+Layer+and+Applications
>>> 
>>> 
>>>    I will start with my vote.
>>>            +1 (binding)
>>> 
>>> 
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>> 
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>> 
>>> 
>>>    Thanks
>>>    jitendra
>>> 
>>> 
>>> 
>>> 
>>> 
>>>            DISCUSSION THREAD SUMMARY :
>>> 
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>> 
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>> 
>>> 
>>>                Dear
>>>                 Hadoop Community Members,
>>> 
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread. We
>>> express our gratitude for participation and valuable comments.
>>> 
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>>> brings in a large body of code.
>>>                4) Why can’t they be separate projects forever or merged
>>> in when production ready?
>>> 
>>>                We have responded to all the above questions with
>>> detailed explanations and answers on the jira as well as in the
>>> discussions. We believe that should sufficiently address community’s
>>> concerns.
>>> 
>>>                Please see the summary below:
>>> 
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>> 
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>>> block layer. See details below on sharing of code.
>>> 
>>> 
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the old
>>> HDFS continues to remains stable. (for example we plan to stabilize the new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS’s old block layer)
>>> 
>>> 
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual production
>>> use till the new system has feature parity with old HDFS. During this time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>>> place to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>>> capacity across disks - this needs to be done for both the current HDFS
>>> blocks and the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>> 
>>> 
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer’s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>>> possible (anything is possible in software),  it is significantly harder
>>> typically requiring  cleaner public apis etc. Sharing within a project
>>> though internal APIs is often simpler (such as the protocol engine that we
>>> want to share).
>>> 
>>> 
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two code
>>> bases for now and then later merge them when the new code is stable:
>>> 
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.  We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>>> harder.
>>> 
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>>> same then.
>>> 
>>> 
>>>                ------------------------------
>>> ---------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>>> ache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Andrew, thanks for your response.

1) Wrt to NN on top of HDSL. You raised the issue of FSN lock separation . This was a key issue we discussed heavily in the past in the context of “Show the community a way to connect NN into the the new block layer”. We heard you clearly and thought deeply and showed how NN can be put on top of  WITHOUT removing the FSN.  We described this in detail  in HDFS-10419 and also  in the summary of the DISCUSSION thread:
 ---- Milestone 1 (no removal of FSN) gives almost 2x scalability and does not require separation of FSN lock and that milestone 2 which removes the FSN lock gives 2x scalability. 

You have conveniently ignored this. Let me reemphasize: Removing the FSN lock is not necessary for NN/HDFS to benefit from HDSL and you get almost the same scalability benefit. Hence the FSN local issue is moot. 

2) You have also conveniently ignored our arguments that there is benefit in keeping HDSL and HDFS together that are in the vote and discussion thread summary:
  A) Side by side usage and resulting operational concerns
>>"In the short term and medium term, the new system and HDFS
>> will be used side-by-side by users. ……  
>> During this time, sharing the DN daemon and admin functions
>> between the two systems is operationally important”

   B) Sharing code 
>>"Need to easily share the block layer code between the two systems
>> when used side-by-side. Areas where sharing code is desired over time: 
>>  - Sharing new block layer’s  new netty based protocol engine
>>     for old HDFS DNs (a long time sore issue for HDFS block layer). 
>> - Shallow data copy from old system to new system is practical
>> only if within same project and daemon otherwise have to deal
>> with security setting and coordinations across daemons.
>> Shallow copy is useful as customer migrate from old to new.
>> - Shared disk scheduling in the future"



3) You argue for separate project from 2 conflicting arguments: (1) Separate then merge later, what’s the hurry.  (2) keep seperate and focus on non-HDFS storage use cases. The HDFS community members built HDSL to address HDFS scalability; they were  not trying go after object store users or market (ceph etc). As explained multiple times OzoneFS is an intermediate step to stabilize HDSL but of immediate value for apps such as Hive and Spark. So even if there might be value in being separate (your motivation 2)  and go after a new storage use cases, the HDFS community members that built HDSL want to focus on improving HDFS; you may not agree with that but the engineers that are writing the code should be able to drive the direction.  Further look at the Security design we posted  - shows a Hadoop/HDFS focus not a focus for some other object store market: it fits into the Hadoop security model, especially supporting the use case of Jobs and the resulting need to support delegation tokens. 

4) You argue that the  HDSL and OzoneFS modules are separate and therefore one should go as a separate project. * Looks like one can’t win here. Damned if you do and Damned if you don’t. In the discussion with the Cloudera team one of the issues raised was that there a lot of new code and it will destabilized HDFS. We explained that  we have kept the code in separate modules so that it will not impact current HDFS stability, and that features like HDSL’s  new protocol engine will be plugged into the old HDFS block layer only after stabilization. You argue for stability and hence separate modules and then use it against to push it out as a separate project.

sanjay


> On Feb 28, 2018, at 12:10 AM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Resending since the formatting was messed up, let's try plain text this
> time:
> 
> Hi Jitendra and all,
> 
> Thanks for putting this together. I caught up on the discussion on JIRA and
> document at HDFS-10419, and still have the same concerns raised earlier
> about merging the Ozone branch to trunk.
> 
> To recap these questions/concerns at a very high level:
> 
> * Wouldn't Ozone benefit from being a separate project?
> * Why should it be merged now?
> 
> I still believe that both Ozone and Hadoop would benefit from Ozone being a
> separate project, and that there is no pressing reason to merge Ozone/HDSL
> now.
> 
> The primary reason I've heard for merging is that the Ozone is that it's at
> a stage where it's ready for user feedback. Second, that it needs to be
> merged to start on the NN refactoring for HDFS-on-HDSL.
> 
> First, without HDFS-on-HDSL support, users are testing against the Ozone
> object storage interface. Ozone and HDSL themselves are implemented as
> separate masters and new functionality bolted onto the datanode. It also
> doesn't look like HDFS in terms of API or featureset; yes, it speaks
> FileSystem, but so do many out-of-tree storage systems like S3, Ceph,
> Swift, ADLS etc. Ozone/HDSL does not support popular HDFS features like
> erasure coding, encryption, high-availability, snapshots, hflush/hsync (and
> thus HBase), or APIs like WebHDFS or NFS. This means that Ozone feels like
> a new, different system that could reasonably be deployed and tested
> separately from HDFS. It's unlikely to replace many of today's HDFS
> deployments, and from what I understand, Ozone was not designed to do this.
> 
> Second, the NameNode refactoring for HDFS-on-HDSL by itself is a major
> undertaking. The discussion on HDFS-10419 is still ongoing so it’s not
> clear what the ultimate refactoring will be, but I do know that the earlier
> FSN/BM refactoring during 2.x was very painful (introducing new bugs and
> making backports difficult) and probably should have been deferred to a new
> major release instead. I think this refactoring is important for the
> long-term maintainability of the NN and worth pursuing, but as a Hadoop 4.0
> item. Merging HDSL is also not a prerequisite for starting this
> refactoring. Really, I see the refactoring as the prerequisite for
> HDFS-on-HDSL to be possible.
> 
> Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements. There are also publicity and community
> benefits; it's an opportunity to build a community focused on the novel
> capabilities and architectural choices of Ozone/HDSL. There are examples of
> other projects that were "incubated" on a branch in the Hadoop repo before
> being spun off to great success.
> 
> In conclusion, I'd like to see Ozone succeeding and thriving as a separate
> project. Meanwhile, we can work on the HDFS refactoring required to
> separate the FSN and BM and make it pluggable. At that point (likely in the
> Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.
> 
> Best,
> Andrew
> 
> On Tue, Feb 27, 2018 at 11:23 PM, Andrew Wang <an...@cloudera.com>
> wrote:
> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> *Hi Jitendra and all,Thanks for putting this together. I caught up on the
>> discussion on JIRA and document at HDFS-10419, and still have the same
>> concerns raised earlier
>> <https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
>> about merging the Ozone branch to trunk.To recap these questions/concerns
>> at a very high level:* Wouldn't Ozone benefit from being a separate
>> project?* Why should it be merged now?I still believe that both Ozone and
>> Hadoop would benefit from Ozone being a separate project, and that there is
>> no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
>> for merging is that the Ozone is that it's at a stage where it's ready for
>> user feedback. Second, that it needs to be merged to start on the NN
>> refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
>> testing against the Ozone object storage interface. Ozone and HDSL
>> themselves are implemented as separate masters and new functionality bolted
>> onto the datanode. It also doesn't look like HDFS in terms of API or
>> featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
>> systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
>> HDFS features like erasure coding, encryption, high-availability,
>> snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
>> means that Ozone feels like a new, different system that could reasonably
>> be deployed and tested separately from HDFS. It's unlikely to replace many
>> of today's HDFS deployments, and from what I understand, Ozone was not
>> designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
>> itself is a major undertaking. The discussion on HDFS-10419 is still
>> ongoing so it’s not clear what the ultimate refactoring will be, but I do
>> know that the earlier FSN/BM refactoring during 2.x was very painful
>> (introducing new bugs and making backports difficult) and probably should
>> have been deferred to a new major release instead. I think this refactoring
>> is important for the long-term maintainability of the NN and worth
>> pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
>> for starting this refactoring. Really, I see the refactoring as the
>> prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
>> that Ozone/HDSL itself would benefit from being a separate project. Ozone
>> could release faster and iterate more quickly if it wasn't hampered by
>> Hadoop's release schedule and security and compatibility requirements.
>> There are also publicity and community benefits; it's an opportunity to
>> build a community focused on the novel capabilities and architectural
>> choices of Ozone/HDSL. There are examples of other projects that were
>> "incubated" on a branch in the Hadoop repo before being spun off to great
>> success.In conclusion, I'd like to see Ozone succeeding and thriving as a
>> separate project. Meanwhile, we can work on the HDFS refactoring required
>> to separate the FSN and BM and make it pluggable. At that point (likely in
>> the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
>> Best,
>> Andrew
>> 
>> On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <jitendra@hortonworks.com
>>> wrote:
>> 
>>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>> 
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>>> is a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>>> layer as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>> 
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>> 
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>>> Distributed+Storage+Layer+and+Applications
>>> 
>>> 
>>>    I will start with my vote.
>>>            +1 (binding)
>>> 
>>> 
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>> 
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>> 
>>> 
>>>    Thanks
>>>    jitendra
>>> 
>>> 
>>> 
>>> 
>>> 
>>>            DISCUSSION THREAD SUMMARY :
>>> 
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>> 
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>> 
>>> 
>>>                Dear
>>>                 Hadoop Community Members,
>>> 
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread. We
>>> express our gratitude for participation and valuable comments.
>>> 
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>>> brings in a large body of code.
>>>                4) Why can’t they be separate projects forever or merged
>>> in when production ready?
>>> 
>>>                We have responded to all the above questions with
>>> detailed explanations and answers on the jira as well as in the
>>> discussions. We believe that should sufficiently address community’s
>>> concerns.
>>> 
>>>                Please see the summary below:
>>> 
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>> 
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>>> block layer. See details below on sharing of code.
>>> 
>>> 
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the old
>>> HDFS continues to remains stable. (for example we plan to stabilize the new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS’s old block layer)
>>> 
>>> 
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual production
>>> use till the new system has feature parity with old HDFS. During this time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>>> place to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>>> capacity across disks - this needs to be done for both the current HDFS
>>> blocks and the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>> 
>>> 
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer’s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>>> possible (anything is possible in software),  it is significantly harder
>>> typically requiring  cleaner public apis etc. Sharing within a project
>>> though internal APIs is often simpler (such as the protocol engine that we
>>> want to share).
>>> 
>>> 
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two code
>>> bases for now and then later merge them when the new code is stable:
>>> 
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.  We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>>> harder.
>>> 
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>>> same then.
>>> 
>>> 
>>>                ------------------------------
>>> ---------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>>> ache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Resending since the formatting was messed up, let's try plain text this
time:

Hi Jitendra and all,

Thanks for putting this together. I caught up on the discussion on JIRA and
document at HDFS-10419, and still have the same concerns raised earlier
about merging the Ozone branch to trunk.

To recap these questions/concerns at a very high level:

* Wouldn't Ozone benefit from being a separate project?
* Why should it be merged now?

I still believe that both Ozone and Hadoop would benefit from Ozone being a
separate project, and that there is no pressing reason to merge Ozone/HDSL
now.

The primary reason I've heard for merging is that the Ozone is that it's at
a stage where it's ready for user feedback. Second, that it needs to be
merged to start on the NN refactoring for HDFS-on-HDSL.

First, without HDFS-on-HDSL support, users are testing against the Ozone
object storage interface. Ozone and HDSL themselves are implemented as
separate masters and new functionality bolted onto the datanode. It also
doesn't look like HDFS in terms of API or featureset; yes, it speaks
FileSystem, but so do many out-of-tree storage systems like S3, Ceph,
Swift, ADLS etc. Ozone/HDSL does not support popular HDFS features like
erasure coding, encryption, high-availability, snapshots, hflush/hsync (and
thus HBase), or APIs like WebHDFS or NFS. This means that Ozone feels like
a new, different system that could reasonably be deployed and tested
separately from HDFS. It's unlikely to replace many of today's HDFS
deployments, and from what I understand, Ozone was not designed to do this.

Second, the NameNode refactoring for HDFS-on-HDSL by itself is a major
undertaking. The discussion on HDFS-10419 is still ongoing so it’s not
clear what the ultimate refactoring will be, but I do know that the earlier
FSN/BM refactoring during 2.x was very painful (introducing new bugs and
making backports difficult) and probably should have been deferred to a new
major release instead. I think this refactoring is important for the
long-term maintainability of the NN and worth pursuing, but as a Hadoop 4.0
item. Merging HDSL is also not a prerequisite for starting this
refactoring. Really, I see the refactoring as the prerequisite for
HDFS-on-HDSL to be possible.

Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements. There are also publicity and community
benefits; it's an opportunity to build a community focused on the novel
capabilities and architectural choices of Ozone/HDSL. There are examples of
other projects that were "incubated" on a branch in the Hadoop repo before
being spun off to great success.

In conclusion, I'd like to see Ozone succeeding and thriving as a separate
project. Meanwhile, we can work on the HDFS refactoring required to
separate the FSN and BM and make it pluggable. At that point (likely in the
Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.

Best,
Andrew

On Tue, Feb 27, 2018 at 11:23 PM, Andrew Wang <an...@cloudera.com>
wrote:

>
>
>
>
>
>
>
>
>
> *Hi Jitendra and all,Thanks for putting this together. I caught up on the
> discussion on JIRA and document at HDFS-10419, and still have the same
> concerns raised earlier
> <https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
> about merging the Ozone branch to trunk.To recap these questions/concerns
> at a very high level:* Wouldn't Ozone benefit from being a separate
> project?* Why should it be merged now?I still believe that both Ozone and
> Hadoop would benefit from Ozone being a separate project, and that there is
> no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
> for merging is that the Ozone is that it's at a stage where it's ready for
> user feedback. Second, that it needs to be merged to start on the NN
> refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
> testing against the Ozone object storage interface. Ozone and HDSL
> themselves are implemented as separate masters and new functionality bolted
> onto the datanode. It also doesn't look like HDFS in terms of API or
> featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
> systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
> HDFS features like erasure coding, encryption, high-availability,
> snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
> means that Ozone feels like a new, different system that could reasonably
> be deployed and tested separately from HDFS. It's unlikely to replace many
> of today's HDFS deployments, and from what I understand, Ozone was not
> designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
> itself is a major undertaking. The discussion on HDFS-10419 is still
> ongoing so it’s not clear what the ultimate refactoring will be, but I do
> know that the earlier FSN/BM refactoring during 2.x was very painful
> (introducing new bugs and making backports difficult) and probably should
> have been deferred to a new major release instead. I think this refactoring
> is important for the long-term maintainability of the NN and worth
> pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
> for starting this refactoring. Really, I see the refactoring as the
> prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
> that Ozone/HDSL itself would benefit from being a separate project. Ozone
> could release faster and iterate more quickly if it wasn't hampered by
> Hadoop's release schedule and security and compatibility requirements.
> There are also publicity and community benefits; it's an opportunity to
> build a community focused on the novel capabilities and architectural
> choices of Ozone/HDSL. There are examples of other projects that were
> "incubated" on a branch in the Hadoop repo before being spun off to great
> success.In conclusion, I'd like to see Ozone succeeding and thriving as a
> separate project. Meanwhile, we can work on the HDFS refactoring required
> to separate the FSN and BM and make it pluggable. At that point (likely in
> the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
> Best,
> Andrew
>
> On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <jitendra@hortonworks.com
> > wrote:
>
>>     Dear folks,
>>            We would like to start a vote to merge HDFS-7240 branch into
>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>> jiras (See references below).
>>
>>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>> is a distributed, replicated block layer.
>>     The old HDFS namespace and NN can be connected to this new block
>> layer as we have described in HDFS-10419.
>>     We also introduce a key-value namespace called Ozone built on HDSL.
>>
>>     The code is in a separate module and is turned off by default. In a
>> secure setup, HDSL and Ozone daemons cannot be started.
>>
>>     The detailed documentation is available at
>>              https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>> Distributed+Storage+Layer+and+Applications
>>
>>
>>     I will start with my vote.
>>             +1 (binding)
>>
>>
>>     Discussion Thread:
>>               https://s.apache.org/7240-merge
>>               https://s.apache.org/4sfU
>>
>>     Jiras:
>>                https://issues.apache.org/jira/browse/HDFS-7240
>>                https://issues.apache.org/jira/browse/HDFS-10419
>>                https://issues.apache.org/jira/browse/HDFS-13074
>>                https://issues.apache.org/jira/browse/HDFS-13180
>>
>>
>>     Thanks
>>     jitendra
>>
>>
>>
>>
>>
>>             DISCUSSION THREAD SUMMARY :
>>
>>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>> wrote:
>>
>>                 Sorry the formatting got messed by my email client.  Here
>> it is again
>>
>>
>>                 Dear
>>                  Hadoop Community Members,
>>
>>                    We had multiple community discussions, a few meetings
>> in smaller groups and also jira discussions with respect to this thread. We
>> express our gratitude for participation and valuable comments.
>>
>>                 The key questions raised were following
>>                 1) How the new block storage layer and OzoneFS benefit
>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>> scalable namenode working with the new storage layer
>>                 2) We were asked to provide a security design
>>                 3)There were questions around stability given ozone
>> brings in a large body of code.
>>                 4) Why can’t they be separate projects forever or merged
>> in when production ready?
>>
>>                 We have responded to all the above questions with
>> detailed explanations and answers on the jira as well as in the
>> discussions. We believe that should sufficiently address community’s
>> concerns.
>>
>>                 Please see the summary below:
>>
>>                 1) The new code base benefits HDFS scaling and a roadmap
>> has been provided.
>>
>>                 Summary:
>>                   - New block storage layer addresses the scalability of
>> the block layer. We have shown how existing NN can be connected to the new
>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>> much simpler than 2nd milestone while giving almost the same scaling
>> benefits. Originally we had proposed simply milestone 2 and the community
>> felt that removing the FSN/BM lock was was a fair amount of work and a
>> simpler solution would be useful
>>                   - We provide a new K-V namespace called Ozone FS with
>> FileSystem/FileContext plugins to allow the users to use the new system.
>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>> facilitate stabilizing the new block layer.
>>                   - The new block layer has a new netty based protocol
>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>> block layer. See details below on sharing of code.
>>
>>
>>                 2) Stability impact on the existing HDFS code base and
>> code separation. The new block layer and the OzoneFS are in modules that
>> are separate from old HDFS code - currently there are no calls from HDFS
>> into Ozone except for DN starting the new block  layer module if configured
>> to do so. It does not add instability (the instability argument has been
>> raised many times). Over time as we share code, we will ensure that the old
>> HDFS continues to remains stable. (for example we plan to stabilize the new
>> netty based protocol engine in the new block layer before sharing it with
>> HDFS’s old block layer)
>>
>>
>>                 3) In the short term and medium term, the new system and
>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>> term for testing and side-by-side in the medium term for actual production
>> use till the new system has feature parity with old HDFS. During this time,
>> sharing the DN daemon and admin functions between the two systems is
>> operationally important:
>>                   - Sharing DN daemon to avoid additional operational
>> daemon lifecycle management
>>                   - Common decommissioning of the daemon and DN: One
>> place to decommission for a node and its storage.
>>                   - Replacing failed disks and internal balancing
>> capacity across disks - this needs to be done for both the current HDFS
>> blocks and the new block-layer blocks.
>>                   - Balancer: we would like use the same balancer and
>> provide a common way to balance and common management of the bandwidth used
>> for balancing
>>                   - Security configuration setup - reuse existing set up
>> for DNs rather then a new one for an independent cluster.
>>
>>
>>                 4) Need to easily share the block layer code between the
>> two systems when used side-by-side. Areas where sharing code is desired
>> over time:
>>                   - Sharing new block layer’s  new netty based protocol
>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>                   - Shallow data copy from old system to new system is
>> practical only if within same project and daemon otherwise have to deal
>> with security setting and coordinations across daemons. Shallow copy is
>> useful as customer migrate from old to new.
>>                   - Shared disk scheduling in the future and in the short
>> term have a single round robin rather than independent round robins.
>>                 While sharing code across projects is technically
>> possible (anything is possible in software),  it is significantly harder
>> typically requiring  cleaner public apis etc. Sharing within a project
>> though internal APIs is often simpler (such as the protocol engine that we
>> want to share).
>>
>>
>>                 5) Security design, including a threat model and and the
>> solution has been posted.
>>                 6) Temporary Separation and merge later: Several of the
>> comments in the jira have argued that we temporarily separate the two code
>> bases for now and then later merge them when the new code is stable:
>>
>>                   - If there is agreement to merge later, why bother
>> separating now - there needs to be to be good reasons to separate now.  We
>> have addressed the stability and separation of the new code from existing
>> above.
>>                   - Merge the new code back into HDFS later will be
>> harder.
>>
>>                     **The code and goals will diverge further.
>>                     ** We will be taking on extra work to split and then
>> take extra work to merge.
>>                     ** The issues raised today will be raised all the
>> same then.
>>
>>
>>                 ------------------------------
>> ---------------------------------------
>>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>> ache.org
>>                 For additional commands, e-mail:
>> hdfs-dev-help@hadoop.apache.org
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Resending since the formatting was messed up, let's try plain text this
time:

Hi Jitendra and all,

Thanks for putting this together. I caught up on the discussion on JIRA and
document at HDFS-10419, and still have the same concerns raised earlier
about merging the Ozone branch to trunk.

To recap these questions/concerns at a very high level:

* Wouldn't Ozone benefit from being a separate project?
* Why should it be merged now?

I still believe that both Ozone and Hadoop would benefit from Ozone being a
separate project, and that there is no pressing reason to merge Ozone/HDSL
now.

The primary reason I've heard for merging is that the Ozone is that it's at
a stage where it's ready for user feedback. Second, that it needs to be
merged to start on the NN refactoring for HDFS-on-HDSL.

First, without HDFS-on-HDSL support, users are testing against the Ozone
object storage interface. Ozone and HDSL themselves are implemented as
separate masters and new functionality bolted onto the datanode. It also
doesn't look like HDFS in terms of API or featureset; yes, it speaks
FileSystem, but so do many out-of-tree storage systems like S3, Ceph,
Swift, ADLS etc. Ozone/HDSL does not support popular HDFS features like
erasure coding, encryption, high-availability, snapshots, hflush/hsync (and
thus HBase), or APIs like WebHDFS or NFS. This means that Ozone feels like
a new, different system that could reasonably be deployed and tested
separately from HDFS. It's unlikely to replace many of today's HDFS
deployments, and from what I understand, Ozone was not designed to do this.

Second, the NameNode refactoring for HDFS-on-HDSL by itself is a major
undertaking. The discussion on HDFS-10419 is still ongoing so it’s not
clear what the ultimate refactoring will be, but I do know that the earlier
FSN/BM refactoring during 2.x was very painful (introducing new bugs and
making backports difficult) and probably should have been deferred to a new
major release instead. I think this refactoring is important for the
long-term maintainability of the NN and worth pursuing, but as a Hadoop 4.0
item. Merging HDSL is also not a prerequisite for starting this
refactoring. Really, I see the refactoring as the prerequisite for
HDFS-on-HDSL to be possible.

Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements. There are also publicity and community
benefits; it's an opportunity to build a community focused on the novel
capabilities and architectural choices of Ozone/HDSL. There are examples of
other projects that were "incubated" on a branch in the Hadoop repo before
being spun off to great success.

In conclusion, I'd like to see Ozone succeeding and thriving as a separate
project. Meanwhile, we can work on the HDFS refactoring required to
separate the FSN and BM and make it pluggable. At that point (likely in the
Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.

Best,
Andrew

On Tue, Feb 27, 2018 at 11:23 PM, Andrew Wang <an...@cloudera.com>
wrote:

>
>
>
>
>
>
>
>
>
> *Hi Jitendra and all,Thanks for putting this together. I caught up on the
> discussion on JIRA and document at HDFS-10419, and still have the same
> concerns raised earlier
> <https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
> about merging the Ozone branch to trunk.To recap these questions/concerns
> at a very high level:* Wouldn't Ozone benefit from being a separate
> project?* Why should it be merged now?I still believe that both Ozone and
> Hadoop would benefit from Ozone being a separate project, and that there is
> no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
> for merging is that the Ozone is that it's at a stage where it's ready for
> user feedback. Second, that it needs to be merged to start on the NN
> refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
> testing against the Ozone object storage interface. Ozone and HDSL
> themselves are implemented as separate masters and new functionality bolted
> onto the datanode. It also doesn't look like HDFS in terms of API or
> featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
> systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
> HDFS features like erasure coding, encryption, high-availability,
> snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
> means that Ozone feels like a new, different system that could reasonably
> be deployed and tested separately from HDFS. It's unlikely to replace many
> of today's HDFS deployments, and from what I understand, Ozone was not
> designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
> itself is a major undertaking. The discussion on HDFS-10419 is still
> ongoing so it’s not clear what the ultimate refactoring will be, but I do
> know that the earlier FSN/BM refactoring during 2.x was very painful
> (introducing new bugs and making backports difficult) and probably should
> have been deferred to a new major release instead. I think this refactoring
> is important for the long-term maintainability of the NN and worth
> pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
> for starting this refactoring. Really, I see the refactoring as the
> prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
> that Ozone/HDSL itself would benefit from being a separate project. Ozone
> could release faster and iterate more quickly if it wasn't hampered by
> Hadoop's release schedule and security and compatibility requirements.
> There are also publicity and community benefits; it's an opportunity to
> build a community focused on the novel capabilities and architectural
> choices of Ozone/HDSL. There are examples of other projects that were
> "incubated" on a branch in the Hadoop repo before being spun off to great
> success.In conclusion, I'd like to see Ozone succeeding and thriving as a
> separate project. Meanwhile, we can work on the HDFS refactoring required
> to separate the FSN and BM and make it pluggable. At that point (likely in
> the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
> Best,
> Andrew
>
> On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <jitendra@hortonworks.com
> > wrote:
>
>>     Dear folks,
>>            We would like to start a vote to merge HDFS-7240 branch into
>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>> jiras (See references below).
>>
>>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>> is a distributed, replicated block layer.
>>     The old HDFS namespace and NN can be connected to this new block
>> layer as we have described in HDFS-10419.
>>     We also introduce a key-value namespace called Ozone built on HDSL.
>>
>>     The code is in a separate module and is turned off by default. In a
>> secure setup, HDSL and Ozone daemons cannot be started.
>>
>>     The detailed documentation is available at
>>              https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>> Distributed+Storage+Layer+and+Applications
>>
>>
>>     I will start with my vote.
>>             +1 (binding)
>>
>>
>>     Discussion Thread:
>>               https://s.apache.org/7240-merge
>>               https://s.apache.org/4sfU
>>
>>     Jiras:
>>                https://issues.apache.org/jira/browse/HDFS-7240
>>                https://issues.apache.org/jira/browse/HDFS-10419
>>                https://issues.apache.org/jira/browse/HDFS-13074
>>                https://issues.apache.org/jira/browse/HDFS-13180
>>
>>
>>     Thanks
>>     jitendra
>>
>>
>>
>>
>>
>>             DISCUSSION THREAD SUMMARY :
>>
>>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>> wrote:
>>
>>                 Sorry the formatting got messed by my email client.  Here
>> it is again
>>
>>
>>                 Dear
>>                  Hadoop Community Members,
>>
>>                    We had multiple community discussions, a few meetings
>> in smaller groups and also jira discussions with respect to this thread. We
>> express our gratitude for participation and valuable comments.
>>
>>                 The key questions raised were following
>>                 1) How the new block storage layer and OzoneFS benefit
>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>> scalable namenode working with the new storage layer
>>                 2) We were asked to provide a security design
>>                 3)There were questions around stability given ozone
>> brings in a large body of code.
>>                 4) Why can’t they be separate projects forever or merged
>> in when production ready?
>>
>>                 We have responded to all the above questions with
>> detailed explanations and answers on the jira as well as in the
>> discussions. We believe that should sufficiently address community’s
>> concerns.
>>
>>                 Please see the summary below:
>>
>>                 1) The new code base benefits HDFS scaling and a roadmap
>> has been provided.
>>
>>                 Summary:
>>                   - New block storage layer addresses the scalability of
>> the block layer. We have shown how existing NN can be connected to the new
>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>> much simpler than 2nd milestone while giving almost the same scaling
>> benefits. Originally we had proposed simply milestone 2 and the community
>> felt that removing the FSN/BM lock was was a fair amount of work and a
>> simpler solution would be useful
>>                   - We provide a new K-V namespace called Ozone FS with
>> FileSystem/FileContext plugins to allow the users to use the new system.
>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>> facilitate stabilizing the new block layer.
>>                   - The new block layer has a new netty based protocol
>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>> block layer. See details below on sharing of code.
>>
>>
>>                 2) Stability impact on the existing HDFS code base and
>> code separation. The new block layer and the OzoneFS are in modules that
>> are separate from old HDFS code - currently there are no calls from HDFS
>> into Ozone except for DN starting the new block  layer module if configured
>> to do so. It does not add instability (the instability argument has been
>> raised many times). Over time as we share code, we will ensure that the old
>> HDFS continues to remains stable. (for example we plan to stabilize the new
>> netty based protocol engine in the new block layer before sharing it with
>> HDFS’s old block layer)
>>
>>
>>                 3) In the short term and medium term, the new system and
>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>> term for testing and side-by-side in the medium term for actual production
>> use till the new system has feature parity with old HDFS. During this time,
>> sharing the DN daemon and admin functions between the two systems is
>> operationally important:
>>                   - Sharing DN daemon to avoid additional operational
>> daemon lifecycle management
>>                   - Common decommissioning of the daemon and DN: One
>> place to decommission for a node and its storage.
>>                   - Replacing failed disks and internal balancing
>> capacity across disks - this needs to be done for both the current HDFS
>> blocks and the new block-layer blocks.
>>                   - Balancer: we would like use the same balancer and
>> provide a common way to balance and common management of the bandwidth used
>> for balancing
>>                   - Security configuration setup - reuse existing set up
>> for DNs rather then a new one for an independent cluster.
>>
>>
>>                 4) Need to easily share the block layer code between the
>> two systems when used side-by-side. Areas where sharing code is desired
>> over time:
>>                   - Sharing new block layer’s  new netty based protocol
>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>                   - Shallow data copy from old system to new system is
>> practical only if within same project and daemon otherwise have to deal
>> with security setting and coordinations across daemons. Shallow copy is
>> useful as customer migrate from old to new.
>>                   - Shared disk scheduling in the future and in the short
>> term have a single round robin rather than independent round robins.
>>                 While sharing code across projects is technically
>> possible (anything is possible in software),  it is significantly harder
>> typically requiring  cleaner public apis etc. Sharing within a project
>> though internal APIs is often simpler (such as the protocol engine that we
>> want to share).
>>
>>
>>                 5) Security design, including a threat model and and the
>> solution has been posted.
>>                 6) Temporary Separation and merge later: Several of the
>> comments in the jira have argued that we temporarily separate the two code
>> bases for now and then later merge them when the new code is stable:
>>
>>                   - If there is agreement to merge later, why bother
>> separating now - there needs to be to be good reasons to separate now.  We
>> have addressed the stability and separation of the new code from existing
>> above.
>>                   - Merge the new code back into HDFS later will be
>> harder.
>>
>>                     **The code and goals will diverge further.
>>                     ** We will be taking on extra work to split and then
>> take extra work to merge.
>>                     ** The issues raised today will be raised all the
>> same then.
>>
>>
>>                 ------------------------------
>> ---------------------------------------
>>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>> ache.org
>>                 For additional commands, e-mail:
>> hdfs-dev-help@hadoop.apache.org
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Resending since the formatting was messed up, let's try plain text this
time:

Hi Jitendra and all,

Thanks for putting this together. I caught up on the discussion on JIRA and
document at HDFS-10419, and still have the same concerns raised earlier
about merging the Ozone branch to trunk.

To recap these questions/concerns at a very high level:

* Wouldn't Ozone benefit from being a separate project?
* Why should it be merged now?

I still believe that both Ozone and Hadoop would benefit from Ozone being a
separate project, and that there is no pressing reason to merge Ozone/HDSL
now.

The primary reason I've heard for merging is that the Ozone is that it's at
a stage where it's ready for user feedback. Second, that it needs to be
merged to start on the NN refactoring for HDFS-on-HDSL.

First, without HDFS-on-HDSL support, users are testing against the Ozone
object storage interface. Ozone and HDSL themselves are implemented as
separate masters and new functionality bolted onto the datanode. It also
doesn't look like HDFS in terms of API or featureset; yes, it speaks
FileSystem, but so do many out-of-tree storage systems like S3, Ceph,
Swift, ADLS etc. Ozone/HDSL does not support popular HDFS features like
erasure coding, encryption, high-availability, snapshots, hflush/hsync (and
thus HBase), or APIs like WebHDFS or NFS. This means that Ozone feels like
a new, different system that could reasonably be deployed and tested
separately from HDFS. It's unlikely to replace many of today's HDFS
deployments, and from what I understand, Ozone was not designed to do this.

Second, the NameNode refactoring for HDFS-on-HDSL by itself is a major
undertaking. The discussion on HDFS-10419 is still ongoing so it’s not
clear what the ultimate refactoring will be, but I do know that the earlier
FSN/BM refactoring during 2.x was very painful (introducing new bugs and
making backports difficult) and probably should have been deferred to a new
major release instead. I think this refactoring is important for the
long-term maintainability of the NN and worth pursuing, but as a Hadoop 4.0
item. Merging HDSL is also not a prerequisite for starting this
refactoring. Really, I see the refactoring as the prerequisite for
HDFS-on-HDSL to be possible.

Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements. There are also publicity and community
benefits; it's an opportunity to build a community focused on the novel
capabilities and architectural choices of Ozone/HDSL. There are examples of
other projects that were "incubated" on a branch in the Hadoop repo before
being spun off to great success.

In conclusion, I'd like to see Ozone succeeding and thriving as a separate
project. Meanwhile, we can work on the HDFS refactoring required to
separate the FSN and BM and make it pluggable. At that point (likely in the
Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.

Best,
Andrew

On Tue, Feb 27, 2018 at 11:23 PM, Andrew Wang <an...@cloudera.com>
wrote:

>
>
>
>
>
>
>
>
>
> *Hi Jitendra and all,Thanks for putting this together. I caught up on the
> discussion on JIRA and document at HDFS-10419, and still have the same
> concerns raised earlier
> <https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
> about merging the Ozone branch to trunk.To recap these questions/concerns
> at a very high level:* Wouldn't Ozone benefit from being a separate
> project?* Why should it be merged now?I still believe that both Ozone and
> Hadoop would benefit from Ozone being a separate project, and that there is
> no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
> for merging is that the Ozone is that it's at a stage where it's ready for
> user feedback. Second, that it needs to be merged to start on the NN
> refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
> testing against the Ozone object storage interface. Ozone and HDSL
> themselves are implemented as separate masters and new functionality bolted
> onto the datanode. It also doesn't look like HDFS in terms of API or
> featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
> systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
> HDFS features like erasure coding, encryption, high-availability,
> snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
> means that Ozone feels like a new, different system that could reasonably
> be deployed and tested separately from HDFS. It's unlikely to replace many
> of today's HDFS deployments, and from what I understand, Ozone was not
> designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
> itself is a major undertaking. The discussion on HDFS-10419 is still
> ongoing so it’s not clear what the ultimate refactoring will be, but I do
> know that the earlier FSN/BM refactoring during 2.x was very painful
> (introducing new bugs and making backports difficult) and probably should
> have been deferred to a new major release instead. I think this refactoring
> is important for the long-term maintainability of the NN and worth
> pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
> for starting this refactoring. Really, I see the refactoring as the
> prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
> that Ozone/HDSL itself would benefit from being a separate project. Ozone
> could release faster and iterate more quickly if it wasn't hampered by
> Hadoop's release schedule and security and compatibility requirements.
> There are also publicity and community benefits; it's an opportunity to
> build a community focused on the novel capabilities and architectural
> choices of Ozone/HDSL. There are examples of other projects that were
> "incubated" on a branch in the Hadoop repo before being spun off to great
> success.In conclusion, I'd like to see Ozone succeeding and thriving as a
> separate project. Meanwhile, we can work on the HDFS refactoring required
> to separate the FSN and BM and make it pluggable. At that point (likely in
> the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
> Best,
> Andrew
>
> On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <jitendra@hortonworks.com
> > wrote:
>
>>     Dear folks,
>>            We would like to start a vote to merge HDFS-7240 branch into
>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>> jiras (See references below).
>>
>>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>> is a distributed, replicated block layer.
>>     The old HDFS namespace and NN can be connected to this new block
>> layer as we have described in HDFS-10419.
>>     We also introduce a key-value namespace called Ozone built on HDSL.
>>
>>     The code is in a separate module and is turned off by default. In a
>> secure setup, HDSL and Ozone daemons cannot be started.
>>
>>     The detailed documentation is available at
>>              https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>> Distributed+Storage+Layer+and+Applications
>>
>>
>>     I will start with my vote.
>>             +1 (binding)
>>
>>
>>     Discussion Thread:
>>               https://s.apache.org/7240-merge
>>               https://s.apache.org/4sfU
>>
>>     Jiras:
>>                https://issues.apache.org/jira/browse/HDFS-7240
>>                https://issues.apache.org/jira/browse/HDFS-10419
>>                https://issues.apache.org/jira/browse/HDFS-13074
>>                https://issues.apache.org/jira/browse/HDFS-13180
>>
>>
>>     Thanks
>>     jitendra
>>
>>
>>
>>
>>
>>             DISCUSSION THREAD SUMMARY :
>>
>>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>> wrote:
>>
>>                 Sorry the formatting got messed by my email client.  Here
>> it is again
>>
>>
>>                 Dear
>>                  Hadoop Community Members,
>>
>>                    We had multiple community discussions, a few meetings
>> in smaller groups and also jira discussions with respect to this thread. We
>> express our gratitude for participation and valuable comments.
>>
>>                 The key questions raised were following
>>                 1) How the new block storage layer and OzoneFS benefit
>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>> scalable namenode working with the new storage layer
>>                 2) We were asked to provide a security design
>>                 3)There were questions around stability given ozone
>> brings in a large body of code.
>>                 4) Why can’t they be separate projects forever or merged
>> in when production ready?
>>
>>                 We have responded to all the above questions with
>> detailed explanations and answers on the jira as well as in the
>> discussions. We believe that should sufficiently address community’s
>> concerns.
>>
>>                 Please see the summary below:
>>
>>                 1) The new code base benefits HDFS scaling and a roadmap
>> has been provided.
>>
>>                 Summary:
>>                   - New block storage layer addresses the scalability of
>> the block layer. We have shown how existing NN can be connected to the new
>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>> much simpler than 2nd milestone while giving almost the same scaling
>> benefits. Originally we had proposed simply milestone 2 and the community
>> felt that removing the FSN/BM lock was was a fair amount of work and a
>> simpler solution would be useful
>>                   - We provide a new K-V namespace called Ozone FS with
>> FileSystem/FileContext plugins to allow the users to use the new system.
>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>> facilitate stabilizing the new block layer.
>>                   - The new block layer has a new netty based protocol
>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>> block layer. See details below on sharing of code.
>>
>>
>>                 2) Stability impact on the existing HDFS code base and
>> code separation. The new block layer and the OzoneFS are in modules that
>> are separate from old HDFS code - currently there are no calls from HDFS
>> into Ozone except for DN starting the new block  layer module if configured
>> to do so. It does not add instability (the instability argument has been
>> raised many times). Over time as we share code, we will ensure that the old
>> HDFS continues to remains stable. (for example we plan to stabilize the new
>> netty based protocol engine in the new block layer before sharing it with
>> HDFS’s old block layer)
>>
>>
>>                 3) In the short term and medium term, the new system and
>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>> term for testing and side-by-side in the medium term for actual production
>> use till the new system has feature parity with old HDFS. During this time,
>> sharing the DN daemon and admin functions between the two systems is
>> operationally important:
>>                   - Sharing DN daemon to avoid additional operational
>> daemon lifecycle management
>>                   - Common decommissioning of the daemon and DN: One
>> place to decommission for a node and its storage.
>>                   - Replacing failed disks and internal balancing
>> capacity across disks - this needs to be done for both the current HDFS
>> blocks and the new block-layer blocks.
>>                   - Balancer: we would like use the same balancer and
>> provide a common way to balance and common management of the bandwidth used
>> for balancing
>>                   - Security configuration setup - reuse existing set up
>> for DNs rather then a new one for an independent cluster.
>>
>>
>>                 4) Need to easily share the block layer code between the
>> two systems when used side-by-side. Areas where sharing code is desired
>> over time:
>>                   - Sharing new block layer’s  new netty based protocol
>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>                   - Shallow data copy from old system to new system is
>> practical only if within same project and daemon otherwise have to deal
>> with security setting and coordinations across daemons. Shallow copy is
>> useful as customer migrate from old to new.
>>                   - Shared disk scheduling in the future and in the short
>> term have a single round robin rather than independent round robins.
>>                 While sharing code across projects is technically
>> possible (anything is possible in software),  it is significantly harder
>> typically requiring  cleaner public apis etc. Sharing within a project
>> though internal APIs is often simpler (such as the protocol engine that we
>> want to share).
>>
>>
>>                 5) Security design, including a threat model and and the
>> solution has been posted.
>>                 6) Temporary Separation and merge later: Several of the
>> comments in the jira have argued that we temporarily separate the two code
>> bases for now and then later merge them when the new code is stable:
>>
>>                   - If there is agreement to merge later, why bother
>> separating now - there needs to be to be good reasons to separate now.  We
>> have addressed the stability and separation of the new code from existing
>> above.
>>                   - Merge the new code back into HDFS later will be
>> harder.
>>
>>                     **The code and goals will diverge further.
>>                     ** We will be taking on extra work to split and then
>> take extra work to merge.
>>                     ** The issues raised today will be raised all the
>> same then.
>>
>>
>>                 ------------------------------
>> ---------------------------------------
>>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>> ache.org
>>                 For additional commands, e-mail:
>> hdfs-dev-help@hadoop.apache.org
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Resending since the formatting was messed up, let's try plain text this
time:

Hi Jitendra and all,

Thanks for putting this together. I caught up on the discussion on JIRA and
document at HDFS-10419, and still have the same concerns raised earlier
about merging the Ozone branch to trunk.

To recap these questions/concerns at a very high level:

* Wouldn't Ozone benefit from being a separate project?
* Why should it be merged now?

I still believe that both Ozone and Hadoop would benefit from Ozone being a
separate project, and that there is no pressing reason to merge Ozone/HDSL
now.

The primary reason I've heard for merging is that the Ozone is that it's at
a stage where it's ready for user feedback. Second, that it needs to be
merged to start on the NN refactoring for HDFS-on-HDSL.

First, without HDFS-on-HDSL support, users are testing against the Ozone
object storage interface. Ozone and HDSL themselves are implemented as
separate masters and new functionality bolted onto the datanode. It also
doesn't look like HDFS in terms of API or featureset; yes, it speaks
FileSystem, but so do many out-of-tree storage systems like S3, Ceph,
Swift, ADLS etc. Ozone/HDSL does not support popular HDFS features like
erasure coding, encryption, high-availability, snapshots, hflush/hsync (and
thus HBase), or APIs like WebHDFS or NFS. This means that Ozone feels like
a new, different system that could reasonably be deployed and tested
separately from HDFS. It's unlikely to replace many of today's HDFS
deployments, and from what I understand, Ozone was not designed to do this.

Second, the NameNode refactoring for HDFS-on-HDSL by itself is a major
undertaking. The discussion on HDFS-10419 is still ongoing so it’s not
clear what the ultimate refactoring will be, but I do know that the earlier
FSN/BM refactoring during 2.x was very painful (introducing new bugs and
making backports difficult) and probably should have been deferred to a new
major release instead. I think this refactoring is important for the
long-term maintainability of the NN and worth pursuing, but as a Hadoop 4.0
item. Merging HDSL is also not a prerequisite for starting this
refactoring. Really, I see the refactoring as the prerequisite for
HDFS-on-HDSL to be possible.

Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements. There are also publicity and community
benefits; it's an opportunity to build a community focused on the novel
capabilities and architectural choices of Ozone/HDSL. There are examples of
other projects that were "incubated" on a branch in the Hadoop repo before
being spun off to great success.

In conclusion, I'd like to see Ozone succeeding and thriving as a separate
project. Meanwhile, we can work on the HDFS refactoring required to
separate the FSN and BM and make it pluggable. At that point (likely in the
Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.

Best,
Andrew

On Tue, Feb 27, 2018 at 11:23 PM, Andrew Wang <an...@cloudera.com>
wrote:

>
>
>
>
>
>
>
>
>
> *Hi Jitendra and all,Thanks for putting this together. I caught up on the
> discussion on JIRA and document at HDFS-10419, and still have the same
> concerns raised earlier
> <https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
> about merging the Ozone branch to trunk.To recap these questions/concerns
> at a very high level:* Wouldn't Ozone benefit from being a separate
> project?* Why should it be merged now?I still believe that both Ozone and
> Hadoop would benefit from Ozone being a separate project, and that there is
> no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
> for merging is that the Ozone is that it's at a stage where it's ready for
> user feedback. Second, that it needs to be merged to start on the NN
> refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
> testing against the Ozone object storage interface. Ozone and HDSL
> themselves are implemented as separate masters and new functionality bolted
> onto the datanode. It also doesn't look like HDFS in terms of API or
> featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
> systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
> HDFS features like erasure coding, encryption, high-availability,
> snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
> means that Ozone feels like a new, different system that could reasonably
> be deployed and tested separately from HDFS. It's unlikely to replace many
> of today's HDFS deployments, and from what I understand, Ozone was not
> designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
> itself is a major undertaking. The discussion on HDFS-10419 is still
> ongoing so it’s not clear what the ultimate refactoring will be, but I do
> know that the earlier FSN/BM refactoring during 2.x was very painful
> (introducing new bugs and making backports difficult) and probably should
> have been deferred to a new major release instead. I think this refactoring
> is important for the long-term maintainability of the NN and worth
> pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
> for starting this refactoring. Really, I see the refactoring as the
> prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
> that Ozone/HDSL itself would benefit from being a separate project. Ozone
> could release faster and iterate more quickly if it wasn't hampered by
> Hadoop's release schedule and security and compatibility requirements.
> There are also publicity and community benefits; it's an opportunity to
> build a community focused on the novel capabilities and architectural
> choices of Ozone/HDSL. There are examples of other projects that were
> "incubated" on a branch in the Hadoop repo before being spun off to great
> success.In conclusion, I'd like to see Ozone succeeding and thriving as a
> separate project. Meanwhile, we can work on the HDFS refactoring required
> to separate the FSN and BM and make it pluggable. At that point (likely in
> the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
> Best,
> Andrew
>
> On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <jitendra@hortonworks.com
> > wrote:
>
>>     Dear folks,
>>            We would like to start a vote to merge HDFS-7240 branch into
>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>> jiras (See references below).
>>
>>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>> is a distributed, replicated block layer.
>>     The old HDFS namespace and NN can be connected to this new block
>> layer as we have described in HDFS-10419.
>>     We also introduce a key-value namespace called Ozone built on HDSL.
>>
>>     The code is in a separate module and is turned off by default. In a
>> secure setup, HDSL and Ozone daemons cannot be started.
>>
>>     The detailed documentation is available at
>>              https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>> Distributed+Storage+Layer+and+Applications
>>
>>
>>     I will start with my vote.
>>             +1 (binding)
>>
>>
>>     Discussion Thread:
>>               https://s.apache.org/7240-merge
>>               https://s.apache.org/4sfU
>>
>>     Jiras:
>>                https://issues.apache.org/jira/browse/HDFS-7240
>>                https://issues.apache.org/jira/browse/HDFS-10419
>>                https://issues.apache.org/jira/browse/HDFS-13074
>>                https://issues.apache.org/jira/browse/HDFS-13180
>>
>>
>>     Thanks
>>     jitendra
>>
>>
>>
>>
>>
>>             DISCUSSION THREAD SUMMARY :
>>
>>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>> wrote:
>>
>>                 Sorry the formatting got messed by my email client.  Here
>> it is again
>>
>>
>>                 Dear
>>                  Hadoop Community Members,
>>
>>                    We had multiple community discussions, a few meetings
>> in smaller groups and also jira discussions with respect to this thread. We
>> express our gratitude for participation and valuable comments.
>>
>>                 The key questions raised were following
>>                 1) How the new block storage layer and OzoneFS benefit
>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>> scalable namenode working with the new storage layer
>>                 2) We were asked to provide a security design
>>                 3)There were questions around stability given ozone
>> brings in a large body of code.
>>                 4) Why can’t they be separate projects forever or merged
>> in when production ready?
>>
>>                 We have responded to all the above questions with
>> detailed explanations and answers on the jira as well as in the
>> discussions. We believe that should sufficiently address community’s
>> concerns.
>>
>>                 Please see the summary below:
>>
>>                 1) The new code base benefits HDFS scaling and a roadmap
>> has been provided.
>>
>>                 Summary:
>>                   - New block storage layer addresses the scalability of
>> the block layer. We have shown how existing NN can be connected to the new
>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>> much simpler than 2nd milestone while giving almost the same scaling
>> benefits. Originally we had proposed simply milestone 2 and the community
>> felt that removing the FSN/BM lock was was a fair amount of work and a
>> simpler solution would be useful
>>                   - We provide a new K-V namespace called Ozone FS with
>> FileSystem/FileContext plugins to allow the users to use the new system.
>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>> facilitate stabilizing the new block layer.
>>                   - The new block layer has a new netty based protocol
>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>> block layer. See details below on sharing of code.
>>
>>
>>                 2) Stability impact on the existing HDFS code base and
>> code separation. The new block layer and the OzoneFS are in modules that
>> are separate from old HDFS code - currently there are no calls from HDFS
>> into Ozone except for DN starting the new block  layer module if configured
>> to do so. It does not add instability (the instability argument has been
>> raised many times). Over time as we share code, we will ensure that the old
>> HDFS continues to remains stable. (for example we plan to stabilize the new
>> netty based protocol engine in the new block layer before sharing it with
>> HDFS’s old block layer)
>>
>>
>>                 3) In the short term and medium term, the new system and
>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>> term for testing and side-by-side in the medium term for actual production
>> use till the new system has feature parity with old HDFS. During this time,
>> sharing the DN daemon and admin functions between the two systems is
>> operationally important:
>>                   - Sharing DN daemon to avoid additional operational
>> daemon lifecycle management
>>                   - Common decommissioning of the daemon and DN: One
>> place to decommission for a node and its storage.
>>                   - Replacing failed disks and internal balancing
>> capacity across disks - this needs to be done for both the current HDFS
>> blocks and the new block-layer blocks.
>>                   - Balancer: we would like use the same balancer and
>> provide a common way to balance and common management of the bandwidth used
>> for balancing
>>                   - Security configuration setup - reuse existing set up
>> for DNs rather then a new one for an independent cluster.
>>
>>
>>                 4) Need to easily share the block layer code between the
>> two systems when used side-by-side. Areas where sharing code is desired
>> over time:
>>                   - Sharing new block layer’s  new netty based protocol
>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>                   - Shallow data copy from old system to new system is
>> practical only if within same project and daemon otherwise have to deal
>> with security setting and coordinations across daemons. Shallow copy is
>> useful as customer migrate from old to new.
>>                   - Shared disk scheduling in the future and in the short
>> term have a single round robin rather than independent round robins.
>>                 While sharing code across projects is technically
>> possible (anything is possible in software),  it is significantly harder
>> typically requiring  cleaner public apis etc. Sharing within a project
>> though internal APIs is often simpler (such as the protocol engine that we
>> want to share).
>>
>>
>>                 5) Security design, including a threat model and and the
>> solution has been posted.
>>                 6) Temporary Separation and merge later: Several of the
>> comments in the jira have argued that we temporarily separate the two code
>> bases for now and then later merge them when the new code is stable:
>>
>>                   - If there is agreement to merge later, why bother
>> separating now - there needs to be to be good reasons to separate now.  We
>> have addressed the stability and separation of the new code from existing
>> above.
>>                   - Merge the new code back into HDFS later will be
>> harder.
>>
>>                     **The code and goals will diverge further.
>>                     ** We will be taking on extra work to split and then
>> take extra work to merge.
>>                     ** The issues raised today will be raised all the
>> same then.
>>
>>
>>                 ------------------------------
>> ---------------------------------------
>>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>> ache.org
>>                 For additional commands, e-mail:
>> hdfs-dev-help@hadoop.apache.org
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

*Hi Jitendra and all,Thanks for putting this together. I caught up on the
discussion on JIRA and document at HDFS-10419, and still have the same
concerns raised earlier
<https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
about merging the Ozone branch to trunk.To recap these questions/concerns
at a very high level:* Wouldn't Ozone benefit from being a separate
project?* Why should it be merged now?I still believe that both Ozone and
Hadoop would benefit from Ozone being a separate project, and that there is
no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
for merging is that the Ozone is that it's at a stage where it's ready for
user feedback. Second, that it needs to be merged to start on the NN
refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
testing against the Ozone object storage interface. Ozone and HDSL
themselves are implemented as separate masters and new functionality bolted
onto the datanode. It also doesn't look like HDFS in terms of API or
featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
HDFS features like erasure coding, encryption, high-availability,
snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
means that Ozone feels like a new, different system that could reasonably
be deployed and tested separately from HDFS. It's unlikely to replace many
of today's HDFS deployments, and from what I understand, Ozone was not
designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
itself is a major undertaking. The discussion on HDFS-10419 is still
ongoing so it’s not clear what the ultimate refactoring will be, but I do
know that the earlier FSN/BM refactoring during 2.x was very painful
(introducing new bugs and making backports difficult) and probably should
have been deferred to a new major release instead. I think this refactoring
is important for the long-term maintainability of the NN and worth
pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
for starting this refactoring. Really, I see the refactoring as the
prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
that Ozone/HDSL itself would benefit from being a separate project. Ozone
could release faster and iterate more quickly if it wasn't hampered by
Hadoop's release schedule and security and compatibility requirements.
There are also publicity and community benefits; it's an opportunity to
build a community focused on the novel capabilities and architectural
choices of Ozone/HDSL. There are examples of other projects that were
"incubated" on a branch in the Hadoop repo before being spun off to great
success.In conclusion, I'd like to see Ozone succeeding and thriving as a
separate project. Meanwhile, we can work on the HDFS refactoring required
to separate the FSN and BM and make it pluggable. At that point (likely in
the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
Best,
Andrew

On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

>     Dear folks,
>            We would like to start a vote to merge HDFS-7240 branch into
> trunk. The context can be reviewed in the DISCUSSION thread, and in the
> jiras (See references below).
>
>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is
> a distributed, replicated block layer.
>     The old HDFS namespace and NN can be connected to this new block layer
> as we have described in HDFS-10419.
>     We also introduce a key-value namespace called Ozone built on HDSL.
>
>     The code is in a separate module and is turned off by default. In a
> secure setup, HDSL and Ozone daemons cannot be started.
>
>     The detailed documentation is available at
>              https://cwiki.apache.org/confluence/display/HADOOP/
> Hadoop+Distributed+Storage+Layer+and+Applications
>
>
>     I will start with my vote.
>             +1 (binding)
>
>
>     Discussion Thread:
>               https://s.apache.org/7240-merge
>               https://s.apache.org/4sfU
>
>     Jiras:
>                https://issues.apache.org/jira/browse/HDFS-7240
>                https://issues.apache.org/jira/browse/HDFS-10419
>                https://issues.apache.org/jira/browse/HDFS-13074
>                https://issues.apache.org/jira/browse/HDFS-13180
>
>
>     Thanks
>     jitendra
>
>
>
>
>
>             DISCUSSION THREAD SUMMARY :
>
>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> wrote:
>
>                 Sorry the formatting got messed by my email client.  Here
> it is again
>
>
>                 Dear
>                  Hadoop Community Members,
>
>                    We had multiple community discussions, a few meetings
> in smaller groups and also jira discussions with respect to this thread. We
> express our gratitude for participation and valuable comments.
>
>                 The key questions raised were following
>                 1) How the new block storage layer and OzoneFS benefit
> HDFS and we were asked to chalk out a roadmap towards the goal of a
> scalable namenode working with the new storage layer
>                 2) We were asked to provide a security design
>                 3)There were questions around stability given ozone brings
> in a large body of code.
>                 4) Why can’t they be separate projects forever or merged
> in when production ready?
>
>                 We have responded to all the above questions with detailed
> explanations and answers on the jira as well as in the discussions. We
> believe that should sufficiently address community’s concerns.
>
>                 Please see the summary below:
>
>                 1) The new code base benefits HDFS scaling and a roadmap
> has been provided.
>
>                 Summary:
>                   - New block storage layer addresses the scalability of
> the block layer. We have shown how existing NN can be connected to the new
> block layer and its benefits. We have shown 2 milestones, 1st milestone is
> much simpler than 2nd milestone while giving almost the same scaling
> benefits. Originally we had proposed simply milestone 2 and the community
> felt that removing the FSN/BM lock was was a fair amount of work and a
> simpler solution would be useful
>                   - We provide a new K-V namespace called Ozone FS with
> FileSystem/FileContext plugins to allow the users to use the new system.
> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
> facilitate stabilizing the new block layer.
>                   - The new block layer has a new netty based protocol
> engine in the Datanode which, when stabilized, can be used by  the old hdfs
> block layer. See details below on sharing of code.
>
>
>                 2) Stability impact on the existing HDFS code base and
> code separation. The new block layer and the OzoneFS are in modules that
> are separate from old HDFS code - currently there are no calls from HDFS
> into Ozone except for DN starting the new block  layer module if configured
> to do so. It does not add instability (the instability argument has been
> raised many times). Over time as we share code, we will ensure that the old
> HDFS continues to remains stable. (for example we plan to stabilize the new
> netty based protocol engine in the new block layer before sharing it with
> HDFS’s old block layer)
>
>
>                 3) In the short term and medium term, the new system and
> HDFS  will be used side-by-side by users. Side by-side usage in the short
> term for testing and side-by-side in the medium term for actual production
> use till the new system has feature parity with old HDFS. During this time,
> sharing the DN daemon and admin functions between the two systems is
> operationally important:
>                   - Sharing DN daemon to avoid additional operational
> daemon lifecycle management
>                   - Common decommissioning of the daemon and DN: One place
> to decommission for a node and its storage.
>                   - Replacing failed disks and internal balancing capacity
> across disks - this needs to be done for both the current HDFS blocks and
> the new block-layer blocks.
>                   - Balancer: we would like use the same balancer and
> provide a common way to balance and common management of the bandwidth used
> for balancing
>                   - Security configuration setup - reuse existing set up
> for DNs rather then a new one for an independent cluster.
>
>
>                 4) Need to easily share the block layer code between the
> two systems when used side-by-side. Areas where sharing code is desired
> over time:
>                   - Sharing new block layer’s  new netty based protocol
> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>                   - Shallow data copy from old system to new system is
> practical only if within same project and daemon otherwise have to deal
> with security setting and coordinations across daemons. Shallow copy is
> useful as customer migrate from old to new.
>                   - Shared disk scheduling in the future and in the short
> term have a single round robin rather than independent round robins.
>                 While sharing code across projects is technically possible
> (anything is possible in software),  it is significantly harder typically
> requiring  cleaner public apis etc. Sharing within a project though
> internal APIs is often simpler (such as the protocol engine that we want to
> share).
>
>
>                 5) Security design, including a threat model and and the
> solution has been posted.
>                 6) Temporary Separation and merge later: Several of the
> comments in the jira have argued that we temporarily separate the two code
> bases for now and then later merge them when the new code is stable:
>
>                   - If there is agreement to merge later, why bother
> separating now - there needs to be to be good reasons to separate now.  We
> have addressed the stability and separation of the new code from existing
> above.
>                   - Merge the new code back into HDFS later will be harder.
>
>                     **The code and goals will diverge further.
>                     ** We will be taking on extra work to split and then
> take extra work to merge.
>                     ** The issues raised today will be raised all the same
> then.
>
>
>                 ------------------------------
> ---------------------------------------
>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> apache.org
>                 For additional commands, e-mail:
> hdfs-dev-help@hadoop.apache.org
>
>
>
>
>
>
>
>
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

> On Mar 5, 2018, at 4:08 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> - NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen acknowledge the benefit of the new block layer).  We have two choices here
>  ** a) Evolve NN so that it can interact with both old and new block layer,
>  **  b) Fork and create new NN that works only with new block layer, the old NN will continue to work with old block layer.
> There are trade-offs but clearly the 2nd option has least impact on the old HDFS code.
> 
> Are you proposing that we pursue the 2nd option to integrate HDSL with HDFS?

Originally I would have prefered (a); but Owen made a strong case for (b) in my discussions with his last week.
Overall we need a broader discussion around the next steps for NN evolution and how to chart the course; I am not locked into any particular path or how we would do it. 
Let me make a more detailed response in HDFS-10419.

sanjay

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

> On Mar 5, 2018, at 4:08 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> - NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen acknowledge the benefit of the new block layer).  We have two choices here
>  ** a) Evolve NN so that it can interact with both old and new block layer,
>  **  b) Fork and create new NN that works only with new block layer, the old NN will continue to work with old block layer.
> There are trade-offs but clearly the 2nd option has least impact on the old HDFS code.
> 
> Are you proposing that we pursue the 2nd option to integrate HDSL with HDFS?

Originally I would have prefered (a); but Owen made a strong case for (b) in my discussions with his last week.
Overall we need a broader discussion around the next steps for NN evolution and how to chart the course; I am not locked into any particular path or how we would do it. 
Let me make a more detailed response in HDFS-10419.

sanjay

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

> On Mar 5, 2018, at 4:08 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> - NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen acknowledge the benefit of the new block layer).  We have two choices here
>  ** a) Evolve NN so that it can interact with both old and new block layer,
>  **  b) Fork and create new NN that works only with new block layer, the old NN will continue to work with old block layer.
> There are trade-offs but clearly the 2nd option has least impact on the old HDFS code.
> 
> Are you proposing that we pursue the 2nd option to integrate HDSL with HDFS?

Originally I would have prefered (a); but Owen made a strong case for (b) in my discussions with his last week.
Overall we need a broader discussion around the next steps for NN evolution and how to chart the course; I am not locked into any particular path or how we would do it. 
Let me make a more detailed response in HDFS-10419.

sanjay

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

> On Mar 5, 2018, at 4:08 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> - NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen acknowledge the benefit of the new block layer).  We have two choices here
>  ** a) Evolve NN so that it can interact with both old and new block layer,
>  **  b) Fork and create new NN that works only with new block layer, the old NN will continue to work with old block layer.
> There are trade-offs but clearly the 2nd option has least impact on the old HDFS code.
> 
> Are you proposing that we pursue the 2nd option to integrate HDSL with HDFS?

Originally I would have prefered (a); but Owen made a strong case for (b) in my discussions with his last week.
Overall we need a broader discussion around the next steps for NN evolution and how to chart the course; I am not locked into any particular path or how we would do it. 
Let me make a more detailed response in HDFS-10419.

sanjay

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Sanjay, thanks for the response, replying inline:

- NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen
> acknowledge the benefit of the new block layer).  We have two choices here
>  ** a) Evolve NN so that it can interact with both old and new block layer,
>  **  b) Fork and create new NN that works only with new block layer, the
> old NN will continue to work with old block layer.
> There are trade-offs but clearly the 2nd option has least impact on the
> old HDFS code.
>
> Are you proposing that we pursue the 2nd option to integrate HDSL with
HDFS?


> - Share the HDSL’s netty  protocol engine with HDFS block layer.  After
> HDSL and Ozone has stabilized the engine, put the new netty engine in
> either HDFS or in Hadoop common - HDSL will use it from there. The HDFS
> community  has been talking about moving to better thread model for HDFS
> DNs since release 0.16!!
>
> The Netty-based protocol engine seems like it could be contributed
separately from HDSL. I'd be interested to learn more about the performance
and other improvements from this new engine.


> - Shallow copy. Here HDSL needs a way to get the actual linux file system
> links - HDFS block layer needs  to provide a private secure API to get file
> names of blocks so that HDSL can do a hard link (hence shallow copy)o
>

Why isn't this possible with two processes? SCR for instance securely
passes file descriptors between the DN and client over a unix domain
socket. I'm sure we can construct a protocol that securely and efficiently
creates hardlinks.

It also sounds like this shallow copy won't work with features like HDFS
encryption or erasure coding, which diminishes its utility. We also don't
even have HDFS-to-HDFS shallow copy yet, so HDFS-to-Ozone shallow copy is
even further out.

Best,
Andrew

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Sanjay, thanks for the response, replying inline:

- NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen
> acknowledge the benefit of the new block layer).  We have two choices here
>  ** a) Evolve NN so that it can interact with both old and new block layer,
>  **  b) Fork and create new NN that works only with new block layer, the
> old NN will continue to work with old block layer.
> There are trade-offs but clearly the 2nd option has least impact on the
> old HDFS code.
>
> Are you proposing that we pursue the 2nd option to integrate HDSL with
HDFS?


> - Share the HDSL’s netty  protocol engine with HDFS block layer.  After
> HDSL and Ozone has stabilized the engine, put the new netty engine in
> either HDFS or in Hadoop common - HDSL will use it from there. The HDFS
> community  has been talking about moving to better thread model for HDFS
> DNs since release 0.16!!
>
> The Netty-based protocol engine seems like it could be contributed
separately from HDSL. I'd be interested to learn more about the performance
and other improvements from this new engine.


> - Shallow copy. Here HDSL needs a way to get the actual linux file system
> links - HDFS block layer needs  to provide a private secure API to get file
> names of blocks so that HDSL can do a hard link (hence shallow copy)o
>

Why isn't this possible with two processes? SCR for instance securely
passes file descriptors between the DN and client over a unix domain
socket. I'm sure we can construct a protocol that securely and efficiently
creates hardlinks.

It also sounds like this shallow copy won't work with features like HDFS
encryption or erasure coding, which diminishes its utility. We also don't
even have HDFS-to-HDFS shallow copy yet, so HDFS-to-Ozone shallow copy is
even further out.

Best,
Andrew

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Sanjay, thanks for the response, replying inline:

- NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen
> acknowledge the benefit of the new block layer).  We have two choices here
>  ** a) Evolve NN so that it can interact with both old and new block layer,
>  **  b) Fork and create new NN that works only with new block layer, the
> old NN will continue to work with old block layer.
> There are trade-offs but clearly the 2nd option has least impact on the
> old HDFS code.
>
> Are you proposing that we pursue the 2nd option to integrate HDSL with
HDFS?


> - Share the HDSL’s netty  protocol engine with HDFS block layer.  After
> HDSL and Ozone has stabilized the engine, put the new netty engine in
> either HDFS or in Hadoop common - HDSL will use it from there. The HDFS
> community  has been talking about moving to better thread model for HDFS
> DNs since release 0.16!!
>
> The Netty-based protocol engine seems like it could be contributed
separately from HDSL. I'd be interested to learn more about the performance
and other improvements from this new engine.


> - Shallow copy. Here HDSL needs a way to get the actual linux file system
> links - HDFS block layer needs  to provide a private secure API to get file
> names of blocks so that HDSL can do a hard link (hence shallow copy)o
>

Why isn't this possible with two processes? SCR for instance securely
passes file descriptors between the DN and client over a unix domain
socket. I'm sure we can construct a protocol that securely and efficiently
creates hardlinks.

It also sounds like this shallow copy won't work with features like HDFS
encryption or erasure coding, which diminishes its utility. We also don't
even have HDFS-to-HDFS shallow copy yet, so HDFS-to-Ozone shallow copy is
even further out.

Best,
Andrew

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Sanjay, thanks for the response, replying inline:

- NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen
> acknowledge the benefit of the new block layer).  We have two choices here
>  ** a) Evolve NN so that it can interact with both old and new block layer,
>  **  b) Fork and create new NN that works only with new block layer, the
> old NN will continue to work with old block layer.
> There are trade-offs but clearly the 2nd option has least impact on the
> old HDFS code.
>
> Are you proposing that we pursue the 2nd option to integrate HDSL with
HDFS?


> - Share the HDSL’s netty  protocol engine with HDFS block layer.  After
> HDSL and Ozone has stabilized the engine, put the new netty engine in
> either HDFS or in Hadoop common - HDSL will use it from there. The HDFS
> community  has been talking about moving to better thread model for HDFS
> DNs since release 0.16!!
>
> The Netty-based protocol engine seems like it could be contributed
separately from HDSL. I'd be interested to learn more about the performance
and other improvements from this new engine.


> - Shallow copy. Here HDSL needs a way to get the actual linux file system
> links - HDFS block layer needs  to provide a private secure API to get file
> names of blocks so that HDSL can do a hard link (hence shallow copy)o
>

Why isn't this possible with two processes? SCR for instance securely
passes file descriptors between the DN and client over a unix domain
socket. I'm sure we can construct a protocol that securely and efficiently
creates hardlinks.

It also sounds like this shallow copy won't work with features like HDFS
encryption or erasure coding, which diminishes its utility. We also don't
even have HDFS-to-HDFS shallow copy yet, so HDFS-to-Ozone shallow copy is
even further out.

Best,
Andrew

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Andrew
  Thanks for your response. 

 In this email let me focus on maintenance and unnecessary impact on HDFS.
Daryn also touched on this topic and looked at the code base from the developer impact point of view. He appreciated that the code is separate and I agree with his suggestion to move it further up the src tree (e.g. Hadoop-hdsl-project or hadoop-hdfs-project/hadoop-hdsl). He also gave a good analogy to the store: do not break things as you change and evolve the store. Let’s look at the areas of future interaction as examples.

- NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen acknowledge the benefit of the new block layer).  We have two choices here 
 ** a) Evolve NN so that it can interact with both old and new block layer, 
 **  b) Fork and create new NN that works only with new block layer, the old NN will continue to work with old block layer. 
There are trade-offs but clearly the 2nd option has least impact on the old HDFS code.  

- Share the HDSL’s netty  protocol engine with HDFS block layer.  After HDSL and Ozone has stabilized the engine, put the new netty engine in either HDFS or in Hadoop common - HDSL will use it from there. The HDFS community  has been talking about moving to better thread model for HDFS DNs since release 0.16!!

- Shallow copy. Here HDSL needs a way to get the actual linux file system links - HDFS block layer needs  to provide a private secure API to get file names of blocks so that HDSL can do a hard link (hence shallow copy)o

The first 2 examples are beneficial to existing HDFS and the maintenance burden can be minimized and worth the benefits (2x NN scalability!! And more efficient protocol engine). The 3rd is only beneficial to HDFS users who want the scalability of the new HDSL/Ozone code in a side-by-side system; here the cost is providing a  private API to access the block file name. 


sanjay

> On Mar 1, 2018, at 11:03 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Hi Sanjay,
> 
> I have different opinions about what's important and how to eventually
> integrate this code, and that's not because I'm "conveniently ignoring"
> your responses. I'm also not making some of the arguments you claim I am
> making. Attacking arguments I'm not making is not going to change my mind,
> so let's bring it back to the arguments I am making.
> 
> Here's what it comes down to: HDFS-on-HDSL is not going to be ready in the
> near-term, and it comes with a maintenance cost.
> 
> I did read the proposal on HDFS-10419 and I understood that HDFS-on-HDSL
> integration does not necessarily require a lock split. However, there still
> needs to be refactoring to clearly define the FSN and BM interfaces and
> make the BM pluggable so HDSL can be swapped in. This is a major
> undertaking and risky. We did a similar refactoring in 2.x which made
> backports hard and introduced bugs. I don't think we should have done this
> in a minor release.
> 
> Furthermore, I don't know what your expectation is on how long it will take
> to stabilize HDSL, but this horizon for other storage systems is typically
> measured in years rather than months.
> 
> Both of these feel like Hadoop 4 items: a ways out yet.
> 
> Moving on, there is a non-trivial maintenance cost to having this new code
> in the code base. Ozone bugs become our bugs. Ozone dependencies become our
> dependencies. Ozone's security flaws are our security flaws. All of this
> negatively affects our already lumbering release schedule, and thus our
> ability to deliver and iterate on the features we're already trying to
> ship. Even if Ozone is separate and off by default, this is still a large
> amount of code that comes with a large maintenance cost. I don't want to
> incur this cost when the benefit is still a ways out.
> 
> We disagree on the necessity of sharing a repo and sharing operational
> behaviors. Libraries exist as a method for sharing code. HDFS also hardly
> has a monopoly on intermediating storage today. Disks are shared with MR
> shuffle, Spark/Impala spill, log output, Kudu, Kafka, etc. Operationally
> we've made this work. Having Ozone/HDSL in a separate process can even be
> seen as an operational advantage since it's isolated. I firmly believe that
> we can solve any implementation issues even with separate processes.
> 
> This is why I asked about making this a separate project. Given that these
> two efforts (HDSL stabilization and NN refactoring) are a ways out, the
> best way to get Ozone/HDSL in the hands of users today is to release it as
> its own project. Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.
> 
> I'm excited about the possibilities of both HDSL and the NN refactoring in
> ensuring a future for HDFS for years to come. A pluggable block manager
> would also let us experiment with things like HDFS-on-S3, increasingly
> important in a cloud-centric world. CBlock would bring HDFS to new usecases
> around generic container workloads. However, given the timeline for
> completing these efforts, now is not the time to merge.
> 
> Best,
> Andrew
> 
> On Thu, Mar 1, 2018 at 5:33 PM, Daryn Sharp <da...@oath.com.invalid> wrote:
> 
>> I’m generally neutral and looked foremost at developer impact.  Ie.  Will
>> it be so intertwined with hdfs that each project risks destabilizing the
>> other?  Will developers with no expertise in ozone will be impeded?  I
>> think the answer is currently no.  These are the intersections and some
>> concerns based on the assumption ozone is accepted into the project:
>> 
>> 
>> Common
>> 
>> Appear to be a number of superfluous changes.  The conf servlet must not be
>> polluted with specific references and logic for ozone.  We don’t create
>> dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
>> “ozone free”.
>> 
>> 
>> Datanode
>> 
>> I expected ozone changes to be intricately linked with the existing blocks
>> map, dataset, volume, etc.  Thankfully it’s not.  As an independent
>> service, the DN should not be polluted with specific references to ozone.
>> If ozone is in the project, the DN should have a generic plugin interface
>> conceptually similar to the NM aux services.
>> 
>> 
>> Namenode
>> 
>> No impact, currently, but certainly will be…
>> 
>> 
>> Code Location
>> 
>> I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
>> I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
>> hadoop-hdsl-project.  This clean separation will make it easier to later
>> spin off or pull in depending on which way we vote.
>> 
>> 
>> Dependencies
>> 
>> Owen hit upon his before I could send.  Hadoop is already bursting with
>> dependencies, I hope this doesn’t pull in a lot more.
>> 
>> 
>> ––
>> 
>> 
>> Do I think ozone be should be a separate project?  If we view it only as a
>> competing filesystem, then clearly yes.  If it’s a low risk evolutionary
>> step with near-term benefits, no, we want to keep it close and help it
>> evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
>> umbrella term for too many technologies that should perhaps be split.  I'm
>> interested in the container block management.  I have little interest at
>> this time in the key store.
>> 
>> 
>> The usability of ozone, specifically container management, is unclear to
>> me.  It lacks basic features like changing replication factors, append, a
>> migration path, security, etc - I know there are good plans for all of it -
>> yet another goal is splicing into the NN.  That’s a lot of high priority
>> items to tackle that need to be carefully orchestrated before contemplating
>> BM replacement.  Each of those is a non-starter for (my) production
>> environment.  We need to make sure we can reach a consensus on the block
>> level functionality before rushing it into the NN.  That’s independent of
>> whether allowing it into the project.
>> 
>> 
>> The BM/SCM changes to the NN are realistically going to be contentious &
>> destabilizing.  If done correctly, the BM separation will be a big win for
>> the NN.  If ozone is out, by necessity interfaces will need to be stable
>> and well-defined but we won’t get that right for a long time.  Interface
>> and logic changes that break the other will be difficult to coordinate and
>> we’ll likely veto changes that impact the other.  If ozone is in, we can
>> hopefully synchronize the changes with less friction, but it greatly
>> increases the chances of developers riddling the NN with hacks and/or ozone
>> specific logic that makes it even more brittle.  I will note we need to be
>> vigilant against pervasive conditionals (ie. EC, snapshots).
>> 
>> 
>> In either case, I think ozone must agree to not impede current hdfs work.
>> I’ll compare to hdfs is a store owner that plans to maybe retire in 5
>> years.  A potential new owner (ozone) is lined up and hdfs graciously gives
>> them no-rent space (the DN).  Precondition is help improve the store.
>> Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
>> that complicate hdfs but ignore it due to anticipation of its
>> departure/demise.  I’m not implying that’s currently happening, it’s just
>> what I don’t want to see.
>> 
>> 
>> We as a community and our customers need an evolution, not a revolution,
>> and definitively not a civil war.  Hdfs has too much legacy code rot that
>> is hard to change.  Too many poorly implemented features.   Perhaps I’m
>> overly optimistic that freshly redesigned code can counterbalance
>> performance degradations in the NN.  I’m also reluctant, but realize it is
>> being driven by some hdfs veterans that know/understand historical hdfs
>> design strengths and flaws.
>> 
>> 
>> If the initially cited issues are addressed, I’m +0.5 for the concept of
>> bringing in ozone if it's not going to be a proverbial bull in the china
>> shop.
>> 
>> 
>> Daryn
>> 
>> On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <jitendra@hortonworks.com
>>> 
>> wrote:
>> 
>>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>> 
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>> is
>>> a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>> layer
>>> as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>> 
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>> 
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/
>>> Hadoop+Distributed+Storage+Layer+and+Applications
>>> 
>>> 
>>>    I will start with my vote.
>>>            +1 (binding)
>>> 
>>> 
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>> 
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>> 
>>> 
>>>    Thanks
>>>    jitendra
>>> 
>>> 
>>> 
>>> 
>>> 
>>>            DISCUSSION THREAD SUMMARY :
>>> 
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>> 
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>> 
>>> 
>>>                Dear
>>>                 Hadoop Community Members,
>>> 
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread.
>> We
>>> express our gratitude for participation and valuable comments.
>>> 
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>> brings
>>> in a large body of code.
>>>                4) Why can’t they be separate projects forever or merged
>>> in when production ready?
>>> 
>>>                We have responded to all the above questions with
>> detailed
>>> explanations and answers on the jira as well as in the discussions. We
>>> believe that should sufficiently address community’s concerns.
>>> 
>>>                Please see the summary below:
>>> 
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>> 
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the
>> new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone
>> is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This
>> will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old
>> hdfs
>>> block layer. See details below on sharing of code.
>>> 
>>> 
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if
>> configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the
>> old
>>> HDFS continues to remains stable. (for example we plan to stabilize the
>> new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS’s old block layer)
>>> 
>>> 
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual
>> production
>>> use till the new system has feature parity with old HDFS. During this
>> time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>> place
>>> to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>> capacity
>>> across disks - this needs to be done for both the current HDFS blocks and
>>> the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth
>> used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>> 
>>> 
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer’s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>> possible
>>> (anything is possible in software),  it is significantly harder typically
>>> requiring  cleaner public apis etc. Sharing within a project though
>>> internal APIs is often simpler (such as the protocol engine that we want
>> to
>>> share).
>>> 
>>> 
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two
>> code
>>> bases for now and then later merge them when the new code is stable:
>>> 
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.
>> We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>> harder.
>>> 
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>> same
>>> then.
>>> 
>>> 
>>>                ------------------------------
>>> ---------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
>>> apache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> --
>> 
>> Daryn
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Andrew
  Thanks for your response. 

 In this email let me focus on maintenance and unnecessary impact on HDFS.
Daryn also touched on this topic and looked at the code base from the developer impact point of view. He appreciated that the code is separate and I agree with his suggestion to move it further up the src tree (e.g. Hadoop-hdsl-project or hadoop-hdfs-project/hadoop-hdsl). He also gave a good analogy to the store: do not break things as you change and evolve the store. Let’s look at the areas of future interaction as examples.

- NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen acknowledge the benefit of the new block layer).  We have two choices here 
 ** a) Evolve NN so that it can interact with both old and new block layer, 
 **  b) Fork and create new NN that works only with new block layer, the old NN will continue to work with old block layer. 
There are trade-offs but clearly the 2nd option has least impact on the old HDFS code.  

- Share the HDSL’s netty  protocol engine with HDFS block layer.  After HDSL and Ozone has stabilized the engine, put the new netty engine in either HDFS or in Hadoop common - HDSL will use it from there. The HDFS community  has been talking about moving to better thread model for HDFS DNs since release 0.16!!

- Shallow copy. Here HDSL needs a way to get the actual linux file system links - HDFS block layer needs  to provide a private secure API to get file names of blocks so that HDSL can do a hard link (hence shallow copy)o

The first 2 examples are beneficial to existing HDFS and the maintenance burden can be minimized and worth the benefits (2x NN scalability!! And more efficient protocol engine). The 3rd is only beneficial to HDFS users who want the scalability of the new HDSL/Ozone code in a side-by-side system; here the cost is providing a  private API to access the block file name. 


sanjay

> On Mar 1, 2018, at 11:03 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Hi Sanjay,
> 
> I have different opinions about what's important and how to eventually
> integrate this code, and that's not because I'm "conveniently ignoring"
> your responses. I'm also not making some of the arguments you claim I am
> making. Attacking arguments I'm not making is not going to change my mind,
> so let's bring it back to the arguments I am making.
> 
> Here's what it comes down to: HDFS-on-HDSL is not going to be ready in the
> near-term, and it comes with a maintenance cost.
> 
> I did read the proposal on HDFS-10419 and I understood that HDFS-on-HDSL
> integration does not necessarily require a lock split. However, there still
> needs to be refactoring to clearly define the FSN and BM interfaces and
> make the BM pluggable so HDSL can be swapped in. This is a major
> undertaking and risky. We did a similar refactoring in 2.x which made
> backports hard and introduced bugs. I don't think we should have done this
> in a minor release.
> 
> Furthermore, I don't know what your expectation is on how long it will take
> to stabilize HDSL, but this horizon for other storage systems is typically
> measured in years rather than months.
> 
> Both of these feel like Hadoop 4 items: a ways out yet.
> 
> Moving on, there is a non-trivial maintenance cost to having this new code
> in the code base. Ozone bugs become our bugs. Ozone dependencies become our
> dependencies. Ozone's security flaws are our security flaws. All of this
> negatively affects our already lumbering release schedule, and thus our
> ability to deliver and iterate on the features we're already trying to
> ship. Even if Ozone is separate and off by default, this is still a large
> amount of code that comes with a large maintenance cost. I don't want to
> incur this cost when the benefit is still a ways out.
> 
> We disagree on the necessity of sharing a repo and sharing operational
> behaviors. Libraries exist as a method for sharing code. HDFS also hardly
> has a monopoly on intermediating storage today. Disks are shared with MR
> shuffle, Spark/Impala spill, log output, Kudu, Kafka, etc. Operationally
> we've made this work. Having Ozone/HDSL in a separate process can even be
> seen as an operational advantage since it's isolated. I firmly believe that
> we can solve any implementation issues even with separate processes.
> 
> This is why I asked about making this a separate project. Given that these
> two efforts (HDSL stabilization and NN refactoring) are a ways out, the
> best way to get Ozone/HDSL in the hands of users today is to release it as
> its own project. Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.
> 
> I'm excited about the possibilities of both HDSL and the NN refactoring in
> ensuring a future for HDFS for years to come. A pluggable block manager
> would also let us experiment with things like HDFS-on-S3, increasingly
> important in a cloud-centric world. CBlock would bring HDFS to new usecases
> around generic container workloads. However, given the timeline for
> completing these efforts, now is not the time to merge.
> 
> Best,
> Andrew
> 
> On Thu, Mar 1, 2018 at 5:33 PM, Daryn Sharp <da...@oath.com.invalid> wrote:
> 
>> I’m generally neutral and looked foremost at developer impact.  Ie.  Will
>> it be so intertwined with hdfs that each project risks destabilizing the
>> other?  Will developers with no expertise in ozone will be impeded?  I
>> think the answer is currently no.  These are the intersections and some
>> concerns based on the assumption ozone is accepted into the project:
>> 
>> 
>> Common
>> 
>> Appear to be a number of superfluous changes.  The conf servlet must not be
>> polluted with specific references and logic for ozone.  We don’t create
>> dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
>> “ozone free”.
>> 
>> 
>> Datanode
>> 
>> I expected ozone changes to be intricately linked with the existing blocks
>> map, dataset, volume, etc.  Thankfully it’s not.  As an independent
>> service, the DN should not be polluted with specific references to ozone.
>> If ozone is in the project, the DN should have a generic plugin interface
>> conceptually similar to the NM aux services.
>> 
>> 
>> Namenode
>> 
>> No impact, currently, but certainly will be…
>> 
>> 
>> Code Location
>> 
>> I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
>> I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
>> hadoop-hdsl-project.  This clean separation will make it easier to later
>> spin off or pull in depending on which way we vote.
>> 
>> 
>> Dependencies
>> 
>> Owen hit upon his before I could send.  Hadoop is already bursting with
>> dependencies, I hope this doesn’t pull in a lot more.
>> 
>> 
>> ––
>> 
>> 
>> Do I think ozone be should be a separate project?  If we view it only as a
>> competing filesystem, then clearly yes.  If it’s a low risk evolutionary
>> step with near-term benefits, no, we want to keep it close and help it
>> evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
>> umbrella term for too many technologies that should perhaps be split.  I'm
>> interested in the container block management.  I have little interest at
>> this time in the key store.
>> 
>> 
>> The usability of ozone, specifically container management, is unclear to
>> me.  It lacks basic features like changing replication factors, append, a
>> migration path, security, etc - I know there are good plans for all of it -
>> yet another goal is splicing into the NN.  That’s a lot of high priority
>> items to tackle that need to be carefully orchestrated before contemplating
>> BM replacement.  Each of those is a non-starter for (my) production
>> environment.  We need to make sure we can reach a consensus on the block
>> level functionality before rushing it into the NN.  That’s independent of
>> whether allowing it into the project.
>> 
>> 
>> The BM/SCM changes to the NN are realistically going to be contentious &
>> destabilizing.  If done correctly, the BM separation will be a big win for
>> the NN.  If ozone is out, by necessity interfaces will need to be stable
>> and well-defined but we won’t get that right for a long time.  Interface
>> and logic changes that break the other will be difficult to coordinate and
>> we’ll likely veto changes that impact the other.  If ozone is in, we can
>> hopefully synchronize the changes with less friction, but it greatly
>> increases the chances of developers riddling the NN with hacks and/or ozone
>> specific logic that makes it even more brittle.  I will note we need to be
>> vigilant against pervasive conditionals (ie. EC, snapshots).
>> 
>> 
>> In either case, I think ozone must agree to not impede current hdfs work.
>> I’ll compare to hdfs is a store owner that plans to maybe retire in 5
>> years.  A potential new owner (ozone) is lined up and hdfs graciously gives
>> them no-rent space (the DN).  Precondition is help improve the store.
>> Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
>> that complicate hdfs but ignore it due to anticipation of its
>> departure/demise.  I’m not implying that’s currently happening, it’s just
>> what I don’t want to see.
>> 
>> 
>> We as a community and our customers need an evolution, not a revolution,
>> and definitively not a civil war.  Hdfs has too much legacy code rot that
>> is hard to change.  Too many poorly implemented features.   Perhaps I’m
>> overly optimistic that freshly redesigned code can counterbalance
>> performance degradations in the NN.  I’m also reluctant, but realize it is
>> being driven by some hdfs veterans that know/understand historical hdfs
>> design strengths and flaws.
>> 
>> 
>> If the initially cited issues are addressed, I’m +0.5 for the concept of
>> bringing in ozone if it's not going to be a proverbial bull in the china
>> shop.
>> 
>> 
>> Daryn
>> 
>> On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <jitendra@hortonworks.com
>>> 
>> wrote:
>> 
>>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>> 
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>> is
>>> a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>> layer
>>> as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>> 
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>> 
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/
>>> Hadoop+Distributed+Storage+Layer+and+Applications
>>> 
>>> 
>>>    I will start with my vote.
>>>            +1 (binding)
>>> 
>>> 
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>> 
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>> 
>>> 
>>>    Thanks
>>>    jitendra
>>> 
>>> 
>>> 
>>> 
>>> 
>>>            DISCUSSION THREAD SUMMARY :
>>> 
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>> 
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>> 
>>> 
>>>                Dear
>>>                 Hadoop Community Members,
>>> 
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread.
>> We
>>> express our gratitude for participation and valuable comments.
>>> 
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>> brings
>>> in a large body of code.
>>>                4) Why can’t they be separate projects forever or merged
>>> in when production ready?
>>> 
>>>                We have responded to all the above questions with
>> detailed
>>> explanations and answers on the jira as well as in the discussions. We
>>> believe that should sufficiently address community’s concerns.
>>> 
>>>                Please see the summary below:
>>> 
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>> 
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the
>> new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone
>> is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This
>> will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old
>> hdfs
>>> block layer. See details below on sharing of code.
>>> 
>>> 
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if
>> configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the
>> old
>>> HDFS continues to remains stable. (for example we plan to stabilize the
>> new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS’s old block layer)
>>> 
>>> 
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual
>> production
>>> use till the new system has feature parity with old HDFS. During this
>> time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>> place
>>> to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>> capacity
>>> across disks - this needs to be done for both the current HDFS blocks and
>>> the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth
>> used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>> 
>>> 
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer’s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>> possible
>>> (anything is possible in software),  it is significantly harder typically
>>> requiring  cleaner public apis etc. Sharing within a project though
>>> internal APIs is often simpler (such as the protocol engine that we want
>> to
>>> share).
>>> 
>>> 
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two
>> code
>>> bases for now and then later merge them when the new code is stable:
>>> 
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.
>> We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>> harder.
>>> 
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>> same
>>> then.
>>> 
>>> 
>>>                ------------------------------
>>> ---------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
>>> apache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> --
>> 
>> Daryn
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Joep,  You raise a number of points:

(1) Ozone vs and object stores. “Some users would choose Ozone as that layer, some might use S3, others GCS, or Azure, or something else”.
(2) How HDSL/Ozone fits into Hadoop and whether it is necessary.
(3) You raise the release issue which we will respond in a separate email.

Let me respond to 1 & 2:
***Wrt to (1) Ozone vs other object stores***
Neither HDFS or Ozone has any real role in cloud except for temp data. The cost of local disk or EBS is so high that long term data storage on HDFS or even Ozone is prohibitive.
So why the hell create the KV namespace? We need to stabilize the HDSL where data is stored.  - We are targeting Hive and SPark apps to stabilize HDSL using real Hadoop apps over OzoneFS.
But HDSL/Ozone is not feature compatible with HDFS so how will users even use it for real to stability. Users can run HDFS and Ozone side by side in same cluster and have two namespace (just like in Federation) and run apps on both: run some hive and spark apps on Ozone and others that need full HDFS feature (encryption) on HDFS. As it becomes stable they can start using HDSL/Ozone for production use for a portion of their data.



***Wrt to (2) HDSL/Ozone fitting into Hadoop and why the same repository***
Ozone KV is a temporary step. Real goal is to put NN on top of HDSL, We have shown how to do that in the roadmap that Konstantine and Chris D asked. Milestone 1 is feasible and doesn't require removal of FSN lock. We have also shown several cases of sharing other code in future (protocol engine). This co-development will be easier if in the same repo. Over time HDSL + ported NN  will create a new HDFS and become feature compatible - some of the feature will come for free because they are in NN and will port over to the new NN, Some are in block layer (erasure code) and will have to be added to HDSL.

--- You compare with Yarn, HDFS and Common. HDFS and Yarn are independent but both depend on Hadoop common (e.g. HBase runs on HDFS without Yarn).   HDSL and Ozone will depend on Hadoop common, Indeed the new protocol engine of HDSL might move to Hadoop common or HDFS. We have made sure that there are no dependencies of HDFS on HDSL or currently.


***The Repo issue and conclusion***
HDFS community will need to work together as we evolve old HDFS to use HDSL, new protocol engine and Raft. and together evolve to a newer more powerful set of sub components. It is important that they are in same repo and that we can share code through both private interface. We are not trying to build a competing Object store but to improve HDFS and fixing scalability fundamentally is hard and we are asking for an environment for that to happen easily over the next year while heeding to the stability concerns of HDFS developers (eg we  remove compile time dependency, maven profile). This work is not being done by members of foreign project trying to insert code in Hadoop, but by Hadoop/HDFS developers with given track record s and are active participation in Hadoop and HDFS. Our jobs depend on HDFS/Hadoop stability - destabilizing is the last thing we want to do; we have responded every constructive feedback 


sanjay


> On Mar 6, 2018, at 6:50 PM, J. Rottinghuis <jr...@gmail.com> wrote:
> 
> Sorry for jumping in late into the fray of this discussion.
> 
> It seems Ozone is a large feature. I appreciate the development effort and
> the desire to get this into the hands of users.
> I understand the need to iterate quickly and to reduce overhead for
> development.
> I also agree that Hadoop can benefit from a quicker release cycle. For our
> part, this is a challenge as we have a large installation with multiple
> clusters and thousands of users. It is a constant balance between jumping
> to the newest release and the cost of this integration and test at our
> scale, especially when things aren't backwards compatible. We try to be
> good citizens and upstream our changes and contribute back.
> 
> The point was made that splitting the projects such as common and Yarn
> didn't work and had to be reverted. That was painful and a lot of work for
> those involved for sure. This project may be slightly different in that
> hadoop-common, Yarn and HDFS made for one consistent whole. One couldn't
> run a project without the other.
> 
> Having a separate block management layer with possibly multiple block
> implementation as pluggable under the covers would be a good future
> development for HDFS. Some users would choose Ozone as that layer, some
> might use S3, others GCS, or Azure, or something else.
> If the argument is made that nobody will be able to run Hadoop as a
> consistent stack without Ozone, then that would be a strong case to keep
> things in the same repo.
> 
> Obviously when people do want to use Ozone, then having it in the same repo
> is easier. The flipside is that, separate top-level project in the same
> repo or not, it adds to the Hadoop releases. If there is a change in Ozone
> and a new release needed, it would have to wait for a Hadoop release. Ditto
> if there is a Hadoop release and there is an issue with Ozone. The case
> that one could turn off Ozone through a Maven profile works only to some
> extend.
> If we have done a 3.x release with Ozone in it, would it make sense to do a
> 3.y release with y>x without Ozone in it? That would be weird.
> 
> This does sound like a Hadoop 4 feature. Compatibility with lots of new
> features in Hadoop 3 need to be worked out. We're still working on jumping
> to a Hadoop 2.9 release and then working on getting a step-store release to
> 3.0 to bridge compatibility issues. I'm afraid that adding a very large new
> feature into trunk now, essentially makes going to Hadoop 3 not viable for
> quite a while. That would be a bummer for all the feature work that has
> gone into Hadoop 3. Encryption and erasure encoding are very appealing
> features, especially in light of meeting GDPR requirements.
> 
> I'd argue to pull out those pieces that make sense in Hadoop 3, merge those
> in and keep the rest in a separate project. Iterate quickly in that
> separate project, you can have a separate set of committers, you can do
> separate release cycle. If that develops Ozone into _the_ new block layer
> for all use cases (even when people want to give up on encryption, erasure
> encoding, or feature parity is reached) then we can jump of that bridge
> when we reach it. I think adding a very large chunk of code that relatively
> few people in the community are familiar with isn't necessarily going to
> help Hadoop at this time.
> 
> Cheers,
> 
> Joep
> 
> On Tue, Mar 6, 2018 at 2:32 PM, Jitendra Pandey <ji...@hortonworks.com>
> wrote:
> 
>> Hi Andrew,
>> 
>> I think we can eliminate the maintenance costs even in the same repo. We
>> can make following changes that incorporate suggestions from Daryn and Owen
>> as well.
>> 1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate
>> directory.
>> 2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
>> 3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be
>> loaded in DN as a pluggable module.
>>     If not loaded, there will be absolutely no code path through hdsl or
>> ozone.
>> 4. To further make it easier for folks building hadoop, we can support a
>> maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will
>> not be built.
>>     For example, Cloudera can choose to skip even compiling/building
>> hdsl/ozone and therefore no maintenance overhead whatsoever.
>>     HADOOP-14453 has a patch that shows how it can be done.
>> 
>> Arguably, there are two kinds of maintenance costs. Costs for developers
>> and the cost for users.
>> - Developers: A maven profile as noted in point (3) and (4) above
>> completely addresses the concern for developers
>>                                 as there are no compile time dependencies
>> and further, they can choose not to build ozone/hdsl.
>> - User: Cost to users will be completely alleviated if ozone/hdsl is not
>> loaded as mentioned in point (3) above.
>> 
>> jitendra
>> 
>> From: Andrew Wang <an...@cloudera.com>
>> Date: Monday, March 5, 2018 at 3:54 PM
>> To: Wangda Tan <wh...@gmail.com>
>> Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp
>> <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>,
>> hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <
>> common-dev@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <
>> yarn-dev@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <
>> mapreduce-dev@hadoop.apache.org>
>> Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk
>> 
>> Hi Owen, Wangda,
>> 
>> Thanks for clearly laying out the subproject options, that helps the
>> discussion.
>> 
>> I'm all onboard with the idea of regular releases, and it's something I
>> tried to do with the 3.0 alphas and betas. The problem though isn't a lack
>> of commitment from feature developers like Sanjay or Jitendra; far from it!
>> I think every feature developer makes a reasonable effort to test their
>> code before it's merged. Yet, my experience as an RM is that more code
>> comes with more risk. I don't believe that Ozone is special or different in
>> this regard. It comes with a maintenance cost, not a maintenance benefit.
>> 
>> 
>> I'm advocating for #3: separate source, separate release. Since HDSL
>> stability and FSN/BM refactoring are still a ways out, I don't want to
>> incur a maintenance cost now. I sympathize with the sentiment that working
>> cross-repo is harder than within same repo, but the right tooling can make
>> this a lot easier (e.g. git submodule, Google's repo tool). We have
>> experience doing this internally here at Cloudera, and I'm happy to share
>> knowledge and possibly code.
>> 
>> Best,
>> Andrew
>> 
>> On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
>> I like the idea of same source / same release and put Ozone's source under
>> a different directory.
>> 
>> Like Owen mentioned, It gonna be important for all parties to keep a
>> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
>> releases. Users can try features and give feedbacks to stabilize feature
>> earlier; developers can be happier since efforts will be consumed by users
>> soon after features get merged. In addition to this, if features merged to
>> trunk after reasonable tests/review, Andrew's concern may not be a problem
>> anymore:
>> 
>> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
>> being a separate project. Ozone could release faster and iterate more
>> quickly if it wasn't hampered by Hadoop's release schedule and security and
>> compatibility requirements.
>> 
>> Thanks,
>> Wangda
>> 
>> 
>> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
>> wrote:
>> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>> 
>> Owen mentioned making a Hadoop subproject; we'd have to
>>> hash out what exactly this means (I assume a separate repo still managed
>> by
>>> the Hadoop project), but I think we could make this work if it's more
>>> attractive than incubation or a new TLP.
>> 
>> 
>> Ok, there are multiple levels of sub-projects that all make sense:
>> 
>>   - Same source tree, same releases - examples like HDFS & YARN
>>   - Same master branch, separate releases and release branches - Hive's
>>   Storage API vs Hive. It is in the source tree for the master branch, but
>>   has distinct releases and release branches.
>>   - Separate source, separate release - Apache Commons.
>> 
>> There are advantages and disadvantages to each. I'd propose that we use the
>> same source, same release pattern for Ozone. Note that we tried and later
>> reverted doing Common, HDFS, and YARN as separate source, separate release
>> because it was too much trouble. I like Daryn's idea of putting it as a top
>> level directory in Hadoop and making sure that nothing in Common, HDFS, or
>> YARN depend on it. That way if a Release Manager doesn't think it is ready
>> for release, it can be trivially removed before the release.
>> 
>> One thing about using the same releases, Sanjay and Jitendra are signing up
>> to make much more regular bugfix and minor releases in the near future. For
>> example, they'll need to make 3.2 relatively soon to get it released and
>> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
>> project. Hadoop needs more regular releases and fewer big bang releases.
>> 
>> .. Owen
>> 
>> 
>> 
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Joep,  You raise a number of points:

(1) Ozone vs and object stores. “Some users would choose Ozone as that layer, some might use S3, others GCS, or Azure, or something else”.
(2) How HDSL/Ozone fits into Hadoop and whether it is necessary.
(3) You raise the release issue which we will respond in a separate email.

Let me respond to 1 & 2:
***Wrt to (1) Ozone vs other object stores***
Neither HDFS or Ozone has any real role in cloud except for temp data. The cost of local disk or EBS is so high that long term data storage on HDFS or even Ozone is prohibitive.
So why the hell create the KV namespace? We need to stabilize the HDSL where data is stored.  - We are targeting Hive and SPark apps to stabilize HDSL using real Hadoop apps over OzoneFS.
But HDSL/Ozone is not feature compatible with HDFS so how will users even use it for real to stability. Users can run HDFS and Ozone side by side in same cluster and have two namespace (just like in Federation) and run apps on both: run some hive and spark apps on Ozone and others that need full HDFS feature (encryption) on HDFS. As it becomes stable they can start using HDSL/Ozone for production use for a portion of their data.



***Wrt to (2) HDSL/Ozone fitting into Hadoop and why the same repository***
Ozone KV is a temporary step. Real goal is to put NN on top of HDSL, We have shown how to do that in the roadmap that Konstantine and Chris D asked. Milestone 1 is feasible and doesn't require removal of FSN lock. We have also shown several cases of sharing other code in future (protocol engine). This co-development will be easier if in the same repo. Over time HDSL + ported NN  will create a new HDFS and become feature compatible - some of the feature will come for free because they are in NN and will port over to the new NN, Some are in block layer (erasure code) and will have to be added to HDSL.

--- You compare with Yarn, HDFS and Common. HDFS and Yarn are independent but both depend on Hadoop common (e.g. HBase runs on HDFS without Yarn).   HDSL and Ozone will depend on Hadoop common, Indeed the new protocol engine of HDSL might move to Hadoop common or HDFS. We have made sure that there are no dependencies of HDFS on HDSL or currently.


***The Repo issue and conclusion***
HDFS community will need to work together as we evolve old HDFS to use HDSL, new protocol engine and Raft. and together evolve to a newer more powerful set of sub components. It is important that they are in same repo and that we can share code through both private interface. We are not trying to build a competing Object store but to improve HDFS and fixing scalability fundamentally is hard and we are asking for an environment for that to happen easily over the next year while heeding to the stability concerns of HDFS developers (eg we  remove compile time dependency, maven profile). This work is not being done by members of foreign project trying to insert code in Hadoop, but by Hadoop/HDFS developers with given track record s and are active participation in Hadoop and HDFS. Our jobs depend on HDFS/Hadoop stability - destabilizing is the last thing we want to do; we have responded every constructive feedback 


sanjay


> On Mar 6, 2018, at 6:50 PM, J. Rottinghuis <jr...@gmail.com> wrote:
> 
> Sorry for jumping in late into the fray of this discussion.
> 
> It seems Ozone is a large feature. I appreciate the development effort and
> the desire to get this into the hands of users.
> I understand the need to iterate quickly and to reduce overhead for
> development.
> I also agree that Hadoop can benefit from a quicker release cycle. For our
> part, this is a challenge as we have a large installation with multiple
> clusters and thousands of users. It is a constant balance between jumping
> to the newest release and the cost of this integration and test at our
> scale, especially when things aren't backwards compatible. We try to be
> good citizens and upstream our changes and contribute back.
> 
> The point was made that splitting the projects such as common and Yarn
> didn't work and had to be reverted. That was painful and a lot of work for
> those involved for sure. This project may be slightly different in that
> hadoop-common, Yarn and HDFS made for one consistent whole. One couldn't
> run a project without the other.
> 
> Having a separate block management layer with possibly multiple block
> implementation as pluggable under the covers would be a good future
> development for HDFS. Some users would choose Ozone as that layer, some
> might use S3, others GCS, or Azure, or something else.
> If the argument is made that nobody will be able to run Hadoop as a
> consistent stack without Ozone, then that would be a strong case to keep
> things in the same repo.
> 
> Obviously when people do want to use Ozone, then having it in the same repo
> is easier. The flipside is that, separate top-level project in the same
> repo or not, it adds to the Hadoop releases. If there is a change in Ozone
> and a new release needed, it would have to wait for a Hadoop release. Ditto
> if there is a Hadoop release and there is an issue with Ozone. The case
> that one could turn off Ozone through a Maven profile works only to some
> extend.
> If we have done a 3.x release with Ozone in it, would it make sense to do a
> 3.y release with y>x without Ozone in it? That would be weird.
> 
> This does sound like a Hadoop 4 feature. Compatibility with lots of new
> features in Hadoop 3 need to be worked out. We're still working on jumping
> to a Hadoop 2.9 release and then working on getting a step-store release to
> 3.0 to bridge compatibility issues. I'm afraid that adding a very large new
> feature into trunk now, essentially makes going to Hadoop 3 not viable for
> quite a while. That would be a bummer for all the feature work that has
> gone into Hadoop 3. Encryption and erasure encoding are very appealing
> features, especially in light of meeting GDPR requirements.
> 
> I'd argue to pull out those pieces that make sense in Hadoop 3, merge those
> in and keep the rest in a separate project. Iterate quickly in that
> separate project, you can have a separate set of committers, you can do
> separate release cycle. If that develops Ozone into _the_ new block layer
> for all use cases (even when people want to give up on encryption, erasure
> encoding, or feature parity is reached) then we can jump of that bridge
> when we reach it. I think adding a very large chunk of code that relatively
> few people in the community are familiar with isn't necessarily going to
> help Hadoop at this time.
> 
> Cheers,
> 
> Joep
> 
> On Tue, Mar 6, 2018 at 2:32 PM, Jitendra Pandey <ji...@hortonworks.com>
> wrote:
> 
>> Hi Andrew,
>> 
>> I think we can eliminate the maintenance costs even in the same repo. We
>> can make following changes that incorporate suggestions from Daryn and Owen
>> as well.
>> 1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate
>> directory.
>> 2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
>> 3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be
>> loaded in DN as a pluggable module.
>>     If not loaded, there will be absolutely no code path through hdsl or
>> ozone.
>> 4. To further make it easier for folks building hadoop, we can support a
>> maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will
>> not be built.
>>     For example, Cloudera can choose to skip even compiling/building
>> hdsl/ozone and therefore no maintenance overhead whatsoever.
>>     HADOOP-14453 has a patch that shows how it can be done.
>> 
>> Arguably, there are two kinds of maintenance costs. Costs for developers
>> and the cost for users.
>> - Developers: A maven profile as noted in point (3) and (4) above
>> completely addresses the concern for developers
>>                                 as there are no compile time dependencies
>> and further, they can choose not to build ozone/hdsl.
>> - User: Cost to users will be completely alleviated if ozone/hdsl is not
>> loaded as mentioned in point (3) above.
>> 
>> jitendra
>> 
>> From: Andrew Wang <an...@cloudera.com>
>> Date: Monday, March 5, 2018 at 3:54 PM
>> To: Wangda Tan <wh...@gmail.com>
>> Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp
>> <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>,
>> hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <
>> common-dev@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <
>> yarn-dev@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <
>> mapreduce-dev@hadoop.apache.org>
>> Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk
>> 
>> Hi Owen, Wangda,
>> 
>> Thanks for clearly laying out the subproject options, that helps the
>> discussion.
>> 
>> I'm all onboard with the idea of regular releases, and it's something I
>> tried to do with the 3.0 alphas and betas. The problem though isn't a lack
>> of commitment from feature developers like Sanjay or Jitendra; far from it!
>> I think every feature developer makes a reasonable effort to test their
>> code before it's merged. Yet, my experience as an RM is that more code
>> comes with more risk. I don't believe that Ozone is special or different in
>> this regard. It comes with a maintenance cost, not a maintenance benefit.
>> 
>> 
>> I'm advocating for #3: separate source, separate release. Since HDSL
>> stability and FSN/BM refactoring are still a ways out, I don't want to
>> incur a maintenance cost now. I sympathize with the sentiment that working
>> cross-repo is harder than within same repo, but the right tooling can make
>> this a lot easier (e.g. git submodule, Google's repo tool). We have
>> experience doing this internally here at Cloudera, and I'm happy to share
>> knowledge and possibly code.
>> 
>> Best,
>> Andrew
>> 
>> On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
>> I like the idea of same source / same release and put Ozone's source under
>> a different directory.
>> 
>> Like Owen mentioned, It gonna be important for all parties to keep a
>> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
>> releases. Users can try features and give feedbacks to stabilize feature
>> earlier; developers can be happier since efforts will be consumed by users
>> soon after features get merged. In addition to this, if features merged to
>> trunk after reasonable tests/review, Andrew's concern may not be a problem
>> anymore:
>> 
>> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
>> being a separate project. Ozone could release faster and iterate more
>> quickly if it wasn't hampered by Hadoop's release schedule and security and
>> compatibility requirements.
>> 
>> Thanks,
>> Wangda
>> 
>> 
>> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
>> wrote:
>> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>> 
>> Owen mentioned making a Hadoop subproject; we'd have to
>>> hash out what exactly this means (I assume a separate repo still managed
>> by
>>> the Hadoop project), but I think we could make this work if it's more
>>> attractive than incubation or a new TLP.
>> 
>> 
>> Ok, there are multiple levels of sub-projects that all make sense:
>> 
>>   - Same source tree, same releases - examples like HDFS & YARN
>>   - Same master branch, separate releases and release branches - Hive's
>>   Storage API vs Hive. It is in the source tree for the master branch, but
>>   has distinct releases and release branches.
>>   - Separate source, separate release - Apache Commons.
>> 
>> There are advantages and disadvantages to each. I'd propose that we use the
>> same source, same release pattern for Ozone. Note that we tried and later
>> reverted doing Common, HDFS, and YARN as separate source, separate release
>> because it was too much trouble. I like Daryn's idea of putting it as a top
>> level directory in Hadoop and making sure that nothing in Common, HDFS, or
>> YARN depend on it. That way if a Release Manager doesn't think it is ready
>> for release, it can be trivially removed before the release.
>> 
>> One thing about using the same releases, Sanjay and Jitendra are signing up
>> to make much more regular bugfix and minor releases in the near future. For
>> example, they'll need to make 3.2 relatively soon to get it released and
>> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
>> project. Hadoop needs more regular releases and fewer big bang releases.
>> 
>> .. Owen
>> 
>> 
>> 
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Anu Engineer <ae...@hortonworks.com>.

Hi Owen,

   Thanks for the proposal. I was hoping for same releases, but I am okay with different releases as well. 
   @Konstantin, I am completely open to the name changes, let us discuss that in HDFS-10419 
  and we can make the corresponding change.

--Anu


On 3/19/18, 10:52 AM, "Owen O'Malley" <ow...@gmail.com> wrote:

    Andrew and Daryn,
       Do you have any feedback on the proposal? Otherwise, we can start a vote
    for "adoption of new codebase" tomorrow.
    
    .. Owen
    
    On Wed, Mar 14, 2018 at 1:50 PM, Owen O'Malley <ow...@gmail.com>
    wrote:
    
    > This discussion seems to have died down coming closer consensus without a
    > resolution.
    >
    > I'd like to propose the following compromise:
    >
    > * HDSL become a subproject of Hadoop.
    > * HDSL will release separately from Hadoop. Hadoop releases will not
    > contain HDSL and vice versa.
    > * HDSL will get its own jira instance so that the release tags stay
    > separate.
    > * On trunk (as opposed to release branches) HDSL will be a separate module
    > in Hadoop's source tree. This will enable the HDSL to work on their trunk
    > and the Hadoop trunk without making releases for every change.
    > * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
    > * When Hadoop creates a release branch, the RM will delete the HDSL module
    > from the branch.
    > * HDSL will have their own Yetus checks and won't cause failures in the
    > Hadoop patch check.
    >
    > I think this accomplishes most of the goals of encouraging HDSL
    > development while minimizing the potential for disruption of HDFS
    > development.
    >
    > Thoughts? Andrew, Jitendra, & Sanjay?
    >
    > Thanks,
    >    Owen
    >

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Anu Engineer <ae...@hortonworks.com>.

Hi Owen,

   Thanks for the proposal. I was hoping for same releases, but I am okay with different releases as well. 
   @Konstantin, I am completely open to the name changes, let us discuss that in HDFS-10419 
  and we can make the corresponding change.

--Anu


On 3/19/18, 10:52 AM, "Owen O'Malley" <ow...@gmail.com> wrote:

    Andrew and Daryn,
       Do you have any feedback on the proposal? Otherwise, we can start a vote
    for "adoption of new codebase" tomorrow.
    
    .. Owen
    
    On Wed, Mar 14, 2018 at 1:50 PM, Owen O'Malley <ow...@gmail.com>
    wrote:
    
    > This discussion seems to have died down coming closer consensus without a
    > resolution.
    >
    > I'd like to propose the following compromise:
    >
    > * HDSL become a subproject of Hadoop.
    > * HDSL will release separately from Hadoop. Hadoop releases will not
    > contain HDSL and vice versa.
    > * HDSL will get its own jira instance so that the release tags stay
    > separate.
    > * On trunk (as opposed to release branches) HDSL will be a separate module
    > in Hadoop's source tree. This will enable the HDSL to work on their trunk
    > and the Hadoop trunk without making releases for every change.
    > * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
    > * When Hadoop creates a release branch, the RM will delete the HDSL module
    > from the branch.
    > * HDSL will have their own Yetus checks and won't cause failures in the
    > Hadoop patch check.
    >
    > I think this accomplishes most of the goals of encouraging HDSL
    > development while minimizing the potential for disruption of HDFS
    > development.
    >
    > Thoughts? Andrew, Jitendra, & Sanjay?
    >
    > Thanks,
    >    Owen
    >

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

Andrew and Daryn,
   Do you have any feedback on the proposal? Otherwise, we can start a vote
for "adoption of new codebase" tomorrow.

.. Owen

On Wed, Mar 14, 2018 at 1:50 PM, Owen O'Malley <ow...@gmail.com>
wrote:

> This discussion seems to have died down coming closer consensus without a
> resolution.
>
> I'd like to propose the following compromise:
>
> * HDSL become a subproject of Hadoop.
> * HDSL will release separately from Hadoop. Hadoop releases will not
> contain HDSL and vice versa.
> * HDSL will get its own jira instance so that the release tags stay
> separate.
> * On trunk (as opposed to release branches) HDSL will be a separate module
> in Hadoop's source tree. This will enable the HDSL to work on their trunk
> and the Hadoop trunk without making releases for every change.
> * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
> * When Hadoop creates a release branch, the RM will delete the HDSL module
> from the branch.
> * HDSL will have their own Yetus checks and won't cause failures in the
> Hadoop patch check.
>
> I think this accomplishes most of the goals of encouraging HDSL
> development while minimizing the potential for disruption of HDFS
> development.
>
> Thoughts? Andrew, Jitendra, & Sanjay?
>
> Thanks,
>    Owen
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

Andrew and Daryn,
   Do you have any feedback on the proposal? Otherwise, we can start a vote
for "adoption of new codebase" tomorrow.

.. Owen

On Wed, Mar 14, 2018 at 1:50 PM, Owen O'Malley <ow...@gmail.com>
wrote:

> This discussion seems to have died down coming closer consensus without a
> resolution.
>
> I'd like to propose the following compromise:
>
> * HDSL become a subproject of Hadoop.
> * HDSL will release separately from Hadoop. Hadoop releases will not
> contain HDSL and vice versa.
> * HDSL will get its own jira instance so that the release tags stay
> separate.
> * On trunk (as opposed to release branches) HDSL will be a separate module
> in Hadoop's source tree. This will enable the HDSL to work on their trunk
> and the Hadoop trunk without making releases for every change.
> * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
> * When Hadoop creates a release branch, the RM will delete the HDSL module
> from the branch.
> * HDSL will have their own Yetus checks and won't cause failures in the
> Hadoop patch check.
>
> I think this accomplishes most of the goals of encouraging HDSL
> development while minimizing the potential for disruption of HDFS
> development.
>
> Thoughts? Andrew, Jitendra, & Sanjay?
>
> Thanks,
>    Owen
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Konstantin Shvachko <sh...@gmail.com>.

The proposal to add it as a subproject of Hadoop makes sense to me. Thank
you Owen.
I am glad to have a path for scaling HDFS further, especially as it enters
areas like IoT and self-driving cars, where storage requirements are huge.

I am not very fond of the name HDSL, though. "Storage Layer" sounds too
generic.
May be something more descriptive, like HDDS / HDSS (Hadoop Dynamically
Distributed/Scaling Storage).
We can discuss this in the jira HDFS-10419.

Thanks,
--Konstantin

On Fri, Mar 16, 2018 at 3:17 PM, Sanjay Radia <sa...@hortonworks.com>
wrote:

> Owen,
> Thanks for your proposal.
> While I would have prefered to have HDSL in HDFS and also to be part
> of Hadoop releases  for the reasons stated earlier in this thread,
> I am willing to accept your proposal as a compromise to move this forward.
>
> Jitendra, Anu, Daryn, Andrew, Konstantine your thoughts?
>
> Thanks
>
> Sanjay
>
>
> On Mar 14, 2018, at 1:50 PM, Owen O'Malley <owen.omalley@gmail.com<mailto:
> owen.omalley@gmail.com>> wrote:
>
> This discussion seems to have died down coming closer consensus without a
> resolution.
>
> I'd like to propose the following compromise:
>
> * HDSL become a subproject of Hadoop.
> * HDSL will release separately from Hadoop. Hadoop releases will not
> contain HDSL and vice versa.
> * HDSL will get its own jira instance so that the release tags stay
> separate.
> * On trunk (as opposed to release branches) HDSL will be a separate module
> in Hadoop's source tree. This will enable the HDSL to work on their trunk
> and the Hadoop trunk without making releases for every change.
> * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
> * When Hadoop creates a release branch, the RM will delete the HDSL module
> from the branch.
> * HDSL will have their own Yetus checks and won't cause failures in the
> Hadoop patch check.
>
> I think this accomplishes most of the goals of encouraging HDSL development
> while minimizing the potential for disruption of HDFS development.
>
> Thoughts? Andrew, Jitendra, & Sanjay?
>
> Thanks,
>   Owen
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Konstantin Shvachko <sh...@gmail.com>.

The proposal to add it as a subproject of Hadoop makes sense to me. Thank
you Owen.
I am glad to have a path for scaling HDFS further, especially as it enters
areas like IoT and self-driving cars, where storage requirements are huge.

I am not very fond of the name HDSL, though. "Storage Layer" sounds too
generic.
May be something more descriptive, like HDDS / HDSS (Hadoop Dynamically
Distributed/Scaling Storage).
We can discuss this in the jira HDFS-10419.

Thanks,
--Konstantin

On Fri, Mar 16, 2018 at 3:17 PM, Sanjay Radia <sa...@hortonworks.com>
wrote:

> Owen,
> Thanks for your proposal.
> While I would have prefered to have HDSL in HDFS and also to be part
> of Hadoop releases  for the reasons stated earlier in this thread,
> I am willing to accept your proposal as a compromise to move this forward.
>
> Jitendra, Anu, Daryn, Andrew, Konstantine your thoughts?
>
> Thanks
>
> Sanjay
>
>
> On Mar 14, 2018, at 1:50 PM, Owen O'Malley <owen.omalley@gmail.com<mailto:
> owen.omalley@gmail.com>> wrote:
>
> This discussion seems to have died down coming closer consensus without a
> resolution.
>
> I'd like to propose the following compromise:
>
> * HDSL become a subproject of Hadoop.
> * HDSL will release separately from Hadoop. Hadoop releases will not
> contain HDSL and vice versa.
> * HDSL will get its own jira instance so that the release tags stay
> separate.
> * On trunk (as opposed to release branches) HDSL will be a separate module
> in Hadoop's source tree. This will enable the HDSL to work on their trunk
> and the Hadoop trunk without making releases for every change.
> * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
> * When Hadoop creates a release branch, the RM will delete the HDSL module
> from the branch.
> * HDSL will have their own Yetus checks and won't cause failures in the
> Hadoop patch check.
>
> I think this accomplishes most of the goals of encouraging HDSL development
> while minimizing the potential for disruption of HDFS development.
>
> Thoughts? Andrew, Jitendra, & Sanjay?
>
> Thanks,
>   Owen
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Konstantin Shvachko <sh...@gmail.com>.

The proposal to add it as a subproject of Hadoop makes sense to me. Thank
you Owen.
I am glad to have a path for scaling HDFS further, especially as it enters
areas like IoT and self-driving cars, where storage requirements are huge.

I am not very fond of the name HDSL, though. "Storage Layer" sounds too
generic.
May be something more descriptive, like HDDS / HDSS (Hadoop Dynamically
Distributed/Scaling Storage).
We can discuss this in the jira HDFS-10419.

Thanks,
--Konstantin

On Fri, Mar 16, 2018 at 3:17 PM, Sanjay Radia <sa...@hortonworks.com>
wrote:

> Owen,
> Thanks for your proposal.
> While I would have prefered to have HDSL in HDFS and also to be part
> of Hadoop releases  for the reasons stated earlier in this thread,
> I am willing to accept your proposal as a compromise to move this forward.
>
> Jitendra, Anu, Daryn, Andrew, Konstantine your thoughts?
>
> Thanks
>
> Sanjay
>
>
> On Mar 14, 2018, at 1:50 PM, Owen O'Malley <owen.omalley@gmail.com<mailto:
> owen.omalley@gmail.com>> wrote:
>
> This discussion seems to have died down coming closer consensus without a
> resolution.
>
> I'd like to propose the following compromise:
>
> * HDSL become a subproject of Hadoop.
> * HDSL will release separately from Hadoop. Hadoop releases will not
> contain HDSL and vice versa.
> * HDSL will get its own jira instance so that the release tags stay
> separate.
> * On trunk (as opposed to release branches) HDSL will be a separate module
> in Hadoop's source tree. This will enable the HDSL to work on their trunk
> and the Hadoop trunk without making releases for every change.
> * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
> * When Hadoop creates a release branch, the RM will delete the HDSL module
> from the branch.
> * HDSL will have their own Yetus checks and won't cause failures in the
> Hadoop patch check.
>
> I think this accomplishes most of the goals of encouraging HDSL development
> while minimizing the potential for disruption of HDFS development.
>
> Thoughts? Andrew, Jitendra, & Sanjay?
>
> Thanks,
>   Owen
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Konstantin Shvachko <sh...@gmail.com>.

The proposal to add it as a subproject of Hadoop makes sense to me. Thank
you Owen.
I am glad to have a path for scaling HDFS further, especially as it enters
areas like IoT and self-driving cars, where storage requirements are huge.

I am not very fond of the name HDSL, though. "Storage Layer" sounds too
generic.
May be something more descriptive, like HDDS / HDSS (Hadoop Dynamically
Distributed/Scaling Storage).
We can discuss this in the jira HDFS-10419.

Thanks,
--Konstantin

On Fri, Mar 16, 2018 at 3:17 PM, Sanjay Radia <sa...@hortonworks.com>
wrote:

> Owen,
> Thanks for your proposal.
> While I would have prefered to have HDSL in HDFS and also to be part
> of Hadoop releases  for the reasons stated earlier in this thread,
> I am willing to accept your proposal as a compromise to move this forward.
>
> Jitendra, Anu, Daryn, Andrew, Konstantine your thoughts?
>
> Thanks
>
> Sanjay
>
>
> On Mar 14, 2018, at 1:50 PM, Owen O'Malley <owen.omalley@gmail.com<mailto:
> owen.omalley@gmail.com>> wrote:
>
> This discussion seems to have died down coming closer consensus without a
> resolution.
>
> I'd like to propose the following compromise:
>
> * HDSL become a subproject of Hadoop.
> * HDSL will release separately from Hadoop. Hadoop releases will not
> contain HDSL and vice versa.
> * HDSL will get its own jira instance so that the release tags stay
> separate.
> * On trunk (as opposed to release branches) HDSL will be a separate module
> in Hadoop's source tree. This will enable the HDSL to work on their trunk
> and the Hadoop trunk without making releases for every change.
> * Hadoop's trunk will only build HDSL if a non-default profile is enabled.
> * When Hadoop creates a release branch, the RM will delete the HDSL module
> from the branch.
> * HDSL will have their own Yetus checks and won't cause failures in the
> Hadoop patch check.
>
> I think this accomplishes most of the goals of encouraging HDSL development
> while minimizing the potential for disruption of HDFS development.
>
> Thoughts? Andrew, Jitendra, & Sanjay?
>
> Thanks,
>   Owen
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Sanjay Radia <sa...@hortonworks.com>.

Owen,
Thanks for your proposal.
While I would have prefered to have HDSL in HDFS and also to be part
of Hadoop releases  for the reasons stated earlier in this thread,
I am willing to accept your proposal as a compromise to move this forward.

Jitendra, Anu, Daryn, Andrew, Konstantine your thoughts?

Thanks

Sanjay


On Mar 14, 2018, at 1:50 PM, Owen O'Malley <ow...@gmail.com>> wrote:

This discussion seems to have died down coming closer consensus without a
resolution.

I'd like to propose the following compromise:

* HDSL become a subproject of Hadoop.
* HDSL will release separately from Hadoop. Hadoop releases will not
contain HDSL and vice versa.
* HDSL will get its own jira instance so that the release tags stay
separate.
* On trunk (as opposed to release branches) HDSL will be a separate module
in Hadoop's source tree. This will enable the HDSL to work on their trunk
and the Hadoop trunk without making releases for every change.
* Hadoop's trunk will only build HDSL if a non-default profile is enabled.
* When Hadoop creates a release branch, the RM will delete the HDSL module
from the branch.
* HDSL will have their own Yetus checks and won't cause failures in the
Hadoop patch check.

I think this accomplishes most of the goals of encouraging HDSL development
while minimizing the potential for disruption of HDFS development.

Thoughts? Andrew, Jitendra, & Sanjay?

Thanks,
  Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

This discussion seems to have died down coming closer consensus without a
resolution.

I'd like to propose the following compromise:

* HDSL become a subproject of Hadoop.
* HDSL will release separately from Hadoop. Hadoop releases will not
contain HDSL and vice versa.
* HDSL will get its own jira instance so that the release tags stay
separate.
* On trunk (as opposed to release branches) HDSL will be a separate module
in Hadoop's source tree. This will enable the HDSL to work on their trunk
and the Hadoop trunk without making releases for every change.
* Hadoop's trunk will only build HDSL if a non-default profile is enabled.
* When Hadoop creates a release branch, the RM will delete the HDSL module
from the branch.
* HDSL will have their own Yetus checks and won't cause failures in the
Hadoop patch check.

I think this accomplishes most of the goals of encouraging HDSL development
while minimizing the potential for disruption of HDFS development.

Thoughts? Andrew, Jitendra, & Sanjay?

Thanks,
   Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

This discussion seems to have died down coming closer consensus without a
resolution.

I'd like to propose the following compromise:

* HDSL become a subproject of Hadoop.
* HDSL will release separately from Hadoop. Hadoop releases will not
contain HDSL and vice versa.
* HDSL will get its own jira instance so that the release tags stay
separate.
* On trunk (as opposed to release branches) HDSL will be a separate module
in Hadoop's source tree. This will enable the HDSL to work on their trunk
and the Hadoop trunk without making releases for every change.
* Hadoop's trunk will only build HDSL if a non-default profile is enabled.
* When Hadoop creates a release branch, the RM will delete the HDSL module
from the branch.
* HDSL will have their own Yetus checks and won't cause failures in the
Hadoop patch check.

I think this accomplishes most of the goals of encouraging HDSL development
while minimizing the potential for disruption of HDFS development.

Thoughts? Andrew, Jitendra, & Sanjay?

Thanks,
   Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

This discussion seems to have died down coming closer consensus without a
resolution.

I'd like to propose the following compromise:

* HDSL become a subproject of Hadoop.
* HDSL will release separately from Hadoop. Hadoop releases will not
contain HDSL and vice versa.
* HDSL will get its own jira instance so that the release tags stay
separate.
* On trunk (as opposed to release branches) HDSL will be a separate module
in Hadoop's source tree. This will enable the HDSL to work on their trunk
and the Hadoop trunk without making releases for every change.
* Hadoop's trunk will only build HDSL if a non-default profile is enabled.
* When Hadoop creates a release branch, the RM will delete the HDSL module
from the branch.
* HDSL will have their own Yetus checks and won't cause failures in the
Hadoop patch check.

I think this accomplishes most of the goals of encouraging HDSL development
while minimizing the potential for disruption of HDFS development.

Thoughts? Andrew, Jitendra, & Sanjay?

Thanks,
   Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

This discussion seems to have died down coming closer consensus without a
resolution.

I'd like to propose the following compromise:

* HDSL become a subproject of Hadoop.
* HDSL will release separately from Hadoop. Hadoop releases will not
contain HDSL and vice versa.
* HDSL will get its own jira instance so that the release tags stay
separate.
* On trunk (as opposed to release branches) HDSL will be a separate module
in Hadoop's source tree. This will enable the HDSL to work on their trunk
and the Hadoop trunk without making releases for every change.
* Hadoop's trunk will only build HDSL if a non-default profile is enabled.
* When Hadoop creates a release branch, the RM will delete the HDSL module
from the branch.
* HDSL will have their own Yetus checks and won't cause failures in the
Hadoop patch check.

I think this accomplishes most of the goals of encouraging HDSL development
while minimizing the potential for disruption of HDFS development.

Thoughts? Andrew, Jitendra, & Sanjay?

Thanks,
   Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

Hi Joep,

On Tue, Mar 6, 2018 at 6:50 PM, J. Rottinghuis <jr...@gmail.com>
wrote:

Obviously when people do want to use Ozone, then having it in the same repo
> is easier. The flipside is that, separate top-level project in the same
> repo or not, it adds to the Hadoop releases.
>

Apache projects are about the group of people who are working together.
There is a large overlap between the team working on HDFS and Ozone, which
is a lot of the motivation to keep project overhead to a minimum and not
start a new project.

Using the same releases or separate releases is a distinct choice. Many
Apache projects, such as Common and Maven, have multiple artifacts that
release independently. In Hive, we have two sub-projects that release
indepdendently: Hive Storage API, and Hive.

One thing we did during that split to minimize the challenges to the
developers was that Storage API and Hive have the same master branch.
However, since they have different releases, they have their own release
branches and release numbers.

If there is a change in Ozone and a new release needed, it would have to
> wait for a Hadoop release. Ditto if there is a Hadoop release and there is
> an issue with Ozone. The case that one could turn off Ozone through a Maven
> profile works only to some extend.
> If we have done a 3.x release with Ozone in it, would it make sense to do
> a 3.y release with y>x without Ozone in it? That would be weird.
>

Actually, if Ozone is marked as unstable/evolving (we should actually have
an even stronger warning for a feature preview), we could remove it in a
3.x. If a user picks up a feature before it is stable, we try to provide a
stable platform, but mistakes happen. Introducing an incompatible change to
the Ozone API between 3.1 and 3.2 wouldn't be good, but it wouldn't be the
end of the world.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

Hi Joep,

On Tue, Mar 6, 2018 at 6:50 PM, J. Rottinghuis <jr...@gmail.com>
wrote:

Obviously when people do want to use Ozone, then having it in the same repo
> is easier. The flipside is that, separate top-level project in the same
> repo or not, it adds to the Hadoop releases.
>

Apache projects are about the group of people who are working together.
There is a large overlap between the team working on HDFS and Ozone, which
is a lot of the motivation to keep project overhead to a minimum and not
start a new project.

Using the same releases or separate releases is a distinct choice. Many
Apache projects, such as Common and Maven, have multiple artifacts that
release independently. In Hive, we have two sub-projects that release
indepdendently: Hive Storage API, and Hive.

One thing we did during that split to minimize the challenges to the
developers was that Storage API and Hive have the same master branch.
However, since they have different releases, they have their own release
branches and release numbers.

If there is a change in Ozone and a new release needed, it would have to
> wait for a Hadoop release. Ditto if there is a Hadoop release and there is
> an issue with Ozone. The case that one could turn off Ozone through a Maven
> profile works only to some extend.
> If we have done a 3.x release with Ozone in it, would it make sense to do
> a 3.y release with y>x without Ozone in it? That would be weird.
>

Actually, if Ozone is marked as unstable/evolving (we should actually have
an even stronger warning for a feature preview), we could remove it in a
3.x. If a user picks up a feature before it is stable, we try to provide a
stable platform, but mistakes happen. Introducing an incompatible change to
the Ozone API between 3.1 and 3.2 wouldn't be good, but it wouldn't be the
end of the world.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

Hi Joep,

On Tue, Mar 6, 2018 at 6:50 PM, J. Rottinghuis <jr...@gmail.com>
wrote:

Obviously when people do want to use Ozone, then having it in the same repo
> is easier. The flipside is that, separate top-level project in the same
> repo or not, it adds to the Hadoop releases.
>

Apache projects are about the group of people who are working together.
There is a large overlap between the team working on HDFS and Ozone, which
is a lot of the motivation to keep project overhead to a minimum and not
start a new project.

Using the same releases or separate releases is a distinct choice. Many
Apache projects, such as Common and Maven, have multiple artifacts that
release independently. In Hive, we have two sub-projects that release
indepdendently: Hive Storage API, and Hive.

One thing we did during that split to minimize the challenges to the
developers was that Storage API and Hive have the same master branch.
However, since they have different releases, they have their own release
branches and release numbers.

If there is a change in Ozone and a new release needed, it would have to
> wait for a Hadoop release. Ditto if there is a Hadoop release and there is
> an issue with Ozone. The case that one could turn off Ozone through a Maven
> profile works only to some extend.
> If we have done a 3.x release with Ozone in it, would it make sense to do
> a 3.y release with y>x without Ozone in it? That would be weird.
>

Actually, if Ozone is marked as unstable/evolving (we should actually have
an even stronger warning for a feature preview), we could remove it in a
3.x. If a user picks up a feature before it is stable, we try to provide a
stable platform, but mistakes happen. Introducing an incompatible change to
the Ozone API between 3.1 and 3.2 wouldn't be good, but it wouldn't be the
end of the world.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Joep,  You raise a number of points:

(1) Ozone vs and object stores. “Some users would choose Ozone as that layer, some might use S3, others GCS, or Azure, or something else”.
(2) How HDSL/Ozone fits into Hadoop and whether it is necessary.
(3) You raise the release issue which we will respond in a separate email.

Let me respond to 1 & 2:
***Wrt to (1) Ozone vs other object stores***
Neither HDFS or Ozone has any real role in cloud except for temp data. The cost of local disk or EBS is so high that long term data storage on HDFS or even Ozone is prohibitive.
So why the hell create the KV namespace? We need to stabilize the HDSL where data is stored.  - We are targeting Hive and SPark apps to stabilize HDSL using real Hadoop apps over OzoneFS.
But HDSL/Ozone is not feature compatible with HDFS so how will users even use it for real to stability. Users can run HDFS and Ozone side by side in same cluster and have two namespace (just like in Federation) and run apps on both: run some hive and spark apps on Ozone and others that need full HDFS feature (encryption) on HDFS. As it becomes stable they can start using HDSL/Ozone for production use for a portion of their data.



***Wrt to (2) HDSL/Ozone fitting into Hadoop and why the same repository***
Ozone KV is a temporary step. Real goal is to put NN on top of HDSL, We have shown how to do that in the roadmap that Konstantine and Chris D asked. Milestone 1 is feasible and doesn't require removal of FSN lock. We have also shown several cases of sharing other code in future (protocol engine). This co-development will be easier if in the same repo. Over time HDSL + ported NN  will create a new HDFS and become feature compatible - some of the feature will come for free because they are in NN and will port over to the new NN, Some are in block layer (erasure code) and will have to be added to HDSL.

--- You compare with Yarn, HDFS and Common. HDFS and Yarn are independent but both depend on Hadoop common (e.g. HBase runs on HDFS without Yarn).   HDSL and Ozone will depend on Hadoop common, Indeed the new protocol engine of HDSL might move to Hadoop common or HDFS. We have made sure that there are no dependencies of HDFS on HDSL or currently.


***The Repo issue and conclusion***
HDFS community will need to work together as we evolve old HDFS to use HDSL, new protocol engine and Raft. and together evolve to a newer more powerful set of sub components. It is important that they are in same repo and that we can share code through both private interface. We are not trying to build a competing Object store but to improve HDFS and fixing scalability fundamentally is hard and we are asking for an environment for that to happen easily over the next year while heeding to the stability concerns of HDFS developers (eg we  remove compile time dependency, maven profile). This work is not being done by members of foreign project trying to insert code in Hadoop, but by Hadoop/HDFS developers with given track record s and are active participation in Hadoop and HDFS. Our jobs depend on HDFS/Hadoop stability - destabilizing is the last thing we want to do; we have responded every constructive feedback 


sanjay


> On Mar 6, 2018, at 6:50 PM, J. Rottinghuis <jr...@gmail.com> wrote:
> 
> Sorry for jumping in late into the fray of this discussion.
> 
> It seems Ozone is a large feature. I appreciate the development effort and
> the desire to get this into the hands of users.
> I understand the need to iterate quickly and to reduce overhead for
> development.
> I also agree that Hadoop can benefit from a quicker release cycle. For our
> part, this is a challenge as we have a large installation with multiple
> clusters and thousands of users. It is a constant balance between jumping
> to the newest release and the cost of this integration and test at our
> scale, especially when things aren't backwards compatible. We try to be
> good citizens and upstream our changes and contribute back.
> 
> The point was made that splitting the projects such as common and Yarn
> didn't work and had to be reverted. That was painful and a lot of work for
> those involved for sure. This project may be slightly different in that
> hadoop-common, Yarn and HDFS made for one consistent whole. One couldn't
> run a project without the other.
> 
> Having a separate block management layer with possibly multiple block
> implementation as pluggable under the covers would be a good future
> development for HDFS. Some users would choose Ozone as that layer, some
> might use S3, others GCS, or Azure, or something else.
> If the argument is made that nobody will be able to run Hadoop as a
> consistent stack without Ozone, then that would be a strong case to keep
> things in the same repo.
> 
> Obviously when people do want to use Ozone, then having it in the same repo
> is easier. The flipside is that, separate top-level project in the same
> repo or not, it adds to the Hadoop releases. If there is a change in Ozone
> and a new release needed, it would have to wait for a Hadoop release. Ditto
> if there is a Hadoop release and there is an issue with Ozone. The case
> that one could turn off Ozone through a Maven profile works only to some
> extend.
> If we have done a 3.x release with Ozone in it, would it make sense to do a
> 3.y release with y>x without Ozone in it? That would be weird.
> 
> This does sound like a Hadoop 4 feature. Compatibility with lots of new
> features in Hadoop 3 need to be worked out. We're still working on jumping
> to a Hadoop 2.9 release and then working on getting a step-store release to
> 3.0 to bridge compatibility issues. I'm afraid that adding a very large new
> feature into trunk now, essentially makes going to Hadoop 3 not viable for
> quite a while. That would be a bummer for all the feature work that has
> gone into Hadoop 3. Encryption and erasure encoding are very appealing
> features, especially in light of meeting GDPR requirements.
> 
> I'd argue to pull out those pieces that make sense in Hadoop 3, merge those
> in and keep the rest in a separate project. Iterate quickly in that
> separate project, you can have a separate set of committers, you can do
> separate release cycle. If that develops Ozone into _the_ new block layer
> for all use cases (even when people want to give up on encryption, erasure
> encoding, or feature parity is reached) then we can jump of that bridge
> when we reach it. I think adding a very large chunk of code that relatively
> few people in the community are familiar with isn't necessarily going to
> help Hadoop at this time.
> 
> Cheers,
> 
> Joep
> 
> On Tue, Mar 6, 2018 at 2:32 PM, Jitendra Pandey <ji...@hortonworks.com>
> wrote:
> 
>> Hi Andrew,
>> 
>> I think we can eliminate the maintenance costs even in the same repo. We
>> can make following changes that incorporate suggestions from Daryn and Owen
>> as well.
>> 1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate
>> directory.
>> 2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
>> 3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be
>> loaded in DN as a pluggable module.
>>     If not loaded, there will be absolutely no code path through hdsl or
>> ozone.
>> 4. To further make it easier for folks building hadoop, we can support a
>> maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will
>> not be built.
>>     For example, Cloudera can choose to skip even compiling/building
>> hdsl/ozone and therefore no maintenance overhead whatsoever.
>>     HADOOP-14453 has a patch that shows how it can be done.
>> 
>> Arguably, there are two kinds of maintenance costs. Costs for developers
>> and the cost for users.
>> - Developers: A maven profile as noted in point (3) and (4) above
>> completely addresses the concern for developers
>>                                 as there are no compile time dependencies
>> and further, they can choose not to build ozone/hdsl.
>> - User: Cost to users will be completely alleviated if ozone/hdsl is not
>> loaded as mentioned in point (3) above.
>> 
>> jitendra
>> 
>> From: Andrew Wang <an...@cloudera.com>
>> Date: Monday, March 5, 2018 at 3:54 PM
>> To: Wangda Tan <wh...@gmail.com>
>> Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp
>> <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>,
>> hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <
>> common-dev@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <
>> yarn-dev@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <
>> mapreduce-dev@hadoop.apache.org>
>> Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk
>> 
>> Hi Owen, Wangda,
>> 
>> Thanks for clearly laying out the subproject options, that helps the
>> discussion.
>> 
>> I'm all onboard with the idea of regular releases, and it's something I
>> tried to do with the 3.0 alphas and betas. The problem though isn't a lack
>> of commitment from feature developers like Sanjay or Jitendra; far from it!
>> I think every feature developer makes a reasonable effort to test their
>> code before it's merged. Yet, my experience as an RM is that more code
>> comes with more risk. I don't believe that Ozone is special or different in
>> this regard. It comes with a maintenance cost, not a maintenance benefit.
>> 
>> 
>> I'm advocating for #3: separate source, separate release. Since HDSL
>> stability and FSN/BM refactoring are still a ways out, I don't want to
>> incur a maintenance cost now. I sympathize with the sentiment that working
>> cross-repo is harder than within same repo, but the right tooling can make
>> this a lot easier (e.g. git submodule, Google's repo tool). We have
>> experience doing this internally here at Cloudera, and I'm happy to share
>> knowledge and possibly code.
>> 
>> Best,
>> Andrew
>> 
>> On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
>> I like the idea of same source / same release and put Ozone's source under
>> a different directory.
>> 
>> Like Owen mentioned, It gonna be important for all parties to keep a
>> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
>> releases. Users can try features and give feedbacks to stabilize feature
>> earlier; developers can be happier since efforts will be consumed by users
>> soon after features get merged. In addition to this, if features merged to
>> trunk after reasonable tests/review, Andrew's concern may not be a problem
>> anymore:
>> 
>> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
>> being a separate project. Ozone could release faster and iterate more
>> quickly if it wasn't hampered by Hadoop's release schedule and security and
>> compatibility requirements.
>> 
>> Thanks,
>> Wangda
>> 
>> 
>> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
>> wrote:
>> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>> 
>> Owen mentioned making a Hadoop subproject; we'd have to
>>> hash out what exactly this means (I assume a separate repo still managed
>> by
>>> the Hadoop project), but I think we could make this work if it's more
>>> attractive than incubation or a new TLP.
>> 
>> 
>> Ok, there are multiple levels of sub-projects that all make sense:
>> 
>>   - Same source tree, same releases - examples like HDFS & YARN
>>   - Same master branch, separate releases and release branches - Hive's
>>   Storage API vs Hive. It is in the source tree for the master branch, but
>>   has distinct releases and release branches.
>>   - Separate source, separate release - Apache Commons.
>> 
>> There are advantages and disadvantages to each. I'd propose that we use the
>> same source, same release pattern for Ozone. Note that we tried and later
>> reverted doing Common, HDFS, and YARN as separate source, separate release
>> because it was too much trouble. I like Daryn's idea of putting it as a top
>> level directory in Hadoop and making sure that nothing in Common, HDFS, or
>> YARN depend on it. That way if a Release Manager doesn't think it is ready
>> for release, it can be trivially removed before the release.
>> 
>> One thing about using the same releases, Sanjay and Jitendra are signing up
>> to make much more regular bugfix and minor releases in the near future. For
>> example, they'll need to make 3.2 relatively soon to get it released and
>> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
>> project. Hadoop needs more regular releases and fewer big bang releases.
>> 
>> .. Owen
>> 
>> 
>> 
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Joep,  You raise a number of points:

(1) Ozone vs and object stores. “Some users would choose Ozone as that layer, some might use S3, others GCS, or Azure, or something else”.
(2) How HDSL/Ozone fits into Hadoop and whether it is necessary.
(3) You raise the release issue which we will respond in a separate email.

Let me respond to 1 & 2:
***Wrt to (1) Ozone vs other object stores***
Neither HDFS or Ozone has any real role in cloud except for temp data. The cost of local disk or EBS is so high that long term data storage on HDFS or even Ozone is prohibitive.
So why the hell create the KV namespace? We need to stabilize the HDSL where data is stored.  - We are targeting Hive and SPark apps to stabilize HDSL using real Hadoop apps over OzoneFS.
But HDSL/Ozone is not feature compatible with HDFS so how will users even use it for real to stability. Users can run HDFS and Ozone side by side in same cluster and have two namespace (just like in Federation) and run apps on both: run some hive and spark apps on Ozone and others that need full HDFS feature (encryption) on HDFS. As it becomes stable they can start using HDSL/Ozone for production use for a portion of their data.



***Wrt to (2) HDSL/Ozone fitting into Hadoop and why the same repository***
Ozone KV is a temporary step. Real goal is to put NN on top of HDSL, We have shown how to do that in the roadmap that Konstantine and Chris D asked. Milestone 1 is feasible and doesn't require removal of FSN lock. We have also shown several cases of sharing other code in future (protocol engine). This co-development will be easier if in the same repo. Over time HDSL + ported NN  will create a new HDFS and become feature compatible - some of the feature will come for free because they are in NN and will port over to the new NN, Some are in block layer (erasure code) and will have to be added to HDSL.

--- You compare with Yarn, HDFS and Common. HDFS and Yarn are independent but both depend on Hadoop common (e.g. HBase runs on HDFS without Yarn).   HDSL and Ozone will depend on Hadoop common, Indeed the new protocol engine of HDSL might move to Hadoop common or HDFS. We have made sure that there are no dependencies of HDFS on HDSL or currently.


***The Repo issue and conclusion***
HDFS community will need to work together as we evolve old HDFS to use HDSL, new protocol engine and Raft. and together evolve to a newer more powerful set of sub components. It is important that they are in same repo and that we can share code through both private interface. We are not trying to build a competing Object store but to improve HDFS and fixing scalability fundamentally is hard and we are asking for an environment for that to happen easily over the next year while heeding to the stability concerns of HDFS developers (eg we  remove compile time dependency, maven profile). This work is not being done by members of foreign project trying to insert code in Hadoop, but by Hadoop/HDFS developers with given track record s and are active participation in Hadoop and HDFS. Our jobs depend on HDFS/Hadoop stability - destabilizing is the last thing we want to do; we have responded every constructive feedback 


sanjay


> On Mar 6, 2018, at 6:50 PM, J. Rottinghuis <jr...@gmail.com> wrote:
> 
> Sorry for jumping in late into the fray of this discussion.
> 
> It seems Ozone is a large feature. I appreciate the development effort and
> the desire to get this into the hands of users.
> I understand the need to iterate quickly and to reduce overhead for
> development.
> I also agree that Hadoop can benefit from a quicker release cycle. For our
> part, this is a challenge as we have a large installation with multiple
> clusters and thousands of users. It is a constant balance between jumping
> to the newest release and the cost of this integration and test at our
> scale, especially when things aren't backwards compatible. We try to be
> good citizens and upstream our changes and contribute back.
> 
> The point was made that splitting the projects such as common and Yarn
> didn't work and had to be reverted. That was painful and a lot of work for
> those involved for sure. This project may be slightly different in that
> hadoop-common, Yarn and HDFS made for one consistent whole. One couldn't
> run a project without the other.
> 
> Having a separate block management layer with possibly multiple block
> implementation as pluggable under the covers would be a good future
> development for HDFS. Some users would choose Ozone as that layer, some
> might use S3, others GCS, or Azure, or something else.
> If the argument is made that nobody will be able to run Hadoop as a
> consistent stack without Ozone, then that would be a strong case to keep
> things in the same repo.
> 
> Obviously when people do want to use Ozone, then having it in the same repo
> is easier. The flipside is that, separate top-level project in the same
> repo or not, it adds to the Hadoop releases. If there is a change in Ozone
> and a new release needed, it would have to wait for a Hadoop release. Ditto
> if there is a Hadoop release and there is an issue with Ozone. The case
> that one could turn off Ozone through a Maven profile works only to some
> extend.
> If we have done a 3.x release with Ozone in it, would it make sense to do a
> 3.y release with y>x without Ozone in it? That would be weird.
> 
> This does sound like a Hadoop 4 feature. Compatibility with lots of new
> features in Hadoop 3 need to be worked out. We're still working on jumping
> to a Hadoop 2.9 release and then working on getting a step-store release to
> 3.0 to bridge compatibility issues. I'm afraid that adding a very large new
> feature into trunk now, essentially makes going to Hadoop 3 not viable for
> quite a while. That would be a bummer for all the feature work that has
> gone into Hadoop 3. Encryption and erasure encoding are very appealing
> features, especially in light of meeting GDPR requirements.
> 
> I'd argue to pull out those pieces that make sense in Hadoop 3, merge those
> in and keep the rest in a separate project. Iterate quickly in that
> separate project, you can have a separate set of committers, you can do
> separate release cycle. If that develops Ozone into _the_ new block layer
> for all use cases (even when people want to give up on encryption, erasure
> encoding, or feature parity is reached) then we can jump of that bridge
> when we reach it. I think adding a very large chunk of code that relatively
> few people in the community are familiar with isn't necessarily going to
> help Hadoop at this time.
> 
> Cheers,
> 
> Joep
> 
> On Tue, Mar 6, 2018 at 2:32 PM, Jitendra Pandey <ji...@hortonworks.com>
> wrote:
> 
>> Hi Andrew,
>> 
>> I think we can eliminate the maintenance costs even in the same repo. We
>> can make following changes that incorporate suggestions from Daryn and Owen
>> as well.
>> 1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate
>> directory.
>> 2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
>> 3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be
>> loaded in DN as a pluggable module.
>>     If not loaded, there will be absolutely no code path through hdsl or
>> ozone.
>> 4. To further make it easier for folks building hadoop, we can support a
>> maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will
>> not be built.
>>     For example, Cloudera can choose to skip even compiling/building
>> hdsl/ozone and therefore no maintenance overhead whatsoever.
>>     HADOOP-14453 has a patch that shows how it can be done.
>> 
>> Arguably, there are two kinds of maintenance costs. Costs for developers
>> and the cost for users.
>> - Developers: A maven profile as noted in point (3) and (4) above
>> completely addresses the concern for developers
>>                                 as there are no compile time dependencies
>> and further, they can choose not to build ozone/hdsl.
>> - User: Cost to users will be completely alleviated if ozone/hdsl is not
>> loaded as mentioned in point (3) above.
>> 
>> jitendra
>> 
>> From: Andrew Wang <an...@cloudera.com>
>> Date: Monday, March 5, 2018 at 3:54 PM
>> To: Wangda Tan <wh...@gmail.com>
>> Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp
>> <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>,
>> hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <
>> common-dev@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <
>> yarn-dev@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <
>> mapreduce-dev@hadoop.apache.org>
>> Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk
>> 
>> Hi Owen, Wangda,
>> 
>> Thanks for clearly laying out the subproject options, that helps the
>> discussion.
>> 
>> I'm all onboard with the idea of regular releases, and it's something I
>> tried to do with the 3.0 alphas and betas. The problem though isn't a lack
>> of commitment from feature developers like Sanjay or Jitendra; far from it!
>> I think every feature developer makes a reasonable effort to test their
>> code before it's merged. Yet, my experience as an RM is that more code
>> comes with more risk. I don't believe that Ozone is special or different in
>> this regard. It comes with a maintenance cost, not a maintenance benefit.
>> 
>> 
>> I'm advocating for #3: separate source, separate release. Since HDSL
>> stability and FSN/BM refactoring are still a ways out, I don't want to
>> incur a maintenance cost now. I sympathize with the sentiment that working
>> cross-repo is harder than within same repo, but the right tooling can make
>> this a lot easier (e.g. git submodule, Google's repo tool). We have
>> experience doing this internally here at Cloudera, and I'm happy to share
>> knowledge and possibly code.
>> 
>> Best,
>> Andrew
>> 
>> On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
>> I like the idea of same source / same release and put Ozone's source under
>> a different directory.
>> 
>> Like Owen mentioned, It gonna be important for all parties to keep a
>> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
>> releases. Users can try features and give feedbacks to stabilize feature
>> earlier; developers can be happier since efforts will be consumed by users
>> soon after features get merged. In addition to this, if features merged to
>> trunk after reasonable tests/review, Andrew's concern may not be a problem
>> anymore:
>> 
>> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
>> being a separate project. Ozone could release faster and iterate more
>> quickly if it wasn't hampered by Hadoop's release schedule and security and
>> compatibility requirements.
>> 
>> Thanks,
>> Wangda
>> 
>> 
>> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
>> wrote:
>> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>> 
>> Owen mentioned making a Hadoop subproject; we'd have to
>>> hash out what exactly this means (I assume a separate repo still managed
>> by
>>> the Hadoop project), but I think we could make this work if it's more
>>> attractive than incubation or a new TLP.
>> 
>> 
>> Ok, there are multiple levels of sub-projects that all make sense:
>> 
>>   - Same source tree, same releases - examples like HDFS & YARN
>>   - Same master branch, separate releases and release branches - Hive's
>>   Storage API vs Hive. It is in the source tree for the master branch, but
>>   has distinct releases and release branches.
>>   - Separate source, separate release - Apache Commons.
>> 
>> There are advantages and disadvantages to each. I'd propose that we use the
>> same source, same release pattern for Ozone. Note that we tried and later
>> reverted doing Common, HDFS, and YARN as separate source, separate release
>> because it was too much trouble. I like Daryn's idea of putting it as a top
>> level directory in Hadoop and making sure that nothing in Common, HDFS, or
>> YARN depend on it. That way if a Release Manager doesn't think it is ready
>> for release, it can be trivially removed before the release.
>> 
>> One thing about using the same releases, Sanjay and Jitendra are signing up
>> to make much more regular bugfix and minor releases in the near future. For
>> example, they'll need to make 3.2 relatively soon to get it released and
>> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
>> project. Hadoop needs more regular releases and fewer big bang releases.
>> 
>> .. Owen
>> 
>> 
>> 
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

Hi Joep,

On Tue, Mar 6, 2018 at 6:50 PM, J. Rottinghuis <jr...@gmail.com>
wrote:

Obviously when people do want to use Ozone, then having it in the same repo
> is easier. The flipside is that, separate top-level project in the same
> repo or not, it adds to the Hadoop releases.
>

Apache projects are about the group of people who are working together.
There is a large overlap between the team working on HDFS and Ozone, which
is a lot of the motivation to keep project overhead to a minimum and not
start a new project.

Using the same releases or separate releases is a distinct choice. Many
Apache projects, such as Common and Maven, have multiple artifacts that
release independently. In Hive, we have two sub-projects that release
indepdendently: Hive Storage API, and Hive.

One thing we did during that split to minimize the challenges to the
developers was that Storage API and Hive have the same master branch.
However, since they have different releases, they have their own release
branches and release numbers.

If there is a change in Ozone and a new release needed, it would have to
> wait for a Hadoop release. Ditto if there is a Hadoop release and there is
> an issue with Ozone. The case that one could turn off Ozone through a Maven
> profile works only to some extend.
> If we have done a 3.x release with Ozone in it, would it make sense to do
> a 3.y release with y>x without Ozone in it? That would be weird.
>

Actually, if Ozone is marked as unstable/evolving (we should actually have
an even stronger warning for a feature preview), we could remove it in a
3.x. If a user picks up a feature before it is stable, we try to provide a
stable platform, but mistakes happen. Introducing an incompatible change to
the Ozone API between 3.1 and 3.2 wouldn't be good, but it wouldn't be the
end of the world.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by "J. Rottinghuis" <jr...@gmail.com>.

Sorry for jumping in late into the fray of this discussion.

It seems Ozone is a large feature. I appreciate the development effort and
the desire to get this into the hands of users.
I understand the need to iterate quickly and to reduce overhead for
development.
I also agree that Hadoop can benefit from a quicker release cycle. For our
part, this is a challenge as we have a large installation with multiple
clusters and thousands of users. It is a constant balance between jumping
to the newest release and the cost of this integration and test at our
scale, especially when things aren't backwards compatible. We try to be
good citizens and upstream our changes and contribute back.

The point was made that splitting the projects such as common and Yarn
didn't work and had to be reverted. That was painful and a lot of work for
those involved for sure. This project may be slightly different in that
hadoop-common, Yarn and HDFS made for one consistent whole. One couldn't
run a project without the other.

Having a separate block management layer with possibly multiple block
implementation as pluggable under the covers would be a good future
development for HDFS. Some users would choose Ozone as that layer, some
might use S3, others GCS, or Azure, or something else.
If the argument is made that nobody will be able to run Hadoop as a
consistent stack without Ozone, then that would be a strong case to keep
things in the same repo.

Obviously when people do want to use Ozone, then having it in the same repo
is easier. The flipside is that, separate top-level project in the same
repo or not, it adds to the Hadoop releases. If there is a change in Ozone
and a new release needed, it would have to wait for a Hadoop release. Ditto
if there is a Hadoop release and there is an issue with Ozone. The case
that one could turn off Ozone through a Maven profile works only to some
extend.
If we have done a 3.x release with Ozone in it, would it make sense to do a
3.y release with y>x without Ozone in it? That would be weird.

This does sound like a Hadoop 4 feature. Compatibility with lots of new
features in Hadoop 3 need to be worked out. We're still working on jumping
to a Hadoop 2.9 release and then working on getting a step-store release to
3.0 to bridge compatibility issues. I'm afraid that adding a very large new
feature into trunk now, essentially makes going to Hadoop 3 not viable for
quite a while. That would be a bummer for all the feature work that has
gone into Hadoop 3. Encryption and erasure encoding are very appealing
features, especially in light of meeting GDPR requirements.

I'd argue to pull out those pieces that make sense in Hadoop 3, merge those
in and keep the rest in a separate project. Iterate quickly in that
separate project, you can have a separate set of committers, you can do
separate release cycle. If that develops Ozone into _the_ new block layer
for all use cases (even when people want to give up on encryption, erasure
encoding, or feature parity is reached) then we can jump of that bridge
when we reach it. I think adding a very large chunk of code that relatively
few people in the community are familiar with isn't necessarily going to
help Hadoop at this time.

Cheers,

Joep

On Tue, Mar 6, 2018 at 2:32 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

> Hi Andrew,
>
>  I think we can eliminate the maintenance costs even in the same repo. We
> can make following changes that incorporate suggestions from Daryn and Owen
> as well.
> 1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate
> directory.
> 2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
> 3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be
> loaded in DN as a pluggable module.
>      If not loaded, there will be absolutely no code path through hdsl or
> ozone.
> 4. To further make it easier for folks building hadoop, we can support a
> maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will
> not be built.
>      For example, Cloudera can choose to skip even compiling/building
> hdsl/ozone and therefore no maintenance overhead whatsoever.
>      HADOOP-14453 has a patch that shows how it can be done.
>
> Arguably, there are two kinds of maintenance costs. Costs for developers
> and the cost for users.
> - Developers: A maven profile as noted in point (3) and (4) above
> completely addresses the concern for developers
>                                  as there are no compile time dependencies
> and further, they can choose not to build ozone/hdsl.
> - User: Cost to users will be completely alleviated if ozone/hdsl is not
> loaded as mentioned in point (3) above.
>
> jitendra
>
> From: Andrew Wang <an...@cloudera.com>
> Date: Monday, March 5, 2018 at 3:54 PM
> To: Wangda Tan <wh...@gmail.com>
> Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp
> <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>,
> hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <
> common-dev@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <
> yarn-dev@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <
> mapreduce-dev@hadoop.apache.org>
> Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk
>
> Hi Owen, Wangda,
>
> Thanks for clearly laying out the subproject options, that helps the
> discussion.
>
> I'm all onboard with the idea of regular releases, and it's something I
> tried to do with the 3.0 alphas and betas. The problem though isn't a lack
> of commitment from feature developers like Sanjay or Jitendra; far from it!
> I think every feature developer makes a reasonable effort to test their
> code before it's merged. Yet, my experience as an RM is that more code
> comes with more risk. I don't believe that Ozone is special or different in
> this regard. It comes with a maintenance cost, not a maintenance benefit.
>
>
> I'm advocating for #3: separate source, separate release. Since HDSL
> stability and FSN/BM refactoring are still a ways out, I don't want to
> incur a maintenance cost now. I sympathize with the sentiment that working
> cross-repo is harder than within same repo, but the right tooling can make
> this a lot easier (e.g. git submodule, Google's repo tool). We have
> experience doing this internally here at Cloudera, and I'm happy to share
> knowledge and possibly code.
>
> Best,
> Andrew
>
> On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
> I like the idea of same source / same release and put Ozone's source under
> a different directory.
>
> Like Owen mentioned, It gonna be important for all parties to keep a
> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
> releases. Users can try features and give feedbacks to stabilize feature
> earlier; developers can be happier since efforts will be consumed by users
> soon after features get merged. In addition to this, if features merged to
> trunk after reasonable tests/review, Andrew's concern may not be a problem
> anymore:
>
> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements.
>
> Thanks,
> Wangda
>
>
> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
> wrote:
> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> Owen mentioned making a Hadoop subproject; we'd have to
> > hash out what exactly this means (I assume a separate repo still managed
> by
> > the Hadoop project), but I think we could make this work if it's more
> > attractive than incubation or a new TLP.
>
>
> Ok, there are multiple levels of sub-projects that all make sense:
>
>    - Same source tree, same releases - examples like HDFS & YARN
>    - Same master branch, separate releases and release branches - Hive's
>    Storage API vs Hive. It is in the source tree for the master branch, but
>    has distinct releases and release branches.
>    - Separate source, separate release - Apache Commons.
>
> There are advantages and disadvantages to each. I'd propose that we use the
> same source, same release pattern for Ozone. Note that we tried and later
> reverted doing Common, HDFS, and YARN as separate source, separate release
> because it was too much trouble. I like Daryn's idea of putting it as a top
> level directory in Hadoop and making sure that nothing in Common, HDFS, or
> YARN depend on it. That way if a Release Manager doesn't think it is ready
> for release, it can be trivially removed before the release.
>
> One thing about using the same releases, Sanjay and Jitendra are signing up
> to make much more regular bugfix and minor releases in the near future. For
> example, they'll need to make 3.2 relatively soon to get it released and
> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
> project. Hadoop needs more regular releases and fewer big bang releases.
>
> .. Owen
>
>
>
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by "J. Rottinghuis" <jr...@gmail.com>.

Sorry for jumping in late into the fray of this discussion.

It seems Ozone is a large feature. I appreciate the development effort and
the desire to get this into the hands of users.
I understand the need to iterate quickly and to reduce overhead for
development.
I also agree that Hadoop can benefit from a quicker release cycle. For our
part, this is a challenge as we have a large installation with multiple
clusters and thousands of users. It is a constant balance between jumping
to the newest release and the cost of this integration and test at our
scale, especially when things aren't backwards compatible. We try to be
good citizens and upstream our changes and contribute back.

The point was made that splitting the projects such as common and Yarn
didn't work and had to be reverted. That was painful and a lot of work for
those involved for sure. This project may be slightly different in that
hadoop-common, Yarn and HDFS made for one consistent whole. One couldn't
run a project without the other.

Having a separate block management layer with possibly multiple block
implementation as pluggable under the covers would be a good future
development for HDFS. Some users would choose Ozone as that layer, some
might use S3, others GCS, or Azure, or something else.
If the argument is made that nobody will be able to run Hadoop as a
consistent stack without Ozone, then that would be a strong case to keep
things in the same repo.

Obviously when people do want to use Ozone, then having it in the same repo
is easier. The flipside is that, separate top-level project in the same
repo or not, it adds to the Hadoop releases. If there is a change in Ozone
and a new release needed, it would have to wait for a Hadoop release. Ditto
if there is a Hadoop release and there is an issue with Ozone. The case
that one could turn off Ozone through a Maven profile works only to some
extend.
If we have done a 3.x release with Ozone in it, would it make sense to do a
3.y release with y>x without Ozone in it? That would be weird.

This does sound like a Hadoop 4 feature. Compatibility with lots of new
features in Hadoop 3 need to be worked out. We're still working on jumping
to a Hadoop 2.9 release and then working on getting a step-store release to
3.0 to bridge compatibility issues. I'm afraid that adding a very large new
feature into trunk now, essentially makes going to Hadoop 3 not viable for
quite a while. That would be a bummer for all the feature work that has
gone into Hadoop 3. Encryption and erasure encoding are very appealing
features, especially in light of meeting GDPR requirements.

I'd argue to pull out those pieces that make sense in Hadoop 3, merge those
in and keep the rest in a separate project. Iterate quickly in that
separate project, you can have a separate set of committers, you can do
separate release cycle. If that develops Ozone into _the_ new block layer
for all use cases (even when people want to give up on encryption, erasure
encoding, or feature parity is reached) then we can jump of that bridge
when we reach it. I think adding a very large chunk of code that relatively
few people in the community are familiar with isn't necessarily going to
help Hadoop at this time.

Cheers,

Joep

On Tue, Mar 6, 2018 at 2:32 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

> Hi Andrew,
>
>  I think we can eliminate the maintenance costs even in the same repo. We
> can make following changes that incorporate suggestions from Daryn and Owen
> as well.
> 1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate
> directory.
> 2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
> 3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be
> loaded in DN as a pluggable module.
>      If not loaded, there will be absolutely no code path through hdsl or
> ozone.
> 4. To further make it easier for folks building hadoop, we can support a
> maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will
> not be built.
>      For example, Cloudera can choose to skip even compiling/building
> hdsl/ozone and therefore no maintenance overhead whatsoever.
>      HADOOP-14453 has a patch that shows how it can be done.
>
> Arguably, there are two kinds of maintenance costs. Costs for developers
> and the cost for users.
> - Developers: A maven profile as noted in point (3) and (4) above
> completely addresses the concern for developers
>                                  as there are no compile time dependencies
> and further, they can choose not to build ozone/hdsl.
> - User: Cost to users will be completely alleviated if ozone/hdsl is not
> loaded as mentioned in point (3) above.
>
> jitendra
>
> From: Andrew Wang <an...@cloudera.com>
> Date: Monday, March 5, 2018 at 3:54 PM
> To: Wangda Tan <wh...@gmail.com>
> Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp
> <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>,
> hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <
> common-dev@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <
> yarn-dev@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <
> mapreduce-dev@hadoop.apache.org>
> Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk
>
> Hi Owen, Wangda,
>
> Thanks for clearly laying out the subproject options, that helps the
> discussion.
>
> I'm all onboard with the idea of regular releases, and it's something I
> tried to do with the 3.0 alphas and betas. The problem though isn't a lack
> of commitment from feature developers like Sanjay or Jitendra; far from it!
> I think every feature developer makes a reasonable effort to test their
> code before it's merged. Yet, my experience as an RM is that more code
> comes with more risk. I don't believe that Ozone is special or different in
> this regard. It comes with a maintenance cost, not a maintenance benefit.
>
>
> I'm advocating for #3: separate source, separate release. Since HDSL
> stability and FSN/BM refactoring are still a ways out, I don't want to
> incur a maintenance cost now. I sympathize with the sentiment that working
> cross-repo is harder than within same repo, but the right tooling can make
> this a lot easier (e.g. git submodule, Google's repo tool). We have
> experience doing this internally here at Cloudera, and I'm happy to share
> knowledge and possibly code.
>
> Best,
> Andrew
>
> On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
> I like the idea of same source / same release and put Ozone's source under
> a different directory.
>
> Like Owen mentioned, It gonna be important for all parties to keep a
> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
> releases. Users can try features and give feedbacks to stabilize feature
> earlier; developers can be happier since efforts will be consumed by users
> soon after features get merged. In addition to this, if features merged to
> trunk after reasonable tests/review, Andrew's concern may not be a problem
> anymore:
>
> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements.
>
> Thanks,
> Wangda
>
>
> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
> wrote:
> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> Owen mentioned making a Hadoop subproject; we'd have to
> > hash out what exactly this means (I assume a separate repo still managed
> by
> > the Hadoop project), but I think we could make this work if it's more
> > attractive than incubation or a new TLP.
>
>
> Ok, there are multiple levels of sub-projects that all make sense:
>
>    - Same source tree, same releases - examples like HDFS & YARN
>    - Same master branch, separate releases and release branches - Hive's
>    Storage API vs Hive. It is in the source tree for the master branch, but
>    has distinct releases and release branches.
>    - Separate source, separate release - Apache Commons.
>
> There are advantages and disadvantages to each. I'd propose that we use the
> same source, same release pattern for Ozone. Note that we tried and later
> reverted doing Common, HDFS, and YARN as separate source, separate release
> because it was too much trouble. I like Daryn's idea of putting it as a top
> level directory in Hadoop and making sure that nothing in Common, HDFS, or
> YARN depend on it. That way if a Release Manager doesn't think it is ready
> for release, it can be trivially removed before the release.
>
> One thing about using the same releases, Sanjay and Jitendra are signing up
> to make much more regular bugfix and minor releases in the near future. For
> example, they'll need to make 3.2 relatively soon to get it released and
> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
> project. Hadoop needs more regular releases and fewer big bang releases.
>
> .. Owen
>
>
>
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by "J. Rottinghuis" <jr...@gmail.com>.

Sorry for jumping in late into the fray of this discussion.

It seems Ozone is a large feature. I appreciate the development effort and
the desire to get this into the hands of users.
I understand the need to iterate quickly and to reduce overhead for
development.
I also agree that Hadoop can benefit from a quicker release cycle. For our
part, this is a challenge as we have a large installation with multiple
clusters and thousands of users. It is a constant balance between jumping
to the newest release and the cost of this integration and test at our
scale, especially when things aren't backwards compatible. We try to be
good citizens and upstream our changes and contribute back.

The point was made that splitting the projects such as common and Yarn
didn't work and had to be reverted. That was painful and a lot of work for
those involved for sure. This project may be slightly different in that
hadoop-common, Yarn and HDFS made for one consistent whole. One couldn't
run a project without the other.

Having a separate block management layer with possibly multiple block
implementation as pluggable under the covers would be a good future
development for HDFS. Some users would choose Ozone as that layer, some
might use S3, others GCS, or Azure, or something else.
If the argument is made that nobody will be able to run Hadoop as a
consistent stack without Ozone, then that would be a strong case to keep
things in the same repo.

Obviously when people do want to use Ozone, then having it in the same repo
is easier. The flipside is that, separate top-level project in the same
repo or not, it adds to the Hadoop releases. If there is a change in Ozone
and a new release needed, it would have to wait for a Hadoop release. Ditto
if there is a Hadoop release and there is an issue with Ozone. The case
that one could turn off Ozone through a Maven profile works only to some
extend.
If we have done a 3.x release with Ozone in it, would it make sense to do a
3.y release with y>x without Ozone in it? That would be weird.

This does sound like a Hadoop 4 feature. Compatibility with lots of new
features in Hadoop 3 need to be worked out. We're still working on jumping
to a Hadoop 2.9 release and then working on getting a step-store release to
3.0 to bridge compatibility issues. I'm afraid that adding a very large new
feature into trunk now, essentially makes going to Hadoop 3 not viable for
quite a while. That would be a bummer for all the feature work that has
gone into Hadoop 3. Encryption and erasure encoding are very appealing
features, especially in light of meeting GDPR requirements.

I'd argue to pull out those pieces that make sense in Hadoop 3, merge those
in and keep the rest in a separate project. Iterate quickly in that
separate project, you can have a separate set of committers, you can do
separate release cycle. If that develops Ozone into _the_ new block layer
for all use cases (even when people want to give up on encryption, erasure
encoding, or feature parity is reached) then we can jump of that bridge
when we reach it. I think adding a very large chunk of code that relatively
few people in the community are familiar with isn't necessarily going to
help Hadoop at this time.

Cheers,

Joep

On Tue, Mar 6, 2018 at 2:32 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

> Hi Andrew,
>
>  I think we can eliminate the maintenance costs even in the same repo. We
> can make following changes that incorporate suggestions from Daryn and Owen
> as well.
> 1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate
> directory.
> 2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
> 3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be
> loaded in DN as a pluggable module.
>      If not loaded, there will be absolutely no code path through hdsl or
> ozone.
> 4. To further make it easier for folks building hadoop, we can support a
> maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will
> not be built.
>      For example, Cloudera can choose to skip even compiling/building
> hdsl/ozone and therefore no maintenance overhead whatsoever.
>      HADOOP-14453 has a patch that shows how it can be done.
>
> Arguably, there are two kinds of maintenance costs. Costs for developers
> and the cost for users.
> - Developers: A maven profile as noted in point (3) and (4) above
> completely addresses the concern for developers
>                                  as there are no compile time dependencies
> and further, they can choose not to build ozone/hdsl.
> - User: Cost to users will be completely alleviated if ozone/hdsl is not
> loaded as mentioned in point (3) above.
>
> jitendra
>
> From: Andrew Wang <an...@cloudera.com>
> Date: Monday, March 5, 2018 at 3:54 PM
> To: Wangda Tan <wh...@gmail.com>
> Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp
> <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>,
> hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <
> common-dev@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <
> yarn-dev@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <
> mapreduce-dev@hadoop.apache.org>
> Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk
>
> Hi Owen, Wangda,
>
> Thanks for clearly laying out the subproject options, that helps the
> discussion.
>
> I'm all onboard with the idea of regular releases, and it's something I
> tried to do with the 3.0 alphas and betas. The problem though isn't a lack
> of commitment from feature developers like Sanjay or Jitendra; far from it!
> I think every feature developer makes a reasonable effort to test their
> code before it's merged. Yet, my experience as an RM is that more code
> comes with more risk. I don't believe that Ozone is special or different in
> this regard. It comes with a maintenance cost, not a maintenance benefit.
>
>
> I'm advocating for #3: separate source, separate release. Since HDSL
> stability and FSN/BM refactoring are still a ways out, I don't want to
> incur a maintenance cost now. I sympathize with the sentiment that working
> cross-repo is harder than within same repo, but the right tooling can make
> this a lot easier (e.g. git submodule, Google's repo tool). We have
> experience doing this internally here at Cloudera, and I'm happy to share
> knowledge and possibly code.
>
> Best,
> Andrew
>
> On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
> I like the idea of same source / same release and put Ozone's source under
> a different directory.
>
> Like Owen mentioned, It gonna be important for all parties to keep a
> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
> releases. Users can try features and give feedbacks to stabilize feature
> earlier; developers can be happier since efforts will be consumed by users
> soon after features get merged. In addition to this, if features merged to
> trunk after reasonable tests/review, Andrew's concern may not be a problem
> anymore:
>
> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements.
>
> Thanks,
> Wangda
>
>
> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
> wrote:
> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> Owen mentioned making a Hadoop subproject; we'd have to
> > hash out what exactly this means (I assume a separate repo still managed
> by
> > the Hadoop project), but I think we could make this work if it's more
> > attractive than incubation or a new TLP.
>
>
> Ok, there are multiple levels of sub-projects that all make sense:
>
>    - Same source tree, same releases - examples like HDFS & YARN
>    - Same master branch, separate releases and release branches - Hive's
>    Storage API vs Hive. It is in the source tree for the master branch, but
>    has distinct releases and release branches.
>    - Separate source, separate release - Apache Commons.
>
> There are advantages and disadvantages to each. I'd propose that we use the
> same source, same release pattern for Ozone. Note that we tried and later
> reverted doing Common, HDFS, and YARN as separate source, separate release
> because it was too much trouble. I like Daryn's idea of putting it as a top
> level directory in Hadoop and making sure that nothing in Common, HDFS, or
> YARN depend on it. That way if a Release Manager doesn't think it is ready
> for release, it can be trivially removed before the release.
>
> One thing about using the same releases, Sanjay and Jitendra are signing up
> to make much more regular bugfix and minor releases in the near future. For
> example, they'll need to make 3.2 relatively soon to get it released and
> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
> project. Hadoop needs more regular releases and fewer big bang releases.
>
> .. Owen
>
>
>
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by "J. Rottinghuis" <jr...@gmail.com>.

Sorry for jumping in late into the fray of this discussion.

It seems Ozone is a large feature. I appreciate the development effort and
the desire to get this into the hands of users.
I understand the need to iterate quickly and to reduce overhead for
development.
I also agree that Hadoop can benefit from a quicker release cycle. For our
part, this is a challenge as we have a large installation with multiple
clusters and thousands of users. It is a constant balance between jumping
to the newest release and the cost of this integration and test at our
scale, especially when things aren't backwards compatible. We try to be
good citizens and upstream our changes and contribute back.

The point was made that splitting the projects such as common and Yarn
didn't work and had to be reverted. That was painful and a lot of work for
those involved for sure. This project may be slightly different in that
hadoop-common, Yarn and HDFS made for one consistent whole. One couldn't
run a project without the other.

Having a separate block management layer with possibly multiple block
implementation as pluggable under the covers would be a good future
development for HDFS. Some users would choose Ozone as that layer, some
might use S3, others GCS, or Azure, or something else.
If the argument is made that nobody will be able to run Hadoop as a
consistent stack without Ozone, then that would be a strong case to keep
things in the same repo.

Obviously when people do want to use Ozone, then having it in the same repo
is easier. The flipside is that, separate top-level project in the same
repo or not, it adds to the Hadoop releases. If there is a change in Ozone
and a new release needed, it would have to wait for a Hadoop release. Ditto
if there is a Hadoop release and there is an issue with Ozone. The case
that one could turn off Ozone through a Maven profile works only to some
extend.
If we have done a 3.x release with Ozone in it, would it make sense to do a
3.y release with y>x without Ozone in it? That would be weird.

This does sound like a Hadoop 4 feature. Compatibility with lots of new
features in Hadoop 3 need to be worked out. We're still working on jumping
to a Hadoop 2.9 release and then working on getting a step-store release to
3.0 to bridge compatibility issues. I'm afraid that adding a very large new
feature into trunk now, essentially makes going to Hadoop 3 not viable for
quite a while. That would be a bummer for all the feature work that has
gone into Hadoop 3. Encryption and erasure encoding are very appealing
features, especially in light of meeting GDPR requirements.

I'd argue to pull out those pieces that make sense in Hadoop 3, merge those
in and keep the rest in a separate project. Iterate quickly in that
separate project, you can have a separate set of committers, you can do
separate release cycle. If that develops Ozone into _the_ new block layer
for all use cases (even when people want to give up on encryption, erasure
encoding, or feature parity is reached) then we can jump of that bridge
when we reach it. I think adding a very large chunk of code that relatively
few people in the community are familiar with isn't necessarily going to
help Hadoop at this time.

Cheers,

Joep

On Tue, Mar 6, 2018 at 2:32 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

> Hi Andrew,
>
>  I think we can eliminate the maintenance costs even in the same repo. We
> can make following changes that incorporate suggestions from Daryn and Owen
> as well.
> 1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate
> directory.
> 2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
> 3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be
> loaded in DN as a pluggable module.
>      If not loaded, there will be absolutely no code path through hdsl or
> ozone.
> 4. To further make it easier for folks building hadoop, we can support a
> maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will
> not be built.
>      For example, Cloudera can choose to skip even compiling/building
> hdsl/ozone and therefore no maintenance overhead whatsoever.
>      HADOOP-14453 has a patch that shows how it can be done.
>
> Arguably, there are two kinds of maintenance costs. Costs for developers
> and the cost for users.
> - Developers: A maven profile as noted in point (3) and (4) above
> completely addresses the concern for developers
>                                  as there are no compile time dependencies
> and further, they can choose not to build ozone/hdsl.
> - User: Cost to users will be completely alleviated if ozone/hdsl is not
> loaded as mentioned in point (3) above.
>
> jitendra
>
> From: Andrew Wang <an...@cloudera.com>
> Date: Monday, March 5, 2018 at 3:54 PM
> To: Wangda Tan <wh...@gmail.com>
> Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp
> <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>,
> hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <
> common-dev@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <
> yarn-dev@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <
> mapreduce-dev@hadoop.apache.org>
> Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk
>
> Hi Owen, Wangda,
>
> Thanks for clearly laying out the subproject options, that helps the
> discussion.
>
> I'm all onboard with the idea of regular releases, and it's something I
> tried to do with the 3.0 alphas and betas. The problem though isn't a lack
> of commitment from feature developers like Sanjay or Jitendra; far from it!
> I think every feature developer makes a reasonable effort to test their
> code before it's merged. Yet, my experience as an RM is that more code
> comes with more risk. I don't believe that Ozone is special or different in
> this regard. It comes with a maintenance cost, not a maintenance benefit.
>
>
> I'm advocating for #3: separate source, separate release. Since HDSL
> stability and FSN/BM refactoring are still a ways out, I don't want to
> incur a maintenance cost now. I sympathize with the sentiment that working
> cross-repo is harder than within same repo, but the right tooling can make
> this a lot easier (e.g. git submodule, Google's repo tool). We have
> experience doing this internally here at Cloudera, and I'm happy to share
> knowledge and possibly code.
>
> Best,
> Andrew
>
> On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
> I like the idea of same source / same release and put Ozone's source under
> a different directory.
>
> Like Owen mentioned, It gonna be important for all parties to keep a
> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
> releases. Users can try features and give feedbacks to stabilize feature
> earlier; developers can be happier since efforts will be consumed by users
> soon after features get merged. In addition to this, if features merged to
> trunk after reasonable tests/review, Andrew's concern may not be a problem
> anymore:
>
> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements.
>
> Thanks,
> Wangda
>
>
> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
> wrote:
> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> Owen mentioned making a Hadoop subproject; we'd have to
> > hash out what exactly this means (I assume a separate repo still managed
> by
> > the Hadoop project), but I think we could make this work if it's more
> > attractive than incubation or a new TLP.
>
>
> Ok, there are multiple levels of sub-projects that all make sense:
>
>    - Same source tree, same releases - examples like HDFS & YARN
>    - Same master branch, separate releases and release branches - Hive's
>    Storage API vs Hive. It is in the source tree for the master branch, but
>    has distinct releases and release branches.
>    - Separate source, separate release - Apache Commons.
>
> There are advantages and disadvantages to each. I'd propose that we use the
> same source, same release pattern for Ozone. Note that we tried and later
> reverted doing Common, HDFS, and YARN as separate source, separate release
> because it was too much trouble. I like Daryn's idea of putting it as a top
> level directory in Hadoop and making sure that nothing in Common, HDFS, or
> YARN depend on it. That way if a Release Manager doesn't think it is ready
> for release, it can be trivially removed before the release.
>
> One thing about using the same releases, Sanjay and Jitendra are signing up
> to make much more regular bugfix and minor releases in the near future. For
> example, they'll need to make 3.2 relatively soon to get it released and
> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
> project. Hadoop needs more regular releases and fewer big bang releases.
>
> .. Owen
>
>
>
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Jitendra Pandey <ji...@hortonworks.com>.

Hi Andrew, 

 I think we can eliminate the maintenance costs even in the same repo. We can make following changes that incorporate suggestions from Daryn and Owen as well.
1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate directory.
2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be loaded in DN as a pluggable module. 
     If not loaded, there will be absolutely no code path through hdsl or ozone.
4. To further make it easier for folks building hadoop, we can support a maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will not be built.
     For example, Cloudera can choose to skip even compiling/building hdsl/ozone and therefore no maintenance overhead whatsoever.
     HADOOP-14453 has a patch that shows how it can be done.

Arguably, there are two kinds of maintenance costs. Costs for developers and the cost for users.
- Developers: A maven profile as noted in point (3) and (4) above completely addresses the concern for developers 
                                 as there are no compile time dependencies and further, they can choose not to build ozone/hdsl.
- User: Cost to users will be completely alleviated if ozone/hdsl is not loaded as mentioned in point (3) above.

jitendra

From: Andrew Wang <an...@cloudera.com>
Date: Monday, March 5, 2018 at 3:54 PM
To: Wangda Tan <wh...@gmail.com>
Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>, hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <co...@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <ma...@hadoop.apache.org>
Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk

Hi Owen, Wangda, 

Thanks for clearly laying out the subproject options, that helps the discussion.

I'm all onboard with the idea of regular releases, and it's something I tried to do with the 3.0 alphas and betas. The problem though isn't a lack of commitment from feature developers like Sanjay or Jitendra; far from it! I think every feature developer makes a reasonable effort to test their code before it's merged. Yet, my experience as an RM is that more code comes with more risk. I don't believe that Ozone is special or different in this regard. It comes with a maintenance cost, not a maintenance benefit.

I'm advocating for #3: separate source, separate release. Since HDSL stability and FSN/BM refactoring are still a ways out, I don't want to incur a maintenance cost now. I sympathize with the sentiment that working cross-repo is harder than within same repo, but the right tooling can make this a lot easier (e.g. git submodule, Google's repo tool). We have experience doing this internally here at Cloudera, and I'm happy to share knowledge and possibly code.

Best,
Andrew

On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
I like the idea of same source / same release and put Ozone's source under a different directory. 

Like Owen mentioned, It gonna be important for all parties to keep a regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor releases. Users can try features and give feedbacks to stabilize feature earlier; developers can be happier since efforts will be consumed by users soon after features get merged. In addition to this, if features merged to trunk after reasonable tests/review, Andrew's concern may not be a problem anymore: 

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda

On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com> wrote:
On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
wrote:

Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.

Ok, there are multiple levels of sub-projects that all make sense:

   - Same source tree, same releases - examples like HDFS & YARN
   - Same master branch, separate releases and release branches - Hive's
   Storage API vs Hive. It is in the source tree for the master branch, but
   has distinct releases and release branches.
   - Separate source, separate release - Apache Commons.

There are advantages and disadvantages to each. I'd propose that we use the
same source, same release pattern for Ozone. Note that we tried and later
reverted doing Common, HDFS, and YARN as separate source, separate release
because it was too much trouble. I like Daryn's idea of putting it as a top
level directory in Hadoop and making sure that nothing in Common, HDFS, or
YARN depend on it. That way if a Release Manager doesn't think it is ready
for release, it can be trivially removed before the release.

One thing about using the same releases, Sanjay and Jitendra are signing up
to make much more regular bugfix and minor releases in the near future. For
example, they'll need to make 3.2 relatively soon to get it released and
then 3.3 somewhere in the next 3 to 6 months. That would be good for the
project. Hadoop needs more regular releases and fewer big bang releases.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Jitendra Pandey <ji...@hortonworks.com>.

Hi Andrew, 

 I think we can eliminate the maintenance costs even in the same repo. We can make following changes that incorporate suggestions from Daryn and Owen as well.
1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate directory.
2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be loaded in DN as a pluggable module. 
     If not loaded, there will be absolutely no code path through hdsl or ozone.
4. To further make it easier for folks building hadoop, we can support a maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will not be built.
     For example, Cloudera can choose to skip even compiling/building hdsl/ozone and therefore no maintenance overhead whatsoever.
     HADOOP-14453 has a patch that shows how it can be done.

Arguably, there are two kinds of maintenance costs. Costs for developers and the cost for users.
- Developers: A maven profile as noted in point (3) and (4) above completely addresses the concern for developers 
                                 as there are no compile time dependencies and further, they can choose not to build ozone/hdsl.
- User: Cost to users will be completely alleviated if ozone/hdsl is not loaded as mentioned in point (3) above.

jitendra

From: Andrew Wang <an...@cloudera.com>
Date: Monday, March 5, 2018 at 3:54 PM
To: Wangda Tan <wh...@gmail.com>
Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>, hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <co...@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <ma...@hadoop.apache.org>
Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk

Hi Owen, Wangda, 

Thanks for clearly laying out the subproject options, that helps the discussion.

I'm all onboard with the idea of regular releases, and it's something I tried to do with the 3.0 alphas and betas. The problem though isn't a lack of commitment from feature developers like Sanjay or Jitendra; far from it! I think every feature developer makes a reasonable effort to test their code before it's merged. Yet, my experience as an RM is that more code comes with more risk. I don't believe that Ozone is special or different in this regard. It comes with a maintenance cost, not a maintenance benefit.

I'm advocating for #3: separate source, separate release. Since HDSL stability and FSN/BM refactoring are still a ways out, I don't want to incur a maintenance cost now. I sympathize with the sentiment that working cross-repo is harder than within same repo, but the right tooling can make this a lot easier (e.g. git submodule, Google's repo tool). We have experience doing this internally here at Cloudera, and I'm happy to share knowledge and possibly code.

Best,
Andrew

On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
I like the idea of same source / same release and put Ozone's source under a different directory. 

Like Owen mentioned, It gonna be important for all parties to keep a regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor releases. Users can try features and give feedbacks to stabilize feature earlier; developers can be happier since efforts will be consumed by users soon after features get merged. In addition to this, if features merged to trunk after reasonable tests/review, Andrew's concern may not be a problem anymore: 

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda

On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com> wrote:
On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
wrote:

Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.

Ok, there are multiple levels of sub-projects that all make sense:

   - Same source tree, same releases - examples like HDFS & YARN
   - Same master branch, separate releases and release branches - Hive's
   Storage API vs Hive. It is in the source tree for the master branch, but
   has distinct releases and release branches.
   - Separate source, separate release - Apache Commons.

There are advantages and disadvantages to each. I'd propose that we use the
same source, same release pattern for Ozone. Note that we tried and later
reverted doing Common, HDFS, and YARN as separate source, separate release
because it was too much trouble. I like Daryn's idea of putting it as a top
level directory in Hadoop and making sure that nothing in Common, HDFS, or
YARN depend on it. That way if a Release Manager doesn't think it is ready
for release, it can be trivially removed before the release.

One thing about using the same releases, Sanjay and Jitendra are signing up
to make much more regular bugfix and minor releases in the near future. For
example, they'll need to make 3.2 relatively soon to get it released and
then 3.3 somewhere in the next 3 to 6 months. That would be good for the
project. Hadoop needs more regular releases and fewer big bang releases.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Jitendra Pandey <ji...@hortonworks.com>.

Hi Andrew, 

 I think we can eliminate the maintenance costs even in the same repo. We can make following changes that incorporate suggestions from Daryn and Owen as well.
1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate directory.
2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be loaded in DN as a pluggable module. 
     If not loaded, there will be absolutely no code path through hdsl or ozone.
4. To further make it easier for folks building hadoop, we can support a maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will not be built.
     For example, Cloudera can choose to skip even compiling/building hdsl/ozone and therefore no maintenance overhead whatsoever.
     HADOOP-14453 has a patch that shows how it can be done.

Arguably, there are two kinds of maintenance costs. Costs for developers and the cost for users.
- Developers: A maven profile as noted in point (3) and (4) above completely addresses the concern for developers 
                                 as there are no compile time dependencies and further, they can choose not to build ozone/hdsl.
- User: Cost to users will be completely alleviated if ozone/hdsl is not loaded as mentioned in point (3) above.

jitendra

From: Andrew Wang <an...@cloudera.com>
Date: Monday, March 5, 2018 at 3:54 PM
To: Wangda Tan <wh...@gmail.com>
Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>, hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <co...@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <ma...@hadoop.apache.org>
Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk

Hi Owen, Wangda, 

Thanks for clearly laying out the subproject options, that helps the discussion.

I'm all onboard with the idea of regular releases, and it's something I tried to do with the 3.0 alphas and betas. The problem though isn't a lack of commitment from feature developers like Sanjay or Jitendra; far from it! I think every feature developer makes a reasonable effort to test their code before it's merged. Yet, my experience as an RM is that more code comes with more risk. I don't believe that Ozone is special or different in this regard. It comes with a maintenance cost, not a maintenance benefit.

I'm advocating for #3: separate source, separate release. Since HDSL stability and FSN/BM refactoring are still a ways out, I don't want to incur a maintenance cost now. I sympathize with the sentiment that working cross-repo is harder than within same repo, but the right tooling can make this a lot easier (e.g. git submodule, Google's repo tool). We have experience doing this internally here at Cloudera, and I'm happy to share knowledge and possibly code.

Best,
Andrew

On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
I like the idea of same source / same release and put Ozone's source under a different directory. 

Like Owen mentioned, It gonna be important for all parties to keep a regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor releases. Users can try features and give feedbacks to stabilize feature earlier; developers can be happier since efforts will be consumed by users soon after features get merged. In addition to this, if features merged to trunk after reasonable tests/review, Andrew's concern may not be a problem anymore: 

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda

On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com> wrote:
On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
wrote:

Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.

Ok, there are multiple levels of sub-projects that all make sense:

   - Same source tree, same releases - examples like HDFS & YARN
   - Same master branch, separate releases and release branches - Hive's
   Storage API vs Hive. It is in the source tree for the master branch, but
   has distinct releases and release branches.
   - Separate source, separate release - Apache Commons.

There are advantages and disadvantages to each. I'd propose that we use the
same source, same release pattern for Ozone. Note that we tried and later
reverted doing Common, HDFS, and YARN as separate source, separate release
because it was too much trouble. I like Daryn's idea of putting it as a top
level directory in Hadoop and making sure that nothing in Common, HDFS, or
YARN depend on it. That way if a Release Manager doesn't think it is ready
for release, it can be trivially removed before the release.

One thing about using the same releases, Sanjay and Jitendra are signing up
to make much more regular bugfix and minor releases in the near future. For
example, they'll need to make 3.2 relatively soon to get it released and
then 3.3 somewhere in the next 3 to 6 months. That would be good for the
project. Hadoop needs more regular releases and fewer big bang releases.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Jitendra Pandey <ji...@hortonworks.com>.

Hi Andrew, 

 I think we can eliminate the maintenance costs even in the same repo. We can make following changes that incorporate suggestions from Daryn and Owen as well.
1. Hadoop-hdsl-project will be at the root of hadoop repo, in a separate directory.
2. There will be no dependencies from common, yarn and hdfs to hdsl/ozone.
3. Based on Daryn’s suggestion, the Hdsl can be optionally (via config) be loaded in DN as a pluggable module. 
     If not loaded, there will be absolutely no code path through hdsl or ozone.
4. To further make it easier for folks building hadoop, we can support a maven profile for hdsl/ozone. If the profile is deactivated hdsl/ozone will not be built.
     For example, Cloudera can choose to skip even compiling/building hdsl/ozone and therefore no maintenance overhead whatsoever.
     HADOOP-14453 has a patch that shows how it can be done.

Arguably, there are two kinds of maintenance costs. Costs for developers and the cost for users.
- Developers: A maven profile as noted in point (3) and (4) above completely addresses the concern for developers 
                                 as there are no compile time dependencies and further, they can choose not to build ozone/hdsl.
- User: Cost to users will be completely alleviated if ozone/hdsl is not loaded as mentioned in point (3) above.

jitendra

From: Andrew Wang <an...@cloudera.com>
Date: Monday, March 5, 2018 at 3:54 PM
To: Wangda Tan <wh...@gmail.com>
Cc: Owen O'Malley <ow...@gmail.com>, Daryn Sharp <da...@oath.com.invalid>, Jitendra Pandey <ji...@hortonworks.com>, hdfs-dev <hd...@hadoop.apache.org>, "common-dev@hadoop.apache.org" <co...@hadoop.apache.org>, "yarn-dev@hadoop.apache.org" <ya...@hadoop.apache.org>, "mapreduce-dev@hadoop.apache.org" <ma...@hadoop.apache.org>
Subject: Re: [VOTE] Merging branch HDFS-7240 to trunk

Hi Owen, Wangda, 

Thanks for clearly laying out the subproject options, that helps the discussion.

I'm all onboard with the idea of regular releases, and it's something I tried to do with the 3.0 alphas and betas. The problem though isn't a lack of commitment from feature developers like Sanjay or Jitendra; far from it! I think every feature developer makes a reasonable effort to test their code before it's merged. Yet, my experience as an RM is that more code comes with more risk. I don't believe that Ozone is special or different in this regard. It comes with a maintenance cost, not a maintenance benefit.

I'm advocating for #3: separate source, separate release. Since HDSL stability and FSN/BM refactoring are still a ways out, I don't want to incur a maintenance cost now. I sympathize with the sentiment that working cross-repo is harder than within same repo, but the right tooling can make this a lot easier (e.g. git submodule, Google's repo tool). We have experience doing this internally here at Cloudera, and I'm happy to share knowledge and possibly code.

Best,
Andrew

On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:
I like the idea of same source / same release and put Ozone's source under a different directory. 

Like Owen mentioned, It gonna be important for all parties to keep a regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor releases. Users can try features and give feedbacks to stabilize feature earlier; developers can be happier since efforts will be consumed by users soon after features get merged. In addition to this, if features merged to trunk after reasonable tests/review, Andrew's concern may not be a problem anymore: 

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda

On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com> wrote:
On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
wrote:

Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.

Ok, there are multiple levels of sub-projects that all make sense:

   - Same source tree, same releases - examples like HDFS & YARN
   - Same master branch, separate releases and release branches - Hive's
   Storage API vs Hive. It is in the source tree for the master branch, but
   has distinct releases and release branches.
   - Separate source, separate release - Apache Commons.

There are advantages and disadvantages to each. I'd propose that we use the
same source, same release pattern for Ozone. Note that we tried and later
reverted doing Common, HDFS, and YARN as separate source, separate release
because it was too much trouble. I like Daryn's idea of putting it as a top
level directory in Hadoop and making sure that nothing in Common, HDFS, or
YARN depend on it. That way if a Release Manager doesn't think it is ready
for release, it can be trivially removed before the release.

One thing about using the same releases, Sanjay and Jitendra are signing up
to make much more regular bugfix and minor releases in the near future. For
example, they'll need to make 3.2 relatively soon to get it released and
then 3.3 somewhere in the next 3 to 6 months. That would be good for the
project. Hadoop needs more regular releases and fewer big bang releases.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Owen, Wangda,

Thanks for clearly laying out the subproject options, that helps the
discussion.

I'm all onboard with the idea of regular releases, and it's something I
tried to do with the 3.0 alphas and betas. The problem though isn't a lack
of commitment from feature developers like Sanjay or Jitendra; far from it!
I think every feature developer makes a reasonable effort to test their
code before it's merged. Yet, my experience as an RM is that more code
comes with more risk. I don't believe that Ozone is special or different in
this regard. It comes with a maintenance cost, not a maintenance benefit.

I'm advocating for #3: separate source, separate release. Since HDSL
stability and FSN/BM refactoring are still a ways out, I don't want to
incur a maintenance cost now. I sympathize with the sentiment that working
cross-repo is harder than within same repo, but the right tooling can make
this a lot easier (e.g. git submodule, Google's repo tool). We have
experience doing this internally here at Cloudera, and I'm happy to share
knowledge and possibly code.

Best,
Andrew

On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:

> I like the idea of same source / same release and put Ozone's source under
> a different directory.
>
> Like Owen mentioned, It gonna be important for all parties to keep a
> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
> releases. Users can try features and give feedbacks to stabilize feature
> earlier; developers can be happier since efforts will be consumed by users
> soon after features get merged. In addition to this, if features merged to
> trunk after reasonable tests/review, Andrew's concern may not be a problem
> anymore:
>
> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements.
>
> Thanks,
> Wangda
>
>
> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
> wrote:
>
>> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>>
>> Owen mentioned making a Hadoop subproject; we'd have to
>> > hash out what exactly this means (I assume a separate repo still
>> managed by
>> > the Hadoop project), but I think we could make this work if it's more
>> > attractive than incubation or a new TLP.
>>
>>
>> Ok, there are multiple levels of sub-projects that all make sense:
>>
>>    - Same source tree, same releases - examples like HDFS & YARN
>>    - Same master branch, separate releases and release branches - Hive's
>>    Storage API vs Hive. It is in the source tree for the master branch,
>> but
>>    has distinct releases and release branches.
>>    - Separate source, separate release - Apache Commons.
>>
>> There are advantages and disadvantages to each. I'd propose that we use
>> the
>> same source, same release pattern for Ozone. Note that we tried and later
>> reverted doing Common, HDFS, and YARN as separate source, separate release
>> because it was too much trouble. I like Daryn's idea of putting it as a
>> top
>> level directory in Hadoop and making sure that nothing in Common, HDFS, or
>> YARN depend on it. That way if a Release Manager doesn't think it is ready
>> for release, it can be trivially removed before the release.
>>
>> One thing about using the same releases, Sanjay and Jitendra are signing
>> up
>> to make much more regular bugfix and minor releases in the near future.
>> For
>> example, they'll need to make 3.2 relatively soon to get it released and
>> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
>> project. Hadoop needs more regular releases and fewer big bang releases.
>>
>> .. Owen
>>
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Anu Engineer <ae...@hortonworks.com>.

Hi Owen,

  >> 1. It is hard to tell what has changed. git rebase -i tells me the
  >> branch has 722 commits. The rebase failed with a conflict. It would really
   >> help if you rebased to current trunk.

Thanks for the comments. I have merged trunk to HDFS-7240 branch. 
Hopefully, this makes it easy to look at the changes; I have committed the 
change required to fix the conflict as a separate commit to make it easy for you to see.

Thanks
Anu


On 3/2/18, 4:42 PM, "Wangda Tan" <wh...@gmail.com> wrote:

    I like the idea of same source / same release and put Ozone's source under
    a different directory.
    
    Like Owen mentioned, It gonna be important for all parties to keep a
    regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
    releases. Users can try features and give feedbacks to stabilize feature
    earlier; developers can be happier since efforts will be consumed by users
    soon after features get merged. In addition to this, if features merged to
    trunk after reasonable tests/review, Andrew's concern may not be a problem
    anymore:
    
    bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
    being a separate project. Ozone could release faster and iterate more
    quickly if it wasn't hampered by Hadoop's release schedule and security and
    compatibility requirements.
    
    Thanks,
    Wangda
    
    
    On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
    wrote:
    
    > On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
    > wrote:
    >
    > Owen mentioned making a Hadoop subproject; we'd have to
    > > hash out what exactly this means (I assume a separate repo still managed
    > by
    > > the Hadoop project), but I think we could make this work if it's more
    > > attractive than incubation or a new TLP.
    >
    >
    > Ok, there are multiple levels of sub-projects that all make sense:
    >
    >    - Same source tree, same releases - examples like HDFS & YARN
    >    - Same master branch, separate releases and release branches - Hive's
    >    Storage API vs Hive. It is in the source tree for the master branch, but
    >    has distinct releases and release branches.
    >    - Separate source, separate release - Apache Commons.
    >
    > There are advantages and disadvantages to each. I'd propose that we use the
    > same source, same release pattern for Ozone. Note that we tried and later
    > reverted doing Common, HDFS, and YARN as separate source, separate release
    > because it was too much trouble. I like Daryn's idea of putting it as a top
    > level directory in Hadoop and making sure that nothing in Common, HDFS, or
    > YARN depend on it. That way if a Release Manager doesn't think it is ready
    > for release, it can be trivially removed before the release.
    >
    > One thing about using the same releases, Sanjay and Jitendra are signing up
    > to make much more regular bugfix and minor releases in the near future. For
    > example, they'll need to make 3.2 relatively soon to get it released and
    > then 3.3 somewhere in the next 3 to 6 months. That would be good for the
    > project. Hadoop needs more regular releases and fewer big bang releases.
    >
    > .. Owen
    >

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Anu Engineer <ae...@hortonworks.com>.

Hi Owen,

  >> 1. It is hard to tell what has changed. git rebase -i tells me the
  >> branch has 722 commits. The rebase failed with a conflict. It would really
   >> help if you rebased to current trunk.

Thanks for the comments. I have merged trunk to HDFS-7240 branch. 
Hopefully, this makes it easy to look at the changes; I have committed the 
change required to fix the conflict as a separate commit to make it easy for you to see.

Thanks
Anu


On 3/2/18, 4:42 PM, "Wangda Tan" <wh...@gmail.com> wrote:

    I like the idea of same source / same release and put Ozone's source under
    a different directory.
    
    Like Owen mentioned, It gonna be important for all parties to keep a
    regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
    releases. Users can try features and give feedbacks to stabilize feature
    earlier; developers can be happier since efforts will be consumed by users
    soon after features get merged. In addition to this, if features merged to
    trunk after reasonable tests/review, Andrew's concern may not be a problem
    anymore:
    
    bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
    being a separate project. Ozone could release faster and iterate more
    quickly if it wasn't hampered by Hadoop's release schedule and security and
    compatibility requirements.
    
    Thanks,
    Wangda
    
    
    On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
    wrote:
    
    > On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
    > wrote:
    >
    > Owen mentioned making a Hadoop subproject; we'd have to
    > > hash out what exactly this means (I assume a separate repo still managed
    > by
    > > the Hadoop project), but I think we could make this work if it's more
    > > attractive than incubation or a new TLP.
    >
    >
    > Ok, there are multiple levels of sub-projects that all make sense:
    >
    >    - Same source tree, same releases - examples like HDFS & YARN
    >    - Same master branch, separate releases and release branches - Hive's
    >    Storage API vs Hive. It is in the source tree for the master branch, but
    >    has distinct releases and release branches.
    >    - Separate source, separate release - Apache Commons.
    >
    > There are advantages and disadvantages to each. I'd propose that we use the
    > same source, same release pattern for Ozone. Note that we tried and later
    > reverted doing Common, HDFS, and YARN as separate source, separate release
    > because it was too much trouble. I like Daryn's idea of putting it as a top
    > level directory in Hadoop and making sure that nothing in Common, HDFS, or
    > YARN depend on it. That way if a Release Manager doesn't think it is ready
    > for release, it can be trivially removed before the release.
    >
    > One thing about using the same releases, Sanjay and Jitendra are signing up
    > to make much more regular bugfix and minor releases in the near future. For
    > example, they'll need to make 3.2 relatively soon to get it released and
    > then 3.3 somewhere in the next 3 to 6 months. That would be good for the
    > project. Hadoop needs more regular releases and fewer big bang releases.
    >
    > .. Owen
    >

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Owen, Wangda,

Thanks for clearly laying out the subproject options, that helps the
discussion.

I'm all onboard with the idea of regular releases, and it's something I
tried to do with the 3.0 alphas and betas. The problem though isn't a lack
of commitment from feature developers like Sanjay or Jitendra; far from it!
I think every feature developer makes a reasonable effort to test their
code before it's merged. Yet, my experience as an RM is that more code
comes with more risk. I don't believe that Ozone is special or different in
this regard. It comes with a maintenance cost, not a maintenance benefit.

I'm advocating for #3: separate source, separate release. Since HDSL
stability and FSN/BM refactoring are still a ways out, I don't want to
incur a maintenance cost now. I sympathize with the sentiment that working
cross-repo is harder than within same repo, but the right tooling can make
this a lot easier (e.g. git submodule, Google's repo tool). We have
experience doing this internally here at Cloudera, and I'm happy to share
knowledge and possibly code.

Best,
Andrew

On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:

> I like the idea of same source / same release and put Ozone's source under
> a different directory.
>
> Like Owen mentioned, It gonna be important for all parties to keep a
> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
> releases. Users can try features and give feedbacks to stabilize feature
> earlier; developers can be happier since efforts will be consumed by users
> soon after features get merged. In addition to this, if features merged to
> trunk after reasonable tests/review, Andrew's concern may not be a problem
> anymore:
>
> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements.
>
> Thanks,
> Wangda
>
>
> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
> wrote:
>
>> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>>
>> Owen mentioned making a Hadoop subproject; we'd have to
>> > hash out what exactly this means (I assume a separate repo still
>> managed by
>> > the Hadoop project), but I think we could make this work if it's more
>> > attractive than incubation or a new TLP.
>>
>>
>> Ok, there are multiple levels of sub-projects that all make sense:
>>
>>    - Same source tree, same releases - examples like HDFS & YARN
>>    - Same master branch, separate releases and release branches - Hive's
>>    Storage API vs Hive. It is in the source tree for the master branch,
>> but
>>    has distinct releases and release branches.
>>    - Separate source, separate release - Apache Commons.
>>
>> There are advantages and disadvantages to each. I'd propose that we use
>> the
>> same source, same release pattern for Ozone. Note that we tried and later
>> reverted doing Common, HDFS, and YARN as separate source, separate release
>> because it was too much trouble. I like Daryn's idea of putting it as a
>> top
>> level directory in Hadoop and making sure that nothing in Common, HDFS, or
>> YARN depend on it. That way if a Release Manager doesn't think it is ready
>> for release, it can be trivially removed before the release.
>>
>> One thing about using the same releases, Sanjay and Jitendra are signing
>> up
>> to make much more regular bugfix and minor releases in the near future.
>> For
>> example, they'll need to make 3.2 relatively soon to get it released and
>> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
>> project. Hadoop needs more regular releases and fewer big bang releases.
>>
>> .. Owen
>>
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Anu Engineer <ae...@hortonworks.com>.

Hi Owen,

  >> 1. It is hard to tell what has changed. git rebase -i tells me the
  >> branch has 722 commits. The rebase failed with a conflict. It would really
   >> help if you rebased to current trunk.

Thanks for the comments. I have merged trunk to HDFS-7240 branch. 
Hopefully, this makes it easy to look at the changes; I have committed the 
change required to fix the conflict as a separate commit to make it easy for you to see.

Thanks
Anu


On 3/2/18, 4:42 PM, "Wangda Tan" <wh...@gmail.com> wrote:

    I like the idea of same source / same release and put Ozone's source under
    a different directory.
    
    Like Owen mentioned, It gonna be important for all parties to keep a
    regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
    releases. Users can try features and give feedbacks to stabilize feature
    earlier; developers can be happier since efforts will be consumed by users
    soon after features get merged. In addition to this, if features merged to
    trunk after reasonable tests/review, Andrew's concern may not be a problem
    anymore:
    
    bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
    being a separate project. Ozone could release faster and iterate more
    quickly if it wasn't hampered by Hadoop's release schedule and security and
    compatibility requirements.
    
    Thanks,
    Wangda
    
    
    On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
    wrote:
    
    > On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
    > wrote:
    >
    > Owen mentioned making a Hadoop subproject; we'd have to
    > > hash out what exactly this means (I assume a separate repo still managed
    > by
    > > the Hadoop project), but I think we could make this work if it's more
    > > attractive than incubation or a new TLP.
    >
    >
    > Ok, there are multiple levels of sub-projects that all make sense:
    >
    >    - Same source tree, same releases - examples like HDFS & YARN
    >    - Same master branch, separate releases and release branches - Hive's
    >    Storage API vs Hive. It is in the source tree for the master branch, but
    >    has distinct releases and release branches.
    >    - Separate source, separate release - Apache Commons.
    >
    > There are advantages and disadvantages to each. I'd propose that we use the
    > same source, same release pattern for Ozone. Note that we tried and later
    > reverted doing Common, HDFS, and YARN as separate source, separate release
    > because it was too much trouble. I like Daryn's idea of putting it as a top
    > level directory in Hadoop and making sure that nothing in Common, HDFS, or
    > YARN depend on it. That way if a Release Manager doesn't think it is ready
    > for release, it can be trivially removed before the release.
    >
    > One thing about using the same releases, Sanjay and Jitendra are signing up
    > to make much more regular bugfix and minor releases in the near future. For
    > example, they'll need to make 3.2 relatively soon to get it released and
    > then 3.3 somewhere in the next 3 to 6 months. That would be good for the
    > project. Hadoop needs more regular releases and fewer big bang releases.
    >
    > .. Owen
    >

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Anu Engineer <ae...@hortonworks.com>.

Hi Owen,

  >> 1. It is hard to tell what has changed. git rebase -i tells me the
  >> branch has 722 commits. The rebase failed with a conflict. It would really
   >> help if you rebased to current trunk.

Thanks for the comments. I have merged trunk to HDFS-7240 branch. 
Hopefully, this makes it easy to look at the changes; I have committed the 
change required to fix the conflict as a separate commit to make it easy for you to see.

Thanks
Anu


On 3/2/18, 4:42 PM, "Wangda Tan" <wh...@gmail.com> wrote:

    I like the idea of same source / same release and put Ozone's source under
    a different directory.
    
    Like Owen mentioned, It gonna be important for all parties to keep a
    regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
    releases. Users can try features and give feedbacks to stabilize feature
    earlier; developers can be happier since efforts will be consumed by users
    soon after features get merged. In addition to this, if features merged to
    trunk after reasonable tests/review, Andrew's concern may not be a problem
    anymore:
    
    bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
    being a separate project. Ozone could release faster and iterate more
    quickly if it wasn't hampered by Hadoop's release schedule and security and
    compatibility requirements.
    
    Thanks,
    Wangda
    
    
    On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
    wrote:
    
    > On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
    > wrote:
    >
    > Owen mentioned making a Hadoop subproject; we'd have to
    > > hash out what exactly this means (I assume a separate repo still managed
    > by
    > > the Hadoop project), but I think we could make this work if it's more
    > > attractive than incubation or a new TLP.
    >
    >
    > Ok, there are multiple levels of sub-projects that all make sense:
    >
    >    - Same source tree, same releases - examples like HDFS & YARN
    >    - Same master branch, separate releases and release branches - Hive's
    >    Storage API vs Hive. It is in the source tree for the master branch, but
    >    has distinct releases and release branches.
    >    - Separate source, separate release - Apache Commons.
    >
    > There are advantages and disadvantages to each. I'd propose that we use the
    > same source, same release pattern for Ozone. Note that we tried and later
    > reverted doing Common, HDFS, and YARN as separate source, separate release
    > because it was too much trouble. I like Daryn's idea of putting it as a top
    > level directory in Hadoop and making sure that nothing in Common, HDFS, or
    > YARN depend on it. That way if a Release Manager doesn't think it is ready
    > for release, it can be trivially removed before the release.
    >
    > One thing about using the same releases, Sanjay and Jitendra are signing up
    > to make much more regular bugfix and minor releases in the near future. For
    > example, they'll need to make 3.2 relatively soon to get it released and
    > then 3.3 somewhere in the next 3 to 6 months. That would be good for the
    > project. Hadoop needs more regular releases and fewer big bang releases.
    >
    > .. Owen
    >

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Owen, Wangda,

Thanks for clearly laying out the subproject options, that helps the
discussion.

I'm all onboard with the idea of regular releases, and it's something I
tried to do with the 3.0 alphas and betas. The problem though isn't a lack
of commitment from feature developers like Sanjay or Jitendra; far from it!
I think every feature developer makes a reasonable effort to test their
code before it's merged. Yet, my experience as an RM is that more code
comes with more risk. I don't believe that Ozone is special or different in
this regard. It comes with a maintenance cost, not a maintenance benefit.

I'm advocating for #3: separate source, separate release. Since HDSL
stability and FSN/BM refactoring are still a ways out, I don't want to
incur a maintenance cost now. I sympathize with the sentiment that working
cross-repo is harder than within same repo, but the right tooling can make
this a lot easier (e.g. git submodule, Google's repo tool). We have
experience doing this internally here at Cloudera, and I'm happy to share
knowledge and possibly code.

Best,
Andrew

On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:

> I like the idea of same source / same release and put Ozone's source under
> a different directory.
>
> Like Owen mentioned, It gonna be important for all parties to keep a
> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
> releases. Users can try features and give feedbacks to stabilize feature
> earlier; developers can be happier since efforts will be consumed by users
> soon after features get merged. In addition to this, if features merged to
> trunk after reasonable tests/review, Andrew's concern may not be a problem
> anymore:
>
> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements.
>
> Thanks,
> Wangda
>
>
> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
> wrote:
>
>> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>>
>> Owen mentioned making a Hadoop subproject; we'd have to
>> > hash out what exactly this means (I assume a separate repo still
>> managed by
>> > the Hadoop project), but I think we could make this work if it's more
>> > attractive than incubation or a new TLP.
>>
>>
>> Ok, there are multiple levels of sub-projects that all make sense:
>>
>>    - Same source tree, same releases - examples like HDFS & YARN
>>    - Same master branch, separate releases and release branches - Hive's
>>    Storage API vs Hive. It is in the source tree for the master branch,
>> but
>>    has distinct releases and release branches.
>>    - Separate source, separate release - Apache Commons.
>>
>> There are advantages and disadvantages to each. I'd propose that we use
>> the
>> same source, same release pattern for Ozone. Note that we tried and later
>> reverted doing Common, HDFS, and YARN as separate source, separate release
>> because it was too much trouble. I like Daryn's idea of putting it as a
>> top
>> level directory in Hadoop and making sure that nothing in Common, HDFS, or
>> YARN depend on it. That way if a Release Manager doesn't think it is ready
>> for release, it can be trivially removed before the release.
>>
>> One thing about using the same releases, Sanjay and Jitendra are signing
>> up
>> to make much more regular bugfix and minor releases in the near future.
>> For
>> example, they'll need to make 3.2 relatively soon to get it released and
>> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
>> project. Hadoop needs more regular releases and fewer big bang releases.
>>
>> .. Owen
>>
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Owen, Wangda,

Thanks for clearly laying out the subproject options, that helps the
discussion.

I'm all onboard with the idea of regular releases, and it's something I
tried to do with the 3.0 alphas and betas. The problem though isn't a lack
of commitment from feature developers like Sanjay or Jitendra; far from it!
I think every feature developer makes a reasonable effort to test their
code before it's merged. Yet, my experience as an RM is that more code
comes with more risk. I don't believe that Ozone is special or different in
this regard. It comes with a maintenance cost, not a maintenance benefit.

I'm advocating for #3: separate source, separate release. Since HDSL
stability and FSN/BM refactoring are still a ways out, I don't want to
incur a maintenance cost now. I sympathize with the sentiment that working
cross-repo is harder than within same repo, but the right tooling can make
this a lot easier (e.g. git submodule, Google's repo tool). We have
experience doing this internally here at Cloudera, and I'm happy to share
knowledge and possibly code.

Best,
Andrew

On Fri, Mar 2, 2018 at 4:41 PM, Wangda Tan <wh...@gmail.com> wrote:

> I like the idea of same source / same release and put Ozone's source under
> a different directory.
>
> Like Owen mentioned, It gonna be important for all parties to keep a
> regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
> releases. Users can try features and give feedbacks to stabilize feature
> earlier; developers can be happier since efforts will be consumed by users
> soon after features get merged. In addition to this, if features merged to
> trunk after reasonable tests/review, Andrew's concern may not be a problem
> anymore:
>
> bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
> being a separate project. Ozone could release faster and iterate more
> quickly if it wasn't hampered by Hadoop's release schedule and security and
> compatibility requirements.
>
> Thanks,
> Wangda
>
>
> On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
> wrote:
>
>> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
>> wrote:
>>
>> Owen mentioned making a Hadoop subproject; we'd have to
>> > hash out what exactly this means (I assume a separate repo still
>> managed by
>> > the Hadoop project), but I think we could make this work if it's more
>> > attractive than incubation or a new TLP.
>>
>>
>> Ok, there are multiple levels of sub-projects that all make sense:
>>
>>    - Same source tree, same releases - examples like HDFS & YARN
>>    - Same master branch, separate releases and release branches - Hive's
>>    Storage API vs Hive. It is in the source tree for the master branch,
>> but
>>    has distinct releases and release branches.
>>    - Separate source, separate release - Apache Commons.
>>
>> There are advantages and disadvantages to each. I'd propose that we use
>> the
>> same source, same release pattern for Ozone. Note that we tried and later
>> reverted doing Common, HDFS, and YARN as separate source, separate release
>> because it was too much trouble. I like Daryn's idea of putting it as a
>> top
>> level directory in Hadoop and making sure that nothing in Common, HDFS, or
>> YARN depend on it. That way if a Release Manager doesn't think it is ready
>> for release, it can be trivially removed before the release.
>>
>> One thing about using the same releases, Sanjay and Jitendra are signing
>> up
>> to make much more regular bugfix and minor releases in the near future.
>> For
>> example, they'll need to make 3.2 relatively soon to get it released and
>> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
>> project. Hadoop needs more regular releases and fewer big bang releases.
>>
>> .. Owen
>>
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Wangda Tan <wh...@gmail.com>.

I like the idea of same source / same release and put Ozone's source under
a different directory.

Like Owen mentioned, It gonna be important for all parties to keep a
regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
releases. Users can try features and give feedbacks to stabilize feature
earlier; developers can be happier since efforts will be consumed by users
soon after features get merged. In addition to this, if features merged to
trunk after reasonable tests/review, Andrew's concern may not be a problem
anymore:

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda

On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
wrote:

> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> Owen mentioned making a Hadoop subproject; we'd have to
> > hash out what exactly this means (I assume a separate repo still managed
> by
> > the Hadoop project), but I think we could make this work if it's more
> > attractive than incubation or a new TLP.
>
>
> Ok, there are multiple levels of sub-projects that all make sense:
>
>    - Same source tree, same releases - examples like HDFS & YARN
>    - Same master branch, separate releases and release branches - Hive's
>    Storage API vs Hive. It is in the source tree for the master branch, but
>    has distinct releases and release branches.
>    - Separate source, separate release - Apache Commons.
>
> There are advantages and disadvantages to each. I'd propose that we use the
> same source, same release pattern for Ozone. Note that we tried and later
> reverted doing Common, HDFS, and YARN as separate source, separate release
> because it was too much trouble. I like Daryn's idea of putting it as a top
> level directory in Hadoop and making sure that nothing in Common, HDFS, or
> YARN depend on it. That way if a Release Manager doesn't think it is ready
> for release, it can be trivially removed before the release.
>
> One thing about using the same releases, Sanjay and Jitendra are signing up
> to make much more regular bugfix and minor releases in the near future. For
> example, they'll need to make 3.2 relatively soon to get it released and
> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
> project. Hadoop needs more regular releases and fewer big bang releases.
>
> .. Owen
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Wangda Tan <wh...@gmail.com>.

I like the idea of same source / same release and put Ozone's source under
a different directory.

Like Owen mentioned, It gonna be important for all parties to keep a
regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
releases. Users can try features and give feedbacks to stabilize feature
earlier; developers can be happier since efforts will be consumed by users
soon after features get merged. In addition to this, if features merged to
trunk after reasonable tests/review, Andrew's concern may not be a problem
anymore:

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda

On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
wrote:

> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> Owen mentioned making a Hadoop subproject; we'd have to
> > hash out what exactly this means (I assume a separate repo still managed
> by
> > the Hadoop project), but I think we could make this work if it's more
> > attractive than incubation or a new TLP.
>
>
> Ok, there are multiple levels of sub-projects that all make sense:
>
>    - Same source tree, same releases - examples like HDFS & YARN
>    - Same master branch, separate releases and release branches - Hive's
>    Storage API vs Hive. It is in the source tree for the master branch, but
>    has distinct releases and release branches.
>    - Separate source, separate release - Apache Commons.
>
> There are advantages and disadvantages to each. I'd propose that we use the
> same source, same release pattern for Ozone. Note that we tried and later
> reverted doing Common, HDFS, and YARN as separate source, separate release
> because it was too much trouble. I like Daryn's idea of putting it as a top
> level directory in Hadoop and making sure that nothing in Common, HDFS, or
> YARN depend on it. That way if a Release Manager doesn't think it is ready
> for release, it can be trivially removed before the release.
>
> One thing about using the same releases, Sanjay and Jitendra are signing up
> to make much more regular bugfix and minor releases in the near future. For
> example, they'll need to make 3.2 relatively soon to get it released and
> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
> project. Hadoop needs more regular releases and fewer big bang releases.
>
> .. Owen
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Wangda Tan <wh...@gmail.com>.

I like the idea of same source / same release and put Ozone's source under
a different directory.

Like Owen mentioned, It gonna be important for all parties to keep a
regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
releases. Users can try features and give feedbacks to stabilize feature
earlier; developers can be happier since efforts will be consumed by users
soon after features get merged. In addition to this, if features merged to
trunk after reasonable tests/review, Andrew's concern may not be a problem
anymore:

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda

On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
wrote:

> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> Owen mentioned making a Hadoop subproject; we'd have to
> > hash out what exactly this means (I assume a separate repo still managed
> by
> > the Hadoop project), but I think we could make this work if it's more
> > attractive than incubation or a new TLP.
>
>
> Ok, there are multiple levels of sub-projects that all make sense:
>
>    - Same source tree, same releases - examples like HDFS & YARN
>    - Same master branch, separate releases and release branches - Hive's
>    Storage API vs Hive. It is in the source tree for the master branch, but
>    has distinct releases and release branches.
>    - Separate source, separate release - Apache Commons.
>
> There are advantages and disadvantages to each. I'd propose that we use the
> same source, same release pattern for Ozone. Note that we tried and later
> reverted doing Common, HDFS, and YARN as separate source, separate release
> because it was too much trouble. I like Daryn's idea of putting it as a top
> level directory in Hadoop and making sure that nothing in Common, HDFS, or
> YARN depend on it. That way if a Release Manager doesn't think it is ready
> for release, it can be trivially removed before the release.
>
> One thing about using the same releases, Sanjay and Jitendra are signing up
> to make much more regular bugfix and minor releases in the near future. For
> example, they'll need to make 3.2 relatively soon to get it released and
> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
> project. Hadoop needs more regular releases and fewer big bang releases.
>
> .. Owen
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Wangda Tan <wh...@gmail.com>.

I like the idea of same source / same release and put Ozone's source under
a different directory.

Like Owen mentioned, It gonna be important for all parties to keep a
regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
releases. Users can try features and give feedbacks to stabilize feature
earlier; developers can be happier since efforts will be consumed by users
soon after features get merged. In addition to this, if features merged to
trunk after reasonable tests/review, Andrew's concern may not be a problem
anymore:

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda

On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley <ow...@gmail.com>
wrote:

> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
> wrote:
>
> Owen mentioned making a Hadoop subproject; we'd have to
> > hash out what exactly this means (I assume a separate repo still managed
> by
> > the Hadoop project), but I think we could make this work if it's more
> > attractive than incubation or a new TLP.
>
>
> Ok, there are multiple levels of sub-projects that all make sense:
>
>    - Same source tree, same releases - examples like HDFS & YARN
>    - Same master branch, separate releases and release branches - Hive's
>    Storage API vs Hive. It is in the source tree for the master branch, but
>    has distinct releases and release branches.
>    - Separate source, separate release - Apache Commons.
>
> There are advantages and disadvantages to each. I'd propose that we use the
> same source, same release pattern for Ozone. Note that we tried and later
> reverted doing Common, HDFS, and YARN as separate source, separate release
> because it was too much trouble. I like Daryn's idea of putting it as a top
> level directory in Hadoop and making sure that nothing in Common, HDFS, or
> YARN depend on it. That way if a Release Manager doesn't think it is ready
> for release, it can be trivially removed before the release.
>
> One thing about using the same releases, Sanjay and Jitendra are signing up
> to make much more regular bugfix and minor releases in the near future. For
> example, they'll need to make 3.2 relatively soon to get it released and
> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
> project. Hadoop needs more regular releases and fewer big bang releases.
>
> .. Owen
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
wrote:

Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.

Ok, there are multiple levels of sub-projects that all make sense:

   - Same source tree, same releases - examples like HDFS & YARN
   - Same master branch, separate releases and release branches - Hive's
   Storage API vs Hive. It is in the source tree for the master branch, but
   has distinct releases and release branches.
   - Separate source, separate release - Apache Commons.

There are advantages and disadvantages to each. I'd propose that we use the
same source, same release pattern for Ozone. Note that we tried and later
reverted doing Common, HDFS, and YARN as separate source, separate release
because it was too much trouble. I like Daryn's idea of putting it as a top
level directory in Hadoop and making sure that nothing in Common, HDFS, or
YARN depend on it. That way if a Release Manager doesn't think it is ready
for release, it can be trivially removed before the release.

One thing about using the same releases, Sanjay and Jitendra are signing up
to make much more regular bugfix and minor releases in the near future. For
example, they'll need to make 3.2 relatively soon to get it released and
then 3.3 somewhere in the next 3 to 6 months. That would be good for the
project. Hadoop needs more regular releases and fewer big bang releases.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
wrote:

Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.

Ok, there are multiple levels of sub-projects that all make sense:

   - Same source tree, same releases - examples like HDFS & YARN
   - Same master branch, separate releases and release branches - Hive's
   Storage API vs Hive. It is in the source tree for the master branch, but
   has distinct releases and release branches.
   - Separate source, separate release - Apache Commons.

There are advantages and disadvantages to each. I'd propose that we use the
same source, same release pattern for Ozone. Note that we tried and later
reverted doing Common, HDFS, and YARN as separate source, separate release
because it was too much trouble. I like Daryn's idea of putting it as a top
level directory in Hadoop and making sure that nothing in Common, HDFS, or
YARN depend on it. That way if a Release Manager doesn't think it is ready
for release, it can be trivially removed before the release.

One thing about using the same releases, Sanjay and Jitendra are signing up
to make much more regular bugfix and minor releases in the near future. For
example, they'll need to make 3.2 relatively soon to get it released and
then 3.3 somewhere in the next 3 to 6 months. That would be good for the
project. Hadoop needs more regular releases and fewer big bang releases.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
wrote:

Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.

Ok, there are multiple levels of sub-projects that all make sense:

   - Same source tree, same releases - examples like HDFS & YARN
   - Same master branch, separate releases and release branches - Hive's
   Storage API vs Hive. It is in the source tree for the master branch, but
   has distinct releases and release branches.
   - Separate source, separate release - Apache Commons.

There are advantages and disadvantages to each. I'd propose that we use the
same source, same release pattern for Ozone. Note that we tried and later
reverted doing Common, HDFS, and YARN as separate source, separate release
because it was too much trouble. I like Daryn's idea of putting it as a top
level directory in Hadoop and making sure that nothing in Common, HDFS, or
YARN depend on it. That way if a Release Manager doesn't think it is ready
for release, it can be trivially removed before the release.

One thing about using the same releases, Sanjay and Jitendra are signing up
to make much more regular bugfix and minor releases in the near future. For
example, they'll need to make 3.2 relatively soon to get it released and
then 3.3 somewhere in the next 3 to 6 months. That would be good for the
project. Hadoop needs more regular releases and fewer big bang releases.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Andrew
  Thanks for your response. 

 In this email let me focus on maintenance and unnecessary impact on HDFS.
Daryn also touched on this topic and looked at the code base from the developer impact point of view. He appreciated that the code is separate and I agree with his suggestion to move it further up the src tree (e.g. Hadoop-hdsl-project or hadoop-hdfs-project/hadoop-hdsl). He also gave a good analogy to the store: do not break things as you change and evolve the store. Let’s look at the areas of future interaction as examples.

- NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen acknowledge the benefit of the new block layer).  We have two choices here 
 ** a) Evolve NN so that it can interact with both old and new block layer, 
 **  b) Fork and create new NN that works only with new block layer, the old NN will continue to work with old block layer. 
There are trade-offs but clearly the 2nd option has least impact on the old HDFS code.  

- Share the HDSL’s netty  protocol engine with HDFS block layer.  After HDSL and Ozone has stabilized the engine, put the new netty engine in either HDFS or in Hadoop common - HDSL will use it from there. The HDFS community  has been talking about moving to better thread model for HDFS DNs since release 0.16!!

- Shallow copy. Here HDSL needs a way to get the actual linux file system links - HDFS block layer needs  to provide a private secure API to get file names of blocks so that HDSL can do a hard link (hence shallow copy)o

The first 2 examples are beneficial to existing HDFS and the maintenance burden can be minimized and worth the benefits (2x NN scalability!! And more efficient protocol engine). The 3rd is only beneficial to HDFS users who want the scalability of the new HDSL/Ozone code in a side-by-side system; here the cost is providing a  private API to access the block file name. 


sanjay

> On Mar 1, 2018, at 11:03 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Hi Sanjay,
> 
> I have different opinions about what's important and how to eventually
> integrate this code, and that's not because I'm "conveniently ignoring"
> your responses. I'm also not making some of the arguments you claim I am
> making. Attacking arguments I'm not making is not going to change my mind,
> so let's bring it back to the arguments I am making.
> 
> Here's what it comes down to: HDFS-on-HDSL is not going to be ready in the
> near-term, and it comes with a maintenance cost.
> 
> I did read the proposal on HDFS-10419 and I understood that HDFS-on-HDSL
> integration does not necessarily require a lock split. However, there still
> needs to be refactoring to clearly define the FSN and BM interfaces and
> make the BM pluggable so HDSL can be swapped in. This is a major
> undertaking and risky. We did a similar refactoring in 2.x which made
> backports hard and introduced bugs. I don't think we should have done this
> in a minor release.
> 
> Furthermore, I don't know what your expectation is on how long it will take
> to stabilize HDSL, but this horizon for other storage systems is typically
> measured in years rather than months.
> 
> Both of these feel like Hadoop 4 items: a ways out yet.
> 
> Moving on, there is a non-trivial maintenance cost to having this new code
> in the code base. Ozone bugs become our bugs. Ozone dependencies become our
> dependencies. Ozone's security flaws are our security flaws. All of this
> negatively affects our already lumbering release schedule, and thus our
> ability to deliver and iterate on the features we're already trying to
> ship. Even if Ozone is separate and off by default, this is still a large
> amount of code that comes with a large maintenance cost. I don't want to
> incur this cost when the benefit is still a ways out.
> 
> We disagree on the necessity of sharing a repo and sharing operational
> behaviors. Libraries exist as a method for sharing code. HDFS also hardly
> has a monopoly on intermediating storage today. Disks are shared with MR
> shuffle, Spark/Impala spill, log output, Kudu, Kafka, etc. Operationally
> we've made this work. Having Ozone/HDSL in a separate process can even be
> seen as an operational advantage since it's isolated. I firmly believe that
> we can solve any implementation issues even with separate processes.
> 
> This is why I asked about making this a separate project. Given that these
> two efforts (HDSL stabilization and NN refactoring) are a ways out, the
> best way to get Ozone/HDSL in the hands of users today is to release it as
> its own project. Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.
> 
> I'm excited about the possibilities of both HDSL and the NN refactoring in
> ensuring a future for HDFS for years to come. A pluggable block manager
> would also let us experiment with things like HDFS-on-S3, increasingly
> important in a cloud-centric world. CBlock would bring HDFS to new usecases
> around generic container workloads. However, given the timeline for
> completing these efforts, now is not the time to merge.
> 
> Best,
> Andrew
> 
> On Thu, Mar 1, 2018 at 5:33 PM, Daryn Sharp <da...@oath.com.invalid> wrote:
> 
>> I’m generally neutral and looked foremost at developer impact.  Ie.  Will
>> it be so intertwined with hdfs that each project risks destabilizing the
>> other?  Will developers with no expertise in ozone will be impeded?  I
>> think the answer is currently no.  These are the intersections and some
>> concerns based on the assumption ozone is accepted into the project:
>> 
>> 
>> Common
>> 
>> Appear to be a number of superfluous changes.  The conf servlet must not be
>> polluted with specific references and logic for ozone.  We don’t create
>> dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
>> “ozone free”.
>> 
>> 
>> Datanode
>> 
>> I expected ozone changes to be intricately linked with the existing blocks
>> map, dataset, volume, etc.  Thankfully it’s not.  As an independent
>> service, the DN should not be polluted with specific references to ozone.
>> If ozone is in the project, the DN should have a generic plugin interface
>> conceptually similar to the NM aux services.
>> 
>> 
>> Namenode
>> 
>> No impact, currently, but certainly will be…
>> 
>> 
>> Code Location
>> 
>> I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
>> I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
>> hadoop-hdsl-project.  This clean separation will make it easier to later
>> spin off or pull in depending on which way we vote.
>> 
>> 
>> Dependencies
>> 
>> Owen hit upon his before I could send.  Hadoop is already bursting with
>> dependencies, I hope this doesn’t pull in a lot more.
>> 
>> 
>> ––
>> 
>> 
>> Do I think ozone be should be a separate project?  If we view it only as a
>> competing filesystem, then clearly yes.  If it’s a low risk evolutionary
>> step with near-term benefits, no, we want to keep it close and help it
>> evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
>> umbrella term for too many technologies that should perhaps be split.  I'm
>> interested in the container block management.  I have little interest at
>> this time in the key store.
>> 
>> 
>> The usability of ozone, specifically container management, is unclear to
>> me.  It lacks basic features like changing replication factors, append, a
>> migration path, security, etc - I know there are good plans for all of it -
>> yet another goal is splicing into the NN.  That’s a lot of high priority
>> items to tackle that need to be carefully orchestrated before contemplating
>> BM replacement.  Each of those is a non-starter for (my) production
>> environment.  We need to make sure we can reach a consensus on the block
>> level functionality before rushing it into the NN.  That’s independent of
>> whether allowing it into the project.
>> 
>> 
>> The BM/SCM changes to the NN are realistically going to be contentious &
>> destabilizing.  If done correctly, the BM separation will be a big win for
>> the NN.  If ozone is out, by necessity interfaces will need to be stable
>> and well-defined but we won’t get that right for a long time.  Interface
>> and logic changes that break the other will be difficult to coordinate and
>> we’ll likely veto changes that impact the other.  If ozone is in, we can
>> hopefully synchronize the changes with less friction, but it greatly
>> increases the chances of developers riddling the NN with hacks and/or ozone
>> specific logic that makes it even more brittle.  I will note we need to be
>> vigilant against pervasive conditionals (ie. EC, snapshots).
>> 
>> 
>> In either case, I think ozone must agree to not impede current hdfs work.
>> I’ll compare to hdfs is a store owner that plans to maybe retire in 5
>> years.  A potential new owner (ozone) is lined up and hdfs graciously gives
>> them no-rent space (the DN).  Precondition is help improve the store.
>> Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
>> that complicate hdfs but ignore it due to anticipation of its
>> departure/demise.  I’m not implying that’s currently happening, it’s just
>> what I don’t want to see.
>> 
>> 
>> We as a community and our customers need an evolution, not a revolution,
>> and definitively not a civil war.  Hdfs has too much legacy code rot that
>> is hard to change.  Too many poorly implemented features.   Perhaps I’m
>> overly optimistic that freshly redesigned code can counterbalance
>> performance degradations in the NN.  I’m also reluctant, but realize it is
>> being driven by some hdfs veterans that know/understand historical hdfs
>> design strengths and flaws.
>> 
>> 
>> If the initially cited issues are addressed, I’m +0.5 for the concept of
>> bringing in ozone if it's not going to be a proverbial bull in the china
>> shop.
>> 
>> 
>> Daryn
>> 
>> On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <jitendra@hortonworks.com
>>> 
>> wrote:
>> 
>>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>> 
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>> is
>>> a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>> layer
>>> as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>> 
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>> 
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/
>>> Hadoop+Distributed+Storage+Layer+and+Applications
>>> 
>>> 
>>>    I will start with my vote.
>>>            +1 (binding)
>>> 
>>> 
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>> 
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>> 
>>> 
>>>    Thanks
>>>    jitendra
>>> 
>>> 
>>> 
>>> 
>>> 
>>>            DISCUSSION THREAD SUMMARY :
>>> 
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>> 
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>> 
>>> 
>>>                Dear
>>>                 Hadoop Community Members,
>>> 
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread.
>> We
>>> express our gratitude for participation and valuable comments.
>>> 
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>> brings
>>> in a large body of code.
>>>                4) Why can’t they be separate projects forever or merged
>>> in when production ready?
>>> 
>>>                We have responded to all the above questions with
>> detailed
>>> explanations and answers on the jira as well as in the discussions. We
>>> believe that should sufficiently address community’s concerns.
>>> 
>>>                Please see the summary below:
>>> 
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>> 
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the
>> new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone
>> is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This
>> will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old
>> hdfs
>>> block layer. See details below on sharing of code.
>>> 
>>> 
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if
>> configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the
>> old
>>> HDFS continues to remains stable. (for example we plan to stabilize the
>> new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS’s old block layer)
>>> 
>>> 
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual
>> production
>>> use till the new system has feature parity with old HDFS. During this
>> time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>> place
>>> to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>> capacity
>>> across disks - this needs to be done for both the current HDFS blocks and
>>> the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth
>> used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>> 
>>> 
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer’s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>> possible
>>> (anything is possible in software),  it is significantly harder typically
>>> requiring  cleaner public apis etc. Sharing within a project though
>>> internal APIs is often simpler (such as the protocol engine that we want
>> to
>>> share).
>>> 
>>> 
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two
>> code
>>> bases for now and then later merge them when the new code is stable:
>>> 
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.
>> We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>> harder.
>>> 
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>> same
>>> then.
>>> 
>>> 
>>>                ------------------------------
>>> ---------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
>>> apache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> --
>> 
>> Daryn
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang <an...@cloudera.com>
wrote:

Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.

Ok, there are multiple levels of sub-projects that all make sense:

   - Same source tree, same releases - examples like HDFS & YARN
   - Same master branch, separate releases and release branches - Hive's
   Storage API vs Hive. It is in the source tree for the master branch, but
   has distinct releases and release branches.
   - Separate source, separate release - Apache Commons.

There are advantages and disadvantages to each. I'd propose that we use the
same source, same release pattern for Ozone. Note that we tried and later
reverted doing Common, HDFS, and YARN as separate source, separate release
because it was too much trouble. I like Daryn's idea of putting it as a top
level directory in Hadoop and making sure that nothing in Common, HDFS, or
YARN depend on it. That way if a Release Manager doesn't think it is ready
for release, it can be trivially removed before the release.

One thing about using the same releases, Sanjay and Jitendra are signing up
to make much more regular bugfix and minor releases in the near future. For
example, they'll need to make 3.2 relatively soon to get it released and
then 3.3 somewhere in the next 3 to 6 months. That would be good for the
project. Hadoop needs more regular releases and fewer big bang releases.

.. Owen

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by sanjay Radia <sa...@gmail.com>.

Andrew
  Thanks for your response. 

 In this email let me focus on maintenance and unnecessary impact on HDFS.
Daryn also touched on this topic and looked at the code base from the developer impact point of view. He appreciated that the code is separate and I agree with his suggestion to move it further up the src tree (e.g. Hadoop-hdsl-project or hadoop-hdfs-project/hadoop-hdsl). He also gave a good analogy to the store: do not break things as you change and evolve the store. Let’s look at the areas of future interaction as examples.

- NN on top HDSL where the NN uses the new block layer (Both Daryn and Owen acknowledge the benefit of the new block layer).  We have two choices here 
 ** a) Evolve NN so that it can interact with both old and new block layer, 
 **  b) Fork and create new NN that works only with new block layer, the old NN will continue to work with old block layer. 
There are trade-offs but clearly the 2nd option has least impact on the old HDFS code.  

- Share the HDSL’s netty  protocol engine with HDFS block layer.  After HDSL and Ozone has stabilized the engine, put the new netty engine in either HDFS or in Hadoop common - HDSL will use it from there. The HDFS community  has been talking about moving to better thread model for HDFS DNs since release 0.16!!

- Shallow copy. Here HDSL needs a way to get the actual linux file system links - HDFS block layer needs  to provide a private secure API to get file names of blocks so that HDSL can do a hard link (hence shallow copy)o

The first 2 examples are beneficial to existing HDFS and the maintenance burden can be minimized and worth the benefits (2x NN scalability!! And more efficient protocol engine). The 3rd is only beneficial to HDFS users who want the scalability of the new HDSL/Ozone code in a side-by-side system; here the cost is providing a  private API to access the block file name. 


sanjay

> On Mar 1, 2018, at 11:03 PM, Andrew Wang <an...@cloudera.com> wrote:
> 
> Hi Sanjay,
> 
> I have different opinions about what's important and how to eventually
> integrate this code, and that's not because I'm "conveniently ignoring"
> your responses. I'm also not making some of the arguments you claim I am
> making. Attacking arguments I'm not making is not going to change my mind,
> so let's bring it back to the arguments I am making.
> 
> Here's what it comes down to: HDFS-on-HDSL is not going to be ready in the
> near-term, and it comes with a maintenance cost.
> 
> I did read the proposal on HDFS-10419 and I understood that HDFS-on-HDSL
> integration does not necessarily require a lock split. However, there still
> needs to be refactoring to clearly define the FSN and BM interfaces and
> make the BM pluggable so HDSL can be swapped in. This is a major
> undertaking and risky. We did a similar refactoring in 2.x which made
> backports hard and introduced bugs. I don't think we should have done this
> in a minor release.
> 
> Furthermore, I don't know what your expectation is on how long it will take
> to stabilize HDSL, but this horizon for other storage systems is typically
> measured in years rather than months.
> 
> Both of these feel like Hadoop 4 items: a ways out yet.
> 
> Moving on, there is a non-trivial maintenance cost to having this new code
> in the code base. Ozone bugs become our bugs. Ozone dependencies become our
> dependencies. Ozone's security flaws are our security flaws. All of this
> negatively affects our already lumbering release schedule, and thus our
> ability to deliver and iterate on the features we're already trying to
> ship. Even if Ozone is separate and off by default, this is still a large
> amount of code that comes with a large maintenance cost. I don't want to
> incur this cost when the benefit is still a ways out.
> 
> We disagree on the necessity of sharing a repo and sharing operational
> behaviors. Libraries exist as a method for sharing code. HDFS also hardly
> has a monopoly on intermediating storage today. Disks are shared with MR
> shuffle, Spark/Impala spill, log output, Kudu, Kafka, etc. Operationally
> we've made this work. Having Ozone/HDSL in a separate process can even be
> seen as an operational advantage since it's isolated. I firmly believe that
> we can solve any implementation issues even with separate processes.
> 
> This is why I asked about making this a separate project. Given that these
> two efforts (HDSL stabilization and NN refactoring) are a ways out, the
> best way to get Ozone/HDSL in the hands of users today is to release it as
> its own project. Owen mentioned making a Hadoop subproject; we'd have to
> hash out what exactly this means (I assume a separate repo still managed by
> the Hadoop project), but I think we could make this work if it's more
> attractive than incubation or a new TLP.
> 
> I'm excited about the possibilities of both HDSL and the NN refactoring in
> ensuring a future for HDFS for years to come. A pluggable block manager
> would also let us experiment with things like HDFS-on-S3, increasingly
> important in a cloud-centric world. CBlock would bring HDFS to new usecases
> around generic container workloads. However, given the timeline for
> completing these efforts, now is not the time to merge.
> 
> Best,
> Andrew
> 
> On Thu, Mar 1, 2018 at 5:33 PM, Daryn Sharp <da...@oath.com.invalid> wrote:
> 
>> I’m generally neutral and looked foremost at developer impact.  Ie.  Will
>> it be so intertwined with hdfs that each project risks destabilizing the
>> other?  Will developers with no expertise in ozone will be impeded?  I
>> think the answer is currently no.  These are the intersections and some
>> concerns based on the assumption ozone is accepted into the project:
>> 
>> 
>> Common
>> 
>> Appear to be a number of superfluous changes.  The conf servlet must not be
>> polluted with specific references and logic for ozone.  We don’t create
>> dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
>> “ozone free”.
>> 
>> 
>> Datanode
>> 
>> I expected ozone changes to be intricately linked with the existing blocks
>> map, dataset, volume, etc.  Thankfully it’s not.  As an independent
>> service, the DN should not be polluted with specific references to ozone.
>> If ozone is in the project, the DN should have a generic plugin interface
>> conceptually similar to the NM aux services.
>> 
>> 
>> Namenode
>> 
>> No impact, currently, but certainly will be…
>> 
>> 
>> Code Location
>> 
>> I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
>> I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
>> hadoop-hdsl-project.  This clean separation will make it easier to later
>> spin off or pull in depending on which way we vote.
>> 
>> 
>> Dependencies
>> 
>> Owen hit upon his before I could send.  Hadoop is already bursting with
>> dependencies, I hope this doesn’t pull in a lot more.
>> 
>> 
>> ––
>> 
>> 
>> Do I think ozone be should be a separate project?  If we view it only as a
>> competing filesystem, then clearly yes.  If it’s a low risk evolutionary
>> step with near-term benefits, no, we want to keep it close and help it
>> evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
>> umbrella term for too many technologies that should perhaps be split.  I'm
>> interested in the container block management.  I have little interest at
>> this time in the key store.
>> 
>> 
>> The usability of ozone, specifically container management, is unclear to
>> me.  It lacks basic features like changing replication factors, append, a
>> migration path, security, etc - I know there are good plans for all of it -
>> yet another goal is splicing into the NN.  That’s a lot of high priority
>> items to tackle that need to be carefully orchestrated before contemplating
>> BM replacement.  Each of those is a non-starter for (my) production
>> environment.  We need to make sure we can reach a consensus on the block
>> level functionality before rushing it into the NN.  That’s independent of
>> whether allowing it into the project.
>> 
>> 
>> The BM/SCM changes to the NN are realistically going to be contentious &
>> destabilizing.  If done correctly, the BM separation will be a big win for
>> the NN.  If ozone is out, by necessity interfaces will need to be stable
>> and well-defined but we won’t get that right for a long time.  Interface
>> and logic changes that break the other will be difficult to coordinate and
>> we’ll likely veto changes that impact the other.  If ozone is in, we can
>> hopefully synchronize the changes with less friction, but it greatly
>> increases the chances of developers riddling the NN with hacks and/or ozone
>> specific logic that makes it even more brittle.  I will note we need to be
>> vigilant against pervasive conditionals (ie. EC, snapshots).
>> 
>> 
>> In either case, I think ozone must agree to not impede current hdfs work.
>> I’ll compare to hdfs is a store owner that plans to maybe retire in 5
>> years.  A potential new owner (ozone) is lined up and hdfs graciously gives
>> them no-rent space (the DN).  Precondition is help improve the store.
>> Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
>> that complicate hdfs but ignore it due to anticipation of its
>> departure/demise.  I’m not implying that’s currently happening, it’s just
>> what I don’t want to see.
>> 
>> 
>> We as a community and our customers need an evolution, not a revolution,
>> and definitively not a civil war.  Hdfs has too much legacy code rot that
>> is hard to change.  Too many poorly implemented features.   Perhaps I’m
>> overly optimistic that freshly redesigned code can counterbalance
>> performance degradations in the NN.  I’m also reluctant, but realize it is
>> being driven by some hdfs veterans that know/understand historical hdfs
>> design strengths and flaws.
>> 
>> 
>> If the initially cited issues are addressed, I’m +0.5 for the concept of
>> bringing in ozone if it's not going to be a proverbial bull in the china
>> shop.
>> 
>> 
>> Daryn
>> 
>> On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <jitendra@hortonworks.com
>>> 
>> wrote:
>> 
>>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>> 
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>> is
>>> a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>> layer
>>> as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>> 
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>> 
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/
>>> Hadoop+Distributed+Storage+Layer+and+Applications
>>> 
>>> 
>>>    I will start with my vote.
>>>            +1 (binding)
>>> 
>>> 
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>> 
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>> 
>>> 
>>>    Thanks
>>>    jitendra
>>> 
>>> 
>>> 
>>> 
>>> 
>>>            DISCUSSION THREAD SUMMARY :
>>> 
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>> 
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>> 
>>> 
>>>                Dear
>>>                 Hadoop Community Members,
>>> 
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread.
>> We
>>> express our gratitude for participation and valuable comments.
>>> 
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>> brings
>>> in a large body of code.
>>>                4) Why can’t they be separate projects forever or merged
>>> in when production ready?
>>> 
>>>                We have responded to all the above questions with
>> detailed
>>> explanations and answers on the jira as well as in the discussions. We
>>> believe that should sufficiently address community’s concerns.
>>> 
>>>                Please see the summary below:
>>> 
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>> 
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the
>> new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone
>> is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This
>> will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old
>> hdfs
>>> block layer. See details below on sharing of code.
>>> 
>>> 
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if
>> configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the
>> old
>>> HDFS continues to remains stable. (for example we plan to stabilize the
>> new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS’s old block layer)
>>> 
>>> 
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual
>> production
>>> use till the new system has feature parity with old HDFS. During this
>> time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>> place
>>> to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>> capacity
>>> across disks - this needs to be done for both the current HDFS blocks and
>>> the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth
>> used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>> 
>>> 
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer’s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>> possible
>>> (anything is possible in software),  it is significantly harder typically
>>> requiring  cleaner public apis etc. Sharing within a project though
>>> internal APIs is often simpler (such as the protocol engine that we want
>> to
>>> share).
>>> 
>>> 
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two
>> code
>>> bases for now and then later merge them when the new code is stable:
>>> 
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.
>> We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>> harder.
>>> 
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>> same
>>> then.
>>> 
>>> 
>>>                ------------------------------
>>> ---------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
>>> apache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> --
>> 
>> Daryn
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Sanjay,

I have different opinions about what's important and how to eventually
integrate this code, and that's not because I'm "conveniently ignoring"
your responses. I'm also not making some of the arguments you claim I am
making. Attacking arguments I'm not making is not going to change my mind,
so let's bring it back to the arguments I am making.

Here's what it comes down to: HDFS-on-HDSL is not going to be ready in the
near-term, and it comes with a maintenance cost.

I did read the proposal on HDFS-10419 and I understood that HDFS-on-HDSL
integration does not necessarily require a lock split. However, there still
needs to be refactoring to clearly define the FSN and BM interfaces and
make the BM pluggable so HDSL can be swapped in. This is a major
undertaking and risky. We did a similar refactoring in 2.x which made
backports hard and introduced bugs. I don't think we should have done this
in a minor release.

Furthermore, I don't know what your expectation is on how long it will take
to stabilize HDSL, but this horizon for other storage systems is typically
measured in years rather than months.

Both of these feel like Hadoop 4 items: a ways out yet.

Moving on, there is a non-trivial maintenance cost to having this new code
in the code base. Ozone bugs become our bugs. Ozone dependencies become our
dependencies. Ozone's security flaws are our security flaws. All of this
negatively affects our already lumbering release schedule, and thus our
ability to deliver and iterate on the features we're already trying to
ship. Even if Ozone is separate and off by default, this is still a large
amount of code that comes with a large maintenance cost. I don't want to
incur this cost when the benefit is still a ways out.

We disagree on the necessity of sharing a repo and sharing operational
behaviors. Libraries exist as a method for sharing code. HDFS also hardly
has a monopoly on intermediating storage today. Disks are shared with MR
shuffle, Spark/Impala spill, log output, Kudu, Kafka, etc. Operationally
we've made this work. Having Ozone/HDSL in a separate process can even be
seen as an operational advantage since it's isolated. I firmly believe that
we can solve any implementation issues even with separate processes.

This is why I asked about making this a separate project. Given that these
two efforts (HDSL stabilization and NN refactoring) are a ways out, the
best way to get Ozone/HDSL in the hands of users today is to release it as
its own project. Owen mentioned making a Hadoop subproject; we'd have to
hash out what exactly this means (I assume a separate repo still managed by
the Hadoop project), but I think we could make this work if it's more
attractive than incubation or a new TLP.

I'm excited about the possibilities of both HDSL and the NN refactoring in
ensuring a future for HDFS for years to come. A pluggable block manager
would also let us experiment with things like HDFS-on-S3, increasingly
important in a cloud-centric world. CBlock would bring HDFS to new usecases
around generic container workloads. However, given the timeline for
completing these efforts, now is not the time to merge.

Best,
Andrew

On Thu, Mar 1, 2018 at 5:33 PM, Daryn Sharp <da...@oath.com.invalid> wrote:

> I’m generally neutral and looked foremost at developer impact.  Ie.  Will
> it be so intertwined with hdfs that each project risks destabilizing the
> other?  Will developers with no expertise in ozone will be impeded?  I
> think the answer is currently no.  These are the intersections and some
> concerns based on the assumption ozone is accepted into the project:
>
>
> Common
>
> Appear to be a number of superfluous changes.  The conf servlet must not be
> polluted with specific references and logic for ozone.  We don’t create
> dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
> “ozone free”.
>
>
> Datanode
>
> I expected ozone changes to be intricately linked with the existing blocks
> map, dataset, volume, etc.  Thankfully it’s not.  As an independent
> service, the DN should not be polluted with specific references to ozone.
> If ozone is in the project, the DN should have a generic plugin interface
> conceptually similar to the NM aux services.
>
>
> Namenode
>
> No impact, currently, but certainly will be…
>
>
> Code Location
>
> I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
> I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
> hadoop-hdsl-project.  This clean separation will make it easier to later
> spin off or pull in depending on which way we vote.
>
>
> Dependencies
>
> Owen hit upon his before I could send.  Hadoop is already bursting with
> dependencies, I hope this doesn’t pull in a lot more.
>
>
> ––
>
>
> Do I think ozone be should be a separate project?  If we view it only as a
> competing filesystem, then clearly yes.  If it’s a low risk evolutionary
> step with near-term benefits, no, we want to keep it close and help it
> evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
> umbrella term for too many technologies that should perhaps be split.  I'm
> interested in the container block management.  I have little interest at
> this time in the key store.
>
>
> The usability of ozone, specifically container management, is unclear to
> me.  It lacks basic features like changing replication factors, append, a
> migration path, security, etc - I know there are good plans for all of it -
> yet another goal is splicing into the NN.  That’s a lot of high priority
> items to tackle that need to be carefully orchestrated before contemplating
> BM replacement.  Each of those is a non-starter for (my) production
> environment.  We need to make sure we can reach a consensus on the block
> level functionality before rushing it into the NN.  That’s independent of
> whether allowing it into the project.
>
>
> The BM/SCM changes to the NN are realistically going to be contentious &
> destabilizing.  If done correctly, the BM separation will be a big win for
> the NN.  If ozone is out, by necessity interfaces will need to be stable
> and well-defined but we won’t get that right for a long time.  Interface
> and logic changes that break the other will be difficult to coordinate and
> we’ll likely veto changes that impact the other.  If ozone is in, we can
> hopefully synchronize the changes with less friction, but it greatly
> increases the chances of developers riddling the NN with hacks and/or ozone
> specific logic that makes it even more brittle.  I will note we need to be
> vigilant against pervasive conditionals (ie. EC, snapshots).
>
>
> In either case, I think ozone must agree to not impede current hdfs work.
> I’ll compare to hdfs is a store owner that plans to maybe retire in 5
> years.  A potential new owner (ozone) is lined up and hdfs graciously gives
> them no-rent space (the DN).  Precondition is help improve the store.
> Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
> that complicate hdfs but ignore it due to anticipation of its
> departure/demise.  I’m not implying that’s currently happening, it’s just
> what I don’t want to see.
>
>
> We as a community and our customers need an evolution, not a revolution,
> and definitively not a civil war.  Hdfs has too much legacy code rot that
> is hard to change.  Too many poorly implemented features.   Perhaps I’m
> overly optimistic that freshly redesigned code can counterbalance
> performance degradations in the NN.  I’m also reluctant, but realize it is
> being driven by some hdfs veterans that know/understand historical hdfs
> design strengths and flaws.
>
>
> If the initially cited issues are addressed, I’m +0.5 for the concept of
> bringing in ozone if it's not going to be a proverbial bull in the china
> shop.
>
>
> Daryn
>
> On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <jitendra@hortonworks.com
> >
> wrote:
>
> >     Dear folks,
> >            We would like to start a vote to merge HDFS-7240 branch into
> > trunk. The context can be reviewed in the DISCUSSION thread, and in the
> > jiras (See references below).
> >
> >     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
> is
> > a distributed, replicated block layer.
> >     The old HDFS namespace and NN can be connected to this new block
> layer
> > as we have described in HDFS-10419.
> >     We also introduce a key-value namespace called Ozone built on HDSL.
> >
> >     The code is in a separate module and is turned off by default. In a
> > secure setup, HDSL and Ozone daemons cannot be started.
> >
> >     The detailed documentation is available at
> >              https://cwiki.apache.org/confluence/display/HADOOP/
> > Hadoop+Distributed+Storage+Layer+and+Applications
> >
> >
> >     I will start with my vote.
> >             +1 (binding)
> >
> >
> >     Discussion Thread:
> >               https://s.apache.org/7240-merge
> >               https://s.apache.org/4sfU
> >
> >     Jiras:
> >                https://issues.apache.org/jira/browse/HDFS-7240
> >                https://issues.apache.org/jira/browse/HDFS-10419
> >                https://issues.apache.org/jira/browse/HDFS-13074
> >                https://issues.apache.org/jira/browse/HDFS-13180
> >
> >
> >     Thanks
> >     jitendra
> >
> >
> >
> >
> >
> >             DISCUSSION THREAD SUMMARY :
> >
> >             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> > wrote:
> >
> >                 Sorry the formatting got messed by my email client.  Here
> > it is again
> >
> >
> >                 Dear
> >                  Hadoop Community Members,
> >
> >                    We had multiple community discussions, a few meetings
> > in smaller groups and also jira discussions with respect to this thread.
> We
> > express our gratitude for participation and valuable comments.
> >
> >                 The key questions raised were following
> >                 1) How the new block storage layer and OzoneFS benefit
> > HDFS and we were asked to chalk out a roadmap towards the goal of a
> > scalable namenode working with the new storage layer
> >                 2) We were asked to provide a security design
> >                 3)There were questions around stability given ozone
> brings
> > in a large body of code.
> >                 4) Why can’t they be separate projects forever or merged
> > in when production ready?
> >
> >                 We have responded to all the above questions with
> detailed
> > explanations and answers on the jira as well as in the discussions. We
> > believe that should sufficiently address community’s concerns.
> >
> >                 Please see the summary below:
> >
> >                 1) The new code base benefits HDFS scaling and a roadmap
> > has been provided.
> >
> >                 Summary:
> >                   - New block storage layer addresses the scalability of
> > the block layer. We have shown how existing NN can be connected to the
> new
> > block layer and its benefits. We have shown 2 milestones, 1st milestone
> is
> > much simpler than 2nd milestone while giving almost the same scaling
> > benefits. Originally we had proposed simply milestone 2 and the community
> > felt that removing the FSN/BM lock was was a fair amount of work and a
> > simpler solution would be useful
> >                   - We provide a new K-V namespace called Ozone FS with
> > FileSystem/FileContext plugins to allow the users to use the new system.
> > BTW Hive and Spark work very well on KV-namespaces on the cloud. This
> will
> > facilitate stabilizing the new block layer.
> >                   - The new block layer has a new netty based protocol
> > engine in the Datanode which, when stabilized, can be used by  the old
> hdfs
> > block layer. See details below on sharing of code.
> >
> >
> >                 2) Stability impact on the existing HDFS code base and
> > code separation. The new block layer and the OzoneFS are in modules that
> > are separate from old HDFS code - currently there are no calls from HDFS
> > into Ozone except for DN starting the new block  layer module if
> configured
> > to do so. It does not add instability (the instability argument has been
> > raised many times). Over time as we share code, we will ensure that the
> old
> > HDFS continues to remains stable. (for example we plan to stabilize the
> new
> > netty based protocol engine in the new block layer before sharing it with
> > HDFS’s old block layer)
> >
> >
> >                 3) In the short term and medium term, the new system and
> > HDFS  will be used side-by-side by users. Side by-side usage in the short
> > term for testing and side-by-side in the medium term for actual
> production
> > use till the new system has feature parity with old HDFS. During this
> time,
> > sharing the DN daemon and admin functions between the two systems is
> > operationally important:
> >                   - Sharing DN daemon to avoid additional operational
> > daemon lifecycle management
> >                   - Common decommissioning of the daemon and DN: One
> place
> > to decommission for a node and its storage.
> >                   - Replacing failed disks and internal balancing
> capacity
> > across disks - this needs to be done for both the current HDFS blocks and
> > the new block-layer blocks.
> >                   - Balancer: we would like use the same balancer and
> > provide a common way to balance and common management of the bandwidth
> used
> > for balancing
> >                   - Security configuration setup - reuse existing set up
> > for DNs rather then a new one for an independent cluster.
> >
> >
> >                 4) Need to easily share the block layer code between the
> > two systems when used side-by-side. Areas where sharing code is desired
> > over time:
> >                   - Sharing new block layer’s  new netty based protocol
> > engine for old HDFS DNs (a long time sore issue for HDFS block layer).
> >                   - Shallow data copy from old system to new system is
> > practical only if within same project and daemon otherwise have to deal
> > with security setting and coordinations across daemons. Shallow copy is
> > useful as customer migrate from old to new.
> >                   - Shared disk scheduling in the future and in the short
> > term have a single round robin rather than independent round robins.
> >                 While sharing code across projects is technically
> possible
> > (anything is possible in software),  it is significantly harder typically
> > requiring  cleaner public apis etc. Sharing within a project though
> > internal APIs is often simpler (such as the protocol engine that we want
> to
> > share).
> >
> >
> >                 5) Security design, including a threat model and and the
> > solution has been posted.
> >                 6) Temporary Separation and merge later: Several of the
> > comments in the jira have argued that we temporarily separate the two
> code
> > bases for now and then later merge them when the new code is stable:
> >
> >                   - If there is agreement to merge later, why bother
> > separating now - there needs to be to be good reasons to separate now.
> We
> > have addressed the stability and separation of the new code from existing
> > above.
> >                   - Merge the new code back into HDFS later will be
> harder.
> >
> >                     **The code and goals will diverge further.
> >                     ** We will be taking on extra work to split and then
> > take extra work to merge.
> >                     ** The issues raised today will be raised all the
> same
> > then.
> >
> >
> >                 ------------------------------
> > ---------------------------------------
> >                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> > apache.org
> >                 For additional commands, e-mail:
> > hdfs-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
> --
>
> Daryn
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Sanjay,

I have different opinions about what's important and how to eventually
integrate this code, and that's not because I'm "conveniently ignoring"
your responses. I'm also not making some of the arguments you claim I am
making. Attacking arguments I'm not making is not going to change my mind,
so let's bring it back to the arguments I am making.

Here's what it comes down to: HDFS-on-HDSL is not going to be ready in the
near-term, and it comes with a maintenance cost.

I did read the proposal on HDFS-10419 and I understood that HDFS-on-HDSL
integration does not necessarily require a lock split. However, there still
needs to be refactoring to clearly define the FSN and BM interfaces and
make the BM pluggable so HDSL can be swapped in. This is a major
undertaking and risky. We did a similar refactoring in 2.x which made
backports hard and introduced bugs. I don't think we should have done this
in a minor release.

Furthermore, I don't know what your expectation is on how long it will take
to stabilize HDSL, but this horizon for other storage systems is typically
measured in years rather than months.

Both of these feel like Hadoop 4 items: a ways out yet.

Moving on, there is a non-trivial maintenance cost to having this new code
in the code base. Ozone bugs become our bugs. Ozone dependencies become our
dependencies. Ozone's security flaws are our security flaws. All of this
negatively affects our already lumbering release schedule, and thus our
ability to deliver and iterate on the features we're already trying to
ship. Even if Ozone is separate and off by default, this is still a large
amount of code that comes with a large maintenance cost. I don't want to
incur this cost when the benefit is still a ways out.

We disagree on the necessity of sharing a repo and sharing operational
behaviors. Libraries exist as a method for sharing code. HDFS also hardly
has a monopoly on intermediating storage today. Disks are shared with MR
shuffle, Spark/Impala spill, log output, Kudu, Kafka, etc. Operationally
we've made this work. Having Ozone/HDSL in a separate process can even be
seen as an operational advantage since it's isolated. I firmly believe that
we can solve any implementation issues even with separate processes.

This is why I asked about making this a separate project. Given that these
two efforts (HDSL stabilization and NN refactoring) are a ways out, the
best way to get Ozone/HDSL in the hands of users today is to release it as
its own project. Owen mentioned making a Hadoop subproject; we'd have to
hash out what exactly this means (I assume a separate repo still managed by
the Hadoop project), but I think we could make this work if it's more
attractive than incubation or a new TLP.

I'm excited about the possibilities of both HDSL and the NN refactoring in
ensuring a future for HDFS for years to come. A pluggable block manager
would also let us experiment with things like HDFS-on-S3, increasingly
important in a cloud-centric world. CBlock would bring HDFS to new usecases
around generic container workloads. However, given the timeline for
completing these efforts, now is not the time to merge.

Best,
Andrew

On Thu, Mar 1, 2018 at 5:33 PM, Daryn Sharp <da...@oath.com.invalid> wrote:

> I’m generally neutral and looked foremost at developer impact.  Ie.  Will
> it be so intertwined with hdfs that each project risks destabilizing the
> other?  Will developers with no expertise in ozone will be impeded?  I
> think the answer is currently no.  These are the intersections and some
> concerns based on the assumption ozone is accepted into the project:
>
>
> Common
>
> Appear to be a number of superfluous changes.  The conf servlet must not be
> polluted with specific references and logic for ozone.  We don’t create
> dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
> “ozone free”.
>
>
> Datanode
>
> I expected ozone changes to be intricately linked with the existing blocks
> map, dataset, volume, etc.  Thankfully it’s not.  As an independent
> service, the DN should not be polluted with specific references to ozone.
> If ozone is in the project, the DN should have a generic plugin interface
> conceptually similar to the NM aux services.
>
>
> Namenode
>
> No impact, currently, but certainly will be…
>
>
> Code Location
>
> I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
> I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
> hadoop-hdsl-project.  This clean separation will make it easier to later
> spin off or pull in depending on which way we vote.
>
>
> Dependencies
>
> Owen hit upon his before I could send.  Hadoop is already bursting with
> dependencies, I hope this doesn’t pull in a lot more.
>
>
> ––
>
>
> Do I think ozone be should be a separate project?  If we view it only as a
> competing filesystem, then clearly yes.  If it’s a low risk evolutionary
> step with near-term benefits, no, we want to keep it close and help it
> evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
> umbrella term for too many technologies that should perhaps be split.  I'm
> interested in the container block management.  I have little interest at
> this time in the key store.
>
>
> The usability of ozone, specifically container management, is unclear to
> me.  It lacks basic features like changing replication factors, append, a
> migration path, security, etc - I know there are good plans for all of it -
> yet another goal is splicing into the NN.  That’s a lot of high priority
> items to tackle that need to be carefully orchestrated before contemplating
> BM replacement.  Each of those is a non-starter for (my) production
> environment.  We need to make sure we can reach a consensus on the block
> level functionality before rushing it into the NN.  That’s independent of
> whether allowing it into the project.
>
>
> The BM/SCM changes to the NN are realistically going to be contentious &
> destabilizing.  If done correctly, the BM separation will be a big win for
> the NN.  If ozone is out, by necessity interfaces will need to be stable
> and well-defined but we won’t get that right for a long time.  Interface
> and logic changes that break the other will be difficult to coordinate and
> we’ll likely veto changes that impact the other.  If ozone is in, we can
> hopefully synchronize the changes with less friction, but it greatly
> increases the chances of developers riddling the NN with hacks and/or ozone
> specific logic that makes it even more brittle.  I will note we need to be
> vigilant against pervasive conditionals (ie. EC, snapshots).
>
>
> In either case, I think ozone must agree to not impede current hdfs work.
> I’ll compare to hdfs is a store owner that plans to maybe retire in 5
> years.  A potential new owner (ozone) is lined up and hdfs graciously gives
> them no-rent space (the DN).  Precondition is help improve the store.
> Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
> that complicate hdfs but ignore it due to anticipation of its
> departure/demise.  I’m not implying that’s currently happening, it’s just
> what I don’t want to see.
>
>
> We as a community and our customers need an evolution, not a revolution,
> and definitively not a civil war.  Hdfs has too much legacy code rot that
> is hard to change.  Too many poorly implemented features.   Perhaps I’m
> overly optimistic that freshly redesigned code can counterbalance
> performance degradations in the NN.  I’m also reluctant, but realize it is
> being driven by some hdfs veterans that know/understand historical hdfs
> design strengths and flaws.
>
>
> If the initially cited issues are addressed, I’m +0.5 for the concept of
> bringing in ozone if it's not going to be a proverbial bull in the china
> shop.
>
>
> Daryn
>
> On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <jitendra@hortonworks.com
> >
> wrote:
>
> >     Dear folks,
> >            We would like to start a vote to merge HDFS-7240 branch into
> > trunk. The context can be reviewed in the DISCUSSION thread, and in the
> > jiras (See references below).
> >
> >     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
> is
> > a distributed, replicated block layer.
> >     The old HDFS namespace and NN can be connected to this new block
> layer
> > as we have described in HDFS-10419.
> >     We also introduce a key-value namespace called Ozone built on HDSL.
> >
> >     The code is in a separate module and is turned off by default. In a
> > secure setup, HDSL and Ozone daemons cannot be started.
> >
> >     The detailed documentation is available at
> >              https://cwiki.apache.org/confluence/display/HADOOP/
> > Hadoop+Distributed+Storage+Layer+and+Applications
> >
> >
> >     I will start with my vote.
> >             +1 (binding)
> >
> >
> >     Discussion Thread:
> >               https://s.apache.org/7240-merge
> >               https://s.apache.org/4sfU
> >
> >     Jiras:
> >                https://issues.apache.org/jira/browse/HDFS-7240
> >                https://issues.apache.org/jira/browse/HDFS-10419
> >                https://issues.apache.org/jira/browse/HDFS-13074
> >                https://issues.apache.org/jira/browse/HDFS-13180
> >
> >
> >     Thanks
> >     jitendra
> >
> >
> >
> >
> >
> >             DISCUSSION THREAD SUMMARY :
> >
> >             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> > wrote:
> >
> >                 Sorry the formatting got messed by my email client.  Here
> > it is again
> >
> >
> >                 Dear
> >                  Hadoop Community Members,
> >
> >                    We had multiple community discussions, a few meetings
> > in smaller groups and also jira discussions with respect to this thread.
> We
> > express our gratitude for participation and valuable comments.
> >
> >                 The key questions raised were following
> >                 1) How the new block storage layer and OzoneFS benefit
> > HDFS and we were asked to chalk out a roadmap towards the goal of a
> > scalable namenode working with the new storage layer
> >                 2) We were asked to provide a security design
> >                 3)There were questions around stability given ozone
> brings
> > in a large body of code.
> >                 4) Why can’t they be separate projects forever or merged
> > in when production ready?
> >
> >                 We have responded to all the above questions with
> detailed
> > explanations and answers on the jira as well as in the discussions. We
> > believe that should sufficiently address community’s concerns.
> >
> >                 Please see the summary below:
> >
> >                 1) The new code base benefits HDFS scaling and a roadmap
> > has been provided.
> >
> >                 Summary:
> >                   - New block storage layer addresses the scalability of
> > the block layer. We have shown how existing NN can be connected to the
> new
> > block layer and its benefits. We have shown 2 milestones, 1st milestone
> is
> > much simpler than 2nd milestone while giving almost the same scaling
> > benefits. Originally we had proposed simply milestone 2 and the community
> > felt that removing the FSN/BM lock was was a fair amount of work and a
> > simpler solution would be useful
> >                   - We provide a new K-V namespace called Ozone FS with
> > FileSystem/FileContext plugins to allow the users to use the new system.
> > BTW Hive and Spark work very well on KV-namespaces on the cloud. This
> will
> > facilitate stabilizing the new block layer.
> >                   - The new block layer has a new netty based protocol
> > engine in the Datanode which, when stabilized, can be used by  the old
> hdfs
> > block layer. See details below on sharing of code.
> >
> >
> >                 2) Stability impact on the existing HDFS code base and
> > code separation. The new block layer and the OzoneFS are in modules that
> > are separate from old HDFS code - currently there are no calls from HDFS
> > into Ozone except for DN starting the new block  layer module if
> configured
> > to do so. It does not add instability (the instability argument has been
> > raised many times). Over time as we share code, we will ensure that the
> old
> > HDFS continues to remains stable. (for example we plan to stabilize the
> new
> > netty based protocol engine in the new block layer before sharing it with
> > HDFS’s old block layer)
> >
> >
> >                 3) In the short term and medium term, the new system and
> > HDFS  will be used side-by-side by users. Side by-side usage in the short
> > term for testing and side-by-side in the medium term for actual
> production
> > use till the new system has feature parity with old HDFS. During this
> time,
> > sharing the DN daemon and admin functions between the two systems is
> > operationally important:
> >                   - Sharing DN daemon to avoid additional operational
> > daemon lifecycle management
> >                   - Common decommissioning of the daemon and DN: One
> place
> > to decommission for a node and its storage.
> >                   - Replacing failed disks and internal balancing
> capacity
> > across disks - this needs to be done for both the current HDFS blocks and
> > the new block-layer blocks.
> >                   - Balancer: we would like use the same balancer and
> > provide a common way to balance and common management of the bandwidth
> used
> > for balancing
> >                   - Security configuration setup - reuse existing set up
> > for DNs rather then a new one for an independent cluster.
> >
> >
> >                 4) Need to easily share the block layer code between the
> > two systems when used side-by-side. Areas where sharing code is desired
> > over time:
> >                   - Sharing new block layer’s  new netty based protocol
> > engine for old HDFS DNs (a long time sore issue for HDFS block layer).
> >                   - Shallow data copy from old system to new system is
> > practical only if within same project and daemon otherwise have to deal
> > with security setting and coordinations across daemons. Shallow copy is
> > useful as customer migrate from old to new.
> >                   - Shared disk scheduling in the future and in the short
> > term have a single round robin rather than independent round robins.
> >                 While sharing code across projects is technically
> possible
> > (anything is possible in software),  it is significantly harder typically
> > requiring  cleaner public apis etc. Sharing within a project though
> > internal APIs is often simpler (such as the protocol engine that we want
> to
> > share).
> >
> >
> >                 5) Security design, including a threat model and and the
> > solution has been posted.
> >                 6) Temporary Separation and merge later: Several of the
> > comments in the jira have argued that we temporarily separate the two
> code
> > bases for now and then later merge them when the new code is stable:
> >
> >                   - If there is agreement to merge later, why bother
> > separating now - there needs to be to be good reasons to separate now.
> We
> > have addressed the stability and separation of the new code from existing
> > above.
> >                   - Merge the new code back into HDFS later will be
> harder.
> >
> >                     **The code and goals will diverge further.
> >                     ** We will be taking on extra work to split and then
> > take extra work to merge.
> >                     ** The issues raised today will be raised all the
> same
> > then.
> >
> >
> >                 ------------------------------
> > ---------------------------------------
> >                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> > apache.org
> >                 For additional commands, e-mail:
> > hdfs-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
> --
>
> Daryn
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Sanjay,

I have different opinions about what's important and how to eventually
integrate this code, and that's not because I'm "conveniently ignoring"
your responses. I'm also not making some of the arguments you claim I am
making. Attacking arguments I'm not making is not going to change my mind,
so let's bring it back to the arguments I am making.

Here's what it comes down to: HDFS-on-HDSL is not going to be ready in the
near-term, and it comes with a maintenance cost.

I did read the proposal on HDFS-10419 and I understood that HDFS-on-HDSL
integration does not necessarily require a lock split. However, there still
needs to be refactoring to clearly define the FSN and BM interfaces and
make the BM pluggable so HDSL can be swapped in. This is a major
undertaking and risky. We did a similar refactoring in 2.x which made
backports hard and introduced bugs. I don't think we should have done this
in a minor release.

Furthermore, I don't know what your expectation is on how long it will take
to stabilize HDSL, but this horizon for other storage systems is typically
measured in years rather than months.

Both of these feel like Hadoop 4 items: a ways out yet.

Moving on, there is a non-trivial maintenance cost to having this new code
in the code base. Ozone bugs become our bugs. Ozone dependencies become our
dependencies. Ozone's security flaws are our security flaws. All of this
negatively affects our already lumbering release schedule, and thus our
ability to deliver and iterate on the features we're already trying to
ship. Even if Ozone is separate and off by default, this is still a large
amount of code that comes with a large maintenance cost. I don't want to
incur this cost when the benefit is still a ways out.

We disagree on the necessity of sharing a repo and sharing operational
behaviors. Libraries exist as a method for sharing code. HDFS also hardly
has a monopoly on intermediating storage today. Disks are shared with MR
shuffle, Spark/Impala spill, log output, Kudu, Kafka, etc. Operationally
we've made this work. Having Ozone/HDSL in a separate process can even be
seen as an operational advantage since it's isolated. I firmly believe that
we can solve any implementation issues even with separate processes.

This is why I asked about making this a separate project. Given that these
two efforts (HDSL stabilization and NN refactoring) are a ways out, the
best way to get Ozone/HDSL in the hands of users today is to release it as
its own project. Owen mentioned making a Hadoop subproject; we'd have to
hash out what exactly this means (I assume a separate repo still managed by
the Hadoop project), but I think we could make this work if it's more
attractive than incubation or a new TLP.

I'm excited about the possibilities of both HDSL and the NN refactoring in
ensuring a future for HDFS for years to come. A pluggable block manager
would also let us experiment with things like HDFS-on-S3, increasingly
important in a cloud-centric world. CBlock would bring HDFS to new usecases
around generic container workloads. However, given the timeline for
completing these efforts, now is not the time to merge.

Best,
Andrew

On Thu, Mar 1, 2018 at 5:33 PM, Daryn Sharp <da...@oath.com.invalid> wrote:

> I’m generally neutral and looked foremost at developer impact.  Ie.  Will
> it be so intertwined with hdfs that each project risks destabilizing the
> other?  Will developers with no expertise in ozone will be impeded?  I
> think the answer is currently no.  These are the intersections and some
> concerns based on the assumption ozone is accepted into the project:
>
>
> Common
>
> Appear to be a number of superfluous changes.  The conf servlet must not be
> polluted with specific references and logic for ozone.  We don’t create
> dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
> “ozone free”.
>
>
> Datanode
>
> I expected ozone changes to be intricately linked with the existing blocks
> map, dataset, volume, etc.  Thankfully it’s not.  As an independent
> service, the DN should not be polluted with specific references to ozone.
> If ozone is in the project, the DN should have a generic plugin interface
> conceptually similar to the NM aux services.
>
>
> Namenode
>
> No impact, currently, but certainly will be…
>
>
> Code Location
>
> I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
> I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
> hadoop-hdsl-project.  This clean separation will make it easier to later
> spin off or pull in depending on which way we vote.
>
>
> Dependencies
>
> Owen hit upon his before I could send.  Hadoop is already bursting with
> dependencies, I hope this doesn’t pull in a lot more.
>
>
> ––
>
>
> Do I think ozone be should be a separate project?  If we view it only as a
> competing filesystem, then clearly yes.  If it’s a low risk evolutionary
> step with near-term benefits, no, we want to keep it close and help it
> evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
> umbrella term for too many technologies that should perhaps be split.  I'm
> interested in the container block management.  I have little interest at
> this time in the key store.
>
>
> The usability of ozone, specifically container management, is unclear to
> me.  It lacks basic features like changing replication factors, append, a
> migration path, security, etc - I know there are good plans for all of it -
> yet another goal is splicing into the NN.  That’s a lot of high priority
> items to tackle that need to be carefully orchestrated before contemplating
> BM replacement.  Each of those is a non-starter for (my) production
> environment.  We need to make sure we can reach a consensus on the block
> level functionality before rushing it into the NN.  That’s independent of
> whether allowing it into the project.
>
>
> The BM/SCM changes to the NN are realistically going to be contentious &
> destabilizing.  If done correctly, the BM separation will be a big win for
> the NN.  If ozone is out, by necessity interfaces will need to be stable
> and well-defined but we won’t get that right for a long time.  Interface
> and logic changes that break the other will be difficult to coordinate and
> we’ll likely veto changes that impact the other.  If ozone is in, we can
> hopefully synchronize the changes with less friction, but it greatly
> increases the chances of developers riddling the NN with hacks and/or ozone
> specific logic that makes it even more brittle.  I will note we need to be
> vigilant against pervasive conditionals (ie. EC, snapshots).
>
>
> In either case, I think ozone must agree to not impede current hdfs work.
> I’ll compare to hdfs is a store owner that plans to maybe retire in 5
> years.  A potential new owner (ozone) is lined up and hdfs graciously gives
> them no-rent space (the DN).  Precondition is help improve the store.
> Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
> that complicate hdfs but ignore it due to anticipation of its
> departure/demise.  I’m not implying that’s currently happening, it’s just
> what I don’t want to see.
>
>
> We as a community and our customers need an evolution, not a revolution,
> and definitively not a civil war.  Hdfs has too much legacy code rot that
> is hard to change.  Too many poorly implemented features.   Perhaps I’m
> overly optimistic that freshly redesigned code can counterbalance
> performance degradations in the NN.  I’m also reluctant, but realize it is
> being driven by some hdfs veterans that know/understand historical hdfs
> design strengths and flaws.
>
>
> If the initially cited issues are addressed, I’m +0.5 for the concept of
> bringing in ozone if it's not going to be a proverbial bull in the china
> shop.
>
>
> Daryn
>
> On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <jitendra@hortonworks.com
> >
> wrote:
>
> >     Dear folks,
> >            We would like to start a vote to merge HDFS-7240 branch into
> > trunk. The context can be reviewed in the DISCUSSION thread, and in the
> > jiras (See references below).
> >
> >     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
> is
> > a distributed, replicated block layer.
> >     The old HDFS namespace and NN can be connected to this new block
> layer
> > as we have described in HDFS-10419.
> >     We also introduce a key-value namespace called Ozone built on HDSL.
> >
> >     The code is in a separate module and is turned off by default. In a
> > secure setup, HDSL and Ozone daemons cannot be started.
> >
> >     The detailed documentation is available at
> >              https://cwiki.apache.org/confluence/display/HADOOP/
> > Hadoop+Distributed+Storage+Layer+and+Applications
> >
> >
> >     I will start with my vote.
> >             +1 (binding)
> >
> >
> >     Discussion Thread:
> >               https://s.apache.org/7240-merge
> >               https://s.apache.org/4sfU
> >
> >     Jiras:
> >                https://issues.apache.org/jira/browse/HDFS-7240
> >                https://issues.apache.org/jira/browse/HDFS-10419
> >                https://issues.apache.org/jira/browse/HDFS-13074
> >                https://issues.apache.org/jira/browse/HDFS-13180
> >
> >
> >     Thanks
> >     jitendra
> >
> >
> >
> >
> >
> >             DISCUSSION THREAD SUMMARY :
> >
> >             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> > wrote:
> >
> >                 Sorry the formatting got messed by my email client.  Here
> > it is again
> >
> >
> >                 Dear
> >                  Hadoop Community Members,
> >
> >                    We had multiple community discussions, a few meetings
> > in smaller groups and also jira discussions with respect to this thread.
> We
> > express our gratitude for participation and valuable comments.
> >
> >                 The key questions raised were following
> >                 1) How the new block storage layer and OzoneFS benefit
> > HDFS and we were asked to chalk out a roadmap towards the goal of a
> > scalable namenode working with the new storage layer
> >                 2) We were asked to provide a security design
> >                 3)There were questions around stability given ozone
> brings
> > in a large body of code.
> >                 4) Why can’t they be separate projects forever or merged
> > in when production ready?
> >
> >                 We have responded to all the above questions with
> detailed
> > explanations and answers on the jira as well as in the discussions. We
> > believe that should sufficiently address community’s concerns.
> >
> >                 Please see the summary below:
> >
> >                 1) The new code base benefits HDFS scaling and a roadmap
> > has been provided.
> >
> >                 Summary:
> >                   - New block storage layer addresses the scalability of
> > the block layer. We have shown how existing NN can be connected to the
> new
> > block layer and its benefits. We have shown 2 milestones, 1st milestone
> is
> > much simpler than 2nd milestone while giving almost the same scaling
> > benefits. Originally we had proposed simply milestone 2 and the community
> > felt that removing the FSN/BM lock was was a fair amount of work and a
> > simpler solution would be useful
> >                   - We provide a new K-V namespace called Ozone FS with
> > FileSystem/FileContext plugins to allow the users to use the new system.
> > BTW Hive and Spark work very well on KV-namespaces on the cloud. This
> will
> > facilitate stabilizing the new block layer.
> >                   - The new block layer has a new netty based protocol
> > engine in the Datanode which, when stabilized, can be used by  the old
> hdfs
> > block layer. See details below on sharing of code.
> >
> >
> >                 2) Stability impact on the existing HDFS code base and
> > code separation. The new block layer and the OzoneFS are in modules that
> > are separate from old HDFS code - currently there are no calls from HDFS
> > into Ozone except for DN starting the new block  layer module if
> configured
> > to do so. It does not add instability (the instability argument has been
> > raised many times). Over time as we share code, we will ensure that the
> old
> > HDFS continues to remains stable. (for example we plan to stabilize the
> new
> > netty based protocol engine in the new block layer before sharing it with
> > HDFS’s old block layer)
> >
> >
> >                 3) In the short term and medium term, the new system and
> > HDFS  will be used side-by-side by users. Side by-side usage in the short
> > term for testing and side-by-side in the medium term for actual
> production
> > use till the new system has feature parity with old HDFS. During this
> time,
> > sharing the DN daemon and admin functions between the two systems is
> > operationally important:
> >                   - Sharing DN daemon to avoid additional operational
> > daemon lifecycle management
> >                   - Common decommissioning of the daemon and DN: One
> place
> > to decommission for a node and its storage.
> >                   - Replacing failed disks and internal balancing
> capacity
> > across disks - this needs to be done for both the current HDFS blocks and
> > the new block-layer blocks.
> >                   - Balancer: we would like use the same balancer and
> > provide a common way to balance and common management of the bandwidth
> used
> > for balancing
> >                   - Security configuration setup - reuse existing set up
> > for DNs rather then a new one for an independent cluster.
> >
> >
> >                 4) Need to easily share the block layer code between the
> > two systems when used side-by-side. Areas where sharing code is desired
> > over time:
> >                   - Sharing new block layer’s  new netty based protocol
> > engine for old HDFS DNs (a long time sore issue for HDFS block layer).
> >                   - Shallow data copy from old system to new system is
> > practical only if within same project and daemon otherwise have to deal
> > with security setting and coordinations across daemons. Shallow copy is
> > useful as customer migrate from old to new.
> >                   - Shared disk scheduling in the future and in the short
> > term have a single round robin rather than independent round robins.
> >                 While sharing code across projects is technically
> possible
> > (anything is possible in software),  it is significantly harder typically
> > requiring  cleaner public apis etc. Sharing within a project though
> > internal APIs is often simpler (such as the protocol engine that we want
> to
> > share).
> >
> >
> >                 5) Security design, including a threat model and and the
> > solution has been posted.
> >                 6) Temporary Separation and merge later: Several of the
> > comments in the jira have argued that we temporarily separate the two
> code
> > bases for now and then later merge them when the new code is stable:
> >
> >                   - If there is agreement to merge later, why bother
> > separating now - there needs to be to be good reasons to separate now.
> We
> > have addressed the stability and separation of the new code from existing
> > above.
> >                   - Merge the new code back into HDFS later will be
> harder.
> >
> >                     **The code and goals will diverge further.
> >                     ** We will be taking on extra work to split and then
> > take extra work to merge.
> >                     ** The issues raised today will be raised all the
> same
> > then.
> >
> >
> >                 ------------------------------
> > ---------------------------------------
> >                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> > apache.org
> >                 For additional commands, e-mail:
> > hdfs-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
> --
>
> Daryn
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

Hi Sanjay,

I have different opinions about what's important and how to eventually
integrate this code, and that's not because I'm "conveniently ignoring"
your responses. I'm also not making some of the arguments you claim I am
making. Attacking arguments I'm not making is not going to change my mind,
so let's bring it back to the arguments I am making.

Here's what it comes down to: HDFS-on-HDSL is not going to be ready in the
near-term, and it comes with a maintenance cost.

I did read the proposal on HDFS-10419 and I understood that HDFS-on-HDSL
integration does not necessarily require a lock split. However, there still
needs to be refactoring to clearly define the FSN and BM interfaces and
make the BM pluggable so HDSL can be swapped in. This is a major
undertaking and risky. We did a similar refactoring in 2.x which made
backports hard and introduced bugs. I don't think we should have done this
in a minor release.

Furthermore, I don't know what your expectation is on how long it will take
to stabilize HDSL, but this horizon for other storage systems is typically
measured in years rather than months.

Both of these feel like Hadoop 4 items: a ways out yet.

Moving on, there is a non-trivial maintenance cost to having this new code
in the code base. Ozone bugs become our bugs. Ozone dependencies become our
dependencies. Ozone's security flaws are our security flaws. All of this
negatively affects our already lumbering release schedule, and thus our
ability to deliver and iterate on the features we're already trying to
ship. Even if Ozone is separate and off by default, this is still a large
amount of code that comes with a large maintenance cost. I don't want to
incur this cost when the benefit is still a ways out.

We disagree on the necessity of sharing a repo and sharing operational
behaviors. Libraries exist as a method for sharing code. HDFS also hardly
has a monopoly on intermediating storage today. Disks are shared with MR
shuffle, Spark/Impala spill, log output, Kudu, Kafka, etc. Operationally
we've made this work. Having Ozone/HDSL in a separate process can even be
seen as an operational advantage since it's isolated. I firmly believe that
we can solve any implementation issues even with separate processes.

This is why I asked about making this a separate project. Given that these
two efforts (HDSL stabilization and NN refactoring) are a ways out, the
best way to get Ozone/HDSL in the hands of users today is to release it as
its own project. Owen mentioned making a Hadoop subproject; we'd have to
hash out what exactly this means (I assume a separate repo still managed by
the Hadoop project), but I think we could make this work if it's more
attractive than incubation or a new TLP.

I'm excited about the possibilities of both HDSL and the NN refactoring in
ensuring a future for HDFS for years to come. A pluggable block manager
would also let us experiment with things like HDFS-on-S3, increasingly
important in a cloud-centric world. CBlock would bring HDFS to new usecases
around generic container workloads. However, given the timeline for
completing these efforts, now is not the time to merge.

Best,
Andrew

On Thu, Mar 1, 2018 at 5:33 PM, Daryn Sharp <da...@oath.com.invalid> wrote:

> I’m generally neutral and looked foremost at developer impact.  Ie.  Will
> it be so intertwined with hdfs that each project risks destabilizing the
> other?  Will developers with no expertise in ozone will be impeded?  I
> think the answer is currently no.  These are the intersections and some
> concerns based on the assumption ozone is accepted into the project:
>
>
> Common
>
> Appear to be a number of superfluous changes.  The conf servlet must not be
> polluted with specific references and logic for ozone.  We don’t create
> dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
> “ozone free”.
>
>
> Datanode
>
> I expected ozone changes to be intricately linked with the existing blocks
> map, dataset, volume, etc.  Thankfully it’s not.  As an independent
> service, the DN should not be polluted with specific references to ozone.
> If ozone is in the project, the DN should have a generic plugin interface
> conceptually similar to the NM aux services.
>
>
> Namenode
>
> No impact, currently, but certainly will be…
>
>
> Code Location
>
> I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
> I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
> hadoop-hdsl-project.  This clean separation will make it easier to later
> spin off or pull in depending on which way we vote.
>
>
> Dependencies
>
> Owen hit upon his before I could send.  Hadoop is already bursting with
> dependencies, I hope this doesn’t pull in a lot more.
>
>
> ––
>
>
> Do I think ozone be should be a separate project?  If we view it only as a
> competing filesystem, then clearly yes.  If it’s a low risk evolutionary
> step with near-term benefits, no, we want to keep it close and help it
> evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
> umbrella term for too many technologies that should perhaps be split.  I'm
> interested in the container block management.  I have little interest at
> this time in the key store.
>
>
> The usability of ozone, specifically container management, is unclear to
> me.  It lacks basic features like changing replication factors, append, a
> migration path, security, etc - I know there are good plans for all of it -
> yet another goal is splicing into the NN.  That’s a lot of high priority
> items to tackle that need to be carefully orchestrated before contemplating
> BM replacement.  Each of those is a non-starter for (my) production
> environment.  We need to make sure we can reach a consensus on the block
> level functionality before rushing it into the NN.  That’s independent of
> whether allowing it into the project.
>
>
> The BM/SCM changes to the NN are realistically going to be contentious &
> destabilizing.  If done correctly, the BM separation will be a big win for
> the NN.  If ozone is out, by necessity interfaces will need to be stable
> and well-defined but we won’t get that right for a long time.  Interface
> and logic changes that break the other will be difficult to coordinate and
> we’ll likely veto changes that impact the other.  If ozone is in, we can
> hopefully synchronize the changes with less friction, but it greatly
> increases the chances of developers riddling the NN with hacks and/or ozone
> specific logic that makes it even more brittle.  I will note we need to be
> vigilant against pervasive conditionals (ie. EC, snapshots).
>
>
> In either case, I think ozone must agree to not impede current hdfs work.
> I’ll compare to hdfs is a store owner that plans to maybe retire in 5
> years.  A potential new owner (ozone) is lined up and hdfs graciously gives
> them no-rent space (the DN).  Precondition is help improve the store.
> Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
> that complicate hdfs but ignore it due to anticipation of its
> departure/demise.  I’m not implying that’s currently happening, it’s just
> what I don’t want to see.
>
>
> We as a community and our customers need an evolution, not a revolution,
> and definitively not a civil war.  Hdfs has too much legacy code rot that
> is hard to change.  Too many poorly implemented features.   Perhaps I’m
> overly optimistic that freshly redesigned code can counterbalance
> performance degradations in the NN.  I’m also reluctant, but realize it is
> being driven by some hdfs veterans that know/understand historical hdfs
> design strengths and flaws.
>
>
> If the initially cited issues are addressed, I’m +0.5 for the concept of
> bringing in ozone if it's not going to be a proverbial bull in the china
> shop.
>
>
> Daryn
>
> On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <jitendra@hortonworks.com
> >
> wrote:
>
> >     Dear folks,
> >            We would like to start a vote to merge HDFS-7240 branch into
> > trunk. The context can be reviewed in the DISCUSSION thread, and in the
> > jiras (See references below).
> >
> >     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
> is
> > a distributed, replicated block layer.
> >     The old HDFS namespace and NN can be connected to this new block
> layer
> > as we have described in HDFS-10419.
> >     We also introduce a key-value namespace called Ozone built on HDSL.
> >
> >     The code is in a separate module and is turned off by default. In a
> > secure setup, HDSL and Ozone daemons cannot be started.
> >
> >     The detailed documentation is available at
> >              https://cwiki.apache.org/confluence/display/HADOOP/
> > Hadoop+Distributed+Storage+Layer+and+Applications
> >
> >
> >     I will start with my vote.
> >             +1 (binding)
> >
> >
> >     Discussion Thread:
> >               https://s.apache.org/7240-merge
> >               https://s.apache.org/4sfU
> >
> >     Jiras:
> >                https://issues.apache.org/jira/browse/HDFS-7240
> >                https://issues.apache.org/jira/browse/HDFS-10419
> >                https://issues.apache.org/jira/browse/HDFS-13074
> >                https://issues.apache.org/jira/browse/HDFS-13180
> >
> >
> >     Thanks
> >     jitendra
> >
> >
> >
> >
> >
> >             DISCUSSION THREAD SUMMARY :
> >
> >             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> > wrote:
> >
> >                 Sorry the formatting got messed by my email client.  Here
> > it is again
> >
> >
> >                 Dear
> >                  Hadoop Community Members,
> >
> >                    We had multiple community discussions, a few meetings
> > in smaller groups and also jira discussions with respect to this thread.
> We
> > express our gratitude for participation and valuable comments.
> >
> >                 The key questions raised were following
> >                 1) How the new block storage layer and OzoneFS benefit
> > HDFS and we were asked to chalk out a roadmap towards the goal of a
> > scalable namenode working with the new storage layer
> >                 2) We were asked to provide a security design
> >                 3)There were questions around stability given ozone
> brings
> > in a large body of code.
> >                 4) Why can’t they be separate projects forever or merged
> > in when production ready?
> >
> >                 We have responded to all the above questions with
> detailed
> > explanations and answers on the jira as well as in the discussions. We
> > believe that should sufficiently address community’s concerns.
> >
> >                 Please see the summary below:
> >
> >                 1) The new code base benefits HDFS scaling and a roadmap
> > has been provided.
> >
> >                 Summary:
> >                   - New block storage layer addresses the scalability of
> > the block layer. We have shown how existing NN can be connected to the
> new
> > block layer and its benefits. We have shown 2 milestones, 1st milestone
> is
> > much simpler than 2nd milestone while giving almost the same scaling
> > benefits. Originally we had proposed simply milestone 2 and the community
> > felt that removing the FSN/BM lock was was a fair amount of work and a
> > simpler solution would be useful
> >                   - We provide a new K-V namespace called Ozone FS with
> > FileSystem/FileContext plugins to allow the users to use the new system.
> > BTW Hive and Spark work very well on KV-namespaces on the cloud. This
> will
> > facilitate stabilizing the new block layer.
> >                   - The new block layer has a new netty based protocol
> > engine in the Datanode which, when stabilized, can be used by  the old
> hdfs
> > block layer. See details below on sharing of code.
> >
> >
> >                 2) Stability impact on the existing HDFS code base and
> > code separation. The new block layer and the OzoneFS are in modules that
> > are separate from old HDFS code - currently there are no calls from HDFS
> > into Ozone except for DN starting the new block  layer module if
> configured
> > to do so. It does not add instability (the instability argument has been
> > raised many times). Over time as we share code, we will ensure that the
> old
> > HDFS continues to remains stable. (for example we plan to stabilize the
> new
> > netty based protocol engine in the new block layer before sharing it with
> > HDFS’s old block layer)
> >
> >
> >                 3) In the short term and medium term, the new system and
> > HDFS  will be used side-by-side by users. Side by-side usage in the short
> > term for testing and side-by-side in the medium term for actual
> production
> > use till the new system has feature parity with old HDFS. During this
> time,
> > sharing the DN daemon and admin functions between the two systems is
> > operationally important:
> >                   - Sharing DN daemon to avoid additional operational
> > daemon lifecycle management
> >                   - Common decommissioning of the daemon and DN: One
> place
> > to decommission for a node and its storage.
> >                   - Replacing failed disks and internal balancing
> capacity
> > across disks - this needs to be done for both the current HDFS blocks and
> > the new block-layer blocks.
> >                   - Balancer: we would like use the same balancer and
> > provide a common way to balance and common management of the bandwidth
> used
> > for balancing
> >                   - Security configuration setup - reuse existing set up
> > for DNs rather then a new one for an independent cluster.
> >
> >
> >                 4) Need to easily share the block layer code between the
> > two systems when used side-by-side. Areas where sharing code is desired
> > over time:
> >                   - Sharing new block layer’s  new netty based protocol
> > engine for old HDFS DNs (a long time sore issue for HDFS block layer).
> >                   - Shallow data copy from old system to new system is
> > practical only if within same project and daemon otherwise have to deal
> > with security setting and coordinations across daemons. Shallow copy is
> > useful as customer migrate from old to new.
> >                   - Shared disk scheduling in the future and in the short
> > term have a single round robin rather than independent round robins.
> >                 While sharing code across projects is technically
> possible
> > (anything is possible in software),  it is significantly harder typically
> > requiring  cleaner public apis etc. Sharing within a project though
> > internal APIs is often simpler (such as the protocol engine that we want
> to
> > share).
> >
> >
> >                 5) Security design, including a threat model and and the
> > solution has been posted.
> >                 6) Temporary Separation and merge later: Several of the
> > comments in the jira have argued that we temporarily separate the two
> code
> > bases for now and then later merge them when the new code is stable:
> >
> >                   - If there is agreement to merge later, why bother
> > separating now - there needs to be to be good reasons to separate now.
> We
> > have addressed the stability and separation of the new code from existing
> > above.
> >                   - Merge the new code back into HDFS later will be
> harder.
> >
> >                     **The code and goals will diverge further.
> >                     ** We will be taking on extra work to split and then
> > take extra work to merge.
> >                     ** The issues raised today will be raised all the
> same
> > then.
> >
> >
> >                 ------------------------------
> > ---------------------------------------
> >                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> > apache.org
> >                 For additional commands, e-mail:
> > hdfs-dev-help@hadoop.apache.org
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
> --
>
> Daryn
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Daryn Sharp <da...@oath.com.INVALID>.

I’m generally neutral and looked foremost at developer impact.  Ie.  Will
it be so intertwined with hdfs that each project risks destabilizing the
other?  Will developers with no expertise in ozone will be impeded?  I
think the answer is currently no.  These are the intersections and some
concerns based on the assumption ozone is accepted into the project:

Common

Appear to be a number of superfluous changes.  The conf servlet must not be
polluted with specific references and logic for ozone.  We don’t create
dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
“ozone free”.

Datanode

I expected ozone changes to be intricately linked with the existing blocks
map, dataset, volume, etc.  Thankfully it’s not.  As an independent
service, the DN should not be polluted with specific references to ozone.
If ozone is in the project, the DN should have a generic plugin interface
conceptually similar to the NM aux services.

Namenode

No impact, currently, but certainly will be…

Code Location

I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
hadoop-hdsl-project.  This clean separation will make it easier to later
spin off or pull in depending on which way we vote.

Dependencies

Owen hit upon his before I could send.  Hadoop is already bursting with
dependencies, I hope this doesn’t pull in a lot more.

––

Do I think ozone be should be a separate project?  If we view it only as a
competing filesystem, then clearly yes.  If it’s a low risk evolutionary
step with near-term benefits, no, we want to keep it close and help it
evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
umbrella term for too many technologies that should perhaps be split.  I'm
interested in the container block management.  I have little interest at
this time in the key store.

The usability of ozone, specifically container management, is unclear to
me.  It lacks basic features like changing replication factors, append, a
migration path, security, etc - I know there are good plans for all of it -
yet another goal is splicing into the NN.  That’s a lot of high priority
items to tackle that need to be carefully orchestrated before contemplating
BM replacement.  Each of those is a non-starter for (my) production
environment.  We need to make sure we can reach a consensus on the block
level functionality before rushing it into the NN.  That’s independent of
whether allowing it into the project.

The BM/SCM changes to the NN are realistically going to be contentious &
destabilizing.  If done correctly, the BM separation will be a big win for
the NN.  If ozone is out, by necessity interfaces will need to be stable
and well-defined but we won’t get that right for a long time.  Interface
and logic changes that break the other will be difficult to coordinate and
we’ll likely veto changes that impact the other.  If ozone is in, we can
hopefully synchronize the changes with less friction, but it greatly
increases the chances of developers riddling the NN with hacks and/or ozone
specific logic that makes it even more brittle.  I will note we need to be
vigilant against pervasive conditionals (ie. EC, snapshots).

In either case, I think ozone must agree to not impede current hdfs work.
I’ll compare to hdfs is a store owner that plans to maybe retire in 5
years.  A potential new owner (ozone) is lined up and hdfs graciously gives
them no-rent space (the DN).  Precondition is help improve the store.
Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
that complicate hdfs but ignore it due to anticipation of its
departure/demise.  I’m not implying that’s currently happening, it’s just
what I don’t want to see.

We as a community and our customers need an evolution, not a revolution,
and definitively not a civil war.  Hdfs has too much legacy code rot that
is hard to change.  Too many poorly implemented features.   Perhaps I’m
overly optimistic that freshly redesigned code can counterbalance
performance degradations in the NN.  I’m also reluctant, but realize it is
being driven by some hdfs veterans that know/understand historical hdfs
design strengths and flaws.

If the initially cited issues are addressed, I’m +0.5 for the concept of
bringing in ozone if it's not going to be a proverbial bull in the china
shop.

Daryn

On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

>     Dear folks,
>            We would like to start a vote to merge HDFS-7240 branch into
> trunk. The context can be reviewed in the DISCUSSION thread, and in the
> jiras (See references below).
>
>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is
> a distributed, replicated block layer.
>     The old HDFS namespace and NN can be connected to this new block layer
> as we have described in HDFS-10419.
>     We also introduce a key-value namespace called Ozone built on HDSL.
>
>     The code is in a separate module and is turned off by default. In a
> secure setup, HDSL and Ozone daemons cannot be started.
>
>     The detailed documentation is available at
>              https://cwiki.apache.org/confluence/display/HADOOP/
> Hadoop+Distributed+Storage+Layer+and+Applications
>
>
>     I will start with my vote.
>             +1 (binding)
>
>
>     Discussion Thread:
>               https://s.apache.org/7240-merge
>               https://s.apache.org/4sfU
>
>     Jiras:
>                https://issues.apache.org/jira/browse/HDFS-7240
>                https://issues.apache.org/jira/browse/HDFS-10419
>                https://issues.apache.org/jira/browse/HDFS-13074
>                https://issues.apache.org/jira/browse/HDFS-13180
>
>
>     Thanks
>     jitendra
>
>
>
>
>
>             DISCUSSION THREAD SUMMARY :
>
>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> wrote:
>
>                 Sorry the formatting got messed by my email client.  Here
> it is again
>
>
>                 Dear
>                  Hadoop Community Members,
>
>                    We had multiple community discussions, a few meetings
> in smaller groups and also jira discussions with respect to this thread. We
> express our gratitude for participation and valuable comments.
>
>                 The key questions raised were following
>                 1) How the new block storage layer and OzoneFS benefit
> HDFS and we were asked to chalk out a roadmap towards the goal of a
> scalable namenode working with the new storage layer
>                 2) We were asked to provide a security design
>                 3)There were questions around stability given ozone brings
> in a large body of code.
>                 4) Why can’t they be separate projects forever or merged
> in when production ready?
>
>                 We have responded to all the above questions with detailed
> explanations and answers on the jira as well as in the discussions. We
> believe that should sufficiently address community’s concerns.
>
>                 Please see the summary below:
>
>                 1) The new code base benefits HDFS scaling and a roadmap
> has been provided.
>
>                 Summary:
>                   - New block storage layer addresses the scalability of
> the block layer. We have shown how existing NN can be connected to the new
> block layer and its benefits. We have shown 2 milestones, 1st milestone is
> much simpler than 2nd milestone while giving almost the same scaling
> benefits. Originally we had proposed simply milestone 2 and the community
> felt that removing the FSN/BM lock was was a fair amount of work and a
> simpler solution would be useful
>                   - We provide a new K-V namespace called Ozone FS with
> FileSystem/FileContext plugins to allow the users to use the new system.
> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
> facilitate stabilizing the new block layer.
>                   - The new block layer has a new netty based protocol
> engine in the Datanode which, when stabilized, can be used by  the old hdfs
> block layer. See details below on sharing of code.
>
>
>                 2) Stability impact on the existing HDFS code base and
> code separation. The new block layer and the OzoneFS are in modules that
> are separate from old HDFS code - currently there are no calls from HDFS
> into Ozone except for DN starting the new block  layer module if configured
> to do so. It does not add instability (the instability argument has been
> raised many times). Over time as we share code, we will ensure that the old
> HDFS continues to remains stable. (for example we plan to stabilize the new
> netty based protocol engine in the new block layer before sharing it with
> HDFS’s old block layer)
>
>
>                 3) In the short term and medium term, the new system and
> HDFS  will be used side-by-side by users. Side by-side usage in the short
> term for testing and side-by-side in the medium term for actual production
> use till the new system has feature parity with old HDFS. During this time,
> sharing the DN daemon and admin functions between the two systems is
> operationally important:
>                   - Sharing DN daemon to avoid additional operational
> daemon lifecycle management
>                   - Common decommissioning of the daemon and DN: One place
> to decommission for a node and its storage.
>                   - Replacing failed disks and internal balancing capacity
> across disks - this needs to be done for both the current HDFS blocks and
> the new block-layer blocks.
>                   - Balancer: we would like use the same balancer and
> provide a common way to balance and common management of the bandwidth used
> for balancing
>                   - Security configuration setup - reuse existing set up
> for DNs rather then a new one for an independent cluster.
>
>
>                 4) Need to easily share the block layer code between the
> two systems when used side-by-side. Areas where sharing code is desired
> over time:
>                   - Sharing new block layer’s  new netty based protocol
> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>                   - Shallow data copy from old system to new system is
> practical only if within same project and daemon otherwise have to deal
> with security setting and coordinations across daemons. Shallow copy is
> useful as customer migrate from old to new.
>                   - Shared disk scheduling in the future and in the short
> term have a single round robin rather than independent round robins.
>                 While sharing code across projects is technically possible
> (anything is possible in software),  it is significantly harder typically
> requiring  cleaner public apis etc. Sharing within a project though
> internal APIs is often simpler (such as the protocol engine that we want to
> share).
>
>
>                 5) Security design, including a threat model and and the
> solution has been posted.
>                 6) Temporary Separation and merge later: Several of the
> comments in the jira have argued that we temporarily separate the two code
> bases for now and then later merge them when the new code is stable:
>
>                   - If there is agreement to merge later, why bother
> separating now - there needs to be to be good reasons to separate now.  We
> have addressed the stability and separation of the new code from existing
> above.
>                   - Merge the new code back into HDFS later will be harder.
>
>                     **The code and goals will diverge further.
>                     ** We will be taking on extra work to split and then
> take extra work to merge.
>                     ** The issues raised today will be raised all the same
> then.
>
>
>                 ------------------------------
> ---------------------------------------
>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> apache.org
>                 For additional commands, e-mail:
> hdfs-dev-help@hadoop.apache.org
>
>
>
>
>
>
>
>
>
>

-- 

Daryn

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Daryn Sharp <da...@oath.com.INVALID>.

I’m generally neutral and looked foremost at developer impact.  Ie.  Will
it be so intertwined with hdfs that each project risks destabilizing the
other?  Will developers with no expertise in ozone will be impeded?  I
think the answer is currently no.  These are the intersections and some
concerns based on the assumption ozone is accepted into the project:

Common

Appear to be a number of superfluous changes.  The conf servlet must not be
polluted with specific references and logic for ozone.  We don’t create
dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
“ozone free”.

Datanode

I expected ozone changes to be intricately linked with the existing blocks
map, dataset, volume, etc.  Thankfully it’s not.  As an independent
service, the DN should not be polluted with specific references to ozone.
If ozone is in the project, the DN should have a generic plugin interface
conceptually similar to the NM aux services.

Namenode

No impact, currently, but certainly will be…

Code Location

I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
hadoop-hdsl-project.  This clean separation will make it easier to later
spin off or pull in depending on which way we vote.

Dependencies

Owen hit upon his before I could send.  Hadoop is already bursting with
dependencies, I hope this doesn’t pull in a lot more.

––

Do I think ozone be should be a separate project?  If we view it only as a
competing filesystem, then clearly yes.  If it’s a low risk evolutionary
step with near-term benefits, no, we want to keep it close and help it
evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
umbrella term for too many technologies that should perhaps be split.  I'm
interested in the container block management.  I have little interest at
this time in the key store.

The usability of ozone, specifically container management, is unclear to
me.  It lacks basic features like changing replication factors, append, a
migration path, security, etc - I know there are good plans for all of it -
yet another goal is splicing into the NN.  That’s a lot of high priority
items to tackle that need to be carefully orchestrated before contemplating
BM replacement.  Each of those is a non-starter for (my) production
environment.  We need to make sure we can reach a consensus on the block
level functionality before rushing it into the NN.  That’s independent of
whether allowing it into the project.

The BM/SCM changes to the NN are realistically going to be contentious &
destabilizing.  If done correctly, the BM separation will be a big win for
the NN.  If ozone is out, by necessity interfaces will need to be stable
and well-defined but we won’t get that right for a long time.  Interface
and logic changes that break the other will be difficult to coordinate and
we’ll likely veto changes that impact the other.  If ozone is in, we can
hopefully synchronize the changes with less friction, but it greatly
increases the chances of developers riddling the NN with hacks and/or ozone
specific logic that makes it even more brittle.  I will note we need to be
vigilant against pervasive conditionals (ie. EC, snapshots).

In either case, I think ozone must agree to not impede current hdfs work.
I’ll compare to hdfs is a store owner that plans to maybe retire in 5
years.  A potential new owner (ozone) is lined up and hdfs graciously gives
them no-rent space (the DN).  Precondition is help improve the store.
Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
that complicate hdfs but ignore it due to anticipation of its
departure/demise.  I’m not implying that’s currently happening, it’s just
what I don’t want to see.

We as a community and our customers need an evolution, not a revolution,
and definitively not a civil war.  Hdfs has too much legacy code rot that
is hard to change.  Too many poorly implemented features.   Perhaps I’m
overly optimistic that freshly redesigned code can counterbalance
performance degradations in the NN.  I’m also reluctant, but realize it is
being driven by some hdfs veterans that know/understand historical hdfs
design strengths and flaws.

If the initially cited issues are addressed, I’m +0.5 for the concept of
bringing in ozone if it's not going to be a proverbial bull in the china
shop.

Daryn

On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

>     Dear folks,
>            We would like to start a vote to merge HDFS-7240 branch into
> trunk. The context can be reviewed in the DISCUSSION thread, and in the
> jiras (See references below).
>
>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is
> a distributed, replicated block layer.
>     The old HDFS namespace and NN can be connected to this new block layer
> as we have described in HDFS-10419.
>     We also introduce a key-value namespace called Ozone built on HDSL.
>
>     The code is in a separate module and is turned off by default. In a
> secure setup, HDSL and Ozone daemons cannot be started.
>
>     The detailed documentation is available at
>              https://cwiki.apache.org/confluence/display/HADOOP/
> Hadoop+Distributed+Storage+Layer+and+Applications
>
>
>     I will start with my vote.
>             +1 (binding)
>
>
>     Discussion Thread:
>               https://s.apache.org/7240-merge
>               https://s.apache.org/4sfU
>
>     Jiras:
>                https://issues.apache.org/jira/browse/HDFS-7240
>                https://issues.apache.org/jira/browse/HDFS-10419
>                https://issues.apache.org/jira/browse/HDFS-13074
>                https://issues.apache.org/jira/browse/HDFS-13180
>
>
>     Thanks
>     jitendra
>
>
>
>
>
>             DISCUSSION THREAD SUMMARY :
>
>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> wrote:
>
>                 Sorry the formatting got messed by my email client.  Here
> it is again
>
>
>                 Dear
>                  Hadoop Community Members,
>
>                    We had multiple community discussions, a few meetings
> in smaller groups and also jira discussions with respect to this thread. We
> express our gratitude for participation and valuable comments.
>
>                 The key questions raised were following
>                 1) How the new block storage layer and OzoneFS benefit
> HDFS and we were asked to chalk out a roadmap towards the goal of a
> scalable namenode working with the new storage layer
>                 2) We were asked to provide a security design
>                 3)There were questions around stability given ozone brings
> in a large body of code.
>                 4) Why can’t they be separate projects forever or merged
> in when production ready?
>
>                 We have responded to all the above questions with detailed
> explanations and answers on the jira as well as in the discussions. We
> believe that should sufficiently address community’s concerns.
>
>                 Please see the summary below:
>
>                 1) The new code base benefits HDFS scaling and a roadmap
> has been provided.
>
>                 Summary:
>                   - New block storage layer addresses the scalability of
> the block layer. We have shown how existing NN can be connected to the new
> block layer and its benefits. We have shown 2 milestones, 1st milestone is
> much simpler than 2nd milestone while giving almost the same scaling
> benefits. Originally we had proposed simply milestone 2 and the community
> felt that removing the FSN/BM lock was was a fair amount of work and a
> simpler solution would be useful
>                   - We provide a new K-V namespace called Ozone FS with
> FileSystem/FileContext plugins to allow the users to use the new system.
> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
> facilitate stabilizing the new block layer.
>                   - The new block layer has a new netty based protocol
> engine in the Datanode which, when stabilized, can be used by  the old hdfs
> block layer. See details below on sharing of code.
>
>
>                 2) Stability impact on the existing HDFS code base and
> code separation. The new block layer and the OzoneFS are in modules that
> are separate from old HDFS code - currently there are no calls from HDFS
> into Ozone except for DN starting the new block  layer module if configured
> to do so. It does not add instability (the instability argument has been
> raised many times). Over time as we share code, we will ensure that the old
> HDFS continues to remains stable. (for example we plan to stabilize the new
> netty based protocol engine in the new block layer before sharing it with
> HDFS’s old block layer)
>
>
>                 3) In the short term and medium term, the new system and
> HDFS  will be used side-by-side by users. Side by-side usage in the short
> term for testing and side-by-side in the medium term for actual production
> use till the new system has feature parity with old HDFS. During this time,
> sharing the DN daemon and admin functions between the two systems is
> operationally important:
>                   - Sharing DN daemon to avoid additional operational
> daemon lifecycle management
>                   - Common decommissioning of the daemon and DN: One place
> to decommission for a node and its storage.
>                   - Replacing failed disks and internal balancing capacity
> across disks - this needs to be done for both the current HDFS blocks and
> the new block-layer blocks.
>                   - Balancer: we would like use the same balancer and
> provide a common way to balance and common management of the bandwidth used
> for balancing
>                   - Security configuration setup - reuse existing set up
> for DNs rather then a new one for an independent cluster.
>
>
>                 4) Need to easily share the block layer code between the
> two systems when used side-by-side. Areas where sharing code is desired
> over time:
>                   - Sharing new block layer’s  new netty based protocol
> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>                   - Shallow data copy from old system to new system is
> practical only if within same project and daemon otherwise have to deal
> with security setting and coordinations across daemons. Shallow copy is
> useful as customer migrate from old to new.
>                   - Shared disk scheduling in the future and in the short
> term have a single round robin rather than independent round robins.
>                 While sharing code across projects is technically possible
> (anything is possible in software),  it is significantly harder typically
> requiring  cleaner public apis etc. Sharing within a project though
> internal APIs is often simpler (such as the protocol engine that we want to
> share).
>
>
>                 5) Security design, including a threat model and and the
> solution has been posted.
>                 6) Temporary Separation and merge later: Several of the
> comments in the jira have argued that we temporarily separate the two code
> bases for now and then later merge them when the new code is stable:
>
>                   - If there is agreement to merge later, why bother
> separating now - there needs to be to be good reasons to separate now.  We
> have addressed the stability and separation of the new code from existing
> above.
>                   - Merge the new code back into HDFS later will be harder.
>
>                     **The code and goals will diverge further.
>                     ** We will be taking on extra work to split and then
> take extra work to merge.
>                     ** The issues raised today will be raised all the same
> then.
>
>
>                 ------------------------------
> ---------------------------------------
>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> apache.org
>                 For additional commands, e-mail:
> hdfs-dev-help@hadoop.apache.org
>
>
>
>
>
>
>
>
>
>

-- 

Daryn

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

*Hi Jitendra and all,Thanks for putting this together. I caught up on the
discussion on JIRA and document at HDFS-10419, and still have the same
concerns raised earlier
<https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
about merging the Ozone branch to trunk.To recap these questions/concerns
at a very high level:* Wouldn't Ozone benefit from being a separate
project?* Why should it be merged now?I still believe that both Ozone and
Hadoop would benefit from Ozone being a separate project, and that there is
no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
for merging is that the Ozone is that it's at a stage where it's ready for
user feedback. Second, that it needs to be merged to start on the NN
refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
testing against the Ozone object storage interface. Ozone and HDSL
themselves are implemented as separate masters and new functionality bolted
onto the datanode. It also doesn't look like HDFS in terms of API or
featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
HDFS features like erasure coding, encryption, high-availability,
snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
means that Ozone feels like a new, different system that could reasonably
be deployed and tested separately from HDFS. It's unlikely to replace many
of today's HDFS deployments, and from what I understand, Ozone was not
designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
itself is a major undertaking. The discussion on HDFS-10419 is still
ongoing so it’s not clear what the ultimate refactoring will be, but I do
know that the earlier FSN/BM refactoring during 2.x was very painful
(introducing new bugs and making backports difficult) and probably should
have been deferred to a new major release instead. I think this refactoring
is important for the long-term maintainability of the NN and worth
pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
for starting this refactoring. Really, I see the refactoring as the
prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
that Ozone/HDSL itself would benefit from being a separate project. Ozone
could release faster and iterate more quickly if it wasn't hampered by
Hadoop's release schedule and security and compatibility requirements.
There are also publicity and community benefits; it's an opportunity to
build a community focused on the novel capabilities and architectural
choices of Ozone/HDSL. There are examples of other projects that were
"incubated" on a branch in the Hadoop repo before being spun off to great
success.In conclusion, I'd like to see Ozone succeeding and thriving as a
separate project. Meanwhile, we can work on the HDFS refactoring required
to separate the FSN and BM and make it pluggable. At that point (likely in
the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
Best,
Andrew

On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

>     Dear folks,
>            We would like to start a vote to merge HDFS-7240 branch into
> trunk. The context can be reviewed in the DISCUSSION thread, and in the
> jiras (See references below).
>
>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is
> a distributed, replicated block layer.
>     The old HDFS namespace and NN can be connected to this new block layer
> as we have described in HDFS-10419.
>     We also introduce a key-value namespace called Ozone built on HDSL.
>
>     The code is in a separate module and is turned off by default. In a
> secure setup, HDSL and Ozone daemons cannot be started.
>
>     The detailed documentation is available at
>              https://cwiki.apache.org/confluence/display/HADOOP/
> Hadoop+Distributed+Storage+Layer+and+Applications
>
>
>     I will start with my vote.
>             +1 (binding)
>
>
>     Discussion Thread:
>               https://s.apache.org/7240-merge
>               https://s.apache.org/4sfU
>
>     Jiras:
>                https://issues.apache.org/jira/browse/HDFS-7240
>                https://issues.apache.org/jira/browse/HDFS-10419
>                https://issues.apache.org/jira/browse/HDFS-13074
>                https://issues.apache.org/jira/browse/HDFS-13180
>
>
>     Thanks
>     jitendra
>
>
>
>
>
>             DISCUSSION THREAD SUMMARY :
>
>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> wrote:
>
>                 Sorry the formatting got messed by my email client.  Here
> it is again
>
>
>                 Dear
>                  Hadoop Community Members,
>
>                    We had multiple community discussions, a few meetings
> in smaller groups and also jira discussions with respect to this thread. We
> express our gratitude for participation and valuable comments.
>
>                 The key questions raised were following
>                 1) How the new block storage layer and OzoneFS benefit
> HDFS and we were asked to chalk out a roadmap towards the goal of a
> scalable namenode working with the new storage layer
>                 2) We were asked to provide a security design
>                 3)There were questions around stability given ozone brings
> in a large body of code.
>                 4) Why can’t they be separate projects forever or merged
> in when production ready?
>
>                 We have responded to all the above questions with detailed
> explanations and answers on the jira as well as in the discussions. We
> believe that should sufficiently address community’s concerns.
>
>                 Please see the summary below:
>
>                 1) The new code base benefits HDFS scaling and a roadmap
> has been provided.
>
>                 Summary:
>                   - New block storage layer addresses the scalability of
> the block layer. We have shown how existing NN can be connected to the new
> block layer and its benefits. We have shown 2 milestones, 1st milestone is
> much simpler than 2nd milestone while giving almost the same scaling
> benefits. Originally we had proposed simply milestone 2 and the community
> felt that removing the FSN/BM lock was was a fair amount of work and a
> simpler solution would be useful
>                   - We provide a new K-V namespace called Ozone FS with
> FileSystem/FileContext plugins to allow the users to use the new system.
> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
> facilitate stabilizing the new block layer.
>                   - The new block layer has a new netty based protocol
> engine in the Datanode which, when stabilized, can be used by  the old hdfs
> block layer. See details below on sharing of code.
>
>
>                 2) Stability impact on the existing HDFS code base and
> code separation. The new block layer and the OzoneFS are in modules that
> are separate from old HDFS code - currently there are no calls from HDFS
> into Ozone except for DN starting the new block  layer module if configured
> to do so. It does not add instability (the instability argument has been
> raised many times). Over time as we share code, we will ensure that the old
> HDFS continues to remains stable. (for example we plan to stabilize the new
> netty based protocol engine in the new block layer before sharing it with
> HDFS’s old block layer)
>
>
>                 3) In the short term and medium term, the new system and
> HDFS  will be used side-by-side by users. Side by-side usage in the short
> term for testing and side-by-side in the medium term for actual production
> use till the new system has feature parity with old HDFS. During this time,
> sharing the DN daemon and admin functions between the two systems is
> operationally important:
>                   - Sharing DN daemon to avoid additional operational
> daemon lifecycle management
>                   - Common decommissioning of the daemon and DN: One place
> to decommission for a node and its storage.
>                   - Replacing failed disks and internal balancing capacity
> across disks - this needs to be done for both the current HDFS blocks and
> the new block-layer blocks.
>                   - Balancer: we would like use the same balancer and
> provide a common way to balance and common management of the bandwidth used
> for balancing
>                   - Security configuration setup - reuse existing set up
> for DNs rather then a new one for an independent cluster.
>
>
>                 4) Need to easily share the block layer code between the
> two systems when used side-by-side. Areas where sharing code is desired
> over time:
>                   - Sharing new block layer’s  new netty based protocol
> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>                   - Shallow data copy from old system to new system is
> practical only if within same project and daemon otherwise have to deal
> with security setting and coordinations across daemons. Shallow copy is
> useful as customer migrate from old to new.
>                   - Shared disk scheduling in the future and in the short
> term have a single round robin rather than independent round robins.
>                 While sharing code across projects is technically possible
> (anything is possible in software),  it is significantly harder typically
> requiring  cleaner public apis etc. Sharing within a project though
> internal APIs is often simpler (such as the protocol engine that we want to
> share).
>
>
>                 5) Security design, including a threat model and and the
> solution has been posted.
>                 6) Temporary Separation and merge later: Several of the
> comments in the jira have argued that we temporarily separate the two code
> bases for now and then later merge them when the new code is stable:
>
>                   - If there is agreement to merge later, why bother
> separating now - there needs to be to be good reasons to separate now.  We
> have addressed the stability and separation of the new code from existing
> above.
>                   - Merge the new code back into HDFS later will be harder.
>
>                     **The code and goals will diverge further.
>                     ** We will be taking on extra work to split and then
> take extra work to merge.
>                     ** The issues raised today will be raised all the same
> then.
>
>
>                 ------------------------------
> ---------------------------------------
>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> apache.org
>                 For additional commands, e-mail:
> hdfs-dev-help@hadoop.apache.org
>
>
>
>
>
>
>
>
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Suresh Srinivas <su...@gmail.com>.

Anu, Jing, Nicholas, Sanjay, Jitendra and many others, thank you for
staying focused on this effort. It has been almost 3.5 years since
HDFS-7240 was created and all of the work has happened over the years in
the open in the feature branch.

Storage layer improvements from HDFS-7240 is important not only for
scalability but also enabling flexibility for new innovations. One of the
challenges with HDFS is balancing the need for stability for the majority
of the existing deployments with the need for newer features and
enhancements. Both of these are current needs we see at Uber. A single
storage layer with the right abstractions can enable new use cases beyond
file system and avoid siloes of storages to simplify management. HDSL and
storage container abstraction is the architectural foundation toward that
future.

My main concerns with the merge is the stability impact on existing HDFS
users. It needs to be a priority for the project. The modular approach
taken to ensure the clean separation of code and dependencies addresses
that concern.

I am +1 (binding) on merging HDFS-7240 to HDFS repository.



On Thu, Mar 1, 2018 at 4:05 PM, Owen O'Malley <ow...@gmail.com>
wrote:

> I think it would be good to get this in sooner rather than later, but I
> have some thoughts.
>
>    1. It is hard to tell what has changed. git rebase -i tells me the
>    branch has 722 commits. The rebase failed with a conflict. It would
> really
>    help if you rebased to current trunk.
>    2. I think Ozone would be a good Hadoop subproject, but it should be
>    outside of HDFS.
>    3. CBlock, which is also coming in this merge, would benefit from more
>    separation from HDFS.
>    4. What are the new transitive dependencies that Ozone, HDSL, and CBlock
>    adding to the clients? The servers matter too, but the client
> dependencies
>    have a huge impact on our users.
>    5. Have you checked the new dependencies for compatibility with ASL?
>
>
> On Thu, Mar 1, 2018 at 2:45 PM, Clay B. <cw...@clayb.net> wrote:
>
> > Oops, retrying now subscribed to more than solely yarn-dev.
> >
> > -Clay
> >
> >
> > On Wed, 28 Feb 2018, Clay B. wrote:
> >
> > +1 (non-binding)
> >>
> >> I have walked through the code and find it very compelling as a user; I
> >> really look forward to seeing the Ozone code mature and it maturing HDFS
> >> features together. The points which excite me as an eight year HDFS user
> >> are:
> >>
> >> * Excitement for making the datanode a storage technology container -
> this
> >>  patch clearly brings fresh thought to HDFS keeping it from growing
> stale
> >>
> >> * Ability to build upon a shared storage infrastructure for diverse
> >>  loads: I do not want to have "stranded" storage capacity or have to
> >>  manage competing storage systems on the same disks (and further I want
> >>  the metrics datanodes can provide me today, so I do not have to
> >>  instrument two systems or evolve their instrumentation separately).
> >>
> >> * Looking forward to supporting object-sized files!
> >>
> >> * Moves HDFS in the right direction to test out new block management
> >>  techniques for scaling HDFS. I am really excited to see the raft
> >>  integration; I hope it opens a new era in Hadoop matching modern
> systems
> >>  design with new consistency and replication options in our ever
> >>  distributed ecosystem.
> >>
> >> -Clay
> >>
> >> On Mon, 26 Feb 2018, Jitendra Pandey wrote:
> >>
> >>    Dear folks,
> >>>           We would like to start a vote to merge HDFS-7240 branch into
> >>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
> >>> jiras (See references below).
> >>>
> >>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
> >>> is a distributed, replicated block layer.
> >>>    The old HDFS namespace and NN can be connected to this new block
> >>> layer as we have described in HDFS-10419.
> >>>    We also introduce a key-value namespace called Ozone built on HDSL.
> >>>
> >>>    The code is in a separate module and is turned off by default. In a
> >>> secure setup, HDSL and Ozone daemons cannot be started.
> >>>
> >>>    The detailed documentation is available at
> >>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
> >>> Distributed+Storage+Layer+and+Applications
> >>>
> >>>
> >>>    I will start with my vote.
> >>>            +1 (binding)
> >>>
> >>>
> >>>    Discussion Thread:
> >>>              https://s.apache.org/7240-merge
> >>>              https://s.apache.org/4sfU
> >>>
> >>>    Jiras:
> >>>               https://issues.apache.org/jira/browse/HDFS-7240
> >>>               https://issues.apache.org/jira/browse/HDFS-10419
> >>>               https://issues.apache.org/jira/browse/HDFS-13074
> >>>               https://issues.apache.org/jira/browse/HDFS-13180
> >>>
> >>>
> >>>    Thanks
> >>>    jitendra
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>            DISCUSSION THREAD SUMMARY :
> >>>
> >>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> >>> wrote:
> >>>
> >>>                Sorry the formatting got messed by my email client.
> Here
> >>> it is again
> >>>
> >>>
> >>>                Dear
> >>>                 Hadoop Community Members,
> >>>
> >>>                   We had multiple community discussions, a few meetings
> >>> in smaller groups and also jira discussions with respect to this
> thread. We
> >>> express our gratitude for participation and valuable comments.
> >>>
> >>>                The key questions raised were following
> >>>                1) How the new block storage layer and OzoneFS benefit
> >>> HDFS and we were asked to chalk out a roadmap towards the goal of a
> >>> scalable namenode working with the new storage layer
> >>>                2) We were asked to provide a security design
> >>>                3)There were questions around stability given ozone
> >>> brings in a large body of code.
> >>>                4) Why can?t they be separate projects forever or merged
> >>> in when production ready?
> >>>
> >>>                We have responded to all the above questions with
> >>> detailed explanations and answers on the jira as well as in the
> >>> discussions. We believe that should sufficiently address community?s
> >>> concerns.
> >>>
> >>>                Please see the summary below:
> >>>
> >>>                1) The new code base benefits HDFS scaling and a roadmap
> >>> has been provided.
> >>>
> >>>                Summary:
> >>>                  - New block storage layer addresses the scalability of
> >>> the block layer. We have shown how existing NN can be connected to the
> new
> >>> block layer and its benefits. We have shown 2 milestones, 1st
> milestone is
> >>> much simpler than 2nd milestone while giving almost the same scaling
> >>> benefits. Originally we had proposed simply milestone 2 and the
> community
> >>> felt that removing the FSN/BM lock was was a fair amount of work and a
> >>> simpler solution would be useful
> >>>                  - We provide a new K-V namespace called Ozone FS with
> >>> FileSystem/FileContext plugins to allow the users to use the new
> system.
> >>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This
> will
> >>> facilitate stabilizing the new block layer.
> >>>                  - The new block layer has a new netty based protocol
> >>> engine in the Datanode which, when stabilized, can be used by  the old
> hdfs
> >>> block layer. See details below on sharing of code.
> >>>
> >>>
> >>>                2) Stability impact on the existing HDFS code base and
> >>> code separation. The new block layer and the OzoneFS are in modules
> that
> >>> are separate from old HDFS code - currently there are no calls from
> HDFS
> >>> into Ozone except for DN starting the new block  layer module if
> configured
> >>> to do so. It does not add instability (the instability argument has
> been
> >>> raised many times). Over time as we share code, we will ensure that
> the old
> >>> HDFS continues to remains stable. (for example we plan to stabilize
> the new
> >>> netty based protocol engine in the new block layer before sharing it
> with
> >>> HDFS?s old block layer)
> >>>
> >>>
> >>>                3) In the short term and medium term, the new system and
> >>> HDFS  will be used side-by-side by users. Side by-side usage in the
> short
> >>> term for testing and side-by-side in the medium term for actual
> production
> >>> use till the new system has feature parity with old HDFS. During this
> time,
> >>> sharing the DN daemon and admin functions between the two systems is
> >>> operationally important:
> >>>                  - Sharing DN daemon to avoid additional operational
> >>> daemon lifecycle management
> >>>                  - Common decommissioning of the daemon and DN: One
> >>> place to decommission for a node and its storage.
> >>>                  - Replacing failed disks and internal balancing
> >>> capacity across disks - this needs to be done for both the current HDFS
> >>> blocks and the new block-layer blocks.
> >>>                  - Balancer: we would like use the same balancer and
> >>> provide a common way to balance and common management of the bandwidth
> used
> >>> for balancing
> >>>                  - Security configuration setup - reuse existing set up
> >>> for DNs rather then a new one for an independent cluster.
> >>>
> >>>
> >>>                4) Need to easily share the block layer code between the
> >>> two systems when used side-by-side. Areas where sharing code is desired
> >>> over time:
> >>>                  - Sharing new block layer?s  new netty based protocol
> >>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
> >>>                  - Shallow data copy from old system to new system is
> >>> practical only if within same project and daemon otherwise have to deal
> >>> with security setting and coordinations across daemons. Shallow copy is
> >>> useful as customer migrate from old to new.
> >>>                  - Shared disk scheduling in the future and in the
> short
> >>> term have a single round robin rather than independent round robins.
> >>>                While sharing code across projects is technically
> >>> possible (anything is possible in software),  it is significantly
> harder
> >>> typically requiring  cleaner public apis etc. Sharing within a project
> >>> though internal APIs is often simpler (such as the protocol engine
> that we
> >>> want to share).
> >>>
> >>>
> >>>                5) Security design, including a threat model and and the
> >>> solution has been posted.
> >>>                6) Temporary Separation and merge later: Several of the
> >>> comments in the jira have argued that we temporarily separate the two
> code
> >>> bases for now and then later merge them when the new code is stable:
> >>>
> >>>                  - If there is agreement to merge later, why bother
> >>> separating now - there needs to be to be good reasons to separate
> now.  We
> >>> have addressed the stability and separation of the new code from
> existing
> >>> above.
> >>>                  - Merge the new code back into HDFS later will be
> >>> harder.
> >>>
> >>>                    **The code and goals will diverge further.
> >>>                    ** We will be taking on extra work to split and then
> >>> take extra work to merge.
> >>>                    ** The issues raised today will be raised all the
> >>> same then.
> >>>
> >>>
> >>>                -----------------------------
> >>> ----------------------------------------
> >>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
> >>> ache.org
> >>>                For additional commands, e-mail:
> >>> hdfs-dev-help@hadoop.apache.org
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> >>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
> >>>
> >>>
> >>
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
> > For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org
> >
> >
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

I think it would be good to get this in sooner rather than later, but I
have some thoughts.

   1. It is hard to tell what has changed. git rebase -i tells me the
   branch has 722 commits. The rebase failed with a conflict. It would really
   help if you rebased to current trunk.
   2. I think Ozone would be a good Hadoop subproject, but it should be
   outside of HDFS.
   3. CBlock, which is also coming in this merge, would benefit from more
   separation from HDFS.
   4. What are the new transitive dependencies that Ozone, HDSL, and CBlock
   adding to the clients? The servers matter too, but the client dependencies
   have a huge impact on our users.
   5. Have you checked the new dependencies for compatibility with ASL?


On Thu, Mar 1, 2018 at 2:45 PM, Clay B. <cw...@clayb.net> wrote:

> Oops, retrying now subscribed to more than solely yarn-dev.
>
> -Clay
>
>
> On Wed, 28 Feb 2018, Clay B. wrote:
>
> +1 (non-binding)
>>
>> I have walked through the code and find it very compelling as a user; I
>> really look forward to seeing the Ozone code mature and it maturing HDFS
>> features together. The points which excite me as an eight year HDFS user
>> are:
>>
>> * Excitement for making the datanode a storage technology container - this
>>  patch clearly brings fresh thought to HDFS keeping it from growing stale
>>
>> * Ability to build upon a shared storage infrastructure for diverse
>>  loads: I do not want to have "stranded" storage capacity or have to
>>  manage competing storage systems on the same disks (and further I want
>>  the metrics datanodes can provide me today, so I do not have to
>>  instrument two systems or evolve their instrumentation separately).
>>
>> * Looking forward to supporting object-sized files!
>>
>> * Moves HDFS in the right direction to test out new block management
>>  techniques for scaling HDFS. I am really excited to see the raft
>>  integration; I hope it opens a new era in Hadoop matching modern systems
>>  design with new consistency and replication options in our ever
>>  distributed ecosystem.
>>
>> -Clay
>>
>> On Mon, 26 Feb 2018, Jitendra Pandey wrote:
>>
>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>>
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>>> is a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>>> layer as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>>
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>>
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>>> Distributed+Storage+Layer+and+Applications
>>>
>>>
>>>    I will start with my vote.
>>>            +1 (binding)
>>>
>>>
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>>
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>>
>>>
>>>    Thanks
>>>    jitendra
>>>
>>>
>>>
>>>
>>>
>>>            DISCUSSION THREAD SUMMARY :
>>>
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>>
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>>
>>>
>>>                Dear
>>>                 Hadoop Community Members,
>>>
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread. We
>>> express our gratitude for participation and valuable comments.
>>>
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>>> brings in a large body of code.
>>>                4) Why can?t they be separate projects forever or merged
>>> in when production ready?
>>>
>>>                We have responded to all the above questions with
>>> detailed explanations and answers on the jira as well as in the
>>> discussions. We believe that should sufficiently address community?s
>>> concerns.
>>>
>>>                Please see the summary below:
>>>
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>>
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>>> block layer. See details below on sharing of code.
>>>
>>>
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the old
>>> HDFS continues to remains stable. (for example we plan to stabilize the new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS?s old block layer)
>>>
>>>
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual production
>>> use till the new system has feature parity with old HDFS. During this time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>>> place to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>>> capacity across disks - this needs to be done for both the current HDFS
>>> blocks and the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>>
>>>
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer?s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>>> possible (anything is possible in software),  it is significantly harder
>>> typically requiring  cleaner public apis etc. Sharing within a project
>>> though internal APIs is often simpler (such as the protocol engine that we
>>> want to share).
>>>
>>>
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two code
>>> bases for now and then later merge them when the new code is stable:
>>>
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.  We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>>> harder.
>>>
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>>> same then.
>>>
>>>
>>>                -----------------------------
>>> ----------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>>> ache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>>
>>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

I think it would be good to get this in sooner rather than later, but I
have some thoughts.

   1. It is hard to tell what has changed. git rebase -i tells me the
   branch has 722 commits. The rebase failed with a conflict. It would really
   help if you rebased to current trunk.
   2. I think Ozone would be a good Hadoop subproject, but it should be
   outside of HDFS.
   3. CBlock, which is also coming in this merge, would benefit from more
   separation from HDFS.
   4. What are the new transitive dependencies that Ozone, HDSL, and CBlock
   adding to the clients? The servers matter too, but the client dependencies
   have a huge impact on our users.
   5. Have you checked the new dependencies for compatibility with ASL?


On Thu, Mar 1, 2018 at 2:45 PM, Clay B. <cw...@clayb.net> wrote:

> Oops, retrying now subscribed to more than solely yarn-dev.
>
> -Clay
>
>
> On Wed, 28 Feb 2018, Clay B. wrote:
>
> +1 (non-binding)
>>
>> I have walked through the code and find it very compelling as a user; I
>> really look forward to seeing the Ozone code mature and it maturing HDFS
>> features together. The points which excite me as an eight year HDFS user
>> are:
>>
>> * Excitement for making the datanode a storage technology container - this
>>  patch clearly brings fresh thought to HDFS keeping it from growing stale
>>
>> * Ability to build upon a shared storage infrastructure for diverse
>>  loads: I do not want to have "stranded" storage capacity or have to
>>  manage competing storage systems on the same disks (and further I want
>>  the metrics datanodes can provide me today, so I do not have to
>>  instrument two systems or evolve their instrumentation separately).
>>
>> * Looking forward to supporting object-sized files!
>>
>> * Moves HDFS in the right direction to test out new block management
>>  techniques for scaling HDFS. I am really excited to see the raft
>>  integration; I hope it opens a new era in Hadoop matching modern systems
>>  design with new consistency and replication options in our ever
>>  distributed ecosystem.
>>
>> -Clay
>>
>> On Mon, 26 Feb 2018, Jitendra Pandey wrote:
>>
>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>>
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>>> is a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>>> layer as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>>
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>>
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>>> Distributed+Storage+Layer+and+Applications
>>>
>>>
>>>    I will start with my vote.
>>>            +1 (binding)
>>>
>>>
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>>
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>>
>>>
>>>    Thanks
>>>    jitendra
>>>
>>>
>>>
>>>
>>>
>>>            DISCUSSION THREAD SUMMARY :
>>>
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>>
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>>
>>>
>>>                Dear
>>>                 Hadoop Community Members,
>>>
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread. We
>>> express our gratitude for participation and valuable comments.
>>>
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>>> brings in a large body of code.
>>>                4) Why can?t they be separate projects forever or merged
>>> in when production ready?
>>>
>>>                We have responded to all the above questions with
>>> detailed explanations and answers on the jira as well as in the
>>> discussions. We believe that should sufficiently address community?s
>>> concerns.
>>>
>>>                Please see the summary below:
>>>
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>>
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>>> block layer. See details below on sharing of code.
>>>
>>>
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the old
>>> HDFS continues to remains stable. (for example we plan to stabilize the new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS?s old block layer)
>>>
>>>
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual production
>>> use till the new system has feature parity with old HDFS. During this time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>>> place to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>>> capacity across disks - this needs to be done for both the current HDFS
>>> blocks and the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>>
>>>
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer?s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>>> possible (anything is possible in software),  it is significantly harder
>>> typically requiring  cleaner public apis etc. Sharing within a project
>>> though internal APIs is often simpler (such as the protocol engine that we
>>> want to share).
>>>
>>>
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two code
>>> bases for now and then later merge them when the new code is stable:
>>>
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.  We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>>> harder.
>>>
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>>> same then.
>>>
>>>
>>>                -----------------------------
>>> ----------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>>> ache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>>
>>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

I think it would be good to get this in sooner rather than later, but I
have some thoughts.

   1. It is hard to tell what has changed. git rebase -i tells me the
   branch has 722 commits. The rebase failed with a conflict. It would really
   help if you rebased to current trunk.
   2. I think Ozone would be a good Hadoop subproject, but it should be
   outside of HDFS.
   3. CBlock, which is also coming in this merge, would benefit from more
   separation from HDFS.
   4. What are the new transitive dependencies that Ozone, HDSL, and CBlock
   adding to the clients? The servers matter too, but the client dependencies
   have a huge impact on our users.
   5. Have you checked the new dependencies for compatibility with ASL?


On Thu, Mar 1, 2018 at 2:45 PM, Clay B. <cw...@clayb.net> wrote:

> Oops, retrying now subscribed to more than solely yarn-dev.
>
> -Clay
>
>
> On Wed, 28 Feb 2018, Clay B. wrote:
>
> +1 (non-binding)
>>
>> I have walked through the code and find it very compelling as a user; I
>> really look forward to seeing the Ozone code mature and it maturing HDFS
>> features together. The points which excite me as an eight year HDFS user
>> are:
>>
>> * Excitement for making the datanode a storage technology container - this
>>  patch clearly brings fresh thought to HDFS keeping it from growing stale
>>
>> * Ability to build upon a shared storage infrastructure for diverse
>>  loads: I do not want to have "stranded" storage capacity or have to
>>  manage competing storage systems on the same disks (and further I want
>>  the metrics datanodes can provide me today, so I do not have to
>>  instrument two systems or evolve their instrumentation separately).
>>
>> * Looking forward to supporting object-sized files!
>>
>> * Moves HDFS in the right direction to test out new block management
>>  techniques for scaling HDFS. I am really excited to see the raft
>>  integration; I hope it opens a new era in Hadoop matching modern systems
>>  design with new consistency and replication options in our ever
>>  distributed ecosystem.
>>
>> -Clay
>>
>> On Mon, 26 Feb 2018, Jitendra Pandey wrote:
>>
>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>>
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>>> is a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>>> layer as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>>
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>>
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>>> Distributed+Storage+Layer+and+Applications
>>>
>>>
>>>    I will start with my vote.
>>>            +1 (binding)
>>>
>>>
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>>
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>>
>>>
>>>    Thanks
>>>    jitendra
>>>
>>>
>>>
>>>
>>>
>>>            DISCUSSION THREAD SUMMARY :
>>>
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>>
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>>
>>>
>>>                Dear
>>>                 Hadoop Community Members,
>>>
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread. We
>>> express our gratitude for participation and valuable comments.
>>>
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>>> brings in a large body of code.
>>>                4) Why can?t they be separate projects forever or merged
>>> in when production ready?
>>>
>>>                We have responded to all the above questions with
>>> detailed explanations and answers on the jira as well as in the
>>> discussions. We believe that should sufficiently address community?s
>>> concerns.
>>>
>>>                Please see the summary below:
>>>
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>>
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>>> block layer. See details below on sharing of code.
>>>
>>>
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the old
>>> HDFS continues to remains stable. (for example we plan to stabilize the new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS?s old block layer)
>>>
>>>
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual production
>>> use till the new system has feature parity with old HDFS. During this time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>>> place to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>>> capacity across disks - this needs to be done for both the current HDFS
>>> blocks and the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>>
>>>
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer?s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>>> possible (anything is possible in software),  it is significantly harder
>>> typically requiring  cleaner public apis etc. Sharing within a project
>>> though internal APIs is often simpler (such as the protocol engine that we
>>> want to share).
>>>
>>>
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two code
>>> bases for now and then later merge them when the new code is stable:
>>>
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.  We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>>> harder.
>>>
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>>> same then.
>>>
>>>
>>>                -----------------------------
>>> ----------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>>> ache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>>
>>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Owen O'Malley <ow...@gmail.com>.

I think it would be good to get this in sooner rather than later, but I
have some thoughts.

   1. It is hard to tell what has changed. git rebase -i tells me the
   branch has 722 commits. The rebase failed with a conflict. It would really
   help if you rebased to current trunk.
   2. I think Ozone would be a good Hadoop subproject, but it should be
   outside of HDFS.
   3. CBlock, which is also coming in this merge, would benefit from more
   separation from HDFS.
   4. What are the new transitive dependencies that Ozone, HDSL, and CBlock
   adding to the clients? The servers matter too, but the client dependencies
   have a huge impact on our users.
   5. Have you checked the new dependencies for compatibility with ASL?


On Thu, Mar 1, 2018 at 2:45 PM, Clay B. <cw...@clayb.net> wrote:

> Oops, retrying now subscribed to more than solely yarn-dev.
>
> -Clay
>
>
> On Wed, 28 Feb 2018, Clay B. wrote:
>
> +1 (non-binding)
>>
>> I have walked through the code and find it very compelling as a user; I
>> really look forward to seeing the Ozone code mature and it maturing HDFS
>> features together. The points which excite me as an eight year HDFS user
>> are:
>>
>> * Excitement for making the datanode a storage technology container - this
>>  patch clearly brings fresh thought to HDFS keeping it from growing stale
>>
>> * Ability to build upon a shared storage infrastructure for diverse
>>  loads: I do not want to have "stranded" storage capacity or have to
>>  manage competing storage systems on the same disks (and further I want
>>  the metrics datanodes can provide me today, so I do not have to
>>  instrument two systems or evolve their instrumentation separately).
>>
>> * Looking forward to supporting object-sized files!
>>
>> * Moves HDFS in the right direction to test out new block management
>>  techniques for scaling HDFS. I am really excited to see the raft
>>  integration; I hope it opens a new era in Hadoop matching modern systems
>>  design with new consistency and replication options in our ever
>>  distributed ecosystem.
>>
>> -Clay
>>
>> On Mon, 26 Feb 2018, Jitendra Pandey wrote:
>>
>>    Dear folks,
>>>           We would like to start a vote to merge HDFS-7240 branch into
>>> trunk. The context can be reviewed in the DISCUSSION thread, and in the
>>> jiras (See references below).
>>>
>>>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which
>>> is a distributed, replicated block layer.
>>>    The old HDFS namespace and NN can be connected to this new block
>>> layer as we have described in HDFS-10419.
>>>    We also introduce a key-value namespace called Ozone built on HDSL.
>>>
>>>    The code is in a separate module and is turned off by default. In a
>>> secure setup, HDSL and Ozone daemons cannot be started.
>>>
>>>    The detailed documentation is available at
>>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+
>>> Distributed+Storage+Layer+and+Applications
>>>
>>>
>>>    I will start with my vote.
>>>            +1 (binding)
>>>
>>>
>>>    Discussion Thread:
>>>              https://s.apache.org/7240-merge
>>>              https://s.apache.org/4sfU
>>>
>>>    Jiras:
>>>               https://issues.apache.org/jira/browse/HDFS-7240
>>>               https://issues.apache.org/jira/browse/HDFS-10419
>>>               https://issues.apache.org/jira/browse/HDFS-13074
>>>               https://issues.apache.org/jira/browse/HDFS-13180
>>>
>>>
>>>    Thanks
>>>    jitendra
>>>
>>>
>>>
>>>
>>>
>>>            DISCUSSION THREAD SUMMARY :
>>>
>>>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
>>> wrote:
>>>
>>>                Sorry the formatting got messed by my email client.  Here
>>> it is again
>>>
>>>
>>>                Dear
>>>                 Hadoop Community Members,
>>>
>>>                   We had multiple community discussions, a few meetings
>>> in smaller groups and also jira discussions with respect to this thread. We
>>> express our gratitude for participation and valuable comments.
>>>
>>>                The key questions raised were following
>>>                1) How the new block storage layer and OzoneFS benefit
>>> HDFS and we were asked to chalk out a roadmap towards the goal of a
>>> scalable namenode working with the new storage layer
>>>                2) We were asked to provide a security design
>>>                3)There were questions around stability given ozone
>>> brings in a large body of code.
>>>                4) Why can?t they be separate projects forever or merged
>>> in when production ready?
>>>
>>>                We have responded to all the above questions with
>>> detailed explanations and answers on the jira as well as in the
>>> discussions. We believe that should sufficiently address community?s
>>> concerns.
>>>
>>>                Please see the summary below:
>>>
>>>                1) The new code base benefits HDFS scaling and a roadmap
>>> has been provided.
>>>
>>>                Summary:
>>>                  - New block storage layer addresses the scalability of
>>> the block layer. We have shown how existing NN can be connected to the new
>>> block layer and its benefits. We have shown 2 milestones, 1st milestone is
>>> much simpler than 2nd milestone while giving almost the same scaling
>>> benefits. Originally we had proposed simply milestone 2 and the community
>>> felt that removing the FSN/BM lock was was a fair amount of work and a
>>> simpler solution would be useful
>>>                  - We provide a new K-V namespace called Ozone FS with
>>> FileSystem/FileContext plugins to allow the users to use the new system.
>>> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
>>> facilitate stabilizing the new block layer.
>>>                  - The new block layer has a new netty based protocol
>>> engine in the Datanode which, when stabilized, can be used by  the old hdfs
>>> block layer. See details below on sharing of code.
>>>
>>>
>>>                2) Stability impact on the existing HDFS code base and
>>> code separation. The new block layer and the OzoneFS are in modules that
>>> are separate from old HDFS code - currently there are no calls from HDFS
>>> into Ozone except for DN starting the new block  layer module if configured
>>> to do so. It does not add instability (the instability argument has been
>>> raised many times). Over time as we share code, we will ensure that the old
>>> HDFS continues to remains stable. (for example we plan to stabilize the new
>>> netty based protocol engine in the new block layer before sharing it with
>>> HDFS?s old block layer)
>>>
>>>
>>>                3) In the short term and medium term, the new system and
>>> HDFS  will be used side-by-side by users. Side by-side usage in the short
>>> term for testing and side-by-side in the medium term for actual production
>>> use till the new system has feature parity with old HDFS. During this time,
>>> sharing the DN daemon and admin functions between the two systems is
>>> operationally important:
>>>                  - Sharing DN daemon to avoid additional operational
>>> daemon lifecycle management
>>>                  - Common decommissioning of the daemon and DN: One
>>> place to decommission for a node and its storage.
>>>                  - Replacing failed disks and internal balancing
>>> capacity across disks - this needs to be done for both the current HDFS
>>> blocks and the new block-layer blocks.
>>>                  - Balancer: we would like use the same balancer and
>>> provide a common way to balance and common management of the bandwidth used
>>> for balancing
>>>                  - Security configuration setup - reuse existing set up
>>> for DNs rather then a new one for an independent cluster.
>>>
>>>
>>>                4) Need to easily share the block layer code between the
>>> two systems when used side-by-side. Areas where sharing code is desired
>>> over time:
>>>                  - Sharing new block layer?s  new netty based protocol
>>> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>>>                  - Shallow data copy from old system to new system is
>>> practical only if within same project and daemon otherwise have to deal
>>> with security setting and coordinations across daemons. Shallow copy is
>>> useful as customer migrate from old to new.
>>>                  - Shared disk scheduling in the future and in the short
>>> term have a single round robin rather than independent round robins.
>>>                While sharing code across projects is technically
>>> possible (anything is possible in software),  it is significantly harder
>>> typically requiring  cleaner public apis etc. Sharing within a project
>>> though internal APIs is often simpler (such as the protocol engine that we
>>> want to share).
>>>
>>>
>>>                5) Security design, including a threat model and and the
>>> solution has been posted.
>>>                6) Temporary Separation and merge later: Several of the
>>> comments in the jira have argued that we temporarily separate the two code
>>> bases for now and then later merge them when the new code is stable:
>>>
>>>                  - If there is agreement to merge later, why bother
>>> separating now - there needs to be to be good reasons to separate now.  We
>>> have addressed the stability and separation of the new code from existing
>>> above.
>>>                  - Merge the new code back into HDFS later will be
>>> harder.
>>>
>>>                    **The code and goals will diverge further.
>>>                    ** We will be taking on extra work to split and then
>>> take extra work to merge.
>>>                    ** The issues raised today will be raised all the
>>> same then.
>>>
>>>
>>>                -----------------------------
>>> ----------------------------------------
>>>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.ap
>>> ache.org
>>>                For additional commands, e-mail:
>>> hdfs-dev-help@hadoop.apache.org
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>>> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>>>
>>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by "Clay B." <cw...@clayb.net>.

Oops, retrying now subscribed to more than solely yarn-dev.

-Clay

On Wed, 28 Feb 2018, Clay B. wrote:

> +1 (non-binding)
>
> I have walked through the code and find it very compelling as a 
> user; I really look forward to seeing the Ozone code mature and 
> it maturing HDFS features together. The points which excite me 
> as an eight year HDFS user are:
>
> * Excitement for making the datanode a storage technology 
> container - this
>  patch clearly brings fresh thought to HDFS keeping it from 
> growing stale
>
> * Ability to build upon a shared storage infrastructure for 
> diverse
>  loads: I do not want to have "stranded" storage capacity or 
> have to
>  manage competing storage systems on the same disks (and 
> further I want
>  the metrics datanodes can provide me today, so I do not have 
> to
>  instrument two systems or evolve their instrumentation 
> separately).
>
> * Looking forward to supporting object-sized files!
>
> * Moves HDFS in the right direction to test out new block 
> management
>  techniques for scaling HDFS. I am really excited to see the 
> raft
>  integration; I hope it opens a new era in Hadoop matching 
> modern systems
>  design with new consistency and replication options in our 
> ever
>  distributed ecosystem.
>
> -Clay
>
> On Mon, 26 Feb 2018, Jitendra Pandey wrote:
>
>>    Dear folks,
>>           We would like to start a vote to merge HDFS-7240 
>> branch into trunk. The context can be reviewed in the 
>> DISCUSSION thread, and in the jiras (See references below).
>>
>>    HDFS-7240 introduces Hadoop Distributed Storage Layer 
>> (HDSL), which is a distributed, replicated block layer.
>>    The old HDFS namespace and NN can be connected to this new 
>> block layer as we have described in HDFS-10419.
>>    We also introduce a key-value namespace called Ozone built 
>> on HDSL.
>>
>>    The code is in a separate module and is turned off by 
>> default. In a secure setup, HDSL and Ozone daemons cannot be 
>> started.
>>
>>    The detailed documentation is available at
>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Distributed+Storage+Layer+and+Applications
>> 
>>
>>    I will start with my vote.
>>            +1 (binding)
>> 
>>
>>    Discussion Thread:
>>              https://s.apache.org/7240-merge
>>              https://s.apache.org/4sfU
>>
>>    Jiras:
>>               https://issues.apache.org/jira/browse/HDFS-7240
>>               https://issues.apache.org/jira/browse/HDFS-10419
>>               https://issues.apache.org/jira/browse/HDFS-13074
>>               https://issues.apache.org/jira/browse/HDFS-13180
>> 
>>
>>    Thanks
>>    jitendra
>> 
>> 
>> 
>> 
>>
>>            DISCUSSION THREAD SUMMARY :
>>
>>            On 2/13/18, 6:28 PM, "sanjay Radia" 
>> <sa...@gmail.com> wrote:
>>
>>                Sorry the formatting got messed by my email 
>> client.  Here it is again
>> 
>>
>>                Dear
>>                 Hadoop Community Members,
>>
>>                   We had multiple community discussions, a few 
>> meetings in smaller groups and also jira discussions with 
>> respect to this thread. We express our gratitude for 
>> participation and valuable comments.
>>
>>                The key questions raised were following
>>                1) How the new block storage layer and OzoneFS 
>> benefit HDFS and we were asked to chalk out a roadmap towards 
>> the goal of a scalable namenode working with the new storage 
>> layer
>>                2) We were asked to provide a security design
>>                3)There were questions around stability given 
>> ozone brings in a large body of code.
>>                4) Why can?t they be separate projects forever 
>> or merged in when production ready?
>>
>>                We have responded to all the above questions 
>> with detailed explanations and answers on the jira as well as 
>> in the discussions. We believe that should sufficiently 
>> address community?s concerns.
>>
>>                Please see the summary below:
>>
>>                1) The new code base benefits HDFS scaling and 
>> a roadmap has been provided.
>>
>>                Summary:
>>                  - New block storage layer addresses the 
>> scalability of the block layer. We have shown how existing NN 
>> can be connected to the new block layer and its benefits. We 
>> have shown 2 milestones, 1st milestone is much simpler than 
>> 2nd milestone while giving almost the same scaling benefits. 
>> Originally we had proposed simply milestone 2 and the 
>> community felt that removing the FSN/BM lock was was a fair 
>> amount of work and a simpler solution would be useful
>>                  - We provide a new K-V namespace called Ozone 
>> FS with FileSystem/FileContext plugins to allow the users to 
>> use the new system. BTW Hive and Spark work very well on 
>> KV-namespaces on the cloud. This will facilitate stabilizing 
>> the new block layer.
>>                  - The new block layer has a new netty based 
>> protocol engine in the Datanode which, when stabilized, can be 
>> used by  the old hdfs block layer. See details below on 
>> sharing of code.
>> 
>>
>>                2) Stability impact on the existing HDFS code 
>> base and code separation. The new block layer and the OzoneFS 
>> are in modules that are separate from old HDFS code - 
>> currently there are no calls from HDFS into Ozone except for 
>> DN starting the new block  layer module if configured to do 
>> so. It does not add instability (the instability argument has 
>> been raised many times). Over time as we share code, we will 
>> ensure that the old HDFS continues to remains stable. (for 
>> example we plan to stabilize the new netty based protocol 
>> engine in the new block layer before sharing it with HDFS?s 
>> old block layer)
>> 
>>
>>                3) In the short term and medium term, the new 
>> system and HDFS  will be used side-by-side by users. Side 
>> by-side usage in the short term for testing and side-by-side 
>> in the medium term for actual production use till the new 
>> system has feature parity with old HDFS. During this time, 
>> sharing the DN daemon and admin functions between the two 
>> systems is operationally important:
>>                  - Sharing DN daemon to avoid additional 
>> operational daemon lifecycle management
>>                  - Common decommissioning of the daemon and 
>> DN: One place to decommission for a node and its storage.
>>                  - Replacing failed disks and internal 
>> balancing capacity across disks - this needs to be done for 
>> both the current HDFS blocks and the new block-layer blocks.
>>                  - Balancer: we would like use the same 
>> balancer and provide a common way to balance and common 
>> management of the bandwidth used for balancing
>>                  - Security configuration setup - reuse 
>> existing set up for DNs rather then a new one for an 
>> independent cluster.
>> 
>>
>>                4) Need to easily share the block layer code 
>> between the two systems when used side-by-side. Areas where 
>> sharing code is desired over time:
>>                  - Sharing new block layer?s  new netty based 
>> protocol engine for old HDFS DNs (a long time sore issue for 
>> HDFS block layer).
>>                  - Shallow data copy from old system to new 
>> system is practical only if within same project and daemon 
>> otherwise have to deal with security setting and coordinations 
>> across daemons. Shallow copy is useful as customer migrate 
>> from old to new.
>>                  - Shared disk scheduling in the future and in 
>> the short term have a single round robin rather than 
>> independent round robins.
>>                While sharing code across projects is 
>> technically possible (anything is possible in software),  it 
>> is significantly harder typically requiring  cleaner public 
>> apis etc. Sharing within a project though internal APIs is 
>> often simpler (such as the protocol engine that we want to 
>> share).
>> 
>>
>>                5) Security design, including a threat model 
>> and and the solution has been posted.
>>                6) Temporary Separation and merge later: 
>> Several of the comments in the jira have argued that we 
>> temporarily separate the two code bases for now and then later 
>> merge them when the new code is stable:
>>
>>                  - If there is agreement to merge later, why 
>> bother separating now - there needs to be to be good reasons 
>> to separate now.  We have addressed the stability and 
>> separation of the new code from existing above.
>>                  - Merge the new code back into HDFS later 
>> will be harder.
>>
>>                    **The code and goals will diverge further.
>>                    ** We will be taking on extra work to split 
>> and then take extra work to merge.
>>                    ** The issues raised today will be raised 
>> all the same then.
>> 
>>
>>                ---------------------------------------------------------------------
>>                To unsubscribe, e-mail: 
>> hdfs-dev-unsubscribe@hadoop.apache.org
>>                For additional commands, e-mail: 
>> hdfs-dev-help@hadoop.apache.org
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: 
>> yarn-dev-help@hadoop.apache.org
>> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by "Clay B." <cw...@clayb.net>.

Oops, retrying now subscribed to more than solely yarn-dev.

-Clay

On Wed, 28 Feb 2018, Clay B. wrote:

> +1 (non-binding)
>
> I have walked through the code and find it very compelling as a 
> user; I really look forward to seeing the Ozone code mature and 
> it maturing HDFS features together. The points which excite me 
> as an eight year HDFS user are:
>
> * Excitement for making the datanode a storage technology 
> container - this
>  patch clearly brings fresh thought to HDFS keeping it from 
> growing stale
>
> * Ability to build upon a shared storage infrastructure for 
> diverse
>  loads: I do not want to have "stranded" storage capacity or 
> have to
>  manage competing storage systems on the same disks (and 
> further I want
>  the metrics datanodes can provide me today, so I do not have 
> to
>  instrument two systems or evolve their instrumentation 
> separately).
>
> * Looking forward to supporting object-sized files!
>
> * Moves HDFS in the right direction to test out new block 
> management
>  techniques for scaling HDFS. I am really excited to see the 
> raft
>  integration; I hope it opens a new era in Hadoop matching 
> modern systems
>  design with new consistency and replication options in our 
> ever
>  distributed ecosystem.
>
> -Clay
>
> On Mon, 26 Feb 2018, Jitendra Pandey wrote:
>
>>    Dear folks,
>>           We would like to start a vote to merge HDFS-7240 
>> branch into trunk. The context can be reviewed in the 
>> DISCUSSION thread, and in the jiras (See references below).
>>
>>    HDFS-7240 introduces Hadoop Distributed Storage Layer 
>> (HDSL), which is a distributed, replicated block layer.
>>    The old HDFS namespace and NN can be connected to this new 
>> block layer as we have described in HDFS-10419.
>>    We also introduce a key-value namespace called Ozone built 
>> on HDSL.
>>
>>    The code is in a separate module and is turned off by 
>> default. In a secure setup, HDSL and Ozone daemons cannot be 
>> started.
>>
>>    The detailed documentation is available at
>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Distributed+Storage+Layer+and+Applications
>> 
>>
>>    I will start with my vote.
>>            +1 (binding)
>> 
>>
>>    Discussion Thread:
>>              https://s.apache.org/7240-merge
>>              https://s.apache.org/4sfU
>>
>>    Jiras:
>>               https://issues.apache.org/jira/browse/HDFS-7240
>>               https://issues.apache.org/jira/browse/HDFS-10419
>>               https://issues.apache.org/jira/browse/HDFS-13074
>>               https://issues.apache.org/jira/browse/HDFS-13180
>> 
>>
>>    Thanks
>>    jitendra
>> 
>> 
>> 
>> 
>>
>>            DISCUSSION THREAD SUMMARY :
>>
>>            On 2/13/18, 6:28 PM, "sanjay Radia" 
>> <sa...@gmail.com> wrote:
>>
>>                Sorry the formatting got messed by my email 
>> client.  Here it is again
>> 
>>
>>                Dear
>>                 Hadoop Community Members,
>>
>>                   We had multiple community discussions, a few 
>> meetings in smaller groups and also jira discussions with 
>> respect to this thread. We express our gratitude for 
>> participation and valuable comments.
>>
>>                The key questions raised were following
>>                1) How the new block storage layer and OzoneFS 
>> benefit HDFS and we were asked to chalk out a roadmap towards 
>> the goal of a scalable namenode working with the new storage 
>> layer
>>                2) We were asked to provide a security design
>>                3)There were questions around stability given 
>> ozone brings in a large body of code.
>>                4) Why can?t they be separate projects forever 
>> or merged in when production ready?
>>
>>                We have responded to all the above questions 
>> with detailed explanations and answers on the jira as well as 
>> in the discussions. We believe that should sufficiently 
>> address community?s concerns.
>>
>>                Please see the summary below:
>>
>>                1) The new code base benefits HDFS scaling and 
>> a roadmap has been provided.
>>
>>                Summary:
>>                  - New block storage layer addresses the 
>> scalability of the block layer. We have shown how existing NN 
>> can be connected to the new block layer and its benefits. We 
>> have shown 2 milestones, 1st milestone is much simpler than 
>> 2nd milestone while giving almost the same scaling benefits. 
>> Originally we had proposed simply milestone 2 and the 
>> community felt that removing the FSN/BM lock was was a fair 
>> amount of work and a simpler solution would be useful
>>                  - We provide a new K-V namespace called Ozone 
>> FS with FileSystem/FileContext plugins to allow the users to 
>> use the new system. BTW Hive and Spark work very well on 
>> KV-namespaces on the cloud. This will facilitate stabilizing 
>> the new block layer.
>>                  - The new block layer has a new netty based 
>> protocol engine in the Datanode which, when stabilized, can be 
>> used by  the old hdfs block layer. See details below on 
>> sharing of code.
>> 
>>
>>                2) Stability impact on the existing HDFS code 
>> base and code separation. The new block layer and the OzoneFS 
>> are in modules that are separate from old HDFS code - 
>> currently there are no calls from HDFS into Ozone except for 
>> DN starting the new block  layer module if configured to do 
>> so. It does not add instability (the instability argument has 
>> been raised many times). Over time as we share code, we will 
>> ensure that the old HDFS continues to remains stable. (for 
>> example we plan to stabilize the new netty based protocol 
>> engine in the new block layer before sharing it with HDFS?s 
>> old block layer)
>> 
>>
>>                3) In the short term and medium term, the new 
>> system and HDFS  will be used side-by-side by users. Side 
>> by-side usage in the short term for testing and side-by-side 
>> in the medium term for actual production use till the new 
>> system has feature parity with old HDFS. During this time, 
>> sharing the DN daemon and admin functions between the two 
>> systems is operationally important:
>>                  - Sharing DN daemon to avoid additional 
>> operational daemon lifecycle management
>>                  - Common decommissioning of the daemon and 
>> DN: One place to decommission for a node and its storage.
>>                  - Replacing failed disks and internal 
>> balancing capacity across disks - this needs to be done for 
>> both the current HDFS blocks and the new block-layer blocks.
>>                  - Balancer: we would like use the same 
>> balancer and provide a common way to balance and common 
>> management of the bandwidth used for balancing
>>                  - Security configuration setup - reuse 
>> existing set up for DNs rather then a new one for an 
>> independent cluster.
>> 
>>
>>                4) Need to easily share the block layer code 
>> between the two systems when used side-by-side. Areas where 
>> sharing code is desired over time:
>>                  - Sharing new block layer?s  new netty based 
>> protocol engine for old HDFS DNs (a long time sore issue for 
>> HDFS block layer).
>>                  - Shallow data copy from old system to new 
>> system is practical only if within same project and daemon 
>> otherwise have to deal with security setting and coordinations 
>> across daemons. Shallow copy is useful as customer migrate 
>> from old to new.
>>                  - Shared disk scheduling in the future and in 
>> the short term have a single round robin rather than 
>> independent round robins.
>>                While sharing code across projects is 
>> technically possible (anything is possible in software),  it 
>> is significantly harder typically requiring  cleaner public 
>> apis etc. Sharing within a project though internal APIs is 
>> often simpler (such as the protocol engine that we want to 
>> share).
>> 
>>
>>                5) Security design, including a threat model 
>> and and the solution has been posted.
>>                6) Temporary Separation and merge later: 
>> Several of the comments in the jira have argued that we 
>> temporarily separate the two code bases for now and then later 
>> merge them when the new code is stable:
>>
>>                  - If there is agreement to merge later, why 
>> bother separating now - there needs to be to be good reasons 
>> to separate now.  We have addressed the stability and 
>> separation of the new code from existing above.
>>                  - Merge the new code back into HDFS later 
>> will be harder.
>>
>>                    **The code and goals will diverge further.
>>                    ** We will be taking on extra work to split 
>> and then take extra work to merge.
>>                    ** The issues raised today will be raised 
>> all the same then.
>> 
>>
>>                ---------------------------------------------------------------------
>>                To unsubscribe, e-mail: 
>> hdfs-dev-unsubscribe@hadoop.apache.org
>>                For additional commands, e-mail: 
>> hdfs-dev-help@hadoop.apache.org
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: 
>> yarn-dev-help@hadoop.apache.org
>> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by "Clay B." <cw...@clayb.net>.

Oops, retrying now subscribed to more than solely yarn-dev.

-Clay

On Wed, 28 Feb 2018, Clay B. wrote:

> +1 (non-binding)
>
> I have walked through the code and find it very compelling as a 
> user; I really look forward to seeing the Ozone code mature and 
> it maturing HDFS features together. The points which excite me 
> as an eight year HDFS user are:
>
> * Excitement for making the datanode a storage technology 
> container - this
>  patch clearly brings fresh thought to HDFS keeping it from 
> growing stale
>
> * Ability to build upon a shared storage infrastructure for 
> diverse
>  loads: I do not want to have "stranded" storage capacity or 
> have to
>  manage competing storage systems on the same disks (and 
> further I want
>  the metrics datanodes can provide me today, so I do not have 
> to
>  instrument two systems or evolve their instrumentation 
> separately).
>
> * Looking forward to supporting object-sized files!
>
> * Moves HDFS in the right direction to test out new block 
> management
>  techniques for scaling HDFS. I am really excited to see the 
> raft
>  integration; I hope it opens a new era in Hadoop matching 
> modern systems
>  design with new consistency and replication options in our 
> ever
>  distributed ecosystem.
>
> -Clay
>
> On Mon, 26 Feb 2018, Jitendra Pandey wrote:
>
>>    Dear folks,
>>           We would like to start a vote to merge HDFS-7240 
>> branch into trunk. The context can be reviewed in the 
>> DISCUSSION thread, and in the jiras (See references below).
>>
>>    HDFS-7240 introduces Hadoop Distributed Storage Layer 
>> (HDSL), which is a distributed, replicated block layer.
>>    The old HDFS namespace and NN can be connected to this new 
>> block layer as we have described in HDFS-10419.
>>    We also introduce a key-value namespace called Ozone built 
>> on HDSL.
>>
>>    The code is in a separate module and is turned off by 
>> default. In a secure setup, HDSL and Ozone daemons cannot be 
>> started.
>>
>>    The detailed documentation is available at
>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Distributed+Storage+Layer+and+Applications
>> 
>>
>>    I will start with my vote.
>>            +1 (binding)
>> 
>>
>>    Discussion Thread:
>>              https://s.apache.org/7240-merge
>>              https://s.apache.org/4sfU
>>
>>    Jiras:
>>               https://issues.apache.org/jira/browse/HDFS-7240
>>               https://issues.apache.org/jira/browse/HDFS-10419
>>               https://issues.apache.org/jira/browse/HDFS-13074
>>               https://issues.apache.org/jira/browse/HDFS-13180
>> 
>>
>>    Thanks
>>    jitendra
>> 
>> 
>> 
>> 
>>
>>            DISCUSSION THREAD SUMMARY :
>>
>>            On 2/13/18, 6:28 PM, "sanjay Radia" 
>> <sa...@gmail.com> wrote:
>>
>>                Sorry the formatting got messed by my email 
>> client.  Here it is again
>> 
>>
>>                Dear
>>                 Hadoop Community Members,
>>
>>                   We had multiple community discussions, a few 
>> meetings in smaller groups and also jira discussions with 
>> respect to this thread. We express our gratitude for 
>> participation and valuable comments.
>>
>>                The key questions raised were following
>>                1) How the new block storage layer and OzoneFS 
>> benefit HDFS and we were asked to chalk out a roadmap towards 
>> the goal of a scalable namenode working with the new storage 
>> layer
>>                2) We were asked to provide a security design
>>                3)There were questions around stability given 
>> ozone brings in a large body of code.
>>                4) Why can?t they be separate projects forever 
>> or merged in when production ready?
>>
>>                We have responded to all the above questions 
>> with detailed explanations and answers on the jira as well as 
>> in the discussions. We believe that should sufficiently 
>> address community?s concerns.
>>
>>                Please see the summary below:
>>
>>                1) The new code base benefits HDFS scaling and 
>> a roadmap has been provided.
>>
>>                Summary:
>>                  - New block storage layer addresses the 
>> scalability of the block layer. We have shown how existing NN 
>> can be connected to the new block layer and its benefits. We 
>> have shown 2 milestones, 1st milestone is much simpler than 
>> 2nd milestone while giving almost the same scaling benefits. 
>> Originally we had proposed simply milestone 2 and the 
>> community felt that removing the FSN/BM lock was was a fair 
>> amount of work and a simpler solution would be useful
>>                  - We provide a new K-V namespace called Ozone 
>> FS with FileSystem/FileContext plugins to allow the users to 
>> use the new system. BTW Hive and Spark work very well on 
>> KV-namespaces on the cloud. This will facilitate stabilizing 
>> the new block layer.
>>                  - The new block layer has a new netty based 
>> protocol engine in the Datanode which, when stabilized, can be 
>> used by  the old hdfs block layer. See details below on 
>> sharing of code.
>> 
>>
>>                2) Stability impact on the existing HDFS code 
>> base and code separation. The new block layer and the OzoneFS 
>> are in modules that are separate from old HDFS code - 
>> currently there are no calls from HDFS into Ozone except for 
>> DN starting the new block  layer module if configured to do 
>> so. It does not add instability (the instability argument has 
>> been raised many times). Over time as we share code, we will 
>> ensure that the old HDFS continues to remains stable. (for 
>> example we plan to stabilize the new netty based protocol 
>> engine in the new block layer before sharing it with HDFS?s 
>> old block layer)
>> 
>>
>>                3) In the short term and medium term, the new 
>> system and HDFS  will be used side-by-side by users. Side 
>> by-side usage in the short term for testing and side-by-side 
>> in the medium term for actual production use till the new 
>> system has feature parity with old HDFS. During this time, 
>> sharing the DN daemon and admin functions between the two 
>> systems is operationally important:
>>                  - Sharing DN daemon to avoid additional 
>> operational daemon lifecycle management
>>                  - Common decommissioning of the daemon and 
>> DN: One place to decommission for a node and its storage.
>>                  - Replacing failed disks and internal 
>> balancing capacity across disks - this needs to be done for 
>> both the current HDFS blocks and the new block-layer blocks.
>>                  - Balancer: we would like use the same 
>> balancer and provide a common way to balance and common 
>> management of the bandwidth used for balancing
>>                  - Security configuration setup - reuse 
>> existing set up for DNs rather then a new one for an 
>> independent cluster.
>> 
>>
>>                4) Need to easily share the block layer code 
>> between the two systems when used side-by-side. Areas where 
>> sharing code is desired over time:
>>                  - Sharing new block layer?s  new netty based 
>> protocol engine for old HDFS DNs (a long time sore issue for 
>> HDFS block layer).
>>                  - Shallow data copy from old system to new 
>> system is practical only if within same project and daemon 
>> otherwise have to deal with security setting and coordinations 
>> across daemons. Shallow copy is useful as customer migrate 
>> from old to new.
>>                  - Shared disk scheduling in the future and in 
>> the short term have a single round robin rather than 
>> independent round robins.
>>                While sharing code across projects is 
>> technically possible (anything is possible in software),  it 
>> is significantly harder typically requiring  cleaner public 
>> apis etc. Sharing within a project though internal APIs is 
>> often simpler (such as the protocol engine that we want to 
>> share).
>> 
>>
>>                5) Security design, including a threat model 
>> and and the solution has been posted.
>>                6) Temporary Separation and merge later: 
>> Several of the comments in the jira have argued that we 
>> temporarily separate the two code bases for now and then later 
>> merge them when the new code is stable:
>>
>>                  - If there is agreement to merge later, why 
>> bother separating now - there needs to be to be good reasons 
>> to separate now.  We have addressed the stability and 
>> separation of the new code from existing above.
>>                  - Merge the new code back into HDFS later 
>> will be harder.
>>
>>                    **The code and goals will diverge further.
>>                    ** We will be taking on extra work to split 
>> and then take extra work to merge.
>>                    ** The issues raised today will be raised 
>> all the same then.
>> 
>>
>>                ---------------------------------------------------------------------
>>                To unsubscribe, e-mail: 
>> hdfs-dev-unsubscribe@hadoop.apache.org
>>                For additional commands, e-mail: 
>> hdfs-dev-help@hadoop.apache.org
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: 
>> yarn-dev-help@hadoop.apache.org
>> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by "Clay B." <cw...@clayb.net>.

Oops, retrying now subscribed to more than solely yarn-dev.

-Clay

On Wed, 28 Feb 2018, Clay B. wrote:

> +1 (non-binding)
>
> I have walked through the code and find it very compelling as a 
> user; I really look forward to seeing the Ozone code mature and 
> it maturing HDFS features together. The points which excite me 
> as an eight year HDFS user are:
>
> * Excitement for making the datanode a storage technology 
> container - this
>  patch clearly brings fresh thought to HDFS keeping it from 
> growing stale
>
> * Ability to build upon a shared storage infrastructure for 
> diverse
>  loads: I do not want to have "stranded" storage capacity or 
> have to
>  manage competing storage systems on the same disks (and 
> further I want
>  the metrics datanodes can provide me today, so I do not have 
> to
>  instrument two systems or evolve their instrumentation 
> separately).
>
> * Looking forward to supporting object-sized files!
>
> * Moves HDFS in the right direction to test out new block 
> management
>  techniques for scaling HDFS. I am really excited to see the 
> raft
>  integration; I hope it opens a new era in Hadoop matching 
> modern systems
>  design with new consistency and replication options in our 
> ever
>  distributed ecosystem.
>
> -Clay
>
> On Mon, 26 Feb 2018, Jitendra Pandey wrote:
>
>>    Dear folks,
>>           We would like to start a vote to merge HDFS-7240 
>> branch into trunk. The context can be reviewed in the 
>> DISCUSSION thread, and in the jiras (See references below).
>>
>>    HDFS-7240 introduces Hadoop Distributed Storage Layer 
>> (HDSL), which is a distributed, replicated block layer.
>>    The old HDFS namespace and NN can be connected to this new 
>> block layer as we have described in HDFS-10419.
>>    We also introduce a key-value namespace called Ozone built 
>> on HDSL.
>>
>>    The code is in a separate module and is turned off by 
>> default. In a secure setup, HDSL and Ozone daemons cannot be 
>> started.
>>
>>    The detailed documentation is available at
>>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Distributed+Storage+Layer+and+Applications
>> 
>>
>>    I will start with my vote.
>>            +1 (binding)
>> 
>>
>>    Discussion Thread:
>>              https://s.apache.org/7240-merge
>>              https://s.apache.org/4sfU
>>
>>    Jiras:
>>               https://issues.apache.org/jira/browse/HDFS-7240
>>               https://issues.apache.org/jira/browse/HDFS-10419
>>               https://issues.apache.org/jira/browse/HDFS-13074
>>               https://issues.apache.org/jira/browse/HDFS-13180
>> 
>>
>>    Thanks
>>    jitendra
>> 
>> 
>> 
>> 
>>
>>            DISCUSSION THREAD SUMMARY :
>>
>>            On 2/13/18, 6:28 PM, "sanjay Radia" 
>> <sa...@gmail.com> wrote:
>>
>>                Sorry the formatting got messed by my email 
>> client.  Here it is again
>> 
>>
>>                Dear
>>                 Hadoop Community Members,
>>
>>                   We had multiple community discussions, a few 
>> meetings in smaller groups and also jira discussions with 
>> respect to this thread. We express our gratitude for 
>> participation and valuable comments.
>>
>>                The key questions raised were following
>>                1) How the new block storage layer and OzoneFS 
>> benefit HDFS and we were asked to chalk out a roadmap towards 
>> the goal of a scalable namenode working with the new storage 
>> layer
>>                2) We were asked to provide a security design
>>                3)There were questions around stability given 
>> ozone brings in a large body of code.
>>                4) Why can?t they be separate projects forever 
>> or merged in when production ready?
>>
>>                We have responded to all the above questions 
>> with detailed explanations and answers on the jira as well as 
>> in the discussions. We believe that should sufficiently 
>> address community?s concerns.
>>
>>                Please see the summary below:
>>
>>                1) The new code base benefits HDFS scaling and 
>> a roadmap has been provided.
>>
>>                Summary:
>>                  - New block storage layer addresses the 
>> scalability of the block layer. We have shown how existing NN 
>> can be connected to the new block layer and its benefits. We 
>> have shown 2 milestones, 1st milestone is much simpler than 
>> 2nd milestone while giving almost the same scaling benefits. 
>> Originally we had proposed simply milestone 2 and the 
>> community felt that removing the FSN/BM lock was was a fair 
>> amount of work and a simpler solution would be useful
>>                  - We provide a new K-V namespace called Ozone 
>> FS with FileSystem/FileContext plugins to allow the users to 
>> use the new system. BTW Hive and Spark work very well on 
>> KV-namespaces on the cloud. This will facilitate stabilizing 
>> the new block layer.
>>                  - The new block layer has a new netty based 
>> protocol engine in the Datanode which, when stabilized, can be 
>> used by  the old hdfs block layer. See details below on 
>> sharing of code.
>> 
>>
>>                2) Stability impact on the existing HDFS code 
>> base and code separation. The new block layer and the OzoneFS 
>> are in modules that are separate from old HDFS code - 
>> currently there are no calls from HDFS into Ozone except for 
>> DN starting the new block  layer module if configured to do 
>> so. It does not add instability (the instability argument has 
>> been raised many times). Over time as we share code, we will 
>> ensure that the old HDFS continues to remains stable. (for 
>> example we plan to stabilize the new netty based protocol 
>> engine in the new block layer before sharing it with HDFS?s 
>> old block layer)
>> 
>>
>>                3) In the short term and medium term, the new 
>> system and HDFS  will be used side-by-side by users. Side 
>> by-side usage in the short term for testing and side-by-side 
>> in the medium term for actual production use till the new 
>> system has feature parity with old HDFS. During this time, 
>> sharing the DN daemon and admin functions between the two 
>> systems is operationally important:
>>                  - Sharing DN daemon to avoid additional 
>> operational daemon lifecycle management
>>                  - Common decommissioning of the daemon and 
>> DN: One place to decommission for a node and its storage.
>>                  - Replacing failed disks and internal 
>> balancing capacity across disks - this needs to be done for 
>> both the current HDFS blocks and the new block-layer blocks.
>>                  - Balancer: we would like use the same 
>> balancer and provide a common way to balance and common 
>> management of the bandwidth used for balancing
>>                  - Security configuration setup - reuse 
>> existing set up for DNs rather then a new one for an 
>> independent cluster.
>> 
>>
>>                4) Need to easily share the block layer code 
>> between the two systems when used side-by-side. Areas where 
>> sharing code is desired over time:
>>                  - Sharing new block layer?s  new netty based 
>> protocol engine for old HDFS DNs (a long time sore issue for 
>> HDFS block layer).
>>                  - Shallow data copy from old system to new 
>> system is practical only if within same project and daemon 
>> otherwise have to deal with security setting and coordinations 
>> across daemons. Shallow copy is useful as customer migrate 
>> from old to new.
>>                  - Shared disk scheduling in the future and in 
>> the short term have a single round robin rather than 
>> independent round robins.
>>                While sharing code across projects is 
>> technically possible (anything is possible in software),  it 
>> is significantly harder typically requiring  cleaner public 
>> apis etc. Sharing within a project though internal APIs is 
>> often simpler (such as the protocol engine that we want to 
>> share).
>> 
>>
>>                5) Security design, including a threat model 
>> and and the solution has been posted.
>>                6) Temporary Separation and merge later: 
>> Several of the comments in the jira have argued that we 
>> temporarily separate the two code bases for now and then later 
>> merge them when the new code is stable:
>>
>>                  - If there is agreement to merge later, why 
>> bother separating now - there needs to be to be good reasons 
>> to separate now.  We have addressed the stability and 
>> separation of the new code from existing above.
>>                  - Merge the new code back into HDFS later 
>> will be harder.
>>
>>                    **The code and goals will diverge further.
>>                    ** We will be taking on extra work to split 
>> and then take extra work to merge.
>>                    ** The issues raised today will be raised 
>> all the same then.
>> 
>>
>>                ---------------------------------------------------------------------
>>                To unsubscribe, e-mail: 
>> hdfs-dev-unsubscribe@hadoop.apache.org
>>                For additional commands, e-mail: 
>> hdfs-dev-help@hadoop.apache.org
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
>> For additional commands, e-mail: 
>> yarn-dev-help@hadoop.apache.org
>> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by "Clay B." <cw...@clayb.net>.

+1 (non-binding)

I have walked through the code and find it very compelling as a user; I 
really look forward to seeing the Ozone code mature and it maturing HDFS 
features together. The points which excite me as an eight year HDFS user 
are:

* Excitement for making the datanode a storage technology container - this
   patch clearly brings fresh thought to HDFS keeping it from growing stale

* Ability to build upon a shared storage infrastructure for diverse
   loads: I do not want to have "stranded" storage capacity or have to
   manage competing storage systems on the same disks (and further I want
   the metrics datanodes can provide me today, so I do not have to
   instrument two systems or evolve their instrumentation separately).

* Looking forward to supporting object-sized files!

* Moves HDFS in the right direction to test out new block management
   techniques for scaling HDFS. I am really excited to see the raft
   integration; I hope it opens a new era in Hadoop matching modern systems
   design with new consistency and replication options in our ever
   distributed ecosystem.

-Clay

On Mon, 26 Feb 2018, Jitendra Pandey wrote:

>    Dear folks,
>           We would like to start a vote to merge HDFS-7240 branch into trunk. The context can be reviewed in the DISCUSSION thread, and in the jiras (See references below).
>
>    HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is a distributed, replicated block layer.
>    The old HDFS namespace and NN can be connected to this new block layer as we have described in HDFS-10419.
>    We also introduce a key-value namespace called Ozone built on HDSL.
>
>    The code is in a separate module and is turned off by default. In a secure setup, HDSL and Ozone daemons cannot be started.
>
>    The detailed documentation is available at
>             https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Distributed+Storage+Layer+and+Applications
>
>
>    I will start with my vote.
>            +1 (binding)
>
>
>    Discussion Thread:
>              https://s.apache.org/7240-merge
>              https://s.apache.org/4sfU
>
>    Jiras:
>               https://issues.apache.org/jira/browse/HDFS-7240
>               https://issues.apache.org/jira/browse/HDFS-10419
>               https://issues.apache.org/jira/browse/HDFS-13074
>               https://issues.apache.org/jira/browse/HDFS-13180
>
>
>    Thanks
>    jitendra
>
>
>
>
>
>            DISCUSSION THREAD SUMMARY :
>
>            On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com> wrote:
>
>                Sorry the formatting got messed by my email client.  Here it is again
>
>
>                Dear
>                 Hadoop Community Members,
>
>                   We had multiple community discussions, a few meetings in smaller groups and also jira discussions with respect to this thread. We express our gratitude for participation and valuable comments.
>
>                The key questions raised were following
>                1) How the new block storage layer and OzoneFS benefit HDFS and we were asked to chalk out a roadmap towards the goal of a scalable namenode working with the new storage layer
>                2) We were asked to provide a security design
>                3)There were questions around stability given ozone brings in a large body of code.
>                4) Why can?t they be separate projects forever or merged in when production ready?
>
>                We have responded to all the above questions with detailed explanations and answers on the jira as well as in the discussions. We believe that should sufficiently address community?s concerns.
>
>                Please see the summary below:
>
>                1) The new code base benefits HDFS scaling and a roadmap has been provided.
>
>                Summary:
>                  - New block storage layer addresses the scalability of the block layer. We have shown how existing NN can be connected to the new block layer and its benefits. We have shown 2 milestones, 1st milestone is much simpler than 2nd milestone while giving almost the same scaling benefits. Originally we had proposed simply milestone 2 and the community felt that removing the FSN/BM lock was was a fair amount of work and a simpler solution would be useful
>                  - We provide a new K-V namespace called Ozone FS with FileSystem/FileContext plugins to allow the users to use the new system. BTW Hive and Spark work very well on KV-namespaces on the cloud. This will facilitate stabilizing the new block layer.
>                  - The new block layer has a new netty based protocol engine in the Datanode which, when stabilized, can be used by  the old hdfs block layer. See details below on sharing of code.
>
>
>                2) Stability impact on the existing HDFS code base and code separation. The new block layer and the OzoneFS are in modules that are separate from old HDFS code - currently there are no calls from HDFS into Ozone except for DN starting the new block  layer module if configured to do so. It does not add instability (the instability argument has been raised many times). Over time as we share code, we will ensure that the old HDFS continues to remains stable. (for example we plan to stabilize the new netty based protocol engine in the new block layer before sharing it with HDFS?s old block layer)
>
>
>                3) In the short term and medium term, the new system and HDFS  will be used side-by-side by users. Side by-side usage in the short term for testing and side-by-side in the medium term for actual production use till the new system has feature parity with old HDFS. During this time, sharing the DN daemon and admin functions between the two systems is operationally important:
>                  - Sharing DN daemon to avoid additional operational daemon lifecycle management
>                  - Common decommissioning of the daemon and DN: One place to decommission for a node and its storage.
>                  - Replacing failed disks and internal balancing capacity across disks - this needs to be done for both the current HDFS blocks and the new block-layer blocks.
>                  - Balancer: we would like use the same balancer and provide a common way to balance and common management of the bandwidth used for balancing
>                  - Security configuration setup - reuse existing set up for DNs rather then a new one for an independent cluster.
>
>
>                4) Need to easily share the block layer code between the two systems when used side-by-side. Areas where sharing code is desired over time:
>                  - Sharing new block layer?s  new netty based protocol engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>                  - Shallow data copy from old system to new system is practical only if within same project and daemon otherwise have to deal with security setting and coordinations across daemons. Shallow copy is useful as customer migrate from old to new.
>                  - Shared disk scheduling in the future and in the short term have a single round robin rather than independent round robins.
>                While sharing code across projects is technically possible (anything is possible in software),  it is significantly harder typically requiring  cleaner public apis etc. Sharing within a project though internal APIs is often simpler (such as the protocol engine that we want to share).
>
>
>                5) Security design, including a threat model and and the solution has been posted.
>                6) Temporary Separation and merge later: Several of the comments in the jira have argued that we temporarily separate the two code bases for now and then later merge them when the new code is stable:
>
>                  - If there is agreement to merge later, why bother separating now - there needs to be to be good reasons to separate now.  We have addressed the stability and separation of the new code from existing above.
>                  - Merge the new code back into HDFS later will be harder.
>
>                    **The code and goals will diverge further.
>                    ** We will be taking on extra work to split and then take extra work to merge.
>                    ** The issues raised today will be raised all the same then.
>
>
>                ---------------------------------------------------------------------
>                To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
>                For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org
>
>
>
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Daryn Sharp <da...@oath.com.INVALID>.

I’m generally neutral and looked foremost at developer impact.  Ie.  Will
it be so intertwined with hdfs that each project risks destabilizing the
other?  Will developers with no expertise in ozone will be impeded?  I
think the answer is currently no.  These are the intersections and some
concerns based on the assumption ozone is accepted into the project:

Common

Appear to be a number of superfluous changes.  The conf servlet must not be
polluted with specific references and logic for ozone.  We don’t create
dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
“ozone free”.

Datanode

I expected ozone changes to be intricately linked with the existing blocks
map, dataset, volume, etc.  Thankfully it’s not.  As an independent
service, the DN should not be polluted with specific references to ozone.
If ozone is in the project, the DN should have a generic plugin interface
conceptually similar to the NM aux services.

Namenode

No impact, currently, but certainly will be…

Code Location

I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
hadoop-hdsl-project.  This clean separation will make it easier to later
spin off or pull in depending on which way we vote.

Dependencies

Owen hit upon his before I could send.  Hadoop is already bursting with
dependencies, I hope this doesn’t pull in a lot more.

––

Do I think ozone be should be a separate project?  If we view it only as a
competing filesystem, then clearly yes.  If it’s a low risk evolutionary
step with near-term benefits, no, we want to keep it close and help it
evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
umbrella term for too many technologies that should perhaps be split.  I'm
interested in the container block management.  I have little interest at
this time in the key store.

The usability of ozone, specifically container management, is unclear to
me.  It lacks basic features like changing replication factors, append, a
migration path, security, etc - I know there are good plans for all of it -
yet another goal is splicing into the NN.  That’s a lot of high priority
items to tackle that need to be carefully orchestrated before contemplating
BM replacement.  Each of those is a non-starter for (my) production
environment.  We need to make sure we can reach a consensus on the block
level functionality before rushing it into the NN.  That’s independent of
whether allowing it into the project.

The BM/SCM changes to the NN are realistically going to be contentious &
destabilizing.  If done correctly, the BM separation will be a big win for
the NN.  If ozone is out, by necessity interfaces will need to be stable
and well-defined but we won’t get that right for a long time.  Interface
and logic changes that break the other will be difficult to coordinate and
we’ll likely veto changes that impact the other.  If ozone is in, we can
hopefully synchronize the changes with less friction, but it greatly
increases the chances of developers riddling the NN with hacks and/or ozone
specific logic that makes it even more brittle.  I will note we need to be
vigilant against pervasive conditionals (ie. EC, snapshots).

In either case, I think ozone must agree to not impede current hdfs work.
I’ll compare to hdfs is a store owner that plans to maybe retire in 5
years.  A potential new owner (ozone) is lined up and hdfs graciously gives
them no-rent space (the DN).  Precondition is help improve the store.
Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
that complicate hdfs but ignore it due to anticipation of its
departure/demise.  I’m not implying that’s currently happening, it’s just
what I don’t want to see.

We as a community and our customers need an evolution, not a revolution,
and definitively not a civil war.  Hdfs has too much legacy code rot that
is hard to change.  Too many poorly implemented features.   Perhaps I’m
overly optimistic that freshly redesigned code can counterbalance
performance degradations in the NN.  I’m also reluctant, but realize it is
being driven by some hdfs veterans that know/understand historical hdfs
design strengths and flaws.

If the initially cited issues are addressed, I’m +0.5 for the concept of
bringing in ozone if it's not going to be a proverbial bull in the china
shop.

Daryn

On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

>     Dear folks,
>            We would like to start a vote to merge HDFS-7240 branch into
> trunk. The context can be reviewed in the DISCUSSION thread, and in the
> jiras (See references below).
>
>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is
> a distributed, replicated block layer.
>     The old HDFS namespace and NN can be connected to this new block layer
> as we have described in HDFS-10419.
>     We also introduce a key-value namespace called Ozone built on HDSL.
>
>     The code is in a separate module and is turned off by default. In a
> secure setup, HDSL and Ozone daemons cannot be started.
>
>     The detailed documentation is available at
>              https://cwiki.apache.org/confluence/display/HADOOP/
> Hadoop+Distributed+Storage+Layer+and+Applications
>
>
>     I will start with my vote.
>             +1 (binding)
>
>
>     Discussion Thread:
>               https://s.apache.org/7240-merge
>               https://s.apache.org/4sfU
>
>     Jiras:
>                https://issues.apache.org/jira/browse/HDFS-7240
>                https://issues.apache.org/jira/browse/HDFS-10419
>                https://issues.apache.org/jira/browse/HDFS-13074
>                https://issues.apache.org/jira/browse/HDFS-13180
>
>
>     Thanks
>     jitendra
>
>
>
>
>
>             DISCUSSION THREAD SUMMARY :
>
>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> wrote:
>
>                 Sorry the formatting got messed by my email client.  Here
> it is again
>
>
>                 Dear
>                  Hadoop Community Members,
>
>                    We had multiple community discussions, a few meetings
> in smaller groups and also jira discussions with respect to this thread. We
> express our gratitude for participation and valuable comments.
>
>                 The key questions raised were following
>                 1) How the new block storage layer and OzoneFS benefit
> HDFS and we were asked to chalk out a roadmap towards the goal of a
> scalable namenode working with the new storage layer
>                 2) We were asked to provide a security design
>                 3)There were questions around stability given ozone brings
> in a large body of code.
>                 4) Why can’t they be separate projects forever or merged
> in when production ready?
>
>                 We have responded to all the above questions with detailed
> explanations and answers on the jira as well as in the discussions. We
> believe that should sufficiently address community’s concerns.
>
>                 Please see the summary below:
>
>                 1) The new code base benefits HDFS scaling and a roadmap
> has been provided.
>
>                 Summary:
>                   - New block storage layer addresses the scalability of
> the block layer. We have shown how existing NN can be connected to the new
> block layer and its benefits. We have shown 2 milestones, 1st milestone is
> much simpler than 2nd milestone while giving almost the same scaling
> benefits. Originally we had proposed simply milestone 2 and the community
> felt that removing the FSN/BM lock was was a fair amount of work and a
> simpler solution would be useful
>                   - We provide a new K-V namespace called Ozone FS with
> FileSystem/FileContext plugins to allow the users to use the new system.
> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
> facilitate stabilizing the new block layer.
>                   - The new block layer has a new netty based protocol
> engine in the Datanode which, when stabilized, can be used by  the old hdfs
> block layer. See details below on sharing of code.
>
>
>                 2) Stability impact on the existing HDFS code base and
> code separation. The new block layer and the OzoneFS are in modules that
> are separate from old HDFS code - currently there are no calls from HDFS
> into Ozone except for DN starting the new block  layer module if configured
> to do so. It does not add instability (the instability argument has been
> raised many times). Over time as we share code, we will ensure that the old
> HDFS continues to remains stable. (for example we plan to stabilize the new
> netty based protocol engine in the new block layer before sharing it with
> HDFS’s old block layer)
>
>
>                 3) In the short term and medium term, the new system and
> HDFS  will be used side-by-side by users. Side by-side usage in the short
> term for testing and side-by-side in the medium term for actual production
> use till the new system has feature parity with old HDFS. During this time,
> sharing the DN daemon and admin functions between the two systems is
> operationally important:
>                   - Sharing DN daemon to avoid additional operational
> daemon lifecycle management
>                   - Common decommissioning of the daemon and DN: One place
> to decommission for a node and its storage.
>                   - Replacing failed disks and internal balancing capacity
> across disks - this needs to be done for both the current HDFS blocks and
> the new block-layer blocks.
>                   - Balancer: we would like use the same balancer and
> provide a common way to balance and common management of the bandwidth used
> for balancing
>                   - Security configuration setup - reuse existing set up
> for DNs rather then a new one for an independent cluster.
>
>
>                 4) Need to easily share the block layer code between the
> two systems when used side-by-side. Areas where sharing code is desired
> over time:
>                   - Sharing new block layer’s  new netty based protocol
> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>                   - Shallow data copy from old system to new system is
> practical only if within same project and daemon otherwise have to deal
> with security setting and coordinations across daemons. Shallow copy is
> useful as customer migrate from old to new.
>                   - Shared disk scheduling in the future and in the short
> term have a single round robin rather than independent round robins.
>                 While sharing code across projects is technically possible
> (anything is possible in software),  it is significantly harder typically
> requiring  cleaner public apis etc. Sharing within a project though
> internal APIs is often simpler (such as the protocol engine that we want to
> share).
>
>
>                 5) Security design, including a threat model and and the
> solution has been posted.
>                 6) Temporary Separation and merge later: Several of the
> comments in the jira have argued that we temporarily separate the two code
> bases for now and then later merge them when the new code is stable:
>
>                   - If there is agreement to merge later, why bother
> separating now - there needs to be to be good reasons to separate now.  We
> have addressed the stability and separation of the new code from existing
> above.
>                   - Merge the new code back into HDFS later will be harder.
>
>                     **The code and goals will diverge further.
>                     ** We will be taking on extra work to split and then
> take extra work to merge.
>                     ** The issues raised today will be raised all the same
> then.
>
>
>                 ------------------------------
> ---------------------------------------
>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> apache.org
>                 For additional commands, e-mail:
> hdfs-dev-help@hadoop.apache.org
>
>
>
>
>
>
>
>
>
>

-- 

Daryn

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

*Hi Jitendra and all,Thanks for putting this together. I caught up on the
discussion on JIRA and document at HDFS-10419, and still have the same
concerns raised earlier
<https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
about merging the Ozone branch to trunk.To recap these questions/concerns
at a very high level:* Wouldn't Ozone benefit from being a separate
project?* Why should it be merged now?I still believe that both Ozone and
Hadoop would benefit from Ozone being a separate project, and that there is
no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
for merging is that the Ozone is that it's at a stage where it's ready for
user feedback. Second, that it needs to be merged to start on the NN
refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
testing against the Ozone object storage interface. Ozone and HDSL
themselves are implemented as separate masters and new functionality bolted
onto the datanode. It also doesn't look like HDFS in terms of API or
featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
HDFS features like erasure coding, encryption, high-availability,
snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
means that Ozone feels like a new, different system that could reasonably
be deployed and tested separately from HDFS. It's unlikely to replace many
of today's HDFS deployments, and from what I understand, Ozone was not
designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
itself is a major undertaking. The discussion on HDFS-10419 is still
ongoing so it’s not clear what the ultimate refactoring will be, but I do
know that the earlier FSN/BM refactoring during 2.x was very painful
(introducing new bugs and making backports difficult) and probably should
have been deferred to a new major release instead. I think this refactoring
is important for the long-term maintainability of the NN and worth
pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
for starting this refactoring. Really, I see the refactoring as the
prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
that Ozone/HDSL itself would benefit from being a separate project. Ozone
could release faster and iterate more quickly if it wasn't hampered by
Hadoop's release schedule and security and compatibility requirements.
There are also publicity and community benefits; it's an opportunity to
build a community focused on the novel capabilities and architectural
choices of Ozone/HDSL. There are examples of other projects that were
"incubated" on a branch in the Hadoop repo before being spun off to great
success.In conclusion, I'd like to see Ozone succeeding and thriving as a
separate project. Meanwhile, we can work on the HDFS refactoring required
to separate the FSN and BM and make it pluggable. At that point (likely in
the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
Best,
Andrew

On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

>     Dear folks,
>            We would like to start a vote to merge HDFS-7240 branch into
> trunk. The context can be reviewed in the DISCUSSION thread, and in the
> jiras (See references below).
>
>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is
> a distributed, replicated block layer.
>     The old HDFS namespace and NN can be connected to this new block layer
> as we have described in HDFS-10419.
>     We also introduce a key-value namespace called Ozone built on HDSL.
>
>     The code is in a separate module and is turned off by default. In a
> secure setup, HDSL and Ozone daemons cannot be started.
>
>     The detailed documentation is available at
>              https://cwiki.apache.org/confluence/display/HADOOP/
> Hadoop+Distributed+Storage+Layer+and+Applications
>
>
>     I will start with my vote.
>             +1 (binding)
>
>
>     Discussion Thread:
>               https://s.apache.org/7240-merge
>               https://s.apache.org/4sfU
>
>     Jiras:
>                https://issues.apache.org/jira/browse/HDFS-7240
>                https://issues.apache.org/jira/browse/HDFS-10419
>                https://issues.apache.org/jira/browse/HDFS-13074
>                https://issues.apache.org/jira/browse/HDFS-13180
>
>
>     Thanks
>     jitendra
>
>
>
>
>
>             DISCUSSION THREAD SUMMARY :
>
>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> wrote:
>
>                 Sorry the formatting got messed by my email client.  Here
> it is again
>
>
>                 Dear
>                  Hadoop Community Members,
>
>                    We had multiple community discussions, a few meetings
> in smaller groups and also jira discussions with respect to this thread. We
> express our gratitude for participation and valuable comments.
>
>                 The key questions raised were following
>                 1) How the new block storage layer and OzoneFS benefit
> HDFS and we were asked to chalk out a roadmap towards the goal of a
> scalable namenode working with the new storage layer
>                 2) We were asked to provide a security design
>                 3)There were questions around stability given ozone brings
> in a large body of code.
>                 4) Why can’t they be separate projects forever or merged
> in when production ready?
>
>                 We have responded to all the above questions with detailed
> explanations and answers on the jira as well as in the discussions. We
> believe that should sufficiently address community’s concerns.
>
>                 Please see the summary below:
>
>                 1) The new code base benefits HDFS scaling and a roadmap
> has been provided.
>
>                 Summary:
>                   - New block storage layer addresses the scalability of
> the block layer. We have shown how existing NN can be connected to the new
> block layer and its benefits. We have shown 2 milestones, 1st milestone is
> much simpler than 2nd milestone while giving almost the same scaling
> benefits. Originally we had proposed simply milestone 2 and the community
> felt that removing the FSN/BM lock was was a fair amount of work and a
> simpler solution would be useful
>                   - We provide a new K-V namespace called Ozone FS with
> FileSystem/FileContext plugins to allow the users to use the new system.
> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
> facilitate stabilizing the new block layer.
>                   - The new block layer has a new netty based protocol
> engine in the Datanode which, when stabilized, can be used by  the old hdfs
> block layer. See details below on sharing of code.
>
>
>                 2) Stability impact on the existing HDFS code base and
> code separation. The new block layer and the OzoneFS are in modules that
> are separate from old HDFS code - currently there are no calls from HDFS
> into Ozone except for DN starting the new block  layer module if configured
> to do so. It does not add instability (the instability argument has been
> raised many times). Over time as we share code, we will ensure that the old
> HDFS continues to remains stable. (for example we plan to stabilize the new
> netty based protocol engine in the new block layer before sharing it with
> HDFS’s old block layer)
>
>
>                 3) In the short term and medium term, the new system and
> HDFS  will be used side-by-side by users. Side by-side usage in the short
> term for testing and side-by-side in the medium term for actual production
> use till the new system has feature parity with old HDFS. During this time,
> sharing the DN daemon and admin functions between the two systems is
> operationally important:
>                   - Sharing DN daemon to avoid additional operational
> daemon lifecycle management
>                   - Common decommissioning of the daemon and DN: One place
> to decommission for a node and its storage.
>                   - Replacing failed disks and internal balancing capacity
> across disks - this needs to be done for both the current HDFS blocks and
> the new block-layer blocks.
>                   - Balancer: we would like use the same balancer and
> provide a common way to balance and common management of the bandwidth used
> for balancing
>                   - Security configuration setup - reuse existing set up
> for DNs rather then a new one for an independent cluster.
>
>
>                 4) Need to easily share the block layer code between the
> two systems when used side-by-side. Areas where sharing code is desired
> over time:
>                   - Sharing new block layer’s  new netty based protocol
> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>                   - Shallow data copy from old system to new system is
> practical only if within same project and daemon otherwise have to deal
> with security setting and coordinations across daemons. Shallow copy is
> useful as customer migrate from old to new.
>                   - Shared disk scheduling in the future and in the short
> term have a single round robin rather than independent round robins.
>                 While sharing code across projects is technically possible
> (anything is possible in software),  it is significantly harder typically
> requiring  cleaner public apis etc. Sharing within a project though
> internal APIs is often simpler (such as the protocol engine that we want to
> share).
>
>
>                 5) Security design, including a threat model and and the
> solution has been posted.
>                 6) Temporary Separation and merge later: Several of the
> comments in the jira have argued that we temporarily separate the two code
> bases for now and then later merge them when the new code is stable:
>
>                   - If there is agreement to merge later, why bother
> separating now - there needs to be to be good reasons to separate now.  We
> have addressed the stability and separation of the new code from existing
> above.
>                   - Merge the new code back into HDFS later will be harder.
>
>                     **The code and goals will diverge further.
>                     ** We will be taking on extra work to split and then
> take extra work to merge.
>                     ** The issues raised today will be raised all the same
> then.
>
>
>                 ------------------------------
> ---------------------------------------
>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> apache.org
>                 For additional commands, e-mail:
> hdfs-dev-help@hadoop.apache.org
>
>
>
>
>
>
>
>
>
>

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Daryn Sharp <da...@oath.com.INVALID>.

I’m generally neutral and looked foremost at developer impact.  Ie.  Will
it be so intertwined with hdfs that each project risks destabilizing the
other?  Will developers with no expertise in ozone will be impeded?  I
think the answer is currently no.  These are the intersections and some
concerns based on the assumption ozone is accepted into the project:

Common

Appear to be a number of superfluous changes.  The conf servlet must not be
polluted with specific references and logic for ozone.  We don’t create
dependencies from common to hdfs, mapred, yarn, hive, etc.  Common must be
“ozone free”.

Datanode

I expected ozone changes to be intricately linked with the existing blocks
map, dataset, volume, etc.  Thankfully it’s not.  As an independent
service, the DN should not be polluted with specific references to ozone.
If ozone is in the project, the DN should have a generic plugin interface
conceptually similar to the NM aux services.

Namenode

No impact, currently, but certainly will be…

Code Location

I don’t feel hadoop-hdfs-project/hadoop-hdfs is an acceptable location.
I’d rather see hadoop-hdfs-project/hadoop-hdsl, or even better
hadoop-hdsl-project.  This clean separation will make it easier to later
spin off or pull in depending on which way we vote.

Dependencies

Owen hit upon his before I could send.  Hadoop is already bursting with
dependencies, I hope this doesn’t pull in a lot more.

––

Do I think ozone be should be a separate project?  If we view it only as a
competing filesystem, then clearly yes.  If it’s a low risk evolutionary
step with near-term benefits, no, we want to keep it close and help it
evolve.  I think ozone/hdsl/whatever has been poorly marketed and an
umbrella term for too many technologies that should perhaps be split.  I'm
interested in the container block management.  I have little interest at
this time in the key store.

The usability of ozone, specifically container management, is unclear to
me.  It lacks basic features like changing replication factors, append, a
migration path, security, etc - I know there are good plans for all of it -
yet another goal is splicing into the NN.  That’s a lot of high priority
items to tackle that need to be carefully orchestrated before contemplating
BM replacement.  Each of those is a non-starter for (my) production
environment.  We need to make sure we can reach a consensus on the block
level functionality before rushing it into the NN.  That’s independent of
whether allowing it into the project.

The BM/SCM changes to the NN are realistically going to be contentious &
destabilizing.  If done correctly, the BM separation will be a big win for
the NN.  If ozone is out, by necessity interfaces will need to be stable
and well-defined but we won’t get that right for a long time.  Interface
and logic changes that break the other will be difficult to coordinate and
we’ll likely veto changes that impact the other.  If ozone is in, we can
hopefully synchronize the changes with less friction, but it greatly
increases the chances of developers riddling the NN with hacks and/or ozone
specific logic that makes it even more brittle.  I will note we need to be
vigilant against pervasive conditionals (ie. EC, snapshots).

In either case, I think ozone must agree to not impede current hdfs work.
I’ll compare to hdfs is a store owner that plans to maybe retire in 5
years.  A potential new owner (ozone) is lined up and hdfs graciously gives
them no-rent space (the DN).  Precondition is help improve the store.
Don’t make a mess and expect hdfs to clean it up.  Don’t make renovations
that complicate hdfs but ignore it due to anticipation of its
departure/demise.  I’m not implying that’s currently happening, it’s just
what I don’t want to see.

We as a community and our customers need an evolution, not a revolution,
and definitively not a civil war.  Hdfs has too much legacy code rot that
is hard to change.  Too many poorly implemented features.   Perhaps I’m
overly optimistic that freshly redesigned code can counterbalance
performance degradations in the NN.  I’m also reluctant, but realize it is
being driven by some hdfs veterans that know/understand historical hdfs
design strengths and flaws.

If the initially cited issues are addressed, I’m +0.5 for the concept of
bringing in ozone if it's not going to be a proverbial bull in the china
shop.

Daryn

On Mon, Feb 26, 2018 at 3:18 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

>     Dear folks,
>            We would like to start a vote to merge HDFS-7240 branch into
> trunk. The context can be reviewed in the DISCUSSION thread, and in the
> jiras (See references below).
>
>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is
> a distributed, replicated block layer.
>     The old HDFS namespace and NN can be connected to this new block layer
> as we have described in HDFS-10419.
>     We also introduce a key-value namespace called Ozone built on HDSL.
>
>     The code is in a separate module and is turned off by default. In a
> secure setup, HDSL and Ozone daemons cannot be started.
>
>     The detailed documentation is available at
>              https://cwiki.apache.org/confluence/display/HADOOP/
> Hadoop+Distributed+Storage+Layer+and+Applications
>
>
>     I will start with my vote.
>             +1 (binding)
>
>
>     Discussion Thread:
>               https://s.apache.org/7240-merge
>               https://s.apache.org/4sfU
>
>     Jiras:
>                https://issues.apache.org/jira/browse/HDFS-7240
>                https://issues.apache.org/jira/browse/HDFS-10419
>                https://issues.apache.org/jira/browse/HDFS-13074
>                https://issues.apache.org/jira/browse/HDFS-13180
>
>
>     Thanks
>     jitendra
>
>
>
>
>
>             DISCUSSION THREAD SUMMARY :
>
>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> wrote:
>
>                 Sorry the formatting got messed by my email client.  Here
> it is again
>
>
>                 Dear
>                  Hadoop Community Members,
>
>                    We had multiple community discussions, a few meetings
> in smaller groups and also jira discussions with respect to this thread. We
> express our gratitude for participation and valuable comments.
>
>                 The key questions raised were following
>                 1) How the new block storage layer and OzoneFS benefit
> HDFS and we were asked to chalk out a roadmap towards the goal of a
> scalable namenode working with the new storage layer
>                 2) We were asked to provide a security design
>                 3)There were questions around stability given ozone brings
> in a large body of code.
>                 4) Why can’t they be separate projects forever or merged
> in when production ready?
>
>                 We have responded to all the above questions with detailed
> explanations and answers on the jira as well as in the discussions. We
> believe that should sufficiently address community’s concerns.
>
>                 Please see the summary below:
>
>                 1) The new code base benefits HDFS scaling and a roadmap
> has been provided.
>
>                 Summary:
>                   - New block storage layer addresses the scalability of
> the block layer. We have shown how existing NN can be connected to the new
> block layer and its benefits. We have shown 2 milestones, 1st milestone is
> much simpler than 2nd milestone while giving almost the same scaling
> benefits. Originally we had proposed simply milestone 2 and the community
> felt that removing the FSN/BM lock was was a fair amount of work and a
> simpler solution would be useful
>                   - We provide a new K-V namespace called Ozone FS with
> FileSystem/FileContext plugins to allow the users to use the new system.
> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
> facilitate stabilizing the new block layer.
>                   - The new block layer has a new netty based protocol
> engine in the Datanode which, when stabilized, can be used by  the old hdfs
> block layer. See details below on sharing of code.
>
>
>                 2) Stability impact on the existing HDFS code base and
> code separation. The new block layer and the OzoneFS are in modules that
> are separate from old HDFS code - currently there are no calls from HDFS
> into Ozone except for DN starting the new block  layer module if configured
> to do so. It does not add instability (the instability argument has been
> raised many times). Over time as we share code, we will ensure that the old
> HDFS continues to remains stable. (for example we plan to stabilize the new
> netty based protocol engine in the new block layer before sharing it with
> HDFS’s old block layer)
>
>
>                 3) In the short term and medium term, the new system and
> HDFS  will be used side-by-side by users. Side by-side usage in the short
> term for testing and side-by-side in the medium term for actual production
> use till the new system has feature parity with old HDFS. During this time,
> sharing the DN daemon and admin functions between the two systems is
> operationally important:
>                   - Sharing DN daemon to avoid additional operational
> daemon lifecycle management
>                   - Common decommissioning of the daemon and DN: One place
> to decommission for a node and its storage.
>                   - Replacing failed disks and internal balancing capacity
> across disks - this needs to be done for both the current HDFS blocks and
> the new block-layer blocks.
>                   - Balancer: we would like use the same balancer and
> provide a common way to balance and common management of the bandwidth used
> for balancing
>                   - Security configuration setup - reuse existing set up
> for DNs rather then a new one for an independent cluster.
>
>
>                 4) Need to easily share the block layer code between the
> two systems when used side-by-side. Areas where sharing code is desired
> over time:
>                   - Sharing new block layer’s  new netty based protocol
> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>                   - Shallow data copy from old system to new system is
> practical only if within same project and daemon otherwise have to deal
> with security setting and coordinations across daemons. Shallow copy is
> useful as customer migrate from old to new.
>                   - Shared disk scheduling in the future and in the short
> term have a single round robin rather than independent round robins.
>                 While sharing code across projects is technically possible
> (anything is possible in software),  it is significantly harder typically
> requiring  cleaner public apis etc. Sharing within a project though
> internal APIs is often simpler (such as the protocol engine that we want to
> share).
>
>
>                 5) Security design, including a threat model and and the
> solution has been posted.
>                 6) Temporary Separation and merge later: Several of the
> comments in the jira have argued that we temporarily separate the two code
> bases for now and then later merge them when the new code is stable:
>
>                   - If there is agreement to merge later, why bother
> separating now - there needs to be to be good reasons to separate now.  We
> have addressed the stability and separation of the new code from existing
> above.
>                   - Merge the new code back into HDFS later will be harder.
>
>                     **The code and goals will diverge further.
>                     ** We will be taking on extra work to split and then
> take extra work to merge.
>                     ** The issues raised today will be raised all the same
> then.
>
>
>                 ------------------------------
> ---------------------------------------
>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> apache.org
>                 For additional commands, e-mail:
> hdfs-dev-help@hadoop.apache.org
>
>
>
>
>
>
>
>
>
>

-- 

Daryn

Re: [VOTE] Merging branch HDFS-7240 to trunk

Posted by Andrew Wang <an...@cloudera.com>.

*Hi Jitendra and all,Thanks for putting this together. I caught up on the
discussion on JIRA and document at HDFS-10419, and still have the same
concerns raised earlier
<https://issues.apache.org/jira/browse/HDFS-7240?focusedCommentId=16257730&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16257730>
about merging the Ozone branch to trunk.To recap these questions/concerns
at a very high level:* Wouldn't Ozone benefit from being a separate
project?* Why should it be merged now?I still believe that both Ozone and
Hadoop would benefit from Ozone being a separate project, and that there is
no pressing reason to merge Ozone/HDSL now.The primary reason I've heard
for merging is that the Ozone is that it's at a stage where it's ready for
user feedback. Second, that it needs to be merged to start on the NN
refactoring for HDFS-on-HDSL.First, without HDFS-on-HDSL support, users are
testing against the Ozone object storage interface. Ozone and HDSL
themselves are implemented as separate masters and new functionality bolted
onto the datanode. It also doesn't look like HDFS in terms of API or
featureset; yes, it speaks FileSystem, but so do many out-of-tree storage
systems like S3, Ceph, Swift, ADLS etc. Ozone/HDSL does not support popular
HDFS features like erasure coding, encryption, high-availability,
snapshots, hflush/hsync (and thus HBase), or APIs like WebHDFS or NFS. This
means that Ozone feels like a new, different system that could reasonably
be deployed and tested separately from HDFS. It's unlikely to replace many
of today's HDFS deployments, and from what I understand, Ozone was not
designed to do this.Second, the NameNode refactoring for HDFS-on-HDSL by
itself is a major undertaking. The discussion on HDFS-10419 is still
ongoing so it’s not clear what the ultimate refactoring will be, but I do
know that the earlier FSN/BM refactoring during 2.x was very painful
(introducing new bugs and making backports difficult) and probably should
have been deferred to a new major release instead. I think this refactoring
is important for the long-term maintainability of the NN and worth
pursuing, but as a Hadoop 4.0 item. Merging HDSL is also not a prerequisite
for starting this refactoring. Really, I see the refactoring as the
prerequisite for HDFS-on-HDSL to be possible.Finally, I earnestly believe
that Ozone/HDSL itself would benefit from being a separate project. Ozone
could release faster and iterate more quickly if it wasn't hampered by
Hadoop's release schedule and security and compatibility requirements.
There are also publicity and community benefits; it's an opportunity to
build a community focused on the novel capabilities and architectural
choices of Ozone/HDSL. There are examples of other projects that were
"incubated" on a branch in the Hadoop repo before being spun off to great
success.In conclusion, I'd like to see Ozone succeeding and thriving as a
separate project. Meanwhile, we can work on the HDFS refactoring required
to separate the FSN and BM and make it pluggable. At that point (likely in
the Hadoop 4 timeframe), we'll be ready to pursue HDFS-on-HDSL integration.*
Best,
Andrew

On Mon, Feb 26, 2018 at 1:18 PM, Jitendra Pandey <ji...@hortonworks.com>
wrote:

>     Dear folks,
>            We would like to start a vote to merge HDFS-7240 branch into
> trunk. The context can be reviewed in the DISCUSSION thread, and in the
> jiras (See references below).
>
>     HDFS-7240 introduces Hadoop Distributed Storage Layer (HDSL), which is
> a distributed, replicated block layer.
>     The old HDFS namespace and NN can be connected to this new block layer
> as we have described in HDFS-10419.
>     We also introduce a key-value namespace called Ozone built on HDSL.
>
>     The code is in a separate module and is turned off by default. In a
> secure setup, HDSL and Ozone daemons cannot be started.
>
>     The detailed documentation is available at
>              https://cwiki.apache.org/confluence/display/HADOOP/
> Hadoop+Distributed+Storage+Layer+and+Applications
>
>
>     I will start with my vote.
>             +1 (binding)
>
>
>     Discussion Thread:
>               https://s.apache.org/7240-merge
>               https://s.apache.org/4sfU
>
>     Jiras:
>                https://issues.apache.org/jira/browse/HDFS-7240
>                https://issues.apache.org/jira/browse/HDFS-10419
>                https://issues.apache.org/jira/browse/HDFS-13074
>                https://issues.apache.org/jira/browse/HDFS-13180
>
>
>     Thanks
>     jitendra
>
>
>
>
>
>             DISCUSSION THREAD SUMMARY :
>
>             On 2/13/18, 6:28 PM, "sanjay Radia" <sa...@gmail.com>
> wrote:
>
>                 Sorry the formatting got messed by my email client.  Here
> it is again
>
>
>                 Dear
>                  Hadoop Community Members,
>
>                    We had multiple community discussions, a few meetings
> in smaller groups and also jira discussions with respect to this thread. We
> express our gratitude for participation and valuable comments.
>
>                 The key questions raised were following
>                 1) How the new block storage layer and OzoneFS benefit
> HDFS and we were asked to chalk out a roadmap towards the goal of a
> scalable namenode working with the new storage layer
>                 2) We were asked to provide a security design
>                 3)There were questions around stability given ozone brings
> in a large body of code.
>                 4) Why can’t they be separate projects forever or merged
> in when production ready?
>
>                 We have responded to all the above questions with detailed
> explanations and answers on the jira as well as in the discussions. We
> believe that should sufficiently address community’s concerns.
>
>                 Please see the summary below:
>
>                 1) The new code base benefits HDFS scaling and a roadmap
> has been provided.
>
>                 Summary:
>                   - New block storage layer addresses the scalability of
> the block layer. We have shown how existing NN can be connected to the new
> block layer and its benefits. We have shown 2 milestones, 1st milestone is
> much simpler than 2nd milestone while giving almost the same scaling
> benefits. Originally we had proposed simply milestone 2 and the community
> felt that removing the FSN/BM lock was was a fair amount of work and a
> simpler solution would be useful
>                   - We provide a new K-V namespace called Ozone FS with
> FileSystem/FileContext plugins to allow the users to use the new system.
> BTW Hive and Spark work very well on KV-namespaces on the cloud. This will
> facilitate stabilizing the new block layer.
>                   - The new block layer has a new netty based protocol
> engine in the Datanode which, when stabilized, can be used by  the old hdfs
> block layer. See details below on sharing of code.
>
>
>                 2) Stability impact on the existing HDFS code base and
> code separation. The new block layer and the OzoneFS are in modules that
> are separate from old HDFS code - currently there are no calls from HDFS
> into Ozone except for DN starting the new block  layer module if configured
> to do so. It does not add instability (the instability argument has been
> raised many times). Over time as we share code, we will ensure that the old
> HDFS continues to remains stable. (for example we plan to stabilize the new
> netty based protocol engine in the new block layer before sharing it with
> HDFS’s old block layer)
>
>
>                 3) In the short term and medium term, the new system and
> HDFS  will be used side-by-side by users. Side by-side usage in the short
> term for testing and side-by-side in the medium term for actual production
> use till the new system has feature parity with old HDFS. During this time,
> sharing the DN daemon and admin functions between the two systems is
> operationally important:
>                   - Sharing DN daemon to avoid additional operational
> daemon lifecycle management
>                   - Common decommissioning of the daemon and DN: One place
> to decommission for a node and its storage.
>                   - Replacing failed disks and internal balancing capacity
> across disks - this needs to be done for both the current HDFS blocks and
> the new block-layer blocks.
>                   - Balancer: we would like use the same balancer and
> provide a common way to balance and common management of the bandwidth used
> for balancing
>                   - Security configuration setup - reuse existing set up
> for DNs rather then a new one for an independent cluster.
>
>
>                 4) Need to easily share the block layer code between the
> two systems when used side-by-side. Areas where sharing code is desired
> over time:
>                   - Sharing new block layer’s  new netty based protocol
> engine for old HDFS DNs (a long time sore issue for HDFS block layer).
>                   - Shallow data copy from old system to new system is
> practical only if within same project and daemon otherwise have to deal
> with security setting and coordinations across daemons. Shallow copy is
> useful as customer migrate from old to new.
>                   - Shared disk scheduling in the future and in the short
> term have a single round robin rather than independent round robins.
>                 While sharing code across projects is technically possible
> (anything is possible in software),  it is significantly harder typically
> requiring  cleaner public apis etc. Sharing within a project though
> internal APIs is often simpler (such as the protocol engine that we want to
> share).
>
>
>                 5) Security design, including a threat model and and the
> solution has been posted.
>                 6) Temporary Separation and merge later: Several of the
> comments in the jira have argued that we temporarily separate the two code
> bases for now and then later merge them when the new code is stable:
>
>                   - If there is agreement to merge later, why bother
> separating now - there needs to be to be good reasons to separate now.  We
> have addressed the stability and separation of the new code from existing
> above.
>                   - Merge the new code back into HDFS later will be harder.
>
>                     **The code and goals will diverge further.
>                     ** We will be taking on extra work to split and then
> take extra work to merge.
>                     ** The issues raised today will be raised all the same
> then.
>
>
>                 ------------------------------
> ---------------------------------------
>                 To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.
> apache.org
>                 For additional commands, e-mail:
> hdfs-dev-help@hadoop.apache.org
>
>
>
>
>
>
>
>
>
>