You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Eli Collins (Created) (JIRA)" <ji...@apache.org> on 2012/03/30 02:43:26 UTC

[jira] [Created] (HADOOP-8230) Enable sync by default and disable append

Enable sync by default and disable append
-----------------------------------------

                 Key: HADOOP-8230
                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
             Project: Hadoop Common
          Issue Type: Improvement
    Affects Versions: 1.0.0
            Reporter: Eli Collins
            Assignee: Eli Collins


Per HDFS-3120 for 1.x let's:
- Always enable the sync path, which is currently only enabled if dfs.support.append is set
- Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267819#comment-13267819 ] 

Suresh Srinivas commented on HADOOP-8230:
-----------------------------------------

bq. We've had sync on by default in hundreds of our customer clusters for almost two years now and have yet to see a related data-loss event. The only bugs we've seen have been bugs where sync() wouldn't provide the correct semantics, but for installs which don't use sync, that doesn't matter.

That is great.

Still, I think we should retain ability to turn it off, because I want to continue running my installation that way and this patch removes that ability.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269858#comment-13269858 ] 

Suresh Srinivas commented on HADOOP-8230:
-----------------------------------------

bq. There may be a misunderstanding: the dfs.support.append flag never controlled whether sync was enabled.
dfs.support.append turned off some code paths. These code paths are not just related to append. They enable durable sync. See the patch where it changes, "if support append then do x else do y" to do "x" without any check. That is the behavior I want a user to be able to turn off with a flag.

                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269690#comment-13269690 ] 

stack commented on HADOOP-8230:
-------------------------------

bq. This was turned on in the 20 originally and then we had to turn it off due bugs.

The "bugs" were fixed a good while ago.

bq. Can you please explain the reason the make this change?

+ HBase needs it.
+ Its broken that users have to flip a configuration flag, then stop+start, to make a basic fs api method work.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267169#comment-13267169 ] 

stack commented on HADOOP-8230:
-------------------------------

bq. When an installation upgrades to a release with this patch, suddenly sync is enabled and there is no way to disable it.

Would such an installation be using the sync call?
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Sanjay Radia (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269286#comment-13269286 ] 

Sanjay Radia commented on HADOOP-8230:
--------------------------------------

> Couldn't the same be said for any new feature? Given that sync was fixed prior to the 1.0 release, 
> I don't see why this should be considered an incompatible change.
This was turned on in the 20 originally and then we had to turn it off due bugs.
Given that the default in Hadoop 1 is off, why not leave it off and give a way to turn it on.
The current default is off and I don't see a reason to change that default in 1.1.
There are installations have that are using the current default.
Can you please explain the reason the make this change?
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265151#comment-13265151 ] 

Suresh Srinivas commented on HADOOP-8230:
-----------------------------------------

Eli, sorry for the late comment. I agree with the general direction of splitting hflush/hsync feature from append. Perhaps these features should be using two different flags.  

I have concerns with this change:
# I thought the proposal from HDFS-3120 was to add "dfs.support.sync". I do not see that flag in this patch.
# There are installations where hsync/hflush is disabled, using dfs.support.append. That option should be preserved.
# "dfs.support.broken.append" - why add this and not delete the tests that are testing append functionality?
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262903#comment-13262903 ] 

Eli Collins commented on HADOOP-8230:
-------------------------------------

I've run the full unit test suite and there were no new additional failures.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408408#comment-13408408 ] 

Eli Collins commented on HADOOP-8230:
-------------------------------------

Patch for HADOOP-8365 coming, please review when you get a sec.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269283#comment-13269283 ] 

Suresh Srinivas commented on HADOOP-8230:
-----------------------------------------

Adding dfs.support.sync flag is along the lines of my previous comments. I am reluctantly okay with enabling it by default. This should be a blocker on 1.1. It might be easy to revert this patch, and add the new flag, as lot of paths to be enabled by the new flag are removed in this patch.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267242#comment-13267242 ] 

Eli Collins commented on HADOOP-8230:
-------------------------------------

bq. For testing sync, with this patch, since it is enabled by default, you do not need the flag right?

Correct, after my patch the tests that no longer use append no longer set the append flag. The tests that call append to get its side effects still use the append flag.

Agree w Stack wrt the previous comment.  Making sync actually work is a bug fix, it was a bug that we allowed people to call sync and unlike append there wasn't a flag to enable it that was disabled by default. Better to fix the default behavior (which allows you to sync).
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267584#comment-13267584 ] 

Suresh Srinivas commented on HADOOP-8230:
-----------------------------------------

bq. Would such an installation be using the sync call?
No from what I know.

>From what I understand, the intention of this change is to:
# Disable append, since 1.x has bugs in that implementation.
# Enable sync by default.

bq. Making sync actually work is a bug fix, it was a bug that we allowed people to call sync and unlike append there wasn't a flag to enable it that was disabled by default. Better to fix the default behavior (which allows you to sync).
The implementation earlier used dfs.supports.append to support both durable sync and append. When this flag is off, whole bunch of code got turned off, related to sync functionality on how the blocks are stored, block reports etc. Now with this change, this code can no longer be turned off. I agree with enabling sync by default. However, for folks who chose not to enable the related code and not impacted by it, we need to add a flag to turn off that functionality.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266339#comment-13266339 ] 

Eli Collins commented on HADOOP-8230:
-------------------------------------

Thanks for chiming in Suresh.

Wrt #1 see [this comment in HDFS-3120|https://issues.apache.org/jira/browse/HDFS-3120?focusedCommentId=13241903&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13241903] that outlines the proposal that Todd, Nicholas and I thought was best. Feel free to file a follow-on jira for an improvement.

Wrt #2 add a new option to disable durable sync? Personally I don't think we should HADOOP-8230 
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266339#comment-13266339 ] 

Eli Collins edited comment on HADOOP-8230 at 5/2/12 4:46 AM:
-------------------------------------------------------------

Thanks for chiming in Suresh.

Wrt #1 see [this comment in HDFS-3120|https://issues.apache.org/jira/browse/HDFS-3120?focusedCommentId=13241903&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13241903] that outlines the proposal that Todd, Nicholas and I thought was best. Feel free to file a follow-on jira for an improvement, happy to review. I'll update the description to match the proposal.

Wrt #2 personally I don't think we should allow people to disable durable sync as that can result in data loss for people running HBase. See HADOOP-8230 for more info. I'm open to having an option to disable durable sync if you think that use case is important.

Wrt #3 the rationale was two-fold: (1) there are tests that are using append not to test append per se but for the side effects and we'd lose sync test coverage by removing those tests and (2) per the description we're keeping the append code path in case someone wants to fix the data loss issues in which case it makes sense to keep the test coverage as well.
                
      was (Author: eli2):
    Thanks for chiming in Suresh.

Wrt #1 see [this comment in HDFS-3120|https://issues.apache.org/jira/browse/HDFS-3120?focusedCommentId=13241903&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13241903] that outlines the proposal that Todd, Nicholas and I thought was best. Feel free to file a follow-on jira for an improvement.

Wrt #2 add a new option to disable durable sync? Personally I don't think we should HADOOP-8230 
                  
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269311#comment-13269311 ] 

Todd Lipcon commented on HADOOP-8230:
-------------------------------------

bq. This was turned on in the 20 originally and then we had to turn it off due bugs.

Then several us spent many months fixing those bugs, and we haven't seen any since.

{quote}
Given that the default in Hadoop 1 is off, why not leave it off and give a way to turn it on.
The current default is off and I don't see a reason to change that default in 1.1.
There are installations have that are using the current default.
Can you please explain the reason the make this change?
{quote}

It's a pain for HBase users to have to manually flip this, and risk data loss if they don't. Changing the default also means we have fewer code paths to maintain for the average user.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Sanjay Radia (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267596#comment-13267596 ] 

Sanjay Radia commented on HADOOP-8230:
--------------------------------------

The sync fixes were a large number of individual fixes.  They added risk for users that were not using the sync feature.  Hence Sync was kept off by default for such users - this was a very conscious decision.
Sync should be left off by default with an option to turn it on.  This is an incompatible change.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267691#comment-13267691 ] 

Todd Lipcon commented on HADOOP-8230:
-------------------------------------

Couldn't the same be said for any new feature? Given that sync was fixed prior to the 1.0 release, I don't see why this should be considered an incompatible change. Many much bigger incompatible changes went into 1.0 when compared to earlier 0.20.20x or 0.20.x releases (e.g tarball layout entirely changed, for example). This doesn't affect any of the existing APIs, only the underlying implementations.

Given this is committed for 1.1, not 1.0.x, I don't buy the reasoning that it's too much risk. We've had sync on by default in hundreds of our customer clusters for almost two years now and have yet to see a related data-loss event. The only bugs we've seen have been bugs where sync() wouldn't provide the correct semantics, but for installs which don't use sync, that doesn't matter.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13262836#comment-13262836 ] 

Todd Lipcon commented on HADOOP-8230:
-------------------------------------

+1, looks good to me.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269360#comment-13269360 ] 

Eli Collins commented on HADOOP-8230:
-------------------------------------

Suresh, Sanjay, the rationale is discussed in HDFS-3120. In short:
# We shouldn't provide a flag that enables append because we know append has data loss issues
# HBase and other programs have data loss when running against a default Hadoop 1.x install

The rationale is pretty clear - this prevents data loss.

Per my earlier comment, I'm open to having an option to disable durable sync if you think that use case is important, but what is that use case?  Given that the sync code path is well tested and debugged, why would you want to run with a buggy sync implementation?
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269882#comment-13269882 ] 

Eli Collins commented on HADOOP-8230:
-------------------------------------

I get that, what I'm saying is that I don't see the rationale for disabling the durable sync code paths. I'm -0 on HADOOP-8365, if you feel strongly that we should have a config option that let's people keep the previous/broken sync behavior go for it. I just don't see it.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eli Collins updated HADOOP-8230:
--------------------------------

    Attachment: hadoop-8230.txt

Patch attached.

test-patch is clean, I'm running the full unit test suite now, and will also do some cluster testing for sanity.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270111#comment-13270111 ] 

Eli Collins commented on HADOOP-8230:
-------------------------------------

This change was proposed and discussed over a month ago:
https://issues.apache.org/jira/browse/HDFS-3120?focusedCommentId=13241903&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13241903

Since you're worried that the sync path being buggy I think the best remedy is testing. The sync path in 1.0.x has been tested extensively by HBase users (including MR jobs), but we need to spend time looking at any bugs/changed behavior that might crop up as part of testing the Hadoop 1.x release anyway.

This change prevents *known* data loss on out of the box Hadoop 1.x installs, that seems more important than issues that could potentially come up during testing. I think our HBase users feel similarly. The reason I filed a separate jira for adding the flag is that I don't think that we should let users enable a code path that we know can result in data loss, and means we have to test two code paths. I also see your POV which is why I'm not -1 on this flag even though I don't like it.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269920#comment-13269920 ] 

Suresh Srinivas commented on HADOOP-8230:
-----------------------------------------

bq. Do you think there many users who'd want to do this Suresh?
There are several clusters that I support that do not use sync, that currently runs with append turned off. 

bq. I'd think the number few and if there any still conscious this option even exists, they are probably suffering from the FUD that sync is buggy/broke. We should help them get over their misconception?
I agree that the code that is being enabled has been stable for some time, which is the main reason why it was ported to 0.20.205. However I would like to retain the existing behavior and not enable a change unnecessarily on these clusters. This avoids having to worry about or spend time looking at any bugs/changed behavior that might crop up.

For these kinds of changes (see several token related changes that happened in 1.x), I have always advocated adding a flag so existing deployments can stay unaffected. I am asking the same here. It is more important given this patch removed an option that existed to turn off new code.

bq. if you feel strongly that we should have a config option that let's people keep the previous/broken sync behavior go for it
The need for an option is a comment on the patch committed in this jira. Sorry I could not comment quickly enough, as this patch was committed with a short turn around time. I think it should be addressed as a subsequent patch for this jira and not a separate optional item. Alternatively we could revert this change and rework it to add a flag.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263850#comment-13263850 ] 

Eli Collins commented on HADOOP-8230:
-------------------------------------

{noformat}
     [exec] 
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 4 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     -1 findbugs.  The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings.
     [exec] 
{noformat}

8 findbugs are HADOOP-7847.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HADOOP-8230) Enable sync by default and disable append

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suresh Srinivas updated HADOOP-8230:
------------------------------------

    Release Note: Append is not supported in Hadoop 1.x. Please upgrade to 2.x if you need append. If you enabled dfs.support.append for HBase, you're OK, as durable sync (why HBase required dfs.support.append) is now enabled by default. If you really need the previous functionality, to turn on the append functionality set the flag "dfs.support.broken.append" to true.  (was: Append is not supported in Hadoop 1.x. Please upgrade to 2.x if you need append. If you enabled dfs.support.append for HBase, you're OK, as durable sync (why HBase required dfs.support.append) is now enabled by default.)
    
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eli Collins resolved HADOOP-8230.
---------------------------------

          Resolution: Fixed
       Fix Version/s: 1.1.0
    Target Version/s:   (was: 1.1.0)
        Hadoop Flags: Incompatible change,Reviewed  (was: Incompatible change)

I've committed this. Thanks for the review Todd!
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269845#comment-13269845 ] 

Eli Collins commented on HADOOP-8230:
-------------------------------------

@Suresh, I understand that someone said they want to run their installation w/ broken sync the question is *why* they want to run an installation with broken sync. I filed HADOOP-8365 for such a flag here, but there's currently no rationale for why you'd want to do that. Also, someone choosing to upgrade from 1.0.x to 1.1 is going to pick up new changes - and even new features - and most of them don't have a flag to disable them. Nothing new there.

@Koji, per your comment..

bq. Before moving all of our non-HBase clusters to 2.0, we might use 1.1 for some time. During this period, I do not want some production projects to start relying on the sync features then find some regression/difference on 2.0 blocking our upgrade schedule.

There may be a misunderstanding: the dfs.support.append flag never controlled whether sync was enabled. DFSClient#sync and NN#fsync have always been available - *enabling sync by default does not expose any new APIs to your users that were not previously available*. The difference is that this fixes bugs for your users that were already using sync, which I think you'll want.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269678#comment-13269678 ] 

Koji Noguchi commented on HADOOP-8230:
--------------------------------------

bq. , but what is that use case?

I'm probably a minority here but I do want the "an option to disable durable sync" not because it could be buggy but because it could be too stable compared to 0.23/2.0.  Before moving all of our non-HBase clusters to 2.0, we might use 1.1 for some time.  During this period, I do not want some production projects to start relying on the sync features then find some regression/difference on 2.0 blocking our upgrade schedule. 
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269785#comment-13269785 ] 

Suresh Srinivas commented on HADOOP-8230:
-----------------------------------------

bq. what is that use case?
I think I have explained it in the comments above. To repeat:

"Still, I think we should retain ability to turn it off, because I want to continue running my installation that way and this patch removes that ability."
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408193#comment-13408193 ] 

Suresh Srinivas commented on HADOOP-8230:
-----------------------------------------

I had marked HADOOP-8365 as a blocker for 1.1.0. 

Since HADOOP-8365 has not been fixed yet for 1.1.0, I am -1 on this patch. If HADOOP-8365 gets fixed, I will remove my -1.

                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Suresh Srinivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266700#comment-13266700 ] 

Suresh Srinivas commented on HADOOP-8230:
-----------------------------------------

bq. Wrt #2 personally I don't think we should allow people to disable durable sync as that can result in data loss for people running HBase. See HADOOP-8230 for more info. I'm open to having an option to disable durable sync if you think that use case is important.
There are installations where HBase is not used and sync was disabled. Now this patch has removed that option. When an installation upgrades to a release with this patch, suddenly sync is enabled and there is no way to disable it.

bq. (1) there are tests that are using append not to test append per se but for the side effects and we'd lose sync test coverage by removing those tests and (2) per the description we're keeping the append code path in case someone wants to fix the data loss issues in which case it makes sense to keep the test coverage as well.
For testing sync, with this patch, since it is enabled by default, you do not need the flag right?
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267828#comment-13267828 ] 

Todd Lipcon commented on HADOOP-8230:
-------------------------------------

Seems a reasonable compromise is to instate a dfs.support.sync flag, but set it to true by default. That way those who are nervous about the bug fix can disable it, but the average user who can have HBase/etc working without additional configuration.
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8230) Enable sync by default and disable append

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269909#comment-13269909 ] 

stack commented on HADOOP-8230:
-------------------------------

bq. That is the behavior I want a user to be able to turn off with a flag.

Do you think there many users who'd want to do this Suresh?  I'd think the number few and if there any still conscious this option even exists, they are probably suffering from the FUD that sync is buggy/broke. We should help them get over their misconception? (Pardon me if I am way off on this.  Just offering an opinion from outer-left-field)
                
> Enable sync by default and disable append
> -----------------------------------------
>
>                 Key: HADOOP-8230
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8230
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 1.1.0
>
>         Attachments: hadoop-8230.txt
>
>
> Per HDFS-3120 for 1.x let's:
> - Always enable the sync path, which is currently only enabled if dfs.support.append is set
> - Remove the dfs.support.append configuration option. We'll keep the code paths though in case we ever fix append on branch-1, in which case we can add the config option back

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira