You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@zookeeper.apache.org by "Christian Ziech (JIRA)" <ji...@apache.org> on 2012/06/18 11:10:43 UTC

[jira] [Created] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Christian Ziech created ZOOKEEPER-1489:
------------------------------------------

             Summary: Data loss after truncate on transaction log
                 Key: ZOOKEEPER-1489
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.4.3
         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
            Reporter: Christian Ziech
            Priority: Critical


The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).

This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.

I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 

Steps to reproduce:
- Start an ensemble of zookeeper servers with at least 3 participants
- Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
- Create another entry which is hence only reflected on zk-1 and zk-3
- Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
- Bring back zk-2 and zk-3 so that they form a quorum
- Allow zk-1 to connect again
- zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
- Create a third node
- Force zk-1 to become master somehow
- That third node will be gone


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Christian Ziech (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Ziech updated ZOOKEEPER-1489:
---------------------------------------

    Attachment: TruncateTxLogCorruption.tgz

Re-uploading as requested. 

I'm happy that you liked the idea - note that some work may be required on that class to make it reliable. I just hacked that together as a demonstrator. I'm happy that I could help.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412348#comment-13412348 ] 

Hadoop QA commented on ZOOKEEPER-1489:
--------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536130/ZOOKEEPER-1489.patch
  against trunk revision 1357711.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 13 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1135//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1135//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1135//console

This message is automatically generated.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412566#comment-13412566 ] 

Hadoop QA commented on ZOOKEEPER-1489:
--------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536181/ZOOKEEPER-1489.patch
  against trunk revision 1357711.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 13 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1136//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1136//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1136//console

This message is automatically generated.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409594#comment-13409594 ] 

Benjamin Reed commented on ZOOKEEPER-1489:
------------------------------------------

sorry i haven't been able to look at this more deeply. i was more swamped this weekend than expected. i have two small comments (given the previous statement feel free to ignore :)

1) i think the correct fix is to simply ensure that a new log file is created after we do the initial handshake. that is what we were doing earlier; i'm not sure when it changed...
2) you should be able to reproduce this quickly and reliably using a Zab1_0Test style

i'm planning on checking more deeply into these two points, but given that i have a time crunch for the next week, i'm afraid i might not get to it :(
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-1489:
------------------------------------

    Attachment: ZOOKEEPER-1489_br34.patch
                ZOOKEEPER-1489_br33.patch
    
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409732#comment-13409732 ] 

Patrick Hunt commented on ZOOKEEPER-1489:
-----------------------------------------

I saw that, thanks Camille (will get back to this once I have a few min free...)
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Christian Ziech (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412231#comment-13412231 ] 

Christian Ziech commented on ZOOKEEPER-1489:
--------------------------------------------

One should note that I only quickly put together that port forwarder and did not really use it anywhere except for producing the test case here to demonstrate the issue ... so there could be any amount of race conditions I did not think about.

PS: Great to read that work is progressing so quickly :) ...
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-1489:
------------------------------------

    Attachment: ZOOKEEPER-1489.patch

Updated patches with Camille's f/b (more sane timeouts in 3 cases)
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403936#comment-13403936 ] 

Patrick Hunt commented on ZOOKEEPER-1489:
-----------------------------------------

I have created a patch that fixes the issue and passes all our tests, however it's still hanging up when running the tests you (Christian) provided. I'm getting close (I think I see the issue, working on it)

Christian, I'd like to include the test you created in our unit test infra. Would it be possible for you to contribute these? If so please re-attach the tgz archive to this jira and click the "grant" option when doing so. I would take what you've provided and massage it a bit to get into our test infra (pkg names, license headers, etc...). I like the port forwarder idea alot(!) if you're unable to contribute the code I'll likely add this concept to our test infra at some point myself. Thanks again!

                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Camille Fournier (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409717#comment-13409717 ] 

Camille Fournier commented on ZOOKEEPER-1489:
---------------------------------------------

I left my comments in the review btw, but I think in general I'm +1 if we get rid of the 300000ms waits in that test.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411831#comment-13411831 ] 

Hadoop QA commented on ZOOKEEPER-1489:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536083/ZOOKEEPER-1489.patch
  against trunk revision 1357711.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 13 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1132//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1132//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1132//console

This message is automatically generated.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Camille Fournier (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413045#comment-13413045 ] 

Camille Fournier commented on ZOOKEEPER-1489:
---------------------------------------------

+1 looks good
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409650#comment-13409650 ] 

Benjamin Reed commented on ZOOKEEPER-1489:
------------------------------------------

after further investigation it appears that pat's fix is the best. we do not restart the log file on reconnect only on restart or snapshot change.

+1 on the fix. i'll try to get a better test in, but i don't want to hold up the fix.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Camille Fournier (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408307#comment-13408307 ] 

Camille Fournier commented on ZOOKEEPER-1489:
---------------------------------------------

Can we stick this guy in reviewboard? I've got some comments that might be more easily made there. I'm most concerned with some of the timeouts, printlns etc in TruncateCorruptionTest. Like, no way should we have a test with 300000ms timeouts in it. 
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-1489:
------------------------------------

    Attachment: ZOOKEEPER-1489.patch
    
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412548#comment-13412548 ] 

Patrick Hunt commented on ZOOKEEPER-1489:
-----------------------------------------

Christian no worries at all. I really appreciate your submission, happy to work out the kinks. Will be a great addition to our testing regime.

                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt reassigned ZOOKEEPER-1489:
---------------------------------------

    Assignee: Patrick Hunt
    
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Critical
>         Attachments: TruncateTxLogCorruption.tgz
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412581#comment-13412581 ] 

Hadoop QA commented on ZOOKEEPER-1489:
--------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536181/ZOOKEEPER-1489.patch
  against trunk revision 1357711.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 13 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1137//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1137//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1137//console

This message is automatically generated.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416995#comment-13416995 ] 

Hudson commented on ZOOKEEPER-1489:
-----------------------------------

Integrated in ZooKeeper-trunk #1617 (See [https://builds.apache.org/job/ZooKeeper-trunk/1617/])
    ZOOKEEPER-1489. Data loss after truncate on transaction log (phunt) (Revision 1362660)

     Result = SUCCESS
phunt : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1362660
Files : 
* /zookeeper/trunk/CHANGES.txt
* /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZKDatabase.java
* /zookeeper/trunk/src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java
* /zookeeper/trunk/src/java/test/org/apache/zookeeper/server/TruncateCorruptionTest.java
* /zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerTestBase.java
* /zookeeper/trunk/src/java/test/org/apache/zookeeper/server/util/PortForwarder.java
* /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/TruncateTest.java

                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-1489:
------------------------------------

    Attachment: ZOOKEEPER-1489.patch

This update adds more logging, I also believe it fixes the test failure we were seeing. The main issue was that the port forwarder was not attempting to reconnect on outbound connection failure. In the case of follower connecting to the quorum port on the leader it might take some time for the leader to open the port once the election has completed. The core follower code retries this. However in the port forwarder case it would originally try once and close the inbound connection if the outbound failed. I now retry 10 times.

I managed to find a box which would fail on this test (my laptop and a number of other boxes didn't display this issue). With this latest fix the previously failing box is now passing the test consistently.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412068#comment-13412068 ] 

Patrick Hunt commented on ZOOKEEPER-1489:
-----------------------------------------

One other thing, noticed that some of the socket timeouts were very short on the port forwarder - increased the timeouts to values more likely to pass on various hardware/loads.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13408373#comment-13408373 ] 

Patrick Hunt commented on ZOOKEEPER-1489:
-----------------------------------------

That's great, thanks: https://reviews.apache.org/r/5815/

fwiw the test was/is complicated. I didn't want to change too much and lose the ability to reproduce the problem. I did replace a number of timeouts and such, might not have caught everything. I'm also fine to log everything is that's an issue.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409631#comment-13409631 ] 

Patrick Hunt commented on ZOOKEEPER-1489:
-----------------------------------------

:-(
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-1489:
------------------------------------

             Priority: Blocker  (was: Critical)
    Affects Version/s: 3.3.5
        Fix Version/s: 3.5.0
                       3.4.4
                       3.3.6

I looked at the code for this and I agree, it seems the existing txn log is not being repositioned. I'll work on some tests and a patch for this.

Thanks Christian!
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412554#comment-13412554 ] 

Patrick Hunt commented on ZOOKEEPER-1489:
-----------------------------------------

I also verified that the latest test code still fails without the fix applied.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-1489:
------------------------------------

    Attachment: ZOOKEEPER-1489_br34.patch
                ZOOKEEPER-1489_br33.patch
    
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Christian Ziech (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christian Ziech updated ZOOKEEPER-1489:
---------------------------------------

    Attachment: TruncateTxLogCorruption.tgz

Attaching a small maven project which contains the before mentioned test case. That test works reliably to reproduce the failure for 3.4.3 but for 3.3.5 it seems to pass, although the problem in the transaction log seems to be present there too. This difference seems to be caused by 3.3.5 not sending the truncate message to its peers in that flow the test case is following. 
However I cannot exclude that other flows may show the same problem for 3.3.5 as well.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Priority: Critical
>         Attachments: TruncateTxLogCorruption.tgz
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412331#comment-13412331 ] 

Hadoop QA commented on ZOOKEEPER-1489:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536130/ZOOKEEPER-1489.patch
  against trunk revision 1357711.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 13 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1134//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1134//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1134//console

This message is automatically generated.
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-1489:
------------------------------------

    Attachment: ZOOKEEPER-1489_br34.patch
                ZOOKEEPER-1489_br33.patch
    
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-1489:
------------------------------------

    Attachment: ZOOKEEPER-1489.patch

I updated the patches to do more logging on portforward issues - to enable debugging. Also moved the test printlns into logs (verified the test still fails w/o the fix).
                
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1489) Data loss after truncate on transaction log

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-1489:
------------------------------------

    Attachment: ZOOKEEPER-1489_br34.patch
                ZOOKEEPER-1489_br33.patch
    
> Data loss after truncate on transaction log
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1489
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1489
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.3, 3.3.5
>         Environment: Tested on Ubuntu 12.04 and CentOS 6, should be reproducible elsewhere
>            Reporter: Christian Ziech
>            Assignee: Patrick Hunt
>            Priority: Blocker
>             Fix For: 3.3.6, 3.4.4, 3.5.0
>
>         Attachments: TruncateTxLogCorruption.tgz, TruncateTxLogCorruption.tgz, ZOOKEEPER-1489.patch, ZOOKEEPER-1489_br33.patch, ZOOKEEPER-1489_br34.patch
>
>
> The truncate method on the transaction log in the class org.apache.zookeeper.server.persistence.FileTxnLog will reduce the file size to the required amount without either closing or re-positioning the logStream (which could also be dangerous since the truncate method is not synchronized against concurrent writes to the log).
> This causes the next append to that log to create a small "hole" in the file which java would interpret as binary zeroes when reading it. This then causes to the FileTxnIterator.next() implementation to detect the end of the log file too early.
> I'll attach a small maven project with one junit test which can be used to reproduce the issue. Due to the blackbox nature of the test it will run for roughly 50 seconds unfortunately. 
> Steps to reproduce:
> - Start an ensemble of zookeeper servers with at least 3 participants
> - Create one entry and the remove one of the servers from the ensemble temporarily (e.g. zk-2)
> - Create another entry which is hence only reflected on zk-1 and zk-3
> - Take zk-1 out of the ensemble without shutting it down (that is important, I did that by interrupting the network connection to that node) and clean zk-3
> - Bring back zk-2 and zk-3 so that they form a quorum
> - Allow zk-1 to connect again
> - zk-1 will receive a TRUNC message from zk-2 since zk-1 is now a minority knowing about that second node creation event
> - Create a third node
> - Force zk-1 to become master somehow
> - That third node will be gone

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira