You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Patrick Hunt (JIRA)" <ji...@apache.org> on 2010/02/02 18:32:19 UTC
[jira] Created: (ZOOKEEPER-663) hudson failure in
ZKDatabaseCorruptionTest
hudson failure in ZKDatabaseCorruptionTest
------------------------------------------
Key: ZOOKEEPER-663
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-663
Project: Zookeeper
Issue Type: Bug
Components: server
Reporter: Patrick Hunt
Assignee: Mahadev konar
Priority: Critical
Fix For: 3.3.0
http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/686/
java.lang.RuntimeException: Unable to run quorum server
at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:380)
at org.apache.zookeeper.test.ZkDatabaseCorruptionTest.testCorruption(ZkDatabaseCorruptionTest.java:99)
Caused by: java.io.IOException: Invalid magic number 0 != 1514884167
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:455)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:471)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:438)
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:519)
at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:145)
at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:377)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-663) hudson failure in
ZKDatabaseCorruptionTest
Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842160#action_12842160 ]
Hadoop QA commented on ZOOKEEPER-663:
-------------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12438038/ZOOKEEPER-663.patch
against trunk revision 919640.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/129/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/129/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h8.grid.sp2.yahoo.net/129/console
This message is automatically generated.
> hudson failure in ZKDatabaseCorruptionTest
> ------------------------------------------
>
> Key: ZOOKEEPER-663
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-663
> Project: Zookeeper
> Issue Type: Bug
> Components: server
> Reporter: Patrick Hunt
> Assignee: Mahadev konar
> Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-663.patch
>
>
> http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/686/
> java.lang.RuntimeException: Unable to run quorum server
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:380)
> at org.apache.zookeeper.test.ZkDatabaseCorruptionTest.testCorruption(ZkDatabaseCorruptionTest.java:99)
> Caused by: java.io.IOException: Invalid magic number 0 != 1514884167
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:455)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:471)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:438)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:519)
> at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:145)
> at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:377)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-663) hudson failure in
ZKDatabaseCorruptionTest
Posted by "Henry Robinson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henry Robinson updated ZOOKEEPER-663:
-------------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I just committed this. Thanks Mahadev!
> hudson failure in ZKDatabaseCorruptionTest
> ------------------------------------------
>
> Key: ZOOKEEPER-663
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-663
> Project: Zookeeper
> Issue Type: Bug
> Components: server
> Reporter: Patrick Hunt
> Assignee: Mahadev konar
> Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-663.patch
>
>
> http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/686/
> java.lang.RuntimeException: Unable to run quorum server
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:380)
> at org.apache.zookeeper.test.ZkDatabaseCorruptionTest.testCorruption(ZkDatabaseCorruptionTest.java:99)
> Caused by: java.io.IOException: Invalid magic number 0 != 1514884167
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:455)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:471)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:438)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:519)
> at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:145)
> at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:377)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-663) hudson failure in
ZKDatabaseCorruptionTest
Posted by "Benjamin Reed (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Benjamin Reed updated ZOOKEEPER-663:
------------------------------------
Hadoop Flags: [Reviewed]
no need for a test. it just changes messages and doc.
> hudson failure in ZKDatabaseCorruptionTest
> ------------------------------------------
>
> Key: ZOOKEEPER-663
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-663
> Project: Zookeeper
> Issue Type: Bug
> Components: server
> Reporter: Patrick Hunt
> Assignee: Mahadev konar
> Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-663.patch
>
>
> http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/686/
> java.lang.RuntimeException: Unable to run quorum server
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:380)
> at org.apache.zookeeper.test.ZkDatabaseCorruptionTest.testCorruption(ZkDatabaseCorruptionTest.java:99)
> Caused by: java.io.IOException: Invalid magic number 0 != 1514884167
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:455)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:471)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:438)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:519)
> at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:145)
> at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:377)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-663) hudson failure in
ZKDatabaseCorruptionTest
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mahadev konar updated ZOOKEEPER-663:
------------------------------------
Attachment: ZOOKEEPER-663.patch
this patch fixes the logging to mention which file is corrupted and then adds forrest docs on handling such kind of failures.
> hudson failure in ZKDatabaseCorruptionTest
> ------------------------------------------
>
> Key: ZOOKEEPER-663
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-663
> Project: Zookeeper
> Issue Type: Bug
> Components: server
> Reporter: Patrick Hunt
> Assignee: Mahadev konar
> Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-663.patch
>
>
> http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/686/
> java.lang.RuntimeException: Unable to run quorum server
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:380)
> at org.apache.zookeeper.test.ZkDatabaseCorruptionTest.testCorruption(ZkDatabaseCorruptionTest.java:99)
> Caused by: java.io.IOException: Invalid magic number 0 != 1514884167
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:455)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:471)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:438)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:519)
> at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:145)
> at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:377)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (ZOOKEEPER-663) hudson failure in
ZKDatabaseCorruptionTest
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841977#action_12841977 ]
Mahadev konar commented on ZOOKEEPER-663:
-----------------------------------------
This looks like a quorum peer was creting a new txn log file and was shutdown in the middle of that. This probably led to corruption of txnlogs in the data directory of one of the quorumpeers. We actually do not have a good story with the corruption with of the transaction logs. Currently we depend on admins manually going to the node and making decisions on how to resolve this.
As a part of this jira we can add documentation in the forrest docs for now, on how to deal with such situations. Also, the logging needs to change to point which file was corrupted.
> hudson failure in ZKDatabaseCorruptionTest
> ------------------------------------------
>
> Key: ZOOKEEPER-663
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-663
> Project: Zookeeper
> Issue Type: Bug
> Components: server
> Reporter: Patrick Hunt
> Assignee: Mahadev konar
> Priority: Critical
> Fix For: 3.3.0
>
>
> http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/686/
> java.lang.RuntimeException: Unable to run quorum server
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:380)
> at org.apache.zookeeper.test.ZkDatabaseCorruptionTest.testCorruption(ZkDatabaseCorruptionTest.java:99)
> Caused by: java.io.IOException: Invalid magic number 0 != 1514884167
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:455)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:471)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:438)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:519)
> at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:145)
> at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:377)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (ZOOKEEPER-663) hudson failure in
ZKDatabaseCorruptionTest
Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mahadev konar updated ZOOKEEPER-663:
------------------------------------
Status: Patch Available (was: Open)
> hudson failure in ZKDatabaseCorruptionTest
> ------------------------------------------
>
> Key: ZOOKEEPER-663
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-663
> Project: Zookeeper
> Issue Type: Bug
> Components: server
> Reporter: Patrick Hunt
> Assignee: Mahadev konar
> Priority: Critical
> Fix For: 3.3.0
>
> Attachments: ZOOKEEPER-663.patch
>
>
> http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/686/
> java.lang.RuntimeException: Unable to run quorum server
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:380)
> at org.apache.zookeeper.test.ZkDatabaseCorruptionTest.testCorruption(ZkDatabaseCorruptionTest.java:99)
> Caused by: java.io.IOException: Invalid magic number 0 != 1514884167
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:455)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:471)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:438)
> at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:519)
> at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:145)
> at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:193)
> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:377)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.