You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@zookeeper.apache.org by GitBox <gi...@apache.org> on 2022/04/05 13:57:55 UTC
[GitHub] [zookeeper] kezhuw opened a new pull request, #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
kezhuw opened a new pull request, #1852:
URL: https://github.com/apache/zookeeper/pull/1852
This test writes txns to trigger snapshot and expects some txns remain
in txn log. But snapshot taking is asynchronous, thus all txns could be
written to snapshot. So in restarting, it is possible that no txns to
load after snapshot restored. This will fail assertion.
This commit solves this by disable automic snapshot taking by
`SyncRequestProcessor.setSnapCount(Integer.MAX_VALUE)`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] kezhuw commented on pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
kezhuw commented on PR #1852:
URL: https://github.com/apache/zookeeper/pull/1852#issuecomment-1162954259
Ci fails due to unrelated flaky test [ZOOKEEPER-4308](https://issues.apache.org/jira/browse/ZOOKEEPER-4308)(#1851).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] kezhuw commented on a diff in pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
kezhuw commented on code in PR #1852:
URL: https://github.com/apache/zookeeper/pull/1852#discussion_r843376225
##########
zookeeper-server/src/test/java/org/apache/zookeeper/server/persistence/FileTxnSnapLogMetricsTest.java:
##########
@@ -43,38 +40,20 @@
private static final Logger LOG = LoggerFactory.getLogger(FileTxnSnapLogMetricsTest.class);
- CountDownLatch allCreatedLatch;
-
- private class MockWatcher implements Watcher {
-
- @Override
- public void process(WatchedEvent e) {
- LOG.info("all nodes created");
- allCreatedLatch.countDown();
- }
-
- }
-
@Test
public void testFileTxnSnapLogMetrics() throws Exception {
- SyncRequestProcessor.setSnapCount(100);
+ // disable automatic snapshot taking to leave writes in txn log
+ SyncRequestProcessor.setSnapCount(Integer.MAX_VALUE);
Review Comment:
Not sure, but I think almost all tests run and should run in dedicated jvm(aka. false `reuseForks` in `zookeeper-server` module). This test is not an exception. Restoring a primitive value has no means in this case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] kezhuw commented on a diff in pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
kezhuw commented on code in PR #1852:
URL: https://github.com/apache/zookeeper/pull/1852#discussion_r843373366
##########
zookeeper-server/src/test/java/org/apache/zookeeper/server/persistence/FileTxnSnapLogMetricsTest.java:
##########
@@ -43,38 +40,20 @@
private static final Logger LOG = LoggerFactory.getLogger(FileTxnSnapLogMetricsTest.class);
- CountDownLatch allCreatedLatch;
-
- private class MockWatcher implements Watcher {
-
- @Override
- public void process(WatchedEvent e) {
- LOG.info("all nodes created");
- allCreatedLatch.countDown();
- }
-
- }
-
@Test
public void testFileTxnSnapLogMetrics() throws Exception {
- SyncRequestProcessor.setSnapCount(100);
+ // disable automatic snapshot taking to leave writes in txn log
+ SyncRequestProcessor.setSnapCount(Integer.MAX_VALUE);
QuorumUtil util = new QuorumUtil(1);
util.startAll();
- allCreatedLatch = new CountDownLatch(1);
-
byte[] data = new byte[500];
- // make sure a snapshot is taken and some txns are not in a snapshot
ZooKeeper zk = ClientBase.createZKClient(util.getConnString());
for (int i = 0; i < 150; i++) {
zk.create("/path" + i, data, ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
}
- if (null == zk.exists("/path149", new MockWatcher())) {
Review Comment:
`zk.create` above is synchronous. `zk.exists` will return non null stat if I understand correct.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] symat commented on a diff in pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
symat commented on code in PR #1852:
URL: https://github.com/apache/zookeeper/pull/1852#discussion_r901700884
##########
zookeeper-server/src/test/java/org/apache/zookeeper/server/persistence/FileTxnSnapLogMetricsTest.java:
##########
@@ -43,38 +40,20 @@
private static final Logger LOG = LoggerFactory.getLogger(FileTxnSnapLogMetricsTest.class);
- CountDownLatch allCreatedLatch;
-
- private class MockWatcher implements Watcher {
-
- @Override
- public void process(WatchedEvent e) {
- LOG.info("all nodes created");
- allCreatedLatch.countDown();
- }
-
- }
-
@Test
public void testFileTxnSnapLogMetrics() throws Exception {
- SyncRequestProcessor.setSnapCount(100);
+ // disable automatic snapshot taking to leave writes in txn log
+ SyncRequestProcessor.setSnapCount(Integer.MAX_VALUE);
Review Comment:
I'm not exactly sure how we run the tests... sometimes I even run them in IntelliJ during development. I think usually setting back all "static stuff" we changed in a test is a good practice.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] kezhuw commented on a diff in pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
kezhuw commented on code in PR #1852:
URL: https://github.com/apache/zookeeper/pull/1852#discussion_r902455844
##########
zookeeper-server/src/test/java/org/apache/zookeeper/server/persistence/FileTxnSnapLogMetricsTest.java:
##########
@@ -43,38 +40,20 @@
private static final Logger LOG = LoggerFactory.getLogger(FileTxnSnapLogMetricsTest.class);
- CountDownLatch allCreatedLatch;
-
- private class MockWatcher implements Watcher {
-
- @Override
- public void process(WatchedEvent e) {
- LOG.info("all nodes created");
- allCreatedLatch.countDown();
- }
-
- }
-
@Test
public void testFileTxnSnapLogMetrics() throws Exception {
- SyncRequestProcessor.setSnapCount(100);
+ // disable automatic snapshot taking to leave writes in txn log
+ SyncRequestProcessor.setSnapCount(Integer.MAX_VALUE);
Review Comment:
Cleanup is always a complicated problem. No ones run in identical environment.
> setting back all "static stuff" we changed in a test is a good practice.
It is actually hard to follow. Given `SyncRequestProcessor.setSnapCount` as an example, there are 20+ usages in test, only `RecoveryTest.testRecovery` restores.
Personally, I treat all tests as oneshot and resort to junit's `@Rule`(eg. `@TempDir`) or similars to cleanup persistent resources.
Anyway, I will add a `@AfterEach` to restore it back in next fixup. 😄
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] symat commented on pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
symat commented on PR #1852:
URL: https://github.com/apache/zookeeper/pull/1852#issuecomment-1164039781
I really like your solution @kezhuw - you improved this test a lot. Thanks!
I think we are good to merge this. We should have the flaky test fixed in older branches too, if the patch applies smoothly. I'm a bit overloaded today. @eolivelli feel free to merge this. Otherwise I'll try to do it tomorrow. Thanks!!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] asfgit closed pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
URL: https://github.com/apache/zookeeper/pull/1852
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] eolivelli commented on a diff in pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
eolivelli commented on code in PR #1852:
URL: https://github.com/apache/zookeeper/pull/1852#discussion_r843073938
##########
zookeeper-server/src/test/java/org/apache/zookeeper/server/persistence/FileTxnSnapLogMetricsTest.java:
##########
@@ -43,38 +40,20 @@
private static final Logger LOG = LoggerFactory.getLogger(FileTxnSnapLogMetricsTest.class);
- CountDownLatch allCreatedLatch;
-
- private class MockWatcher implements Watcher {
-
- @Override
- public void process(WatchedEvent e) {
- LOG.info("all nodes created");
- allCreatedLatch.countDown();
- }
-
- }
-
@Test
public void testFileTxnSnapLogMetrics() throws Exception {
- SyncRequestProcessor.setSnapCount(100);
+ // disable automatic snapshot taking to leave writes in txn log
+ SyncRequestProcessor.setSnapCount(Integer.MAX_VALUE);
QuorumUtil util = new QuorumUtil(1);
util.startAll();
- allCreatedLatch = new CountDownLatch(1);
-
byte[] data = new byte[500];
- // make sure a snapshot is taken and some txns are not in a snapshot
ZooKeeper zk = ClientBase.createZKClient(util.getConnString());
for (int i = 0; i < 150; i++) {
zk.create("/path" + i, data, ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
}
- if (null == zk.exists("/path149", new MockWatcher())) {
Review Comment:
Why are we removing this part?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] kezhuw commented on pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
kezhuw commented on PR #1852:
URL: https://github.com/apache/zookeeper/pull/1852#issuecomment-1161553240
@symat I think you are right about what the test try to express. But the original test does not assert that "we have a snapshot"(aka. "cnt_snapshottime"). That is why this fix pass. I think we could utilize `ZKTestCase.waitFor` to wait a snapshot and `FileSnap.close` to logs more txns after snapshot paused. I will push a fixup commit soon.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] eolivelli commented on a diff in pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
eolivelli commented on code in PR #1852:
URL: https://github.com/apache/zookeeper/pull/1852#discussion_r843073437
##########
zookeeper-server/src/test/java/org/apache/zookeeper/server/persistence/FileTxnSnapLogMetricsTest.java:
##########
@@ -43,38 +40,20 @@
private static final Logger LOG = LoggerFactory.getLogger(FileTxnSnapLogMetricsTest.class);
- CountDownLatch allCreatedLatch;
-
- private class MockWatcher implements Watcher {
-
- @Override
- public void process(WatchedEvent e) {
- LOG.info("all nodes created");
- allCreatedLatch.countDown();
- }
-
- }
-
@Test
public void testFileTxnSnapLogMetrics() throws Exception {
- SyncRequestProcessor.setSnapCount(100);
+ // disable automatic snapshot taking to leave writes in txn log
+ SyncRequestProcessor.setSnapCount(Integer.MAX_VALUE);
Review Comment:
We should reset this in a finally block.
That was a pre existing problem
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] eolivelli commented on pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
eolivelli commented on PR #1852:
URL: https://github.com/apache/zookeeper/pull/1852#issuecomment-1160388021
@maoling @symat @ztzg PTAL
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [zookeeper] symat commented on pull request #1852: ZOOKEEPER-4511: Fix flaky test FileTxnSnapLogMetricsTest.testFileTxnSnapLogMetrics
Posted by GitBox <gi...@apache.org>.
symat commented on PR #1852:
URL: https://github.com/apache/zookeeper/pull/1852#issuecomment-1165591879
thanks @kezhuw for fixing this test!!
I pushed it to master, branch-3.8 and branch-3.7. The cherry-pick on branch-3.6 wasn't clean, so we will need a separate PR there, if we want to push this to branch-3.6.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@zookeeper.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org