You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@giraph.apache.org by ed...@apache.org on 2016/06/21 17:14:44 UTC
git commit: updated refs/heads/trunk to 2185f59
Repository: giraph
Updated Branches:
refs/heads/trunk d827c97fc -> 2185f5946
GIRAPH-1076 Race condition in FileTxnSnapLog
Summary:
org.apache.zookeeper.server.persistence.FileTxnSnapLog has a potential for race condition:
if (!this.dataDir.exists()) {
if (!this.dataDir.mkdirs()) {
throw new IOException("Unable to create data directory " + this.dataDir);
}
}
If two threads try to create FileTxnSnapLog simultaneously it can trigger IOException.
We saw this happening in Giraph where FileTxnSnapLog is being created by PurgeTask created by DatadirCleanupManager and by InProcessZooKeeperRunner#runFromConfig.
Until and if ever, the zookeeper code is fixed, we need to make sure zookeeper starts first and only then starts PurgeTask.
Test Plan: run a few jobs and mvn clean verify
Reviewers: majakabiljo, dionysis.logothetis, heslami, maja.kabiljo
Reviewed By: maja.kabiljo
Differential Revision: https://reviews.facebook.net/D59883
Project: http://git-wip-us.apache.org/repos/asf/giraph/repo
Commit: http://git-wip-us.apache.org/repos/asf/giraph/commit/2185f594
Tree: http://git-wip-us.apache.org/repos/asf/giraph/tree/2185f594
Diff: http://git-wip-us.apache.org/repos/asf/giraph/diff/2185f594
Branch: refs/heads/trunk
Commit: 2185f5946edfddcca8a5bcb76160212bfe2ef797
Parents: d827c97
Author: Sergey Edunov <ed...@fb.com>
Authored: Tue Jun 21 10:14:34 2016 -0700
Committer: Sergey Edunov <ed...@fb.com>
Committed: Tue Jun 21 10:14:34 2016 -0700
----------------------------------------------------------------------
.../org/apache/giraph/zk/InProcessZooKeeperRunner.java | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/giraph/blob/2185f594/giraph-core/src/main/java/org/apache/giraph/zk/InProcessZooKeeperRunner.java
----------------------------------------------------------------------
diff --git a/giraph-core/src/main/java/org/apache/giraph/zk/InProcessZooKeeperRunner.java b/giraph-core/src/main/java/org/apache/giraph/zk/InProcessZooKeeperRunner.java
index 9502c24..4f15f3a 100644
--- a/giraph-core/src/main/java/org/apache/giraph/zk/InProcessZooKeeperRunner.java
+++ b/giraph-core/src/main/java/org/apache/giraph/zk/InProcessZooKeeperRunner.java
@@ -88,16 +88,22 @@ public class InProcessZooKeeperRunner
* @throws IOException if can't start zookeeper
*/
public int start(ZookeeperConfig config) throws IOException {
+ serverRunner = new ZooKeeperServerRunner();
+ //Make sure zookeeper starts first and purge manager last
+ //This is important because zookeeper creates a folder
+ //strucutre on the local disk. Purge manager also tries
+ //to create it but from a different thread and can run into
+ //race condition. See FileTxnSnapLog source code for details.
+ int port = serverRunner.start(config);
// Start and schedule the the purge task
DatadirCleanupManager purgeMgr = new DatadirCleanupManager(
config
- .getDataDir(), config.getDataLogDir(),
+ .getDataDir(), config.getDataLogDir(),
GiraphConstants.ZOOKEEPER_SNAP_RETAIN_COUNT,
GiraphConstants.ZOOKEEPER_PURGE_INTERVAL);
purgeMgr.start();
- serverRunner = new ZooKeeperServerRunner();
- return serverRunner.start(config);
+ return port;
}
/**