You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Tummy Bunny (JIRA)" <ji...@apache.org> on 2017/05/13 13:54:04 UTC

[jira] [Created] (HDFS-11820) Thread safety in logEdit?

Tummy Bunny created HDFS-11820:
----------------------------------

             Summary: Thread safety in logEdit?
                 Key: HDFS-11820
                 URL: https://issues.apache.org/jira/browse/HDFS-11820
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs
            Reporter: Tummy Bunny


Hi there,

I am new to Hadoop and trying to understand how things work under the hood by browsing through some of the codes.

I noticed a potential thread safety issue in in FSEditLog.java in version 2.7.1 where the following patterns is used (the current trunk also use the same pattern):
1. Instance of FSEditLogOp is retrieved from cache for reuse 
2. Set the attributes (e.g. path, timestamp, etc)
3. Invoke logEdit(op). This method has synchronized block in it, but also has "wait" if auto-sync is scheduled.

Now, if I have two almost simultaneous rename operations, right after each is about to write edit log:
Thread #1 acquired instance of RenameOp, set the attributes, and invoked logEdit, then it waits because auto-sync is scheduled.
Thread #2 acquired same instance of RenameOp, set *different* attributes, and also invoked logEdit.

The second renameOp could end up being logged twice because both renameOps are actually the same instance. 
The fix is to move synchronized(anyCachedOp) { ... } prior to calling logEdit or clone the op (use the cached instance as template).

I could be wrong. Am I missing something?

Thanks,

Alexander Koentjara



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org