You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by tillrohrmann <gi...@git.apache.org> on 2017/05/12 16:49:38 UTC

[GitHub] flink pull request #3888: [FLINK-6020] Introduce BlobServer#readWriteLock to...

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/3888

    [FLINK-6020] Introduce BlobServer#readWriteLock to synchronize file creation and deletion

    This PR is based on #3873 and #3864.
    
    This commit introduces a BlobServer#readWriteLock in order to synchronize file creation
    and deletion operations in BlobServerConnection and BlobServer. This will prevent
    that multiple put, delete and get operations interfere with each other.
    
    The get operations are synchronized using the read lock in order to guarantee some kind of
    parallelism.
    
    What this PR does not address is the handling of concurrent writes and reads to the `BlobStore`. This could be solved via SUCCESS files in order to indicate the completion of a file. However, the first read operation should now happen strictly after the write operation due to the locking.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink concurrentBlobUploads

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3888.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3888
    
----
commit 28bc0acd5c2d3858e41fd29113fff1c7a40471f5
Author: Till Rohrmann <tr...@apache.org>
Date:   2017-05-11T15:36:17Z

    [FLINK-6555] [futures] Generalize ConjunctFuture to return results
    
    The ConjunctFuture now returns the set of future values once it is completed.

commit f221353584c9089552572387eee9e162695311cd
Author: Till Rohrmann <tr...@apache.org>
Date:   2017-05-12T09:05:13Z

    Introduce WaitingConjunctFuture; Fix thread safety issue with ResultConjunctFuture
    
    The WaitingConjunctFuture waits for the completion of its futures. The future values
    are discarded making it more efficient than the ResultConjunctFuture which returns
    the futures' values. The WaitingConjunctFuture is instantiated via
    FutureUtils.waitForAll(Collection<Future>).

commit 79eafa2671f4a4dfdc9ab135443c339ef2e8001a
Author: Till Rohrmann <tr...@apache.org>
Date:   2017-05-09T08:26:37Z

    [FLINK-6519] Integrate BlobStore in lifecycle management of HighAvailabilityServices
    
    The HighAvailabilityService creates a single BlobStoreService instance which is
    shared by all BlobServer and BlobCache instances. The BlobStoreService's lifecycle
    is exclusively managed by the HighAvailabilityServices. This means that the
    BlobStore's content is only cleaned up if the HighAvailabilityService's HA data
    is cleaned up. Having this single point of control, makes it easier to decide when
    to discard HA data (e.g. in case of a successful job execution) and when to retain
    the data (e.g. for recovery).
    
    Close and cleanup all data of BlobStore in HighAvailabilityServices
    
    Use HighAvailabilityServices to create BlobStore
    
    Introduce BlobStoreService interface to hide close and closeAndCleanupAllData methods

commit c6ff2ced58e60f63d0236e53c83192f64479c44a
Author: Till Rohrmann <tr...@apache.org>
Date:   2017-05-10T15:38:49Z

    [FLINK-6020] Introduce BlobServer#readWriteLock to synchronize file creation and deletion
    
    This commit introduces a BlobServer#readWriteLock in order to synchronize file creation
    and deletion operations in BlobServerConnection and BlobServer. This will prevent
    that multiple put and get operations interfere with each other and with get operations.
    
    The get operations are synchronized using the read lock in order to guarantee some kind of
    parallelism.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3888: [FLINK-6020] Introduce BlobServer#readWriteLock to...

Posted by tillrohrmann <gi...@git.apache.org>.

Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3888#discussion_r116457501
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/blob/BlobServerConnection.java ---
    @@ -235,8 +235,54 @@ else if (contentAddressable == CONTENT_ADDRESSABLE) {
     			return;
     		}
     
    -		// from here on, we started sending data, so all we can do is close the connection when something happens
    +		readLock.lock();
    +
     		try {
    +			try {
    +				if (!blobFile.exists()) {
    +					// first we have to release the read lock in order to acquire the write lock
    +					readLock.unlock();
    +					writeLock.lock();
    --- End diff --
    
    True, I'm also not sure which version would be more efficient. Given that the file transfer dominates the execution time here, I assume that the additional check won't hurt.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3888: [FLINK-6020] Introduce BlobServer#readWriteLock to...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/3888


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3888: [FLINK-6020] Introduce BlobServer#readWriteLock to synchr...

Posted by tillrohrmann <gi...@git.apache.org>.

Github user tillrohrmann commented on the issue:

    https://github.com/apache/flink/pull/3888
  
    Thanks for the quick review @StefanRRichter. I will address your comment and also add a test case for the delete path.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3888: [FLINK-6020] Introduce BlobServer#readWriteLock to...

Posted by StefanRRichter <gi...@git.apache.org>.

Github user StefanRRichter commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3888#discussion_r116446914
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/blob/BlobServerConnection.java ---
    @@ -235,8 +235,54 @@ else if (contentAddressable == CONTENT_ADDRESSABLE) {
     			return;
     		}
     
    -		// from here on, we started sending data, so all we can do is close the connection when something happens
    +		readLock.lock();
    +
     		try {
    +			try {
    +				if (!blobFile.exists()) {
    +					// first we have to release the read lock in order to acquire the write lock
    +					readLock.unlock();
    +					writeLock.lock();
    --- End diff --
    
    In between upgrading from read lock to write lock, multiple threads can reach this point and as far as I see, then a file can be written more often then required. I would assume the code still produces correct result, but could do duplicate work. An obvious fix would be to re-check `blobFile.exists()` under the write lock, but now sure if the costs of another meta data query per write could offset occasional, but unlikely duplicate work. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---