You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "James Clampffer (JIRA)" <ji...@apache.org> on 2017/03/21 16:22:42 UTC

[jira] [Created] (HDFS-11556) libhdfs++: Improve the usage of std::promise and std::future for sync shims

James Clampffer created HDFS-11556:
--------------------------------------

             Summary: libhdfs++: Improve the usage of std::promise and std::future for sync shims
                 Key: HDFS-11556
                 URL: https://issues.apache.org/jira/browse/HDFS-11556
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: James Clampffer
            Assignee: James Clampffer


There's two issues with the current promise/future situation being used in the synchronous shims.

1) C++ std::promise and std::futures aren't particularly intuitive (at least for me).  Logically it makes sense to think of the promise and future objects as handles that each hold onto a shared object containing the promised type + a mutex and condition variable for synchronization.  When promise<T>::set(T val) is called it seems reasonable to expect that it writes val into the shared state, and and wakes the thread blocked in future<>::wait.  This would be standard conformant but generally isn't how it's implemented in the standard libraries.

The common implementation is to bundle the value, mutex, and cond_var into the promise.  This makes it really easy to introduce race conditions that will often pass tests.. until they don't.  Example of most common case:

{code}
auto foo = std::promise<int>()
std::thread async_task([&foo](){
  ... do some work ...
  foo.set_value(1);
})

auto future_val = foo.get_future();
int result = future_val.get_value();
{code}

That last line has a race between promise<T>::~promise() which happens right after it gets set and any reads of the value, mutex, or condition variable owned by the promise but referenced by the future.

When it does show up there will typically be a thread hung in pthread_cond_wait and an invalid 32 bit read of the futex state that the condition variable used under the hood.

2) If a callback happens to throw in an operation that has synchronous shims on top of it the stack will unwind and io_service.run_once will be called to keep getting work done (HDFS-9699).  The problem here is that once the stack is unwound there's nothing around to set the promise so the thread blocked on a future will never wake up.  The simple/unrealistic fix is to assert that exceptions should never be used but that precludes the use of this library in most C++ projects and the use of most third party dependencies that weren't made by Google.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org