You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/12 03:11:44 UTC

[GitHub] [arrow] westonpace opened a new pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

westonpace opened a new pull request #12408:
URL: https://github.com/apache/arrow/pull/12408


   The particular cause of the failure was:
   
   ```
     void InputReceived(ExecNode* input, ExecBatch batch) override {
       EVENT(span_, "InputReceived", {{"batch.length", batch.length}});
       util::tracing::Span span;
       START_SPAN_WITH_PARENT(span, span_, "InputReceived",
                              {{"node.label", label()}, {"batch.length", batch.length}});
   
       DCHECK_EQ(input, inputs_[0]);
   
       bool did_push = producer_.Push(std::move(batch));
       if (!did_push) return;  // producer_ was Closed already
   
       if (input_counter_.Increment()) {
         // This will mark the exec plan finished
         Finish();
         // At this point the main thread is generally free to exit but span hasn't been destructed yet
       }
     }
   ```
   
   It could be fixed with scoping:
   
   ```
     void InputReceived(ExecNode* input, ExecBatch batch) override {
       EVENT(span_, "InputReceived", {{"batch.length", batch.length}});
       {
         util::tracing::Span span;
         START_SPAN_WITH_PARENT(span, span_, "InputReceived",
                                {{"node.label", label()}, {"batch.length", batch.length}});
   
         DCHECK_EQ(input, inputs_[0]);
   
         bool did_push = producer_.Push(std::move(batch));
         if (!did_push) return;  // producer_ was Closed already
       }
       if (input_counter_.Increment()) {
         // This will mark the exec plan finished
         Finish();
         // At this point the main thread is generally free to exit but span hasn't been destructed yet
       }
     }
   ```
   However, I suspect this would quickly get tedious.  Instead it seems we can control the lifetime of the OT runtime storage and bind it to our worker threads.  In the future we may want to consider doing similar tricks to keep alive the memory pool, global thread pools, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r813981342



##########
File path: cpp/src/arrow/util/thread_pool.h
##########
@@ -201,6 +201,27 @@ class ARROW_EXPORT Executor {
   // Executor. Returns false if this Executor does not support this property.
   virtual bool OwnsThisThread() { return false; }
 
+  /// \brief An interface to represent something with a custom destructor
+  ///
+  /// \see KeepAlive
+  class Resource {
+   public:
+    virtual ~Resource() = default;
+  };
+
+  /// \brief Keeps a resource alive until all executor threads have terminated

Review comment:
       Nit, sorry :-)
   ```suggestion
     /// \brief Keep a resource alive until all executor threads have terminated
   ```

##########
File path: cpp/src/arrow/util/thread_pool.h
##########
@@ -201,6 +201,27 @@ class ARROW_EXPORT Executor {
   // Executor. Returns false if this Executor does not support this property.
   virtual bool OwnsThisThread() { return false; }
 
+  /// \brief An interface to represent something with a custom destructor
+  ///
+  /// \see KeepAlive
+  class Resource {

Review comment:
       Hmm... does this need ARROW_EXPORT?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
westonpace commented on pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#issuecomment-1049117504


   I've made an attempt at addressing the PR feedback in preparation for the OT release.  This should be pretty close to what we will use (I think only two lines will change once we can use their new API).
   
   The unit test is a little unreliable.  If I run it on repeat I can trigger the failure much more reliably than running it a single time.  I played around with a few different approaches but couldn't come up with a variation that failed very reliably.  I'm not sure if we want to leave it in or just get rid of it and rely on catching this via TSAN were there to be a regression.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r805186057



##########
File path: cpp/src/arrow/util/tracing_internal.h
##########
@@ -101,6 +101,17 @@ AsyncGenerator<T> WrapAsyncGenerator(AsyncGenerator<T> wrapped,
   };
 }
 
+class OtHandle {
+ public:
+  OtHandle(opentelemetry::nostd::shared_ptr<opentelemetry::context::RuntimeContextStorage>

Review comment:
       Should this be a private constructor?

##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       Right, this should generally be left to the application. For us, I suppose that's technically each individual test; that may get tedious, though.

##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       Maybe we should file an upstream issue? If we could access the upstream shared_ptr, we could keep references to it ourselves and be independent of what the application wants to do: https://github.com/open-telemetry/opentelemetry-cpp/blob/3a3bf25289901079534b1cabe14e9c4fb3b35968/api/include/opentelemetry/context/runtime_context.h#L154
   
   A brief reproduction using a thread and TSAN might be enough to convince them.

##########
File path: cpp/src/arrow/util/tracing_internal.h
##########
@@ -101,6 +101,17 @@ AsyncGenerator<T> WrapAsyncGenerator(AsyncGenerator<T> wrapped,
   };
 }
 
+class OtHandle {

Review comment:
       Can we add a brief docstring for the class?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r805132924



##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);
+  }
+  nostd::shared_ptr<otel::context::RuntimeContextStorage> storage_;
+};
+
 }  // namespace
 
+static StorageSingleton storage_singleton;

Review comment:
       Nit: what about moving the definition inside Attach()?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r806197721



##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       open-telemetry/opentelemetry-cpp#1211




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r810638389



##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       Looks like the implemented it indeed. Maybe we can bump the bundled OpenTelemetry, or wait for a new release.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r813225912



##########
File path: cpp/src/arrow/util/tracing_internal.h
##########
@@ -101,6 +101,17 @@ AsyncGenerator<T> WrapAsyncGenerator(AsyncGenerator<T> wrapped,
   };
 }
 
+class OtHandle {
+ public:
+  OtHandle(opentelemetry::nostd::shared_ptr<opentelemetry::context::RuntimeContextStorage>

Review comment:
       Yes it should have been.  The implementation has changed however, all tracing_internal changes are in the anonymous namespace now.

##########
File path: cpp/src/arrow/util/tracing_internal.h
##########
@@ -101,6 +101,17 @@ AsyncGenerator<T> WrapAsyncGenerator(AsyncGenerator<T> wrapped,
   };
 }
 
+class OtHandle {

Review comment:
       No longer relevant.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#issuecomment-1049937344


   We can keep the test, I think. Even if we plan to catch regressions via TSAN, presumably this will trigger it more reliably?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r813221511



##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);
+  }
+  nostd::shared_ptr<otel::context::RuntimeContextStorage> storage_;
+};
+
 }  // namespace
 
+static StorageSingleton storage_singleton;

Review comment:
       I moved the static definition inside the method (although the method names have changed a little)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r806154621



##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       Ah, good points.  I'll file an upstream issue and put this on hold for now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] cyb70289 commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
cyb70289 commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r805132951



##########
File path: docker-compose.yml
##########
@@ -773,8 +773,6 @@ services:
     shm_size: *shm-size
     environment:
       <<: *ccache
-      # Bundled build of OpenTelemetry needs a git client
-      ARROW_WITH_OPENTELEMETRY: "OFF"

Review comment:
       Intended or spurious change?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#issuecomment-1075379259


   What is the status of this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r805146753



##########
File path: cpp/src/arrow/util/thread_pool.cc
##########
@@ -144,6 +145,13 @@ struct ThreadPool::State {
 // after the ThreadPool is destroyed.
 static void WorkerLoop(std::shared_ptr<ThreadPool::State> state,
                        std::list<std::thread>::iterator it) {
+#ifdef ARROW_WITH_OPENTELEMETRY
+  // The main thread may exit and start shutting down static state before
+  // this thread ends.  This means that any calls to OpenTelemetry's static
+  // state will potentially access freed memory.  By grabbing a handle we
+  // keep OpenTelemetry's static state alive until this thread ends.
+  internal::tracing::OtHandle handle = internal::tracing::Attach();

Review comment:
       This would probably deserve a more general approach. For example:
   ```c++
   class Executor {
    public:
     class Resource {
      public:
       virtual ~Resource();
     };
     // All live executors should keep this object alive
     static void KeepAlive(std::shared_ptr<Resource>);
   };
   ```
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#issuecomment-1036969736


   https://issues.apache.org/jira/browse/ARROW-15604


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r812335164



##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       Looks like they are releasing monthly. Let's just disable OT in our CI builds until the release is out and we can address this at that point.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r813224851



##########
File path: cpp/src/arrow/util/thread_pool.cc
##########
@@ -144,6 +145,13 @@ struct ThreadPool::State {
 // after the ThreadPool is destroyed.
 static void WorkerLoop(std::shared_ptr<ThreadPool::State> state,
                        std::list<std::thread>::iterator it) {
+#ifdef ARROW_WITH_OPENTELEMETRY
+  // The main thread may exit and start shutting down static state before
+  // this thread ends.  This means that any calls to OpenTelemetry's static
+  // state will potentially access freed memory.  By grabbing a handle we
+  // keep OpenTelemetry's static state alive until this thread ends.
+  internal::tracing::OtHandle handle = internal::tracing::Attach();

Review comment:
       I switched to what I think you were intending.  This basically inverts the dependency so that OT has to call keepalive on the thread pools instead of the other way around.  The implementation's KeepAlive method was not static though.  Can you double check that I addressed your concerns here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r805146224



##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       Is it ok to do this in a library? Applications or libraries using Arrow C++ may want to use their own context storage and will be surprised that Arrow overrides it. @lidavidm 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r805186057



##########
File path: cpp/src/arrow/util/tracing_internal.h
##########
@@ -101,6 +101,17 @@ AsyncGenerator<T> WrapAsyncGenerator(AsyncGenerator<T> wrapped,
   };
 }
 
+class OtHandle {
+ public:
+  OtHandle(opentelemetry::nostd::shared_ptr<opentelemetry::context::RuntimeContextStorage>

Review comment:
       Should this be a private constructor?

##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       Right, this should generally be left to the application. For us, I suppose that's technically each individual test; that may get tedious, though.

##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       Maybe we should file an upstream issue? If we could access the upstream shared_ptr, we could keep references to it ourselves and be independent of what the application wants to do: https://github.com/open-telemetry/opentelemetry-cpp/blob/3a3bf25289901079534b1cabe14e9c4fb3b35968/api/include/opentelemetry/context/runtime_context.h#L154
   
   A brief reproduction using a thread and TSAN might be enough to convince them.

##########
File path: cpp/src/arrow/util/tracing_internal.h
##########
@@ -101,6 +101,17 @@ AsyncGenerator<T> WrapAsyncGenerator(AsyncGenerator<T> wrapped,
   };
 }
 
+class OtHandle {

Review comment:
       Can we add a brief docstring for the class?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r805146753



##########
File path: cpp/src/arrow/util/thread_pool.cc
##########
@@ -144,6 +145,13 @@ struct ThreadPool::State {
 // after the ThreadPool is destroyed.
 static void WorkerLoop(std::shared_ptr<ThreadPool::State> state,
                        std::list<std::thread>::iterator it) {
+#ifdef ARROW_WITH_OPENTELEMETRY
+  // The main thread may exit and start shutting down static state before
+  // this thread ends.  This means that any calls to OpenTelemetry's static
+  // state will potentially access freed memory.  By grabbing a handle we
+  // keep OpenTelemetry's static state alive until this thread ends.
+  internal::tracing::OtHandle handle = internal::tracing::Attach();

Review comment:
       This would probably deserve a more general approach. For example:
   ```c++
   class Executor {
    public:
     class Resource {
      public:
       virtual ~Resource();
     };
     // All live executors should keep this object alive
     static void KeepAlive(Resource);
   };
   ```
   (is it possible to mandate that `Resource` has to be copy-constructible?)
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r806234546



##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       Thanks a lot for filing this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r806234546



##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       Thanks a lot for filing this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r806154621



##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       Ah, good points.  I'll file an upstream issue and put this on hold for now.

##########
File path: cpp/src/arrow/util/tracing_internal.cc
##########
@@ -176,8 +176,23 @@ otel::trace::TracerProvider* GetTracerProvider() {
   static nostd::shared_ptr<otel::trace::TracerProvider> provider = InitializeTracing();
   return provider.get();
 }
+
+struct StorageSingleton {
+  StorageSingleton() : storage_(otel::context::GetDefaultStorage()) {
+    otel::context::RuntimeContext::SetRuntimeContextStorage(storage_);

Review comment:
       open-telemetry/opentelemetry-cpp#1211




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #12408: ARROW-15604: [C++][CI] Sporadic ThreadSanitizer failure with OpenTracing

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #12408:
URL: https://github.com/apache/arrow/pull/12408#discussion_r813223235



##########
File path: docker-compose.yml
##########
@@ -773,8 +773,6 @@ services:
     shm_size: *shm-size
     environment:
       <<: *ccache
-      # Bundled build of OpenTelemetry needs a git client
-      ARROW_WITH_OPENTELEMETRY: "OFF"

Review comment:
       Good catch.  This was not intentional.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org