You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/07/01 20:25:08 UTC

[GitHub] [arrow] pitrou opened a new pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

pitrou opened a new pull request #10644:
URL: https://github.com/apache/arrow/pull/10644


   Followup to https://github.com/apache/arrow/pull/10632


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #10644:
URL: https://github.com/apache/arrow/pull/10644


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #10644:
URL: https://github.com/apache/arrow/pull/10644#issuecomment-872527986


   https://issues.apache.org/jira/browse/ARROW-13244


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on a change in pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

Posted by GitBox <gi...@apache.org>.
bkietz commented on a change in pull request #10644:
URL: https://github.com/apache/arrow/pull/10644#discussion_r663119109



##########
File path: cpp/src/arrow/util/io_util.cc
##########
@@ -1734,5 +1665,21 @@ int64_t GetRandomSeed() {
   return static_cast<int64_t>(seed_gen());
 }
 
+uint64_t GetThreadId() {
+  uint64_t equiv{0};
+  // std::thread::id is trivially copyable as per C++ spec,
+  // so type punning as a uint64_t should work
+  static_assert(sizeof(std::thread::id) <= sizeof(uint64_t),
+                "std::thread::id can't fit into uint64_t");
+  const auto tid = std::this_thread::get_id();
+  memcpy(&equiv, reinterpret_cast<const void*>(&tid), sizeof(tid));
+  return equiv;
+}
+
+uint64_t GetOptionalThreadId() {
+  auto tid = GetThreadId();
+  return (tid == 0) ? tid - 1 : tid;

Review comment:
       If we're going to use -1 in multiple places like this, let's define it as a global constant:
   ```suggestion
     return (tid == 0) ? kUnlikelyThreadId : tid;
   ```

##########
File path: cpp/src/arrow/util/io_util.h
##########
@@ -399,5 +338,12 @@ Status SendSignalToThread(int signum, uint64_t thread_id);
 ARROW_EXPORT
 int64_t GetRandomSeed();
 
+/// \brief Get the current thread id

Review comment:
       It seems odd to put this in io_util.h and not thread_pool.h

##########
File path: cpp/src/arrow/util/io_util.cc
##########
@@ -1734,5 +1665,21 @@ int64_t GetRandomSeed() {
   return static_cast<int64_t>(seed_gen());
 }
 
+uint64_t GetThreadId() {
+  uint64_t equiv{0};
+  // std::thread::id is trivially copyable as per C++ spec,
+  // so type punning as a uint64_t should work
+  static_assert(sizeof(std::thread::id) <= sizeof(uint64_t),
+                "std::thread::id can't fit into uint64_t");
+  const auto tid = std::this_thread::get_id();
+  memcpy(&equiv, reinterpret_cast<const void*>(&tid), sizeof(tid));
+  return equiv;
+}
+
+uint64_t GetOptionalThreadId() {
+  auto tid = GetThreadId();
+  return (tid == 0) ? tid - 1 : tid;

Review comment:
       Actually I'm not sure I understand the intent of this function and it's not exported

##########
File path: cpp/src/arrow/util/io_util.h
##########
@@ -399,5 +338,12 @@ Status SendSignalToThread(int signum, uint64_t thread_id);
 ARROW_EXPORT
 int64_t GetRandomSeed();
 
+/// \brief Get the current thread id
+///
+/// In addition to having the same properties as std::thread, the returned value
+/// is a regular integer value, which is more convenient than an opaque type.
+ARROW_EXPORT
+uint64_t GetThreadId();
+

Review comment:
       ```suggestion
   ARROW_EXPORT
   uint64_t GetOptionalThreadId();
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10644:
URL: https://github.com/apache/arrow/pull/10644#discussion_r663143143



##########
File path: cpp/src/arrow/util/io_util.cc
##########
@@ -1734,5 +1665,21 @@ int64_t GetRandomSeed() {
   return static_cast<int64_t>(seed_gen());
 }
 
+uint64_t GetThreadId() {
+  uint64_t equiv{0};
+  // std::thread::id is trivially copyable as per C++ spec,
+  // so type punning as a uint64_t should work
+  static_assert(sizeof(std::thread::id) <= sizeof(uint64_t),
+                "std::thread::id can't fit into uint64_t");
+  const auto tid = std::this_thread::get_id();
+  memcpy(&equiv, reinterpret_cast<const void*>(&tid), sizeof(tid));
+  return equiv;
+}
+
+uint64_t GetOptionalThreadId() {
+  auto tid = GetThreadId();
+  return (tid == 0) ? tid - 1 : tid;

Review comment:
       Hmm, that's a remnant from a previous approach, sorry. I should simply remove it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10644:
URL: https://github.com/apache/arrow/pull/10644#issuecomment-874153432


   Rebased, will merge if green.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #10644:
URL: https://github.com/apache/arrow/pull/10644


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10644:
URL: https://github.com/apache/arrow/pull/10644#discussion_r663968454



##########
File path: cpp/src/arrow/util/async_generator.h
##########
@@ -1247,6 +1248,8 @@ class BackgroundGenerator {
   }
 
  protected:
+  static constexpr uint64_t kUnlikelyThreadId{std::numeric_limits<uint64_t>::max()};

Review comment:
       Hmm, I hadn't thought of that. But it seems simpler to use a hardcoded number like this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10644:
URL: https://github.com/apache/arrow/pull/10644#discussion_r663968454



##########
File path: cpp/src/arrow/util/async_generator.h
##########
@@ -1247,6 +1248,8 @@ class BackgroundGenerator {
   }
 
  protected:
+  static constexpr uint64_t kUnlikelyThreadId{std::numeric_limits<uint64_t>::max()};

Review comment:
       Hmm, I hadn't thought of that. But it seems simpler to use a hardcoded number like this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #10644:
URL: https://github.com/apache/arrow/pull/10644#issuecomment-874153432


   Rebased, will merge if green.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #10644:
URL: https://github.com/apache/arrow/pull/10644#discussion_r663142972



##########
File path: cpp/src/arrow/util/io_util.h
##########
@@ -399,5 +338,12 @@ Status SendSignalToThread(int signum, uint64_t thread_id);
 ARROW_EXPORT
 int64_t GetRandomSeed();
 
+/// \brief Get the current thread id

Review comment:
       It's not thread pool-related, it could be useful in other contexts.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] westonpace commented on a change in pull request #10644: ARROW-13244: [C++] Add facility to get current thread id as uint64

Posted by GitBox <gi...@apache.org>.
westonpace commented on a change in pull request #10644:
URL: https://github.com/apache/arrow/pull/10644#discussion_r663227068



##########
File path: cpp/src/arrow/util/async_generator.h
##########
@@ -1247,6 +1248,8 @@ class BackgroundGenerator {
   }
 
  protected:
+  static constexpr uint64_t kUnlikelyThreadId{std::numeric_limits<uint64_t>::max()};

Review comment:
       I couldn't see any reason this wouldn't work but why not just use the `uint64_t` version of `std::thread::id()`?  Is it to allow this to be `constexpr`?

##########
File path: cpp/src/arrow/util/io_util.cc
##########
@@ -1734,5 +1665,21 @@ int64_t GetRandomSeed() {
   return static_cast<int64_t>(seed_gen());
 }
 
+uint64_t GetThreadId() {

Review comment:
       So I did a bit of research on this out of curiosity.  TL;DR: this should be safe enough.
   
   `std::thread::id` is often a number, sometimes it is a pointer to a struct, and sometimes it is a struct (https://github.com/nginx/unit/blob/master/src/nxt_thread_id.h attempts to catalogue all the possible methods).  I couldn't find any example where it is larger than 8 bytes.  This approach has some precedent. OpenSSL originally required thread_id be coerced into `unsigned long` and now it has an implementation where it is either `unsigned long` or a pointer.  However, I think that had to be done because on some systems a pointer may be larger than `unsigned long`.  So I think `uint64_t` is safe since a pointer should never be larger than 8 bytes.
   
   One potential concern is that technically `==` could be overloaded so that two threads could in theory have the same `uint64_t` representation but not be equal.  One case is when a thread id is a pointer and that pointer is reused.  However, the spec for std::thread states `Once a thread has finished, the value of std::thread::id may be reused by another thread` so this is already a given.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org