You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2020/01/20 08:49:53 UTC

[GitHub] [incubator-tvm] FrozenGene opened a new pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

FrozenGene opened a new pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747
 
 
   Follow up #4344 . Previous PR solves OpenCV + TVM slow performance and AutoTVM Python CPU affinity problem. However, previous PR has two problem:
   
   - doesn't solve ARM  BIG.LITTLE heterogeneous multicores.
      For example, we have 2xA72 + 4xA53, previous pr will add all cpu to CPU SET even we call `config_thread_pool` to restrict TVM runs on 4 little cores. So, TVM_MASTER_THREAD could run A72 big core. This is not we want.
      Solution: we should restrict master in the little cpus or big cpus according to users's setting.
   - Not all Linux variant OS has `pthread_atfork`. For example, Alibaba's AliOS doesn't have this api.
      Solution: unified android and linux's way. We will call `SetFullCpuAffinity` if we don't have `TVM_BIND_MASTER_THREAD`.
   
   @tqchen @yidawang @vinx13 @ajtulloch Could you help to review it? Thanks.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] u99127 commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
u99127 commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-576192552
 
 
   > 
   > 
   > Follow up #4344 . Previous PR solves OpenCV + TVM slow performance and AutoTVM Python CPU affinity problem. However, previous PR has two problem:
   > 
   >     * doesn't solve ARM  BIG.LITTLE heterogeneous multicores.
   >       For example, we have 2xA72 + 4xA53, previous pr will add all cpu to CPU SET even we call `config_thread_pool` to restrict TVM runs on 4 little cores. So, TVM_MASTER_THREAD could run A72 big core. This is not we want.
   >       Solution: we should restrict master in the little cpus or big cpus according to users's setting.
   
   This approach needs to be added as comments in the code explaining how this is expected to work. Further I suspect it is TVM_BIND_MASTER_THREAD that you refer to above ?  What is the role of the TVM master thread ? Is it another thread that does compute or is it a "controlling" thread ? 
   
   > 
   >     * Not all Linux variant OS has `pthread_atfork`. For example, Alibaba's AliOS doesn't have this api.
   
   pthread_atfork IIRC originally comes from the posix group from Issue 5 as documented [here](https://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html). Checking the other libcs suggest that all other mainstream Linux runtimes like glibc, musl and bionic support this and thus are we working around a limitation of AliOS here ?
   
   
   >       Solution: unified android and linux's way. We will call `SetFullCpuAffinity` if we don't have `TVM_BIND_MASTER_THREAD`.
   > 
   > 
   > @tqchen @yidawang @vinx13 @ajtulloch Could you help to review it? Thanks.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-581216205
 
 
   @yidawang Would you like to review it again? Thanks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] yidawang commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
yidawang commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-578388738
 
 
   > To clarify your meaning, as hyper thread will make the big_count_ is twice times compared with physical cores. For example, phisical cores are 4, hyper thread will make the concurrency number be 8 (big_count_ is 8 too). So for hyper thread, we wish master thread could run the physical cores (id: 0-3), not cpu id 4-7.&nbsp;
   > […](#)
   > ------------------ Original ------------------ From: Yida Wang <notifications@github.com&gt; Date: Sat,Jan 25,2020 3:21 AM To: apache/incubator-tvm <incubator-tvm@noreply.github.com&gt; Cc: Zhao Wu <zhaowu@apache.org&gt;, Author <author@noreply.github.com&gt; Subject: Re: [apache/incubator-tvm] [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores (#4747) FYI, we just notice that if we don't bind the master thread, which is the default behavior, the master thread may cause the contention issue as it is allocated to a seemingly idle logical thread under hyperthreading. So I would propose to at least have the master thread bound to the intended running cores by default. @anijain2305 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
   
   Yes

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] yidawang commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
yidawang commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#discussion_r373950412
 
 

 ##########
 File path: src/runtime/threading_backend.cc
 ##########
 @@ -133,34 +133,42 @@ class ThreadGroup::Impl {
 #endif
     }
     if (exclude_worker0) {  // master thread run task
-#if defined(__ANDROID__)
-      SetFullCpuAffinity();
-#else
-      // if we set TVM_BIND_MASTER_THREAD to be 1, we will bind master thread
-      // to core 0.
-      const char* bind_master_thread = getenv("TVM_BIND_MASTER_THREAD");
-      if (bind_master_thread && atoi(bind_master_thread) == 1) {
-        cpu_set_t cpuset;
-        CPU_ZERO(&cpuset);
-        if (reverse) {
-          CPU_SET(sorted_order_[sorted_order_.size() - 1], &cpuset);
-        } else {
-          CPU_SET(sorted_order_[0], &cpuset);
-        }
-        pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
-      }
-      pthread_atfork(nullptr, nullptr, ThreadGroup::Impl::SetFullCpuAffinity);
-#endif
+      // Master thread will have free migration on needed cores.
+      // See the comment inside SetFullCpuAffinity function to get more detail.
+      SetFullCpuAffinity(reverse);
     }
 #endif
   }
 
-  static void SetFullCpuAffinity() {
+  void SetFullCpuAffinity(bool reverse) {
 
 Review comment:
   This function is to set master thread to full cpu affinity, the function name should indicate that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#discussion_r368744581
 
 

 ##########
 File path: src/runtime/threading_backend.cc
 ##########
 @@ -133,9 +133,7 @@ class ThreadGroup::Impl {
 #endif
     }
     if (exclude_worker0) {  // master thread run task
-#if defined(__ANDROID__)
-      SetFullCpuAffinity();
-#else
+#if defined(__linux__) || defined(__ANDROID__)
 
 Review comment:
   Thanks. Will fix.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] yidawang commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
yidawang commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#discussion_r368705761
 
 

 ##########
 File path: src/runtime/threading_backend.cc
 ##########
 @@ -133,9 +133,7 @@ class ThreadGroup::Impl {
 #endif
     }
     if (exclude_worker0) {  // master thread run task
-#if defined(__ANDROID__)
-      SetFullCpuAffinity();
-#else
+#if defined(__linux__) || defined(__ANDROID__)
 
 Review comment:
   In the current setting, this is redundant given line 115. Furthermore, in this case, I wouldn't suggest combining the two cases together.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#discussion_r373962677
 
 

 ##########
 File path: src/runtime/threading_backend.cc
 ##########
 @@ -133,34 +133,42 @@ class ThreadGroup::Impl {
 #endif
     }
     if (exclude_worker0) {  // master thread run task
-#if defined(__ANDROID__)
-      SetFullCpuAffinity();
-#else
-      // if we set TVM_BIND_MASTER_THREAD to be 1, we will bind master thread
-      // to core 0.
-      const char* bind_master_thread = getenv("TVM_BIND_MASTER_THREAD");
-      if (bind_master_thread && atoi(bind_master_thread) == 1) {
-        cpu_set_t cpuset;
-        CPU_ZERO(&cpuset);
-        if (reverse) {
-          CPU_SET(sorted_order_[sorted_order_.size() - 1], &cpuset);
-        } else {
-          CPU_SET(sorted_order_[0], &cpuset);
-        }
-        pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
-      }
-      pthread_atfork(nullptr, nullptr, ThreadGroup::Impl::SetFullCpuAffinity);
-#endif
+      // Master thread will have free migration on needed cores.
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#discussion_r373962704
 
 

 ##########
 File path: src/runtime/threading_backend.cc
 ##########
 @@ -133,34 +133,42 @@ class ThreadGroup::Impl {
 #endif
     }
     if (exclude_worker0) {  // master thread run task
-#if defined(__ANDROID__)
-      SetFullCpuAffinity();
-#else
-      // if we set TVM_BIND_MASTER_THREAD to be 1, we will bind master thread
-      // to core 0.
-      const char* bind_master_thread = getenv("TVM_BIND_MASTER_THREAD");
-      if (bind_master_thread && atoi(bind_master_thread) == 1) {
-        cpu_set_t cpuset;
-        CPU_ZERO(&cpuset);
-        if (reverse) {
-          CPU_SET(sorted_order_[sorted_order_.size() - 1], &cpuset);
-        } else {
-          CPU_SET(sorted_order_[0], &cpuset);
-        }
-        pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
-      }
-      pthread_atfork(nullptr, nullptr, ThreadGroup::Impl::SetFullCpuAffinity);
-#endif
+      // Master thread will have free migration on needed cores.
+      // See the comment inside SetFullCpuAffinity function to get more detail.
+      SetFullCpuAffinity(reverse);
     }
 #endif
   }
 
-  static void SetFullCpuAffinity() {
+  void SetFullCpuAffinity(bool reverse) {
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen edited a comment on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-576368620
 
 
   Overall I like the idea of binding master thread to the set of cores needed rather than a single one, whichi means free migration of master thread, would be great if we can document this idea in the code, let others to check.
   
   Given that pthread_atfork is indeed available in most cases, we should put that as a separate item(perhaps using macro to detect the existence) if we still decides to use it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-576368620
 
 
   Overall I like the idea of binding master thread to the set of cores needed rather than a single one, whichi means free migration of master thread, would be great if we can document this idea in the code, let others to check

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-578374172
 
 
   To clarify your meaning, as hyper thread will make the big_count_ is twice times compared with physical cores. For example, phisical cores are 4, hyper thread will make the concurrency number be 8 (big_count_ is 8 too). So for hyper thread, we wish master thread could run the physical cores (id: 0-3), not cpu id 4-7.&nbsp;
   
   
   ------------------ Original ------------------
   From: Yida Wang <notifications@github.com&gt;
   Date: Sat,Jan 25,2020 3:21 AM
   To: apache/incubator-tvm <incubator-tvm@noreply.github.com&gt;
   Cc: Zhao Wu <zhaowu@apache.org&gt;, Author <author@noreply.github.com&gt;
   Subject: Re: [apache/incubator-tvm] [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores (#4747)
   
   
   
   
   FYI, we just notice that if we don't bind the master thread, which is the default behavior, the master thread may cause the contention issue as it is allocated to a seemingly idle logical thread under hyperthreading. So I would propose to at least have the master thread bound to the intended running cores by default. @anijain2305
    
   —
   You are receiving this because you authored the thread.
   Reply to this email directly, view it on GitHub, or unsubscribe.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#discussion_r369340807
 
 

 ##########
 File path: src/runtime/threading_backend.cc
 ##########
 @@ -148,19 +146,36 @@ class ThreadGroup::Impl {
           CPU_SET(sorted_order_[0], &cpuset);
         }
         pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
+      } else {
 
 Review comment:
   Good. I will fix it when I back to work ASAP.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] yidawang commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
yidawang commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#discussion_r369162396
 
 

 ##########
 File path: src/runtime/threading_backend.cc
 ##########
 @@ -148,19 +146,36 @@ class ThreadGroup::Impl {
           CPU_SET(sorted_order_[0], &cpuset);
         }
         pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
+      } else {
 
 Review comment:
   Sounds good. And I am fine with enabling the master thread to migrate across cores.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#discussion_r368745458
 
 

 ##########
 File path: src/runtime/threading_backend.cc
 ##########
 @@ -148,19 +146,36 @@ class ThreadGroup::Impl {
           CPU_SET(sorted_order_[0], &cpuset);
         }
         pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
+      } else {
 
 Review comment:
   Hmm...How about call SetFullCpuAffinity(reverse) directly like Android? I think make master thread could have free migration on needed cores is better than bind just one specific core. Thanks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] yidawang commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
yidawang commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#discussion_r373950272
 
 

 ##########
 File path: src/runtime/threading_backend.cc
 ##########
 @@ -133,34 +133,42 @@ class ThreadGroup::Impl {
 #endif
     }
     if (exclude_worker0) {  // master thread run task
-#if defined(__ANDROID__)
-      SetFullCpuAffinity();
-#else
-      // if we set TVM_BIND_MASTER_THREAD to be 1, we will bind master thread
-      // to core 0.
-      const char* bind_master_thread = getenv("TVM_BIND_MASTER_THREAD");
-      if (bind_master_thread && atoi(bind_master_thread) == 1) {
-        cpu_set_t cpuset;
-        CPU_ZERO(&cpuset);
-        if (reverse) {
-          CPU_SET(sorted_order_[sorted_order_.size() - 1], &cpuset);
-        } else {
-          CPU_SET(sorted_order_[0], &cpuset);
-        }
-        pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
-      }
-      pthread_atfork(nullptr, nullptr, ThreadGroup::Impl::SetFullCpuAffinity);
-#endif
+      // Master thread will have free migration on needed cores.
 
 Review comment:
   I would suggest add one sentence in the comments: Typically, the OS will schedule the master thread to run at core 0, which is idle otherwise, when other works are running.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen edited a comment on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
tqchen edited a comment on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-576368620
 
 
   Overall I like the idea of binding master thread to the set of cores needed rather than a single one, whichi means free migration of master thread, would be great if we can document this idea in the code, let others to check.
   
   Given that pthread_atfork is indeed available in most cases, we should put that as a separate item(perhaps using macro to detect the existence) if we still decides to use it. 
   
   For example, we can keep the existing behavior of binding master thread on linux, but use the new behavior on android

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] yidawang commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
yidawang commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-578264862
 
 
   FYI, we just notice that if we don't bind the master thread, which is the default behavior, the master thread may cause the contention issue as it is allocated to a seemingly idle logical thread under hyperthreading. So I would propose to at least have the master thread bound to the intended running cores by default. @anijain2305 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] yidawang commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
yidawang commented on a change in pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#discussion_r368708893
 
 

 ##########
 File path: src/runtime/threading_backend.cc
 ##########
 @@ -148,19 +146,36 @@ class ThreadGroup::Impl {
           CPU_SET(sorted_order_[0], &cpuset);
         }
         pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
+      } else {
 
 Review comment:
   Under current logic, if you set `TVM_BIND_MASTER_THREAD=1`, the issue fixed in #4344 will come back. I would suggest keeping the current way, but making sure binding it to the right big/small core that user sets.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-576376485
 
 
   > Overall I like the idea of binding master thread to the set of cores needed rather than a single one, whichi means free migration of master thread, would be great if we can document this idea in the code, let others to check.
   
   Newest code has provided one detail example to show which cores the master thread could run in the comment. Could check it and to see whether to have anything should supply. 
   
   > Given that pthread_atfork is indeed available in most cases, we should put that as a separate item(perhaps using macro to detect the existence) if we still decides to use it.
   > For example, we can keep the existing behavior of binding master thread on linux, but use the new behavior on android
   
   pthread_atfork is not a must for solving this problem in fact. The key is we should let master thread could do free migration on available cores. Previous pr wants to do but provide cpu cores set wrong as described. Existing behavior of linux is still has bug if we deploy it on one arm (has big little cpus) linux embed board. So we should resolve linux / android together in my opinion. :-)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene edited a comment on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene edited a comment on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-576208993
 
 
   > This approach needs to be added as comments in the code explaining how this is expected to work. Further I suspect it is TVM_BIND_MASTER_THREAD that you refer to above ? What is the role of the TVM master thread ? Is it another thread that does compute or is it a "controlling" thread ?
   
   TVM_BIND_MASTER_THREAD is related with `exclude_worker0` , the default value of `exclude_worker0` is false. https://github.com/apache/incubator-tvm/blob/master/include/tvm/runtime/threading_backend.h#L50. As the comment said, if it is false, we will make main thread compute worker0 and worker0 will not be launched in a new thread. Normally main thread doesn't have much to do. If TVM_BIND_MASTER_THREAD env var is set, we will bind the main thread be cpu core of `sorted_order_[0]` or `sorted_order_[-1]` (if we are in the little mode)
   
   So previous solution is let us doesn't bind the master thread be one specific core. so that our master thread could run any cores. But previous solution omit one situation that ARM CPU has BIG.LITTLE, we can not add all cpus to CPU SET. It will result in main thread could run big core (even users specify tvm run little cores). Also could run little cores (even users specify tvm run big cores).
   
   This PR solves this by restrict tvm master thread's cpu set. That is to say: if we are in the little mode, we should restrict master thread run only little cores. if we are in the big mode, we should restrict master thread run only big cores. Note: This could work well on x86 cpu too. Because x86 cpu doesn't have BIG.Little. our code implementation will use big_count_ to calculate the cpu cores by default. So master thread could run any cores.
   
   I will add comment to explain a bit in code.
   
   > pthread_atfork IIRC originally comes from the posix group from Issue 5 as documented here. Checking the other libcs suggest that all other mainstream Linux runtimes like glibc, musl and bionic support this and thus are we working around a limitation of AliOS here ?
   
   AliOS like some versions of Android, doesn't provide pthread_atfork. However, the key to solve this problem is doesn't restrict master thread bind to one specific core. So PR's solution could work fine on Linux / Android too. pthread_atfork is not a key or must.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-576208993
 
 
   > This approach needs to be added as comments in the code explaining how this is expected to work. Further I suspect it is TVM_BIND_MASTER_THREAD that you refer to above ? What is the role of the TVM master thread ? Is it another thread that does compute or is it a "controlling" thread ?
   
   TVM_BIND_MASTER_THREAD is related with `exclude_worker0` , the default value of `exclude_worker0` is false. https://github.com/apache/incubator-tvm/blob/master/include/tvm/runtime/threading_backend.h#L50. As the comment said, if it is false, we will make main thread compute worker0 and worker0 will not be launched in a new thread. Normally main thread doesn't have much to do. If TVM_BIND_MASTER_THREAD env var is set, we will bind the main thread be sored_order_[0] or sored_order_[-1] (if we are in the little mode)
   
   So previous solution is let us doesn't bind the master thread be one specific core. so that our master thread could run any cores. But previous solution omit one situation that ARM CPU has BIG.LITTLE, we can not add all cpus to CPU SET. It will result in main thread could run big core (even users specify tvm run little cores). Also could run little cores (even users specify tvm run big cores).
   
   This PR solves this by restrict tvm master thread's cpu set. That is to say: if we are in the little mode, we should restrict master thread run only little cores. if we are in the big mode, we should restrict master thread run only big cores. Note: This could work well on x86 cpu too. Because x86 cpu doesn't have BIG.Little. our code implementation will use big_count_ to calculate the cpu cores by default. So master thread could run any cores.
   
   I will add comment to explain a bit in code.
   
   > pthread_atfork IIRC originally comes from the posix group from Issue 5 as documented here. Checking the other libcs suggest that all other mainstream Linux runtimes like glibc, musl and bionic support this and thus are we working around a limitation of AliOS here ?
   
   AliOS like some versions of Android, doesn't provide pthread_atfork. However, the key to solve this problem is doesn't restrict master thread bind to one specific core. So PR's solution could work fine on Linux / Android too. pthread_atfork is not a key or must.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen merged pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
tqchen merged pull request #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-578389093
 
 
   Thanks for clarification. I will do it together in next code update.------------------ Original ------------------From: Yida Wang <no...@github.com>Date: Sat,Jan 25,2020 4:44 PMTo: apache/incubator-tvm <in...@noreply.github.com>Cc: Zhao Wu <zh...@apache.org>, Author <au...@noreply.github.com>Subject: Re: [apache/incubator-tvm] [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores (#4747)
   To clarify your meaning, as hyper thread will make the big_count_ is twice times compared with physical cores. For example, phisical cores are 4, hyper thread will make the concurrency number be 8 (big_count_ is 8 too). So for hyper thread, we wish master thread could run the physical cores (id: 0-3), not cpu id 4-7. 
   …
   ------------------ Original ------------------ From: Yida Wang <no...@github.com> Date: Sat,Jan 25,2020 3:21 AM To: apache/incubator-tvm <in...@noreply.github.com> Cc: Zhao Wu <zh...@apache.org>, Author <au...@noreply.github.com> Subject: Re: [apache/incubator-tvm] [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores (#4747) FYI, we just notice that if we don't bind the master thread, which is the default behavior, the master thread may cause the contention issue as it is allocated to a seemingly idle logical thread under hyperthreading. So I would propose to at least have the master thread bound to the intended running cores by default. @anijain2305 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.
   
   Yes
   
   —You are receiving this because you authored the thread.Reply to this email directly, view it on GitHub, or unsubscribe.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] FrozenGene edited a comment on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
FrozenGene edited a comment on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-576208993
 
 
   > This approach needs to be added as comments in the code explaining how this is expected to work. Further I suspect it is TVM_BIND_MASTER_THREAD that you refer to above ? What is the role of the TVM master thread ? Is it another thread that does compute or is it a "controlling" thread ?
   
   TVM_BIND_MASTER_THREAD is related with `exclude_worker0` , the default value of `exclude_worker0` is false. https://github.com/apache/incubator-tvm/blob/master/include/tvm/runtime/threading_backend.h#L50. As the comment said, if it is false, we will make main thread compute worker0 and worker0 will not be launched in a new thread. Normally main thread doesn't have much to do. If TVM_BIND_MASTER_THREAD env var is set, we will bind the main thread be cpu core of `sorted_order_[0]` or `sorted_order_[-1]` (if we are in the little mode)
   
   So previous solution is to let us not bind the master thread to be one specific core so that our master thread could run any cores. But previous solution omit one situation that ARM CPU has BIG.LITTLE, we can not add all cpus to CPU SET. It will result in main thread could run big core (even users specify tvm run little cores). Also could run little cores (even users specify tvm run big cores).
   
   This PR solves this by restrict tvm master thread's cpu set. That is to say: if we are in the little mode, we should restrict master thread run only little cores. if we are in the big mode, we should restrict master thread run only big cores. Note: This could work well on x86 cpu too. Because x86 cpu doesn't have BIG.Little. our code implementation will use big_count_ to calculate the cpu cores by default. So master thread could run any cores.
   
   I will add comment to explain a bit in code.
   
   > pthread_atfork IIRC originally comes from the posix group from Issue 5 as documented here. Checking the other libcs suggest that all other mainstream Linux runtimes like glibc, musl and bionic support this and thus are we working around a limitation of AliOS here ?
   
   AliOS like some versions of Android, doesn't provide pthread_atfork. However, the key to solve this problem is doesn't restrict master thread bind to one specific core. So PR's solution could work fine on Linux / Android too. pthread_atfork is not a key or must.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-tvm] tqchen commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores

Posted by GitBox <gi...@apache.org>.
tqchen commented on issue #4747: [ThreadPool] Solve ARM BIG.LITTLE heterogeneous multicores
URL: https://github.com/apache/incubator-tvm/pull/4747#issuecomment-581536397
 
 
   Thanks @FrozenGene @yidawang 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services