You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/06/03 02:31:06 UTC

[GitHub] [arrow] wesm edited a comment on pull request #7337: WIP: ARROW-4633: [Python] importing pyarrow spawns threads at import time

wesm edited a comment on pull request #7337:
URL: https://github.com/apache/arrow/pull/7337#issuecomment-637915510


   @pitrou I'm a bit out of my depth here. I tried a ton of things to figure out where and when the threads are coming from 
   
   Loading the C++ shared libraries seems to create one thread
   
   ```
   >>> import ctypes
   >>> ctypes.CDLL('/home/wesm/local/lib/libarrow.so')
   [New Thread 0x7ffff27ff700 (LWP 16292)]
   <CDLL '/home/wesm/local/lib/libarrow.so', handle 5555559a24c0 at 0x7ffff66e5610>
   >>> ctypes.CDLL('/home/wesm/local/lib/libarrow_python.so')
   <CDLL '/home/wesm/local/lib/libarrow_python.so', handle 5555559baa80 at 0x7ffff66e55d0>
   ```
   
   but then importing pyarrow spawns a bunch of other threads (this machine has 16 cores with HT enabled)
   
   ```
   >>> import pyarrow
   [New Thread 0x7ffff0c37700 (LWP 16310)]
   [New Thread 0x7fffea446700 (LWP 16311)]
   [New Thread 0x7fffe7c45700 (LWP 16312)]
   [New Thread 0x7fffe5444700 (LWP 16313)]
   [New Thread 0x7fffe0c43700 (LWP 16314)]
   [New Thread 0x7fffe0442700 (LWP 16315)]
   [New Thread 0x7fffddc41700 (LWP 16316)]
   [New Thread 0x7fffdb440700 (LWP 16317)]
   [New Thread 0x7fffd6c3f700 (LWP 16318)]
   [New Thread 0x7fffd443e700 (LWP 16319)]
   [New Thread 0x7fffd1c3d700 (LWP 16320)]
   [New Thread 0x7fffcf43c700 (LWP 16321)]
   [New Thread 0x7fffccc3b700 (LWP 16322)]
   [New Thread 0x7fffca43a700 (LWP 16323)]
   [New Thread 0x7fffc7c39700 (LWP 16324)]
   >>> import psutil
   >>> psutil.Process().num_threads()
   17
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org