You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Alessandro Molina (Jira)" <ji...@apache.org> on 2021/04/22 13:53:00 UTC

[jira] [Created] (ARROW-12506) [Python] Improve modularity of pyarrow codebase to speedup compile time

Alessandro Molina created ARROW-12506:
-----------------------------------------

             Summary: [Python] Improve modularity of pyarrow codebase to speedup compile time
                 Key: ARROW-12506
                 URL: https://issues.apache.org/jira/browse/ARROW-12506
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
            Reporter: Alessandro Molina


There are some modules in pyarrow that end up being fairly big to compile because they are mostly based on including other `pxi` / `pxd` files.

That means that when a change to those files is done a big module has to be recompiled slowing down the development process when experimenting (seems it's not uncommon that when a change is done it takes less time to recompile `libarrow` than `pyarrow` )

It would be convenient to divide those into separate modules that can lead to separate object files which would allow the compiler to recompile smaller chunks at the time, so that when a change is done we don't have to recompile the whole `lib.pyx` but can just recompile the module where the change is isolated to.

The goal is to allow faster iteration over pyarrow by reducing time spent on waiting for cython compilation on each change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)