You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Yibo Cai <yi...@arm.com> on 2020/01/02 05:55:19 UTC

[C++]: cmake: about parallel build of third party modules

I noticed a fresh build always stuck at compiling protobuf for a long time. We've decided to use single job building for each third party module [1], partly because different thirty party modules are built concurrently (protobuf is built concurrently with jemalloc, but protobuf itself is built with only one job).

The problem is that protobuf takes much more time than other modules to finish, which blocks the whole building process. I tried adding "-j4" manually when compiling protobuf [2], it significantly reduced time for a fresh build. Below is my testing.

test setting
------------
cpu: Intel E5-2650, 20 logical cores
memory: 64G

test command
------------
cmake -DCMAKE_BUILD_TYPE=Release -DARROW_BUILD_TESTS=ON -DARROW_COMPUTE=ON -DARROW_GANDIVA=ON ..
make -j8

test result
-----------
build protobuf with single job(default): 10min 32sec
build protobuf with four jobs(add -j4):  6min  23sec

Build time dropped 40% from 632s to 383s. Even bigger gap is observed on Arm platform.

I would suggest enabling multi job build for protobuf, maybe set half total jobs. Say, it we launch arrow build with "make -j8", we compile protobuf with "-j4". Code may be kind of ugly and not consistent, but deserves the effort IMHO. Comments?

[1] https://github.com/apache/arrow/pull/2779
[2] https://github.com/apache/arrow/blob/master/cpp/cmake_modules/ThirdpartyToolchain.cmake#L1163

Re: [C++]: cmake: about parallel build of third party modules

Posted by Wes McKinney <we...@gmail.com>.
Hi Yibo

We’ve found flakiness with protobuf_ep when building in parallel with GNU
Make. There are comments about this in ThirdpartyToolchain.cmake I think
and prior JIRA issues related to this.

For faster builds we generally are focused on using -GNinja so recommend
using the ninja build tool instead of make.

Wes

On Fri, Jan 3, 2020 at 8:28 AM Yibo Cai <yi...@arm.com> wrote:

> Maybe just install protobuf dev package is the simplest solution.
>
> On 1/2/20 1:55 PM, Yibo Cai wrote:
> > I noticed a fresh build always stuck at compiling protobuf for a long
> time. We've decided to use single job building for each third party module
> [1], partly because different thirty party modules are built concurrently
> (protobuf is built concurrently with jemalloc, but protobuf itself is built
> with only one job).
> >
> > The problem is that protobuf takes much more time than other modules to
> finish, which blocks the whole building process. I tried adding "-j4"
> manually when compiling protobuf [2], it significantly reduced time for a
> fresh build. Below is my testing.
> >
> > test setting
> > ------------
> > cpu: Intel E5-2650, 20 logical cores
> > memory: 64G
> >
> > test command
> > ------------
> > cmake -DCMAKE_BUILD_TYPE=Release -DARROW_BUILD_TESTS=ON
> -DARROW_COMPUTE=ON -DARROW_GANDIVA=ON ..
> > make -j8
> >
> > test result
> > -----------
> > build protobuf with single job(default): 10min 32sec
> > build protobuf with four jobs(add -j4):  6min  23sec
> >
> > Build time dropped 40% from 632s to 383s. Even bigger gap is observed on
> Arm platform.
> >
> > I would suggest enabling multi job build for protobuf, maybe set half
> total jobs. Say, it we launch arrow build with "make -j8", we compile
> protobuf with "-j4". Code may be kind of ugly and not consistent, but
> deserves the effort IMHO. Comments?
> >
> > [1] https://github.com/apache/arrow/pull/2779
> > [2]
> https://github.com/apache/arrow/blob/master/cpp/cmake_modules/ThirdpartyToolchain.cmake#L1163
>

Re: [C++]: cmake: about parallel build of third party modules

Posted by Yibo Cai <yi...@arm.com>.
Maybe just install protobuf dev package is the simplest solution.

On 1/2/20 1:55 PM, Yibo Cai wrote:
> I noticed a fresh build always stuck at compiling protobuf for a long time. We've decided to use single job building for each third party module [1], partly because different thirty party modules are built concurrently (protobuf is built concurrently with jemalloc, but protobuf itself is built with only one job).
> 
> The problem is that protobuf takes much more time than other modules to finish, which blocks the whole building process. I tried adding "-j4" manually when compiling protobuf [2], it significantly reduced time for a fresh build. Below is my testing.
> 
> test setting
> ------------
> cpu: Intel E5-2650, 20 logical cores
> memory: 64G
> 
> test command
> ------------
> cmake -DCMAKE_BUILD_TYPE=Release -DARROW_BUILD_TESTS=ON -DARROW_COMPUTE=ON -DARROW_GANDIVA=ON ..
> make -j8
> 
> test result
> -----------
> build protobuf with single job(default): 10min 32sec
> build protobuf with four jobs(add -j4):  6min  23sec
> 
> Build time dropped 40% from 632s to 383s. Even bigger gap is observed on Arm platform.
> 
> I would suggest enabling multi job build for protobuf, maybe set half total jobs. Say, it we launch arrow build with "make -j8", we compile protobuf with "-j4". Code may be kind of ugly and not consistent, but deserves the effort IMHO. Comments?
> 
> [1] https://github.com/apache/arrow/pull/2779
> [2] https://github.com/apache/arrow/blob/master/cpp/cmake_modules/ThirdpartyToolchain.cmake#L1163