You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Antoine Pitrou <an...@python.org> on 2022/08/17 08:23:59 UTC

Re: [C++] Moving from -O3 to -O2 optimization level in release builds

For the record, https://github.com/apache/arrow/pull/13661 was finally 
merged. It switches to -O2 by default and selectively re-enables 
auto-vectorization on gcc.

Regards

Antoine.



Le 21/07/2022 à 17:11, Antoine Pitrou a écrit :
> 
> 
> Le 21/07/2022 à 16:34, Wes McKinney a écrit :
>> Based on the discussion in https://github.com/apache/arrow/pull/13661,
>> it seems that one major issue with switching to -O2 is that
>> auto-vectorization (which we rely on in places) and perhaps some other
>> optimization passes would have to be manually enabled in gcc.
>>
>> This benchmark run just completed but it does not have
>> autovectorization enabled, so the benchmark differences appear to be
>> caused by that
>>
>> https://conbench.ursa.dev/compare/runs/e938638743e84794ad829524fae04fbd...20727b1b390e4b30be10f49db7f06f3f/
>>
>> My inclination is that we should leave things as is and keep an eye on
>> symbol sizes (which we can easily do using the
>> cpp/tools/binary_symbol_explore.py tool that I wrote -- it computes
>> diffs right now but can be easily modified/extended to also print the
>> largest symbols in a shared library) in case we have other instances
>> of runaway code size.
> 
> The benchmark results above are a mixed bag, but not strongly in favour
> of -O3.
> 
> I would like to receive more opinions, but from the feedback so far it
> seems that switching to -O2 would be more robust?  If so, then I would
> be in favour of doing it.
> 
> Regards
> 
> Antoine.
> 
> 
>>
>> On Thu, Jul 21, 2022 at 8:11 AM Yaron Gvili <rt...@hotmail.com> wrote:
>>>
>>>> only enable -O3 on source files selectively that can be demonstrated to benefit from it
>>>
>>> Unfortunately, actual benefits from -O3 are application dependent. As https://www.linuxjournal.com/article/7269 explains:
>>>
>>> "Although -O3 can produce fast code, the increase in the size of the image can have adverse effects on its speed. For example, if the size of the image exceeds the size of the available instruction cache, severe performance penalties can be observed. Therefore, it may be better simply to compile at -O2 to increase the chances that the image fits in the instruction cache."
>>>
>>> The image size of a hot-spot of a specific application utilizing some Arrow code compiled with -O3 could exceed the instruction cache size due to this code even if the same Arrow code demonstrated better performance in Arrow benchmarks comparing -O2 with -O3 compilation.
>>>
>>> In the short-term, I join Wes' suggestion of trying to compile everything with -O2 and checking that no existing benchmark suffers too much. Hopefully, none would, and that would justify a switch to -O2. In the longer-term, I'd suggest making a bisection tool for selecting the best optimization flags for Arrow modules in the context of application-specific benchmarks.
>>>
>>>
>>> Yaron.
>>> ________________________________
>>> From: Sasha Krassovsky <kr...@gmail.com>
>>> Sent: Wednesday, July 20, 2022 5:55 PM
>>> To: dev@arrow.apache.org <de...@arrow.apache.org>
>>> Subject: Re: [C++] Moving from -O3 to -O2 optimization level in release builds
>>>
>>> I’d +1 on this - in my past experience I’ve mostly seen -O2. It would make sense to default to -O2 and only enable -O3 on source files selectively that can be demonstrated to benefit from it (if anyone actually spends the time to look into it).
>>>
>>> Sasha
>>>
>>>> On Jul 20, 2022, at 2:10 PM, Wes McKinney <we...@gmail.com> wrote:
>>>>
>>>> hi all,
>>>>
>>>> Antoine and I were digging into a weird issue where gcc in -O3
>>>> generated ~40KB of optimized code for a function which was less than
>>>> 2KB in -O2, and where a "leaner" implementation (in PR 13654) was yet
>>>> faster and smaller. You can see some of the discussion at
>>>>
>>>> https://github.com/apache/arrow/pull/13654
>>>>
>>>> -O3 is known to have some other issues in additional to occasional
>>>> runaway code size -- I opened
>>>>
>>>> https://github.com/apache/arrow/pull/13661
>>>>
>>>> to explore changing out release optimization level to -O2 and see what
>>>> are the performance implications in our benchmarks (and likely make
>>>> builds meaningfully faster). If anyone has any thoughts about this let
>>>> us know!
>>>>
>>>> Thanks,
>>>> WES
>>>