You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Wes McKinney (Jira)" <ji...@apache.org> on 2020/06/16 14:48:00 UTC

[jira] [Comment Edited] (ARROW-8970) [C++] Reduce shared library / binary code size (umbrella issue)

    [ https://issues.apache.org/jira/browse/ARROW-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136714#comment-17136714 ] 

Wes McKinney edited comment on ARROW-8970 at 6/16/20, 2:47 PM:
---------------------------------------------------------------

I looked at some of the large files and LTO won't help them in many cases since many of the inline functions that are generating the code bloat are only used in those files. 


was (Author: wesmckinn):
I looked at some of the large files and LTO won't help them since many of the inline functions that are generating the code bloat are only used in those files. 

> [C++] Reduce shared library / binary code size (umbrella issue)
> ---------------------------------------------------------------
>
>                 Key: ARROW-8970
>                 URL: https://issues.apache.org/jira/browse/ARROW-8970
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Wes McKinney
>            Priority: Major
>
> We're reaching a point where we may need to be careful about decisions that increase code size:
> * Instantiating too many templates for code that isn't performance sensitive, or where some templates may do the same thing (e.g. Int32Type kernels may do the same thing as a Date32Type kernel)
> * Inlining functions that don't need to be inline
> Code size tends to correlate also with compilation times, but not always.
> I'll use this umbrella issue to organize issues related to reducing compiled code size
> At this moment (2020-05-27), here are the 25 largest object files in a -O2 build
> {code}
> 524896	src/arrow/CMakeFiles/arrow_objlib.dir/array/builder_dict.cc.o
> 531920	src/arrow/CMakeFiles/arrow_objlib.dir/filesystem/s3fs.cc.o
> 552000	src/arrow/CMakeFiles/arrow_objlib.dir/json/converter.cc.o
> 575920	src/arrow/CMakeFiles/arrow_objlib.dir/csv/converter.cc.o
> 595112	src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_cast_string.cc.o
> 645728	src/arrow/CMakeFiles/arrow_objlib.dir/type.cc.o
> 683040	src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_set_lookup.cc.o
> 702232	src/arrow/CMakeFiles/arrow_objlib.dir/ipc/reader.cc.o
> 729912	src/arrow/CMakeFiles/arrow_objlib.dir/tensor/coo_converter.cc.o
> 752776	src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csc_converter.cc.o
> 752776	src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csr_converter.cc.o
> 877680	src/arrow/CMakeFiles/arrow_objlib.dir/array/dict_internal.cc.o
> 885624	src/arrow/CMakeFiles/arrow_objlib.dir/builder.cc.o
> 919072	src/arrow/CMakeFiles/arrow_objlib.dir/scalar.cc.o
> 941776	src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_internal.cc.o
> 1055248	src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_simple.cc.o
> 1233304	src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_compare.cc.o
> 1265160	src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor.cc.o
> 1343480	src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csf_converter.cc.o
> 1346928	src/arrow/CMakeFiles/arrow_objlib.dir/array.cc.o
> 1502568	src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_hash.cc.o
> 1609760	src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_cast_numeric.cc.o
> 1794416	src/arrow/CMakeFiles/arrow_objlib.dir/array/diff.cc.o
> 2759552	src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_filter.cc.o
> 7609432	src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_take.cc.o
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)