You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Wes McKinney (Jira)" <ji...@apache.org> on 2020/06/16 14:16:00 UTC
[jira] [Commented] (ARROW-8970) [C++] Reduce shared library /
binary code size (umbrella issue)
[ https://issues.apache.org/jira/browse/ARROW-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136686#comment-17136686 ]
Wes McKinney commented on ARROW-8970:
-------------------------------------
After ARROW-7784, ARROW-5760, and ARROW-9075 patches libarrow.so is now down to 18.44 MB from 23.09 MB in -O3 build on clang-8
Now here are the largest object files in the build
{code}
$ find src -type f -printf '%s %p\n' | sort -nr | head -20
1421728 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_cast_numeric.cc.o
1284672 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_compare.cc.o
1203344 src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor.cc.o
1145640 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_hash.cc.o
905088 src/arrow/CMakeFiles/arrow_objlib.dir/scalar.cc.o
828072 src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_simple.cc.o
811544 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csf_converter.cc.o
727448 src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_internal.cc.o
676576 src/arrow/CMakeFiles/arrow_objlib.dir/array/array_dict.cc.o
668904 src/arrow/CMakeFiles/arrow_objlib.dir/type.cc.o
632680 src/arrow/CMakeFiles/arrow_objlib.dir/array/array_base.cc.o
619968 src/arrow/CMakeFiles/arrow_objlib.dir/builder.cc.o
617392 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_selection.cc.o
583160 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csr_converter.cc.o
583160 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csc_converter.cc.o
554792 src/arrow/CMakeFiles/arrow_objlib.dir/ipc/reader.cc.o
554144 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/coo_converter.cc.o
540912 src/arrow/CMakeFiles/arrow_objlib.dir/array/util.cc.o
500088 src/arrow/CMakeFiles/arrow_objlib.dir/array/diff.cc.o
473096 src/arrow/CMakeFiles/arrow_objlib.dir/filesystem/s3fs.cc.o
{code}
> [C++] Reduce shared library / binary code size (umbrella issue)
> ---------------------------------------------------------------
>
> Key: ARROW-8970
> URL: https://issues.apache.org/jira/browse/ARROW-8970
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Wes McKinney
> Priority: Major
>
> We're reaching a point where we may need to be careful about decisions that increase code size:
> * Instantiating too many templates for code that isn't performance sensitive, or where some templates may do the same thing (e.g. Int32Type kernels may do the same thing as a Date32Type kernel)
> * Inlining functions that don't need to be inline
> Code size tends to correlate also with compilation times, but not always.
> I'll use this umbrella issue to organize issues related to reducing compiled code size
> At this moment (2020-05-27), here are the 25 largest object files in a -O2 build
> {code}
> 524896 src/arrow/CMakeFiles/arrow_objlib.dir/array/builder_dict.cc.o
> 531920 src/arrow/CMakeFiles/arrow_objlib.dir/filesystem/s3fs.cc.o
> 552000 src/arrow/CMakeFiles/arrow_objlib.dir/json/converter.cc.o
> 575920 src/arrow/CMakeFiles/arrow_objlib.dir/csv/converter.cc.o
> 595112 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_cast_string.cc.o
> 645728 src/arrow/CMakeFiles/arrow_objlib.dir/type.cc.o
> 683040 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_set_lookup.cc.o
> 702232 src/arrow/CMakeFiles/arrow_objlib.dir/ipc/reader.cc.o
> 729912 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/coo_converter.cc.o
> 752776 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csc_converter.cc.o
> 752776 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csr_converter.cc.o
> 877680 src/arrow/CMakeFiles/arrow_objlib.dir/array/dict_internal.cc.o
> 885624 src/arrow/CMakeFiles/arrow_objlib.dir/builder.cc.o
> 919072 src/arrow/CMakeFiles/arrow_objlib.dir/scalar.cc.o
> 941776 src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_internal.cc.o
> 1055248 src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_simple.cc.o
> 1233304 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_compare.cc.o
> 1265160 src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor.cc.o
> 1343480 src/arrow/CMakeFiles/arrow_objlib.dir/tensor/csf_converter.cc.o
> 1346928 src/arrow/CMakeFiles/arrow_objlib.dir/array.cc.o
> 1502568 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_hash.cc.o
> 1609760 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/scalar_cast_numeric.cc.o
> 1794416 src/arrow/CMakeFiles/arrow_objlib.dir/array/diff.cc.o
> 2759552 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_filter.cc.o
> 7609432 src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/vector_take.cc.o
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)