You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2022/09/29 19:45:00 UTC

[jira] [Commented] (IMPALA-11623) Put *-ir.cc files into their own libraries to avoid extra recompilation

    [ https://issues.apache.org/jira/browse/IMPALA-11623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17611216#comment-17611216 ] 

Joe McDonnell commented on IMPALA-11623:
----------------------------------------

To elaborate a bit, the LLVM IR doesn't actually depend on the built libraries. It depends on the *-ir.cc sources. We want to depend on the *-ir.cc files AND any headers they reference (and have the list of headers updated appropriately if a new header include is added, etc), so that is why we don't just depend on only the *-ir.cc files. In theory, there should be a way to depend on those and their headers without waiting for them to be compiled as a library. Maybe CMake's INTERFACE library could express this (https://cmake.org/cmake/help/latest/command/add_library.html#interface-libraries). Either way, putting the *-ir.cc files in their own libraries is an improvement on our current setup.

> Put *-ir.cc files into their own libraries to avoid extra recompilation
> -----------------------------------------------------------------------
>
>                 Key: IMPALA-11623
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11623
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Infrastructure
>    Affects Versions: Impala 4.2.0
>            Reporter: Joe McDonnell
>            Assignee: Daniel Becker
>            Priority: Major
>
> It is desirable to be able to iterate quickly by running "make -j impalad" while modifying a file. Currently, modifying most files incurs a rebuild of the LLVM IR, which is a slow serial step. For example:
>  
> {noformat}
> $ touch be/src/runtime/coordinator.cc
> $ make -j impalad
> ...
> [ 98%] Generating ../../../llvm-ir/impala.bc
> [ 98%] Generating ../../../llvm-ir/impala-legacy-avx.bc
> [ 98%] Generating ../../generated-sources/impala-ir/impala-ir.cc
> [ 98%] Generating ../../generated-sources/impala-ir/impala-ir-legacy-avx.cc
> ...{noformat}
> This can add several seconds to an incremental build. This step happens for files that do not actually impact the LLVM IR, so there are ways to avoid this.
> The reason that LLVM IR is rebuilt is because it has a dependencies on Exec, Exprs, Runtime, Udf, Util, and other libraries:
>  
> {noformat}
> add_custom_command(
>   OUTPUT ${IR_OUTPUT_FILE}
>   COMMAND ${LLVM_CLANG_EXECUTABLE} ${CLANG_IR_CXX_FLAGS} ${PLATFORM_SPECIFIC_FLAGS}
>           ${CLANG_INCLUDE_FLAGS} ${IR_INPUT_FILES} -o ${IR_TMP_OUTPUT_FILE}
>   COMMAND ${LLVM_OPT_EXECUTABLE} ${LLVM_OPT_IR_FLAGS} < ${IR_TMP_OUTPUT_FILE} > ${IR_OUTPUT_FILE}
>   COMMAND rm ${IR_TMP_OUTPUT_FILE}
>   DEPENDS Exec ExecAvro ExecKudu Exprs Runtime Udf Util ${IR_INPUT_FILES}
> ){noformat}
> From a correctness perspective, the LLVM IR only cares about things that impact the content of the *-ir.cc files, because impala-ir.cc includes every *-ir.cc file. That list of libraries is a superset of what is needed.
> If the *-ir.cc files were split off into their own libraries (i.e. ExecIr rather than Exec), then this target would only depend on the ExecIr rather than the larger Exec. This would reduce the number of files that would cause LLVM IR to be rebuilt. That should reduce the runtime of an incremental "make -j impalad" for quite a few C++ files.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org