You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2019/04/07 19:46:00 UTC

[jira] [Commented] (IMPALA-8101) Thrift 11 compilation and Thrift ext-data-source compilation are always run

    [ https://issues.apache.org/jira/browse/IMPALA-8101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16811953#comment-16811953 ] 

ASF subversion and git services commented on IMPALA-8101:
---------------------------------------------------------

Commit daa1bf9883e65adb82b11576b5ada4273bc9dd7f in impala's branch refs/heads/master from stakiar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=daa1bf9 ]

IMPALA-8101: Thrift 11 and ext-data-source compilation are always run

Compilation of Thrift 11 Python code (IMPALA-7924) and of ext-data-source
Thrift files (ErrorCodes.thrift, ExternalDataSource.thrift, Data.thrift,
Status.thrift, Types.thrift) is run during every build, regardless of
whether or not the .thrift files have changed. The issue is that the
CMake custom command for compilation of these files points to a
non-existent OUTPUT_FILE.

This patch fixes Thrift 11 compilation by pointing the OUTPUT_FILE of
each .thrift file to its corresponding __init__.py file. For
compilation of ext-data-source, things are a bit tricky as we only run
Java gen and it is difficult to map Java generated code to the
corresponding .thrift files purely based on file names. Instead, for
ext-data-source, this patch adds a dummy file under
ext-data-source/api/target/tmp/generated-sources/ to track if a .thrift
file has been compiled or not. A `mvn clean` of ext-data-source will
delete all of these files and trigger re-compilation of the
ext-data-source files.

Change-Id: I52520e4b099c7bac5d088b4ba5d8a335495f727d
Reviewed-on: http://gerrit.cloudera.org:8080/12290
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Thrift 11 compilation and Thrift ext-data-source compilation are always run
> ---------------------------------------------------------------------------
>
>                 Key: IMPALA-8101
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8101
>             Project: IMPALA
>          Issue Type: Task
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>
> [~tarmstrong] pointed out that after IMPALA-7924 the build output started displaying lines such as: "Running thrift 11 compiler on..." even during builds when Thrift files were not modified.
> I dug a bit deeper and found the following:
>  * This seems to be happening for Thrift compilation of {{ext-data-source}} files as well (e.g. ExternalDataSource.thrift, Types.thrift, etc.); "Running thrift compiler for ext-data-source on..." is always printed
>  * The issue is that the [custom command|https://cmake.org/cmake/help/v3.8/command/add_custom_command.html] for ext-data-source and Thrift 11 compilation specify an {{OUTPUT}} file that does not exist (and is not generated by Thrift)
>  * According to the CMake docs "if the command does not actually create the {{OUTPUT}} then the rule will always run" - so Thrift compilation will run during every build
>  * The issue is that you don't really know what files Thrift is going to generate without actually looking into the Thrift file and understanding Thrift internals
>  * For C++ and Python there is a workaround; for C++ Thrift always generates a file \{THRIFT_FILE_NAME}_types.h (similar situation for Python); however, for Java no such file necessarily exists (ext-data-source only does Java gen)
>  ** This is how regular Thrift compilation works (e.g. compilation of beeswax.thrift, ImpalaService.thrift, etc.); which is why we don't see the issue for regular Thrift compilation
> A solution for Thrift 11 compilation is to just add generated Python files to the {{OUTPUT}} for the custom_command.
> A solution for Thrift compilation of ext-data-source seems trickier, so open to suggestions.
> Ideally, Thrift would be provide a way to return the list of files generated from a .thrift file, without actually generating the files, but I don't see a way to do that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org