You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Riza Suminto (Jira)" <ji...@apache.org> on 2022/07/05 07:21:00 UTC

[jira] [Created] (IMPALA-11415) Add run-step-wait-all after loading Kudu data

Riza Suminto created IMPALA-11415:
-------------------------------------

             Summary: Add run-step-wait-all after loading Kudu data
                 Key: IMPALA-11415
                 URL: https://issues.apache.org/jira/browse/IMPALA-11415
             Project: IMPALA
          Issue Type: Bug
            Reporter: Riza Suminto


IMPALA-11384 reveals an issue in testdata/bin/create-load-data.sh.
{code:java}
if [[ $SKIP_METADATA_LOAD -eq 1 ]]; then
  # Tests depend on the kudu data being clean, so load the data from scratch.
  # This is only necessary if this is not a full dataload, because a full dataload
  # already loads Kudu functional and TPC-H tables from scratch.
  run-step-backgroundable "Loading Kudu functional" load-kudu.log \
        load-data "functional-query" "core" "kudu/none/none" force
  run-step-backgroundable "Loading Kudu TPCH" load-kudu-tpch.log \
        load-data "tpch" "core" "kudu/none/none" force
fi
run-step-backgroundable "Loading Hive UDFs" build-and-copy-hive-udfs.log \
    build-and-copy-hive-udfs {code}
If $SKIP_METADATA_LOAD is true, all three of "Loading Kudu functional", "Loading Kudu TPCH", and "Loading Hive UDFs" will be run in parallel in the background. The later background step seemingly override the thrift generated python code under shell/gen-py/hive_metastore/ and shell/gen-py/beeswaxd/. This in turn cause sporadic python error upon invocation of bin/load-data.py of the two former Kudu background steps. Adding run-step-wait-all after the Kudu data loading seems to fix the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)