You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Michael Smith (Jira)" <ji...@apache.org> on 2024/04/29 20:42:00 UTC
[jira] [Work started] (IMPALA-13046) Rework Iceberg mixed format delete test for Hive optimization
[ https://issues.apache.org/jira/browse/IMPALA-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on IMPALA-13046 started by Michael Smith.
----------------------------------------------
> Rework Iceberg mixed format delete test for Hive optimization
> -------------------------------------------------------------
>
> Key: IMPALA-13046
> URL: https://issues.apache.org/jira/browse/IMPALA-13046
> Project: IMPALA
> Issue Type: Task
> Reporter: Michael Smith
> Assignee: Michael Smith
> Priority: Major
>
> A Hive improvement (TBD) to Iceberg support changed Hive's behavior around handling deletes. It used to always add a delete file, but now if a delete would negate all the contents of a data file it instead removes the data file. This breaks iceberg-mixed-format-position-deletes.test
> {code}
> query_test/test_iceberg.py:1472: in test_read_mixed_format_position_deletes
> vector, unique_database)
> common/impala_test_suite.py:820: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:627: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:520: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:290: in verify_query_result_is_subset
> unicode(expected_row), unicode(actual_results))
> E AssertionError: Could not find expected row row_regex:'hdfs://localhost:20500/test-warehouse/test_read_mixed_format_position_deletes_6fb8ae98.db/ice_mixed_formats/data/.*-data-.*.parquet','.*B','','.*' in actual rows:
> E 'hdfs://localhost:20500/test-warehouse/test_read_mixed_format_position_deletes_6fb8ae98.db/ice_mixed_formats/data/00000-0-data-jenkins_20240427231939_2457cdfb-2e04-471a-9661-4551626f60ee-job_17142740646310_0071-4-00001.orc','306B','','NONE'
> {code}
> as the only resulting data file is a .orc file containing the only valid row.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org