You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/10/04 09:34:00 UTC

[jira] [Commented] (IMPALA-10627) Use standard Iceberg table properties

    [ https://issues.apache.org/jira/browse/IMPALA-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17423859#comment-17423859 ] 

ASF subversion and git services commented on IMPALA-10627:
----------------------------------------------------------

Commit d2f866f9a17c2d71fb3e3e731a2dfcce68d336d9 in impala's branch refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d2f866f ]

IMPALA-10935: Impala crashes on old Iceberg table property

With IMPALA-10627 we switched to use standard Iceberg table
properties: https://iceberg.apache.org/configuration/

E.g. we switched from 'iceberg.file_format' to 'write.format.default'.
For backward compatibility we also support 'iceberg.file_format'. Though
the support is not perfect as it causes a crash in some cases.

Impala crashes when the following conditions met:
* local catalog mode is being used
* Iceberg table is being queried
* the data file format is ORC
* 'iceberg.file_format' is set instead of 'write.format.default' table
  property
* Query is "select count(*) from t;"

Impala wrongly assumes that PARQUET is being used and tries to apply the
count star optimization. It is not implemented for the ORC scanner and
causes it to crash.

This patch fixes the wrong assumption. Also it fixes the HdfsOrcScanner,
so it won't crash in release mode but raise an error.

This patch also enables UNSETting the file format table property for
Iceberg tables. This table property was already enabled for
modifications (changing the value via SET TBLPROPERTIES).

Testing:
 * added e2e test for the above conditions

Change-Id: Iafd9baef1c124d7356a14ba24c571567629a5e50
Reviewed-on: http://gerrit.cloudera.org:8080/17877
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Use standard Iceberg table properties
> -------------------------------------
>
>                 Key: IMPALA-10627
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10627
>             Project: IMPALA
>          Issue Type: Bug
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Attila Jeges
>            Priority: Major
>              Labels: impala-iceberg
>             Fix For: Impala 4.1.0
>
>
> Iceberg lists the following properties:
> [https://iceberg.apache.org/configuration/]
> We should also use these properties if possible, e.g. write.format.default, write.<fileformat>.compression-codec
> Currently Impala use the table property 'iceberg.file_format' to determine the data file format for reads/writes. In the future, read operations should automatically detect the file formats (IMPALA-10610), but for writes we should use 'write.format.default'.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org