You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/01/12 23:38:00 UTC

[jira] [Commented] (IMPALA-10893) Use old schema during time travel

    [ https://issues.apache.org/jira/browse/IMPALA-10893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17676374#comment-17676374 ] 

ASF subversion and git services commented on IMPALA-10893:
----------------------------------------------------------

Commit 66efe50d15f63696debd1e19a53482e5b0f013ab in impala's branch refs/heads/master from Andrew Sherman
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=66efe50d1 ]

IMPALA-10893: Use old schema during iceberg time travel.

Before this change the schema used during Iceberg Time Travel was the
current schema of the table. With this change we will use the schema
from the point specified by the Time Travel parameters.

The parameters used by an Iceberg Time Travel query are part of the FROM
clause of the query. Previously analysis of the Time Travel parameters
took place after the table Path was resolved, at which point some
schema information is cached. In order to use the old schema during
iceberg time travel however we need to ensure that the version of the
Table that is used is always the version specified by the Time Travel
parameters. To do this we have to move the analysis of the Time Travel
parameters inside the code that resolves the Path.

Add a new implementation of FeIcebergTable that represents an Iceberg
table involved in Time Travel. This is implemented by embedding a
reference to the base Iceberg Table. All methods that are not Time
Travel related are delegated to the base table. The Time Travel related
methods use the historic Iceberg schema.

TESTING:
- Add a new file iceberg_util.py to hold the snapshot utility code that
  was developed for the in-progress IMPALA-11482.
- Extend the existing Iceberg Time Travel tests to check the schema.
- Add a test that shows time travel working with columns masking.
  The column masking configuration is not tightly coupled to the schema
  so it is possible to mask historical columns.

Change-Id: I7cbef6e20bbb567e517744fb1f34d880970399ab
Reviewed-on: http://gerrit.cloudera.org:8080/19380
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Use old schema during time travel
> ---------------------------------
>
>                 Key: IMPALA-10893
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10893
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Andrew Sherman
>            Priority: Major
>              Labels: impala-iceberg
>
> With Iceberg V1 we cannot retrieve the old schema of a snapshot. But V2 will have this functionality:
> [https://github.com/apache/iceberg/issues/1029]
> So it would make sense to use the old snapshot's table schema during a time travel query.
> Interestingly MS SQL SERVER also queries temporal tables with current table schema.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org