You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/11/23 05:23:00 UTC

[jira] [Commented] (IMPALA-10923) Fine grained table refreshing at partition level events for transactional tables

    [ https://issues.apache.org/jira/browse/IMPALA-10923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17447779#comment-17447779 ] 

ASF subversion and git services commented on IMPALA-10923:
----------------------------------------------------------

Commit 097b10104f23e0927d5b21b43a79f6cc10425f59 in impala's branch refs/heads/master from Yu-Wen Lai
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=097b101 ]

IMPALA-10923: Fine grained table refreshing at partition level events
for transactional tables

To enable fine-grained table refreshing, there are three main changes
in this commit.
1. Maintain validWriteIdList in Catalogd for transactional tables. We
  will keep track of write id changes for partitioned tables by
  AllocWriteIdEvents, CommitTxnEvents, and AbortTxnEvents.
2. Conduct partition level refreshing for transactional tables'
  addPartitionEvents, dropPartitionEvents, and AlterPartitionEvents.
3. Introduce a config
  hms_event_incremental_refresh_transactional_table, which can switch
  on/off the fine-grained table refreshing.

Performance Tests:
A simple test was performed by running insert into one partition for
a partitioned ACID table(50,000 partitions). Below are the time taken
to refresh this table by the event.

Storage                Before              After
=============================================================
S3                     50 secs             50 msecs
local                  3 secs              3 msecs

Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Reviewed-on: http://gerrit.cloudera.org:8080/17858
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Sourabh Goyal <so...@cloudera.com>
Reviewed-by: Vihang Karajgaonkar <vi...@cloudera.com>


> Fine grained table refreshing at partition level events for transactional tables
> --------------------------------------------------------------------------------
>
>                 Key: IMPALA-10923
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10923
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Yu-Wen Lai
>            Assignee: Yu-Wen Lai
>            Priority: Major
>
> For ensuring the transactional tables are consistent, we currently take whole table refreshing even a change is just for a partition only. That is too expensive and possibly make event processing has a longer delay.
> To enable fine-grained table refreshing, there are three main changes in this proposal.
>  # maintain validWriteIdList in Catalogd for transactional tables. We will track write id changes by AllocWriteIdEvents, CommitTxnEvents, and AbortTxnEvents.
>  # trigger partition level refreshing for addPartitionEvents, dropPartitionEvents, and AlterPartitionEvents.
>  # Introduce a config *incremental_refresh_acid*, which can switch on/off the fine-grained table refreshing



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org