You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/11/23 05:23:00 UTC
[jira] [Commented] (IMPALA-10923) Fine grained table refreshing at partition level events for transactional tables
[ https://issues.apache.org/jira/browse/IMPALA-10923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17447779#comment-17447779 ]
ASF subversion and git services commented on IMPALA-10923:
----------------------------------------------------------
Commit 097b10104f23e0927d5b21b43a79f6cc10425f59 in impala's branch refs/heads/master from Yu-Wen Lai
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=097b101 ]
IMPALA-10923: Fine grained table refreshing at partition level events
for transactional tables
To enable fine-grained table refreshing, there are three main changes
in this commit.
1. Maintain validWriteIdList in Catalogd for transactional tables. We
will keep track of write id changes for partitioned tables by
AllocWriteIdEvents, CommitTxnEvents, and AbortTxnEvents.
2. Conduct partition level refreshing for transactional tables'
addPartitionEvents, dropPartitionEvents, and AlterPartitionEvents.
3. Introduce a config
hms_event_incremental_refresh_transactional_table, which can switch
on/off the fine-grained table refreshing.
Performance Tests:
A simple test was performed by running insert into one partition for
a partitioned ACID table(50,000 partitions). Below are the time taken
to refresh this table by the event.
Storage Before After
=============================================================
S3 50 secs 50 msecs
local 3 secs 3 msecs
Change-Id: I6ba07c9a338a25614690e314335ee4b801486da9
Reviewed-on: http://gerrit.cloudera.org:8080/17858
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Sourabh Goyal <so...@cloudera.com>
Reviewed-by: Vihang Karajgaonkar <vi...@cloudera.com>
> Fine grained table refreshing at partition level events for transactional tables
> --------------------------------------------------------------------------------
>
> Key: IMPALA-10923
> URL: https://issues.apache.org/jira/browse/IMPALA-10923
> Project: IMPALA
> Issue Type: Improvement
> Components: Catalog
> Reporter: Yu-Wen Lai
> Assignee: Yu-Wen Lai
> Priority: Major
>
> For ensuring the transactional tables are consistent, we currently take whole table refreshing even a change is just for a partition only. That is too expensive and possibly make event processing has a longer delay.
> To enable fine-grained table refreshing, there are three main changes in this proposal.
> # maintain validWriteIdList in Catalogd for transactional tables. We will track write id changes by AllocWriteIdEvents, CommitTxnEvents, and AbortTxnEvents.
> # trigger partition level refreshing for addPartitionEvents, dropPartitionEvents, and AlterPartitionEvents.
> # Introduce a config *incremental_refresh_acid*, which can switch on/off the fine-grained table refreshing
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org