You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2023/09/29 16:20:00 UTC

[jira] [Assigned] (IMPALA-12463) Allow batching of non consecutive metastore events

     [ https://issues.apache.org/jira/browse/IMPALA-12463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe McDonnell reassigned IMPALA-12463:
--------------------------------------

    Assignee: Joe McDonnell

> Allow batching of non consecutive metastore events
> --------------------------------------------------
>
>                 Key: IMPALA-12463
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12463
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Csaba Ringhofer
>            Assignee: Joe McDonnell
>            Priority: Major
>         Attachments: concurrent_metadata_load.py
>
>
> Currently Impala tries to batch events like partition insert/creation only if:
> 1. the next event is for the same table as the previous one
> 2. the next event's id is the previous one's + 1
> 3. the next event has the same type as the previous one
> (2 can be stricter than 1 if some events were filtered between the two)
> See https://github.com/apache/impala/blob/94f4f1d82461d8f71fbd0d2e9082aa29b5f53a89/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java#L315
> Another limit is that only events in the same batch from HMS can be merged. Currently 1000 events are polled at the same time: https://github.com/apache/impala/blob/94f4f1d82461d8f71fbd0d2e9082aa29b5f53a89/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L218
> Making this configurable could be also useful.
> Event batching could be improved by batching all events to the current one if they modify the same table, unless they are "cut" by:
> a. an event on the same table but with a different type
> b. a rename table event where the original or the new name is the same as the current event
> If such an event occurs, the events after that can be only merged to a newer event.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org