You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/05/05 05:42:00 UTC

[jira] [Commented] (IMPALA-10692) Inserting to ACID tables are broken in local_catalog mode with hms event polling

    [ https://issues.apache.org/jira/browse/IMPALA-10692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339429#comment-17339429 ] 

ASF subversion and git services commented on IMPALA-10692:
----------------------------------------------------------

Commit 603091ed772f3f82511fd8fec355fe9b0126933b in impala's branch refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=603091e ]

IMPALA-10692: Fix acid insert when event polling is enabled

IMPALA-10656 broke inserts to acid tables when HMS event polling
is enabled. The issue was that the new partitions created during
insert were not added to the catalog table yet when createInsertEvents
is called, as the table is reloaded only after firing the events and
committing the transaction.

The fix is to create the INSERT event based on the partition name
and the fileset alone for new partitions. Already existing partitions
need the Partition object as we add the event to the list of the
partition's in-flight events to detect self-events, but luckily new
partitions don't need self event-handling because:
- new partitions fire events only if the table is ACID
- ACID inserts don't fire any INSERT event visible to Impala, so
  it cannot cause an unnecessary metadata reload

ACID inserts from Hive work differently, they always cause an
ALTER_TABLE or ALTER_PARTITION event which are detected by Impala
and lead to metadata reload. I think that this situation is hacky
at best because these events come before COMMIT event (which is
currently ignored by Impala), so Impala may reload the table too
early (before the commit is finished).

Testing:
- added acid tables to TestEventProcessing.test_self_events

Change-Id: I8c2d0702232538a746410539ad55f87b7fde57e7
Reviewed-on: http://gerrit.cloudera.org:8080/17380
Reviewed-by: Csaba Ringhofer <cs...@cloudera.com>
Tested-by: Csaba Ringhofer <cs...@cloudera.com>


> Inserting to ACID tables are broken in local_catalog mode with hms event polling
> --------------------------------------------------------------------------------
>
>                 Key: IMPALA-10692
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10692
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 4.0
>            Reporter: Csaba Ringhofer
>            Priority: Critical
>             Fix For: Impala 4.0
>
>
> https://gerrit.cloudera.org/#/c/17313/  broke the following simple workflow:
> {code}
> bin/start-impala-cluster.py --catalogd_args="--hms_event_polling_interval_s=1 --catalog_topic_mode=minimal" --impalad_args="--use_local_catalog=1"
> set default_transactional_type=insert_only;
> create table tpa (i int) partitioned by (p int);
> insert into tpa partition (p=1) values(1);
> ERROR: NullPointerException: Invalid partition name: p=1
> {code}
> The issue only occurs when inserting to a new partition in a partitioned table.
> From the catalogd log:
> {code}
> I0430 19:18:03.091575 11521 jni-util.cc:286] java.lang.NullPointerException: Invalid partition name: p=1
>         at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897)
>         at org.apache.impala.catalog.HdfsTable.getPartitionsForNames(HdfsTable.java:1657)
>         at org.apache.impala.service.CatalogOpExecutor.createInsertEvents(CatalogOpExecutor.java:4921)
>         at org.apache.impala.service.CatalogOpExecutor.updateCatalog(CatalogOpExecutor.java:4830)
>         at org.apache.impala.service.JniCatalog.updateCatalog(JniCatalog.java:327)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org