You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2019/03/06 18:47:00 UTC
[jira] [Commented] (IMPALA-8278) Fix MetastoreEventsProcessorTest flakiness

    [ https://issues.apache.org/jira/browse/IMPALA-8278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16786005#comment-16786005 ] 

ASF subversion and git services commented on IMPALA-8278:
---------------------------------------------------------

Commit eba08aa749211522631b16f87bf0f113d2470f29 in impala's branch refs/heads/master from Vihang Karajgaonkar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=eba08aa ]

IMPALA-8278 : Fix MetastoreEventsProcessorTest flakiness

MetastoreEventsProcessorTest#testEventProcessorFetchAfterHMSRestart
causes test flakiness because the event processor instantiated is not
stopped. This causes the test to have 2 instances of EventsProcessor
running concurrently during the test execution. Depending on the timing
of the poll operations all the events are processed twice and they
modify the state of test catalog concurrently causing race conditions
which lead to intermittent test failures.

The patch also fixes a race condition in testEventSyncFlagTurnedOnErrorCase
where the test assumes and all the tables will be available in catalog
immediately after reset() completes. This assumption is not true and
hence removed the flaky assertNull check in there.

Also, changes the MetastoreEventsProcessorTest to make it easier to
debug through logs. Each individual test case uses a different table
name so that logs pertaining to a test case can be easily grepped.

Testing done:
1. Confirmed that two event processors are running by manually adding
identifiers to each event processing and log them when they receive the
events before this patch. After patch same test does not reveal multiple
event processors running.
2. Ran the test multiple times locally to confirm that test works
everytime.
3. Applied patch for IMPALA-8273 along with this patch and ran
full-tests using jenkins job and see no failure

Change-Id: Ia12b1d048d8c8c59fa4c587bf65772ff194d314d
Reviewed-on: http://gerrit.cloudera.org:8080/12656
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Fix MetastoreEventsProcessorTest flakiness
> ------------------------------------------
>
>                 Key: IMPALA-8278
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8278
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>            Priority: Major
>
> {{testEventProcessorFetchAfterHMSRestart}} test case in {{MetastoreEventsProcessorTest}} causes flakiness because it creates a new event processor pointing to the same the catalog instance. This means that all the events generated are now being processed by two events processor instances and they both try to modify the state of catalogd causing race conditions. The failures vary and depend a lot on the timing. I see the following exception which is related to this issue.
> Easiest way to figure out if this is a problem is to look into FeSupport logs of the test to confirm if a event ID is being processed twice (i.e you see to exactly similar logs for a given event id).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org