You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Hemanth Yamijala (JIRA)" <ji...@apache.org> on 2016/05/10 13:00:15 UTC

[jira] [Updated] (ATLAS-629) Kafka messages in ATLAS_HOOK might be lost in HA mode at the instant of failover.

     [ https://issues.apache.org/jira/browse/ATLAS-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hemanth Yamijala updated ATLAS-629:
-----------------------------------
    Attachment: ATLAS-629.patch

The attached patch makes the following changes:

* Disables auto commit in Kafka consumer properties.
* Modifies the creation of Kafka consumer objects like {{ConsumerConnector}},  {{KafkaConsumer}} and {{HookConsumer}} to enable manual commit.
* Modifies {{HookConsumer}} to commit after a message is processed.
* Adds/modifies unit tests to accommodate changes.

One thing to note: I am calling commit even if an Exception is handled when processing the message. This is so that we don't get stuck on one bad message and are able to make progress. This choice is debatable and I am open for discussion. The trade-offs are in terms of making progress on other objects imported vs favoring correctness rigidly. In other places, like ATLAS-602, we have made similar choices favoring making progress and letting the server logic figure out what to do.



> Kafka messages in ATLAS_HOOK might be lost in HA mode at the instant of failover.
> ---------------------------------------------------------------------------------
>
>                 Key: ATLAS-629
>                 URL: https://issues.apache.org/jira/browse/ATLAS-629
>             Project: Atlas
>          Issue Type: Bug
>    Affects Versions: 0.7-incubating
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Critical
>             Fix For: 0.7-incubating
>
>         Attachments: ATLAS-629.patch
>
>
> Write data to Kafka continuously from Hive hook - can do this by writing a script that constantly creates tables. Bring down the Active instance with kill -9. Ensure writes continue after passive becomes active. The expectation is the number of tables created and the number of tables in Atlas match.
> In one test, wrote 180 tables and switched over 6 times from one instance to another. Found that 1 table was lost of the lot. i.e. 179 tables were created, and 1 did not get in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)