You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Ashutosh Mestry (Jira)" <ji...@apache.org> on 2021/03/15 17:53:00 UTC

[jira] [Created] (ATLAS-4204) Hive Hook: Improve HS2 Message Sending

Ashutosh Mestry created ATLAS-4204:
--------------------------------------

             Summary: Hive Hook: Improve HS2 Message Sending
                 Key: ATLAS-4204
                 URL: https://issues.apache.org/jira/browse/ATLAS-4204
             Project: Atlas
          Issue Type: Improvement
            Reporter: Ashutosh Mestry
            Assignee: Ashutosh Mestry


*Background*

HiveServer2 hook for Atlas sends notification message for both metadata (DDL operations) and lineage (DML operations).

Hive Metastore (HMS) hook already sends metadata information to Atlas. These messages are all DDL operations.

So duplicate messages about object updates are sent to Atlas.

Atlas processes these messages like any other.

This is additional processing time and increased volume. There is also a potential of incorrect data being updated within Atlas if the sequence of messages from HMS and HS2 gets changed.

*Solution*

This improvement will  send only lineage messages from HS2 hook. All the DDL (schema definition) messages will continue be sent from HMS hook (no change here).

This will also reduce the volume of messages sent to Atlas from hive server2 and will help improve performance by avoiding processing duplicate messages.

The improvement can be used via a configuration parameter. That way existing behavior continues as is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)