You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Ashutosh Mestry (Jira)" <ji...@apache.org> on 2021/03/15 18:13:00 UTC
[jira] [Updated] (ATLAS-4204) Hive Hook: Improve HS2 Message
Sending
[ https://issues.apache.org/jira/browse/ATLAS-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ashutosh Mestry updated ATLAS-4204:
-----------------------------------
Component/s: hive-integration
> Hive Hook: Improve HS2 Message Sending
> --------------------------------------
>
> Key: ATLAS-4204
> URL: https://issues.apache.org/jira/browse/ATLAS-4204
> Project: Atlas
> Issue Type: Improvement
> Components: hive-integration
> Reporter: Ashutosh Mestry
> Assignee: Ashutosh Mestry
> Priority: Major
>
> *Background*
> HiveServer2 hook for Atlas sends notification message for both metadata (DDL operations) and lineage (DML operations).
> Hive Metastore (HMS) hook already sends metadata information to Atlas. These messages are all DDL operations.
> So duplicate messages about object updates are sent to Atlas.
> Atlas processes these messages like any other.
> This is additional processing time and increased volume. There is also a potential of incorrect data being updated within Atlas if the sequence of messages from HMS and HS2 gets changed.
> *Solution*
> This improvement will send only lineage messages from HS2 hook. All the DDL (schema definition) messages will continue be sent from HMS hook (no change here).
> This will also reduce the volume of messages sent to Atlas from hive server2 and will help improve performance by avoiding processing duplicate messages.
> The improvement can be used via a configuration parameter. That way existing behavior continues as is.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)