You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Fredy Wijaya (JIRA)" <ji...@apache.org> on 2019/05/29 21:55:00 UTC

[jira] [Resolved] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers

     [ https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Fredy Wijaya resolved IMPALA-8473.
----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 3.3.0

> Refactor lineage publication mechanism to allow for different consumers
> -----------------------------------------------------------------------
>
>                 Key: IMPALA-8473
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8473
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Backend, Frontend
>            Reporter: radford nguyen
>            Assignee: radford nguyen
>            Priority: Critical
>             Fix For: Impala 3.3.0
>
>         Attachments: ImpalaPostExecHook-infra.patch
>
>
> Impetus for this change is to allow lineage to be consumed by Atlas via Kafka.
> h3. Design Proposal
> Implement a plugin approach (similar to {{authorization_provider}}) for consuming query event hooks, where downstream users can provide their own hook implementations as runtime dependencies.
> Keep but deprecate existing lineage event file writing.
> [~madhan@apache.org] has provided a fe patch (attached) with suggested mechanism for allowing multiple hooks to be registered with the fe.  Hooks would be invoked from the be at appropriate places, e.g. [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466].  The hooks should all be executed asynchronously, so the current thinking is that this execution should happen in the fe, since the be does not know about what hooks are registered.  IOW, the {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably make use of a thread-pool executor service (or something similar) in order to execute all hooks in parallel and in a non-blocking manner, returning to the be asap.
>  
> h3. Code Review
> [https://gerrit.cloudera.org/#/c/13352/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)