You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Matt Burgess (JIRA)" <ji...@apache.org> on 2017/12/14 22:21:00 UTC

[jira] [Assigned] (NIFI-4696) Support concurrent tasks in PutHiveStreaming

     [ https://issues.apache.org/jira/browse/NIFI-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matt Burgess reassigned NIFI-4696:
----------------------------------

    Assignee: Matt Burgess

> Support concurrent tasks in PutHiveStreaming
> --------------------------------------------
>
>                 Key: NIFI-4696
>                 URL: https://issues.apache.org/jira/browse/NIFI-4696
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Matt Burgess
>            Assignee: Matt Burgess
>
> Currently PutHiveStreaming (PHS) can only support a single task at a time. Before NIFI-4342, that meant each target table would need its own PHS instance, which can be cumbersome with large numbers of tables. After NIFI-4342, Expression Language could be used for SDLC purposes (database/table changes between development and production, e.g.).
> However it would be nice to be able to support at least database/table names using flow file attributes, and also to support multiple tasks to handle them concurrently. Due to the nature of PHS and the Streaming Ingest APIs (and implementation), it is likely not prudent to allow two tasks to write to the same table and partition at the same time.
> I propose adding flow file attribute EL evaluation where prudent, and allowing per-table concurrency in PHS. A thread will attempt to get a lock on a table, and if it cannot, will rollback and return.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)