You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Joe Witt (Jira)" <ji...@apache.org> on 2020/09/04 18:39:00 UTC

[jira] [Commented] (NIFI-7791) Add PutClickHouse Processor for Writing Large Streams

    [ https://issues.apache.org/jira/browse/NIFI-7791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190881#comment-17190881 ] 

Joe Witt commented on NIFI-7791:
--------------------------------

Ricky - strongly recommend you approach this using the recordreader/writer construct rather than specific formats.  You could have just a RecordReader and then let the user indicate how to serialize for writes to ClickHouse.  But we'd like to avoid more processors that are specific format aware/dependent to the extent possible.  

> Add PutClickHouse Processor for Writing Large Streams
> -----------------------------------------------------
>
>                 Key: NIFI-7791
>                 URL: https://issues.apache.org/jira/browse/NIFI-7791
>             Project: Apache NiFi
>          Issue Type: New Feature
>            Reporter: Ricky Saltzer
>            Assignee: Ricky Saltzer
>            Priority: Minor
>
> ClickHouse supports streaming a number of file formats directly using their JDBC (superset) library. Often times it's much more convenient to stream the contents of a file directly to ClickHouse, rather than bothering to process the data in NiFi and then using the native JDBC processor.
> One workaround is to just use PutHTTP to stream the file directly to ClickHouse using it's HTTP endpoint. However, this can get a bit tedious, especially if you need to pass credentials as part of the HTTP method call.
> I'm creating this Jira to support creating a simple PutClickHouse processor that can stream a FlowFile directly to ClickHouse with the following features
>  * CSV, CSVWithNames, TSV and JSONEachRow
>  * Ability to modify column name ordering
>  * Custom delimiters for CSV and TSV
>  * SSL support (with and without strict mode)
>  * Multiple hosts (comma separated) to utilize the {{BalancedClickhouseDataSource}}
>  * Username and Password
> I'm currently wrapping up a PR for this. I wrote it using Kotlin, which uses a processor-scope maven plugin. If there's enough objection, it can be rewritten in native Java.
> +[~joewitt] since I spoke with him regarding this a while back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)