You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Joseph Witt (Jira)" <ji...@apache.org> on 2019/09/20 03:13:00 UTC

[jira] [Updated] (NIFI-6313) PutGCSObject performance seems slow

     [ https://issues.apache.org/jira/browse/NIFI-6313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joseph Witt updated NIFI-6313:
------------------------------
    Fix Version/s:     (was: 1.10.0)

> PutGCSObject performance seems slow
> -----------------------------------
>
>                 Key: NIFI-6313
>                 URL: https://issues.apache.org/jira/browse/NIFI-6313
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework, Extensions
>    Affects Versions: 1.9.2
>            Reporter: Jasper Knulst
>            Priority: Major
>
> The PutGCSObject processor to transfer files to Google Cloud Platform bucket has bad transfer speeds.
> It is impossible to put any hard figures on the throughput as it seems dependent on:
> -Network location of the Nifi node (situated in GC or not)
> -Network bandwidth
> -Nifi node specs
>  
> After performing benchmarks on multiple Nifi clusters (ranging from test setups to prod. sites) the throughput can range from 8MB/s to 800MB/s. 
> Slow really means, slow in comparison to gsutil. If you run gsutil directly from the Nifi node the throughput speed goes up 5 to 8 times (without 'parallel_composite_upload') and up to 16 times faster with 'parallel_composite_upload'.
>  
> The GC Java API on which Nifi's GCS processors are built, does not have the same optimizations as gsutil and maybe isn't supported/maintained. The Storage.create method is even deprecated.
> Still there must be ways to speed up transfers the GCS by implementing parallel composite uploads in chuncks and config options on the GCS processors 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)