You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Gino Lisignoli <gl...@gmail.com> on 2017/09/07 03:15:11 UTC

Slow FTP and SFTP nifi transfer rates

I have this weird issue with PutFTP and PutSFTP transfer rates.

What I am seeing is that no matter what files I transfer from One server to
another over a single connection the maximum rates I can send are 300Mbps
for PutFTP and 1Gbps for PutSFTP.

The sending nifi is installed on Centos 7, running on a Dell R730, 190GB
Ram, 16 Cores @ 2.4GHz and 4x10Gb nics bonded. The sending nifi has it's
content repository on a ramdisk, and the receiving server is receiving to a
ramdisk (for testing, to remove disk IO out of the equation).

When I do a ftp send manually (without nifi) with mput I get ftp rates of
~8Gbs and sftp rates of 2.2Gbs (Which seems slow anyway).

I would have expected transfer rates similar with nifi.

Is there any way to work out why these rates are so much slower, but also
so consistent? I'm using Nifi-1.30

Re: Slow FTP and SFTP nifi transfer rates

Posted by Koji Kawamura <ij...@gmail.com>.
Thanks Gino for confirming that.

I've submitted a JIRA and PR.
https://issues.apache.org/jira/browse/NIFI-4375

Tried to find something that can improve PutSFTP, but to no avail so far.
NIFI-4375 only addresses PutFTP processor.

On Sat, Sep 9, 2017 at 7:54 AM, Gino Lisignoli <gl...@gmail.com> wrote:
> Just built 1.4.0-SNAPSHOT and added in client.setBufferSize(16 * 1024);
> This fixed my problem straight away! Hope it makes it into 1.4.0.
>
> On Sat, Sep 9, 2017 at 12:07 AM, Joe Witt <jo...@gmail.com> wrote:
>>
>> Nope.  That would be specific to these using commons net.
>>
>> Nice work koji and Gino!
>>
>>
>> On Sep 8, 2017 6:54 AM, "Gino Lisignoli" <gl...@gmail.com> wrote:
>>
>> Wow that sounds promising! would that also be the same for any other
>> get/put processors?
>>
>> On Fri, Sep 8, 2017 at 7:47 PM, Koji Kawamura <ij...@gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> Just a quick update. I've tested
>>> commons-net-3.3::org.apache.commons.net.ftp.FTPClient without NiFi
>>> code.
>>> Here is the test code I used.
>>> https://gist.github.com/ijokarumawak/f5a329e53901bf2be7c19aa531094abd
>>>
>>> NiFi doesn't set its BufferSize currently, and default is only 1KB.
>>> To send 10MB file
>>>
>>> # BufferSize = 1KB (default)
>>> about 8 sec
>>>
>>> # BufferSize = 16KB
>>> about 300 ms
>>>
>>> I'm going to create a JIRA to add a processor property to specify buffer
>>> size.
>>> Also, will test SFTP.
>>> Thanks again for highlighting the issue!
>>>
>>> Koji
>>>
>>> On Fri, Sep 8, 2017 at 8:48 AM, Koji Kawamura <ij...@gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > Thanks for clarifying that the number of files is not significant.
>>> > I looked at the PutFTP and FTPTransfer source code, and found that it
>>> > makes few calls to a FTP server in addition to send a file:
>>> >
>>> > 1. Sending a file as a temporal file
>>> > 2. Update modification time, if 'Last Modified Time' is set
>>> > 3. chmod if 'Permissions' is set
>>> > 4. Rename the temporal file
>>> >
>>> > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/util/FTPTransfer.java#L379
>>> >
>>> > PutSFTP and SFTPTransfer does followings additionally:
>>> > 5. chown if 'Remote Owner' is set
>>> > 6. chgrp if 'Remote Group' is set
>>> >
>>> > I wonder if those additional invocations add more latency.
>>> >
>>> > Also, it'd be helpful if you can write simple Java code using the
>>> > underlying (S)FTP client libraries without NiFi layer to investigate
>>> > if NiFi implementation can be improved, or the performance difference
>>> > come from library implementation.
>>> >
>>> > commons-net-3.3::org.apache.commons.net.ftp.FTPClient for FTP
>>> > and
>>> > jsch-0.1.54::com.jcraft.jsch.ChannelSftp for SFTP
>>> >
>>> >
>>> > I will try to do that at my end when I have time, but it'd be very
>>> > helpful if you can do that since you already have testing environment
>>> > and base metrics.
>>> >
>>> > Thanks!
>>> > Koji
>>> >
>>> >
>>> > On Thu, Sep 7, 2017 at 6:30 PM, Gino Lisignoli <gl...@gmail.com>
>>> > wrote:
>>> >> Hi
>>> >>
>>> >> I monitor the send rates using collectd and grafana. It doesn't seem
>>> >> to
>>> >> matter if I send 10,000 10MB files or 100 1GB files, the maximum
>>> >> throughput
>>> >> rate of nifi PutFTP and PutSFTP remain the same. 300Mbps and 1Gbs
>>> >>
>>> >> As mention above, the weird thing is when I send files though ftp and
>>> >> sftp
>>> >> (without nifi) then the rates are much better.
>>> >>
>>> >> It's really odd the the rates are significantly slower in NIFI.
>>> >>
>>> >> On Thu, Sep 7, 2017 at 5:45 PM, Koji Kawamura <ij...@gmail.com>
>>> >> wrote:
>>> >>>
>>> >>> Hello Gino,
>>> >>>
>>> >>> Thanks for sharing your findings on FTP performance.
>>> >>>
>>> >>> How did you measure send rate from NiFi to your FTP server?
>>> >>>
>>> >>> Sending multiple FlowFiles would provide less throughput compared to
>>> >>> sending one big FlowFile, as PutFTP and PutSFTP make connection to
>>> >>> each incoming FlowFile. The overhead of establishing connection each
>>> >>> time might be the performance difference you see with mput command.
>>> >>>
>>> >>> Those processors can decide which FTP servers to use based on
>>> >>> incoming
>>> >>> FlowFiles' attribute when NiFi Expression Language is used.
>>> >>>
>>> >>> If that's the case, there are some room for performance improvement
>>> >>> by
>>> >>> keeping underlying FTP(S) client instance so that it can be reused
>>> >>> among multiple onTrigger() call.
>>> >>>
>>> >>> A possible work-around would be using MergeContent beforehand and
>>> >>> send
>>> >>> it as a single file, if your use-case allows that.
>>> >>>
>>> >>> Thanks,
>>> >>> Koji
>>> >>>
>>> >>> On Thu, Sep 7, 2017 at 12:15 PM, Gino Lisignoli
>>> >>> <gl...@gmail.com>
>>> >>> wrote:
>>> >>> > I have this weird issue with PutFTP and PutSFTP transfer rates.
>>> >>> >
>>> >>> > What I am seeing is that no matter what files I transfer from One
>>> >>> > server
>>> >>> > to
>>> >>> > another over a single connection the maximum rates I can send are
>>> >>> > 300Mbps
>>> >>> > for PutFTP and 1Gbps for PutSFTP.
>>> >>> >
>>> >>> > The sending nifi is installed on Centos 7, running on a Dell R730,
>>> >>> > 190GB
>>> >>> > Ram, 16 Cores @ 2.4GHz and 4x10Gb nics bonded. The sending nifi has
>>> >>> > it's
>>> >>> > content repository on a ramdisk, and the receiving server is
>>> >>> > receiving
>>> >>> > to a
>>> >>> > ramdisk (for testing, to remove disk IO out of the equation).
>>> >>> >
>>> >>> > When I do a ftp send manually (without nifi) with mput I get ftp
>>> >>> > rates
>>> >>> > of
>>> >>> > ~8Gbs and sftp rates of 2.2Gbs (Which seems slow anyway).
>>> >>> >
>>> >>> > I would have expected transfer rates similar with nifi.
>>> >>> >
>>> >>> > Is there any way to work out why these rates are so much slower,
>>> >>> > but
>>> >>> > also so
>>> >>> > consistent? I'm using Nifi-1.30
>>> >>
>>> >>
>>
>>
>>
>

Re: Slow FTP and SFTP nifi transfer rates

Posted by Gino Lisignoli <gl...@gmail.com>.
Just built 1.4.0-SNAPSHOT and added in client.setBufferSize(16 * 1024);
This fixed my problem straight away! Hope it makes it into 1.4.0.

On Sat, Sep 9, 2017 at 12:07 AM, Joe Witt <jo...@gmail.com> wrote:

> Nope.  That would be specific to these using commons net.
>
> Nice work koji and Gino!
>
>
> On Sep 8, 2017 6:54 AM, "Gino Lisignoli" <gl...@gmail.com> wrote:
>
> Wow that sounds promising! would that also be the same for any other
> get/put processors?
>
> On Fri, Sep 8, 2017 at 7:47 PM, Koji Kawamura <ij...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Just a quick update. I've tested
>> commons-net-3.3::org.apache.commons.net.ftp.FTPClient without NiFi
>> code.
>> Here is the test code I used.
>> https://gist.github.com/ijokarumawak/f5a329e53901bf2be7c19aa531094abd
>>
>> NiFi doesn't set its BufferSize currently, and default is only 1KB.
>> To send 10MB file
>>
>> # BufferSize = 1KB (default)
>> about 8 sec
>>
>> # BufferSize = 16KB
>> about 300 ms
>>
>> I'm going to create a JIRA to add a processor property to specify buffer
>> size.
>> Also, will test SFTP.
>> Thanks again for highlighting the issue!
>>
>> Koji
>>
>> On Fri, Sep 8, 2017 at 8:48 AM, Koji Kawamura <ij...@gmail.com>
>> wrote:
>> > Hi,
>> >
>> > Thanks for clarifying that the number of files is not significant.
>> > I looked at the PutFTP and FTPTransfer source code, and found that it
>> > makes few calls to a FTP server in addition to send a file:
>> >
>> > 1. Sending a file as a temporal file
>> > 2. Update modification time, if 'Last Modified Time' is set
>> > 3. chmod if 'Permissions' is set
>> > 4. Rename the temporal file
>> > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/
>> nifi-standard-bundle/nifi-standard-processors/src/main/java/
>> org/apache/nifi/processors/standard/util/FTPTransfer.java#L379
>> >
>> > PutSFTP and SFTPTransfer does followings additionally:
>> > 5. chown if 'Remote Owner' is set
>> > 6. chgrp if 'Remote Group' is set
>> >
>> > I wonder if those additional invocations add more latency.
>> >
>> > Also, it'd be helpful if you can write simple Java code using the
>> > underlying (S)FTP client libraries without NiFi layer to investigate
>> > if NiFi implementation can be improved, or the performance difference
>> > come from library implementation.
>> >
>> > commons-net-3.3::org.apache.commons.net.ftp.FTPClient for FTP
>> > and
>> > jsch-0.1.54::com.jcraft.jsch.ChannelSftp for SFTP
>> >
>> >
>> > I will try to do that at my end when I have time, but it'd be very
>> > helpful if you can do that since you already have testing environment
>> > and base metrics.
>> >
>> > Thanks!
>> > Koji
>> >
>> >
>> > On Thu, Sep 7, 2017 at 6:30 PM, Gino Lisignoli <gl...@gmail.com>
>> wrote:
>> >> Hi
>> >>
>> >> I monitor the send rates using collectd and grafana. It doesn't seem to
>> >> matter if I send 10,000 10MB files or 100 1GB files, the maximum
>> throughput
>> >> rate of nifi PutFTP and PutSFTP remain the same. 300Mbps and 1Gbs
>> >>
>> >> As mention above, the weird thing is when I send files though ftp and
>> sftp
>> >> (without nifi) then the rates are much better.
>> >>
>> >> It's really odd the the rates are significantly slower in NIFI.
>> >>
>> >> On Thu, Sep 7, 2017 at 5:45 PM, Koji Kawamura <ij...@gmail.com>
>> >> wrote:
>> >>>
>> >>> Hello Gino,
>> >>>
>> >>> Thanks for sharing your findings on FTP performance.
>> >>>
>> >>> How did you measure send rate from NiFi to your FTP server?
>> >>>
>> >>> Sending multiple FlowFiles would provide less throughput compared to
>> >>> sending one big FlowFile, as PutFTP and PutSFTP make connection to
>> >>> each incoming FlowFile. The overhead of establishing connection each
>> >>> time might be the performance difference you see with mput command.
>> >>>
>> >>> Those processors can decide which FTP servers to use based on incoming
>> >>> FlowFiles' attribute when NiFi Expression Language is used.
>> >>>
>> >>> If that's the case, there are some room for performance improvement by
>> >>> keeping underlying FTP(S) client instance so that it can be reused
>> >>> among multiple onTrigger() call.
>> >>>
>> >>> A possible work-around would be using MergeContent beforehand and send
>> >>> it as a single file, if your use-case allows that.
>> >>>
>> >>> Thanks,
>> >>> Koji
>> >>>
>> >>> On Thu, Sep 7, 2017 at 12:15 PM, Gino Lisignoli <glisignoli@gmail.com
>> >
>> >>> wrote:
>> >>> > I have this weird issue with PutFTP and PutSFTP transfer rates.
>> >>> >
>> >>> > What I am seeing is that no matter what files I transfer from One
>> server
>> >>> > to
>> >>> > another over a single connection the maximum rates I can send are
>> >>> > 300Mbps
>> >>> > for PutFTP and 1Gbps for PutSFTP.
>> >>> >
>> >>> > The sending nifi is installed on Centos 7, running on a Dell R730,
>> 190GB
>> >>> > Ram, 16 Cores @ 2.4GHz and 4x10Gb nics bonded. The sending nifi has
>> it's
>> >>> > content repository on a ramdisk, and the receiving server is
>> receiving
>> >>> > to a
>> >>> > ramdisk (for testing, to remove disk IO out of the equation).
>> >>> >
>> >>> > When I do a ftp send manually (without nifi) with mput I get ftp
>> rates
>> >>> > of
>> >>> > ~8Gbs and sftp rates of 2.2Gbs (Which seems slow anyway).
>> >>> >
>> >>> > I would have expected transfer rates similar with nifi.
>> >>> >
>> >>> > Is there any way to work out why these rates are so much slower, but
>> >>> > also so
>> >>> > consistent? I'm using Nifi-1.30
>> >>
>> >>
>>
>
>
>

Re: Slow FTP and SFTP nifi transfer rates

Posted by Joe Witt <jo...@gmail.com>.
Nope.  That would be specific to these using commons net.

Nice work koji and Gino!


On Sep 8, 2017 6:54 AM, "Gino Lisignoli" <gl...@gmail.com> wrote:

Wow that sounds promising! would that also be the same for any other
get/put processors?

On Fri, Sep 8, 2017 at 7:47 PM, Koji Kawamura <ij...@gmail.com>
wrote:

> Hi,
>
> Just a quick update. I've tested
> commons-net-3.3::org.apache.commons.net.ftp.FTPClient without NiFi
> code.
> Here is the test code I used.
> https://gist.github.com/ijokarumawak/f5a329e53901bf2be7c19aa531094abd
>
> NiFi doesn't set its BufferSize currently, and default is only 1KB.
> To send 10MB file
>
> # BufferSize = 1KB (default)
> about 8 sec
>
> # BufferSize = 16KB
> about 300 ms
>
> I'm going to create a JIRA to add a processor property to specify buffer
> size.
> Also, will test SFTP.
> Thanks again for highlighting the issue!
>
> Koji
>
> On Fri, Sep 8, 2017 at 8:48 AM, Koji Kawamura <ij...@gmail.com>
> wrote:
> > Hi,
> >
> > Thanks for clarifying that the number of files is not significant.
> > I looked at the PutFTP and FTPTransfer source code, and found that it
> > makes few calls to a FTP server in addition to send a file:
> >
> > 1. Sending a file as a temporal file
> > 2. Update modification time, if 'Last Modified Time' is set
> > 3. chmod if 'Permissions' is set
> > 4. Rename the temporal file
> > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/
> nifi-standard-bundle/nifi-standard-processors/src/main/
> java/org/apache/nifi/processors/standard/util/FTPTransfer.java#L379
> >
> > PutSFTP and SFTPTransfer does followings additionally:
> > 5. chown if 'Remote Owner' is set
> > 6. chgrp if 'Remote Group' is set
> >
> > I wonder if those additional invocations add more latency.
> >
> > Also, it'd be helpful if you can write simple Java code using the
> > underlying (S)FTP client libraries without NiFi layer to investigate
> > if NiFi implementation can be improved, or the performance difference
> > come from library implementation.
> >
> > commons-net-3.3::org.apache.commons.net.ftp.FTPClient for FTP
> > and
> > jsch-0.1.54::com.jcraft.jsch.ChannelSftp for SFTP
> >
> >
> > I will try to do that at my end when I have time, but it'd be very
> > helpful if you can do that since you already have testing environment
> > and base metrics.
> >
> > Thanks!
> > Koji
> >
> >
> > On Thu, Sep 7, 2017 at 6:30 PM, Gino Lisignoli <gl...@gmail.com>
> wrote:
> >> Hi
> >>
> >> I monitor the send rates using collectd and grafana. It doesn't seem to
> >> matter if I send 10,000 10MB files or 100 1GB files, the maximum
> throughput
> >> rate of nifi PutFTP and PutSFTP remain the same. 300Mbps and 1Gbs
> >>
> >> As mention above, the weird thing is when I send files though ftp and
> sftp
> >> (without nifi) then the rates are much better.
> >>
> >> It's really odd the the rates are significantly slower in NIFI.
> >>
> >> On Thu, Sep 7, 2017 at 5:45 PM, Koji Kawamura <ij...@gmail.com>
> >> wrote:
> >>>
> >>> Hello Gino,
> >>>
> >>> Thanks for sharing your findings on FTP performance.
> >>>
> >>> How did you measure send rate from NiFi to your FTP server?
> >>>
> >>> Sending multiple FlowFiles would provide less throughput compared to
> >>> sending one big FlowFile, as PutFTP and PutSFTP make connection to
> >>> each incoming FlowFile. The overhead of establishing connection each
> >>> time might be the performance difference you see with mput command.
> >>>
> >>> Those processors can decide which FTP servers to use based on incoming
> >>> FlowFiles' attribute when NiFi Expression Language is used.
> >>>
> >>> If that's the case, there are some room for performance improvement by
> >>> keeping underlying FTP(S) client instance so that it can be reused
> >>> among multiple onTrigger() call.
> >>>
> >>> A possible work-around would be using MergeContent beforehand and send
> >>> it as a single file, if your use-case allows that.
> >>>
> >>> Thanks,
> >>> Koji
> >>>
> >>> On Thu, Sep 7, 2017 at 12:15 PM, Gino Lisignoli <gl...@gmail.com>
> >>> wrote:
> >>> > I have this weird issue with PutFTP and PutSFTP transfer rates.
> >>> >
> >>> > What I am seeing is that no matter what files I transfer from One
> server
> >>> > to
> >>> > another over a single connection the maximum rates I can send are
> >>> > 300Mbps
> >>> > for PutFTP and 1Gbps for PutSFTP.
> >>> >
> >>> > The sending nifi is installed on Centos 7, running on a Dell R730,
> 190GB
> >>> > Ram, 16 Cores @ 2.4GHz and 4x10Gb nics bonded. The sending nifi has
> it's
> >>> > content repository on a ramdisk, and the receiving server is
> receiving
> >>> > to a
> >>> > ramdisk (for testing, to remove disk IO out of the equation).
> >>> >
> >>> > When I do a ftp send manually (without nifi) with mput I get ftp
> rates
> >>> > of
> >>> > ~8Gbs and sftp rates of 2.2Gbs (Which seems slow anyway).
> >>> >
> >>> > I would have expected transfer rates similar with nifi.
> >>> >
> >>> > Is there any way to work out why these rates are so much slower, but
> >>> > also so
> >>> > consistent? I'm using Nifi-1.30
> >>
> >>
>

Re: Slow FTP and SFTP nifi transfer rates

Posted by Gino Lisignoli <gl...@gmail.com>.
Wow that sounds promising! would that also be the same for any other
get/put processors?

On Fri, Sep 8, 2017 at 7:47 PM, Koji Kawamura <ij...@gmail.com>
wrote:

> Hi,
>
> Just a quick update. I've tested
> commons-net-3.3::org.apache.commons.net.ftp.FTPClient without NiFi
> code.
> Here is the test code I used.
> https://gist.github.com/ijokarumawak/f5a329e53901bf2be7c19aa531094abd
>
> NiFi doesn't set its BufferSize currently, and default is only 1KB.
> To send 10MB file
>
> # BufferSize = 1KB (default)
> about 8 sec
>
> # BufferSize = 16KB
> about 300 ms
>
> I'm going to create a JIRA to add a processor property to specify buffer
> size.
> Also, will test SFTP.
> Thanks again for highlighting the issue!
>
> Koji
>
> On Fri, Sep 8, 2017 at 8:48 AM, Koji Kawamura <ij...@gmail.com>
> wrote:
> > Hi,
> >
> > Thanks for clarifying that the number of files is not significant.
> > I looked at the PutFTP and FTPTransfer source code, and found that it
> > makes few calls to a FTP server in addition to send a file:
> >
> > 1. Sending a file as a temporal file
> > 2. Update modification time, if 'Last Modified Time' is set
> > 3. chmod if 'Permissions' is set
> > 4. Rename the temporal file
> > https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-standard-bundle/nifi-standard-processors/src/
> main/java/org/apache/nifi/processors/standard/util/FTPTransfer.java#L379
> >
> > PutSFTP and SFTPTransfer does followings additionally:
> > 5. chown if 'Remote Owner' is set
> > 6. chgrp if 'Remote Group' is set
> >
> > I wonder if those additional invocations add more latency.
> >
> > Also, it'd be helpful if you can write simple Java code using the
> > underlying (S)FTP client libraries without NiFi layer to investigate
> > if NiFi implementation can be improved, or the performance difference
> > come from library implementation.
> >
> > commons-net-3.3::org.apache.commons.net.ftp.FTPClient for FTP
> > and
> > jsch-0.1.54::com.jcraft.jsch.ChannelSftp for SFTP
> >
> >
> > I will try to do that at my end when I have time, but it'd be very
> > helpful if you can do that since you already have testing environment
> > and base metrics.
> >
> > Thanks!
> > Koji
> >
> >
> > On Thu, Sep 7, 2017 at 6:30 PM, Gino Lisignoli <gl...@gmail.com>
> wrote:
> >> Hi
> >>
> >> I monitor the send rates using collectd and grafana. It doesn't seem to
> >> matter if I send 10,000 10MB files or 100 1GB files, the maximum
> throughput
> >> rate of nifi PutFTP and PutSFTP remain the same. 300Mbps and 1Gbs
> >>
> >> As mention above, the weird thing is when I send files though ftp and
> sftp
> >> (without nifi) then the rates are much better.
> >>
> >> It's really odd the the rates are significantly slower in NIFI.
> >>
> >> On Thu, Sep 7, 2017 at 5:45 PM, Koji Kawamura <ij...@gmail.com>
> >> wrote:
> >>>
> >>> Hello Gino,
> >>>
> >>> Thanks for sharing your findings on FTP performance.
> >>>
> >>> How did you measure send rate from NiFi to your FTP server?
> >>>
> >>> Sending multiple FlowFiles would provide less throughput compared to
> >>> sending one big FlowFile, as PutFTP and PutSFTP make connection to
> >>> each incoming FlowFile. The overhead of establishing connection each
> >>> time might be the performance difference you see with mput command.
> >>>
> >>> Those processors can decide which FTP servers to use based on incoming
> >>> FlowFiles' attribute when NiFi Expression Language is used.
> >>>
> >>> If that's the case, there are some room for performance improvement by
> >>> keeping underlying FTP(S) client instance so that it can be reused
> >>> among multiple onTrigger() call.
> >>>
> >>> A possible work-around would be using MergeContent beforehand and send
> >>> it as a single file, if your use-case allows that.
> >>>
> >>> Thanks,
> >>> Koji
> >>>
> >>> On Thu, Sep 7, 2017 at 12:15 PM, Gino Lisignoli <gl...@gmail.com>
> >>> wrote:
> >>> > I have this weird issue with PutFTP and PutSFTP transfer rates.
> >>> >
> >>> > What I am seeing is that no matter what files I transfer from One
> server
> >>> > to
> >>> > another over a single connection the maximum rates I can send are
> >>> > 300Mbps
> >>> > for PutFTP and 1Gbps for PutSFTP.
> >>> >
> >>> > The sending nifi is installed on Centos 7, running on a Dell R730,
> 190GB
> >>> > Ram, 16 Cores @ 2.4GHz and 4x10Gb nics bonded. The sending nifi has
> it's
> >>> > content repository on a ramdisk, and the receiving server is
> receiving
> >>> > to a
> >>> > ramdisk (for testing, to remove disk IO out of the equation).
> >>> >
> >>> > When I do a ftp send manually (without nifi) with mput I get ftp
> rates
> >>> > of
> >>> > ~8Gbs and sftp rates of 2.2Gbs (Which seems slow anyway).
> >>> >
> >>> > I would have expected transfer rates similar with nifi.
> >>> >
> >>> > Is there any way to work out why these rates are so much slower, but
> >>> > also so
> >>> > consistent? I'm using Nifi-1.30
> >>
> >>
>

Re: Slow FTP and SFTP nifi transfer rates

Posted by Koji Kawamura <ij...@gmail.com>.
Hi,

Just a quick update. I've tested
commons-net-3.3::org.apache.commons.net.ftp.FTPClient without NiFi
code.
Here is the test code I used.
https://gist.github.com/ijokarumawak/f5a329e53901bf2be7c19aa531094abd

NiFi doesn't set its BufferSize currently, and default is only 1KB.
To send 10MB file

# BufferSize = 1KB (default)
about 8 sec

# BufferSize = 16KB
about 300 ms

I'm going to create a JIRA to add a processor property to specify buffer size.
Also, will test SFTP.
Thanks again for highlighting the issue!

Koji

On Fri, Sep 8, 2017 at 8:48 AM, Koji Kawamura <ij...@gmail.com> wrote:
> Hi,
>
> Thanks for clarifying that the number of files is not significant.
> I looked at the PutFTP and FTPTransfer source code, and found that it
> makes few calls to a FTP server in addition to send a file:
>
> 1. Sending a file as a temporal file
> 2. Update modification time, if 'Last Modified Time' is set
> 3. chmod if 'Permissions' is set
> 4. Rename the temporal file
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/util/FTPTransfer.java#L379
>
> PutSFTP and SFTPTransfer does followings additionally:
> 5. chown if 'Remote Owner' is set
> 6. chgrp if 'Remote Group' is set
>
> I wonder if those additional invocations add more latency.
>
> Also, it'd be helpful if you can write simple Java code using the
> underlying (S)FTP client libraries without NiFi layer to investigate
> if NiFi implementation can be improved, or the performance difference
> come from library implementation.
>
> commons-net-3.3::org.apache.commons.net.ftp.FTPClient for FTP
> and
> jsch-0.1.54::com.jcraft.jsch.ChannelSftp for SFTP
>
>
> I will try to do that at my end when I have time, but it'd be very
> helpful if you can do that since you already have testing environment
> and base metrics.
>
> Thanks!
> Koji
>
>
> On Thu, Sep 7, 2017 at 6:30 PM, Gino Lisignoli <gl...@gmail.com> wrote:
>> Hi
>>
>> I monitor the send rates using collectd and grafana. It doesn't seem to
>> matter if I send 10,000 10MB files or 100 1GB files, the maximum throughput
>> rate of nifi PutFTP and PutSFTP remain the same. 300Mbps and 1Gbs
>>
>> As mention above, the weird thing is when I send files though ftp and sftp
>> (without nifi) then the rates are much better.
>>
>> It's really odd the the rates are significantly slower in NIFI.
>>
>> On Thu, Sep 7, 2017 at 5:45 PM, Koji Kawamura <ij...@gmail.com>
>> wrote:
>>>
>>> Hello Gino,
>>>
>>> Thanks for sharing your findings on FTP performance.
>>>
>>> How did you measure send rate from NiFi to your FTP server?
>>>
>>> Sending multiple FlowFiles would provide less throughput compared to
>>> sending one big FlowFile, as PutFTP and PutSFTP make connection to
>>> each incoming FlowFile. The overhead of establishing connection each
>>> time might be the performance difference you see with mput command.
>>>
>>> Those processors can decide which FTP servers to use based on incoming
>>> FlowFiles' attribute when NiFi Expression Language is used.
>>>
>>> If that's the case, there are some room for performance improvement by
>>> keeping underlying FTP(S) client instance so that it can be reused
>>> among multiple onTrigger() call.
>>>
>>> A possible work-around would be using MergeContent beforehand and send
>>> it as a single file, if your use-case allows that.
>>>
>>> Thanks,
>>> Koji
>>>
>>> On Thu, Sep 7, 2017 at 12:15 PM, Gino Lisignoli <gl...@gmail.com>
>>> wrote:
>>> > I have this weird issue with PutFTP and PutSFTP transfer rates.
>>> >
>>> > What I am seeing is that no matter what files I transfer from One server
>>> > to
>>> > another over a single connection the maximum rates I can send are
>>> > 300Mbps
>>> > for PutFTP and 1Gbps for PutSFTP.
>>> >
>>> > The sending nifi is installed on Centos 7, running on a Dell R730, 190GB
>>> > Ram, 16 Cores @ 2.4GHz and 4x10Gb nics bonded. The sending nifi has it's
>>> > content repository on a ramdisk, and the receiving server is receiving
>>> > to a
>>> > ramdisk (for testing, to remove disk IO out of the equation).
>>> >
>>> > When I do a ftp send manually (without nifi) with mput I get ftp rates
>>> > of
>>> > ~8Gbs and sftp rates of 2.2Gbs (Which seems slow anyway).
>>> >
>>> > I would have expected transfer rates similar with nifi.
>>> >
>>> > Is there any way to work out why these rates are so much slower, but
>>> > also so
>>> > consistent? I'm using Nifi-1.30
>>
>>

Re: Slow FTP and SFTP nifi transfer rates

Posted by Koji Kawamura <ij...@gmail.com>.
Hi,

Thanks for clarifying that the number of files is not significant.
I looked at the PutFTP and FTPTransfer source code, and found that it
makes few calls to a FTP server in addition to send a file:

1. Sending a file as a temporal file
2. Update modification time, if 'Last Modified Time' is set
3. chmod if 'Permissions' is set
4. Rename the temporal file
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/util/FTPTransfer.java#L379

PutSFTP and SFTPTransfer does followings additionally:
5. chown if 'Remote Owner' is set
6. chgrp if 'Remote Group' is set

I wonder if those additional invocations add more latency.

Also, it'd be helpful if you can write simple Java code using the
underlying (S)FTP client libraries without NiFi layer to investigate
if NiFi implementation can be improved, or the performance difference
come from library implementation.

commons-net-3.3::org.apache.commons.net.ftp.FTPClient for FTP
and
jsch-0.1.54::com.jcraft.jsch.ChannelSftp for SFTP


I will try to do that at my end when I have time, but it'd be very
helpful if you can do that since you already have testing environment
and base metrics.

Thanks!
Koji


On Thu, Sep 7, 2017 at 6:30 PM, Gino Lisignoli <gl...@gmail.com> wrote:
> Hi
>
> I monitor the send rates using collectd and grafana. It doesn't seem to
> matter if I send 10,000 10MB files or 100 1GB files, the maximum throughput
> rate of nifi PutFTP and PutSFTP remain the same. 300Mbps and 1Gbs
>
> As mention above, the weird thing is when I send files though ftp and sftp
> (without nifi) then the rates are much better.
>
> It's really odd the the rates are significantly slower in NIFI.
>
> On Thu, Sep 7, 2017 at 5:45 PM, Koji Kawamura <ij...@gmail.com>
> wrote:
>>
>> Hello Gino,
>>
>> Thanks for sharing your findings on FTP performance.
>>
>> How did you measure send rate from NiFi to your FTP server?
>>
>> Sending multiple FlowFiles would provide less throughput compared to
>> sending one big FlowFile, as PutFTP and PutSFTP make connection to
>> each incoming FlowFile. The overhead of establishing connection each
>> time might be the performance difference you see with mput command.
>>
>> Those processors can decide which FTP servers to use based on incoming
>> FlowFiles' attribute when NiFi Expression Language is used.
>>
>> If that's the case, there are some room for performance improvement by
>> keeping underlying FTP(S) client instance so that it can be reused
>> among multiple onTrigger() call.
>>
>> A possible work-around would be using MergeContent beforehand and send
>> it as a single file, if your use-case allows that.
>>
>> Thanks,
>> Koji
>>
>> On Thu, Sep 7, 2017 at 12:15 PM, Gino Lisignoli <gl...@gmail.com>
>> wrote:
>> > I have this weird issue with PutFTP and PutSFTP transfer rates.
>> >
>> > What I am seeing is that no matter what files I transfer from One server
>> > to
>> > another over a single connection the maximum rates I can send are
>> > 300Mbps
>> > for PutFTP and 1Gbps for PutSFTP.
>> >
>> > The sending nifi is installed on Centos 7, running on a Dell R730, 190GB
>> > Ram, 16 Cores @ 2.4GHz and 4x10Gb nics bonded. The sending nifi has it's
>> > content repository on a ramdisk, and the receiving server is receiving
>> > to a
>> > ramdisk (for testing, to remove disk IO out of the equation).
>> >
>> > When I do a ftp send manually (without nifi) with mput I get ftp rates
>> > of
>> > ~8Gbs and sftp rates of 2.2Gbs (Which seems slow anyway).
>> >
>> > I would have expected transfer rates similar with nifi.
>> >
>> > Is there any way to work out why these rates are so much slower, but
>> > also so
>> > consistent? I'm using Nifi-1.30
>
>

Re: Slow FTP and SFTP nifi transfer rates

Posted by Gino Lisignoli <gl...@gmail.com>.
Hi

I monitor the send rates using collectd and grafana. It doesn't seem to
matter if I send 10,000 10MB files or 100 1GB files, the maximum throughput
rate of nifi PutFTP and PutSFTP remain the same. 300Mbps and 1Gbs

As mention above, the weird thing is when I send files though ftp and sftp
(without nifi) then the rates are much better.

It's really odd the the rates are significantly slower in NIFI.

On Thu, Sep 7, 2017 at 5:45 PM, Koji Kawamura <ij...@gmail.com>
wrote:

> Hello Gino,
>
> Thanks for sharing your findings on FTP performance.
>
> How did you measure send rate from NiFi to your FTP server?
>
> Sending multiple FlowFiles would provide less throughput compared to
> sending one big FlowFile, as PutFTP and PutSFTP make connection to
> each incoming FlowFile. The overhead of establishing connection each
> time might be the performance difference you see with mput command.
>
> Those processors can decide which FTP servers to use based on incoming
> FlowFiles' attribute when NiFi Expression Language is used.
>
> If that's the case, there are some room for performance improvement by
> keeping underlying FTP(S) client instance so that it can be reused
> among multiple onTrigger() call.
>
> A possible work-around would be using MergeContent beforehand and send
> it as a single file, if your use-case allows that.
>
> Thanks,
> Koji
>
> On Thu, Sep 7, 2017 at 12:15 PM, Gino Lisignoli <gl...@gmail.com>
> wrote:
> > I have this weird issue with PutFTP and PutSFTP transfer rates.
> >
> > What I am seeing is that no matter what files I transfer from One server
> to
> > another over a single connection the maximum rates I can send are 300Mbps
> > for PutFTP and 1Gbps for PutSFTP.
> >
> > The sending nifi is installed on Centos 7, running on a Dell R730, 190GB
> > Ram, 16 Cores @ 2.4GHz and 4x10Gb nics bonded. The sending nifi has it's
> > content repository on a ramdisk, and the receiving server is receiving
> to a
> > ramdisk (for testing, to remove disk IO out of the equation).
> >
> > When I do a ftp send manually (without nifi) with mput I get ftp rates of
> > ~8Gbs and sftp rates of 2.2Gbs (Which seems slow anyway).
> >
> > I would have expected transfer rates similar with nifi.
> >
> > Is there any way to work out why these rates are so much slower, but
> also so
> > consistent? I'm using Nifi-1.30
>

Re: Slow FTP and SFTP nifi transfer rates

Posted by Koji Kawamura <ij...@gmail.com>.
Hello Gino,

Thanks for sharing your findings on FTP performance.

How did you measure send rate from NiFi to your FTP server?

Sending multiple FlowFiles would provide less throughput compared to
sending one big FlowFile, as PutFTP and PutSFTP make connection to
each incoming FlowFile. The overhead of establishing connection each
time might be the performance difference you see with mput command.

Those processors can decide which FTP servers to use based on incoming
FlowFiles' attribute when NiFi Expression Language is used.

If that's the case, there are some room for performance improvement by
keeping underlying FTP(S) client instance so that it can be reused
among multiple onTrigger() call.

A possible work-around would be using MergeContent beforehand and send
it as a single file, if your use-case allows that.

Thanks,
Koji

On Thu, Sep 7, 2017 at 12:15 PM, Gino Lisignoli <gl...@gmail.com> wrote:
> I have this weird issue with PutFTP and PutSFTP transfer rates.
>
> What I am seeing is that no matter what files I transfer from One server to
> another over a single connection the maximum rates I can send are 300Mbps
> for PutFTP and 1Gbps for PutSFTP.
>
> The sending nifi is installed on Centos 7, running on a Dell R730, 190GB
> Ram, 16 Cores @ 2.4GHz and 4x10Gb nics bonded. The sending nifi has it's
> content repository on a ramdisk, and the receiving server is receiving to a
> ramdisk (for testing, to remove disk IO out of the equation).
>
> When I do a ftp send manually (without nifi) with mput I get ftp rates of
> ~8Gbs and sftp rates of 2.2Gbs (Which seems slow anyway).
>
> I would have expected transfer rates similar with nifi.
>
> Is there any way to work out why these rates are so much slower, but also so
> consistent? I'm using Nifi-1.30