You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Pat White <pa...@verizonmedia.com> on 2019/02/01 22:13:35 UTC

Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Hi Folks,

I'm trying to track a very odd performance issue, this is on 1.6.0 using
S2S, would like to ask if there are any known issues like this or if my
flow configuration is broken. From point of view of the RPG, receiving
takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've
setup two simple flows and see this behavior consistently, also duplicated
the flows between two single node instances to verify the behavior follows
the xsfr direction versus the node, behavior follows the direction of xsfr,
ie a receive on both nodes is much slower than sending.

Flows are:

FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA

For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB
xsfrs at ~52.0MB/s, this is leaving default values for all processors,
connections and the RPG with the exception that RPG uses https (instead of
raw), the nodes are running secure. Same policy values were applied on both
nodes to both flows.

Aside from the latency diff, the xsfrs appear to work fine with no
anomalies that i can find, the file transfers correctly in both directions.
The one anomaly i do see is in the slow case, the destination node will
have cpu go to 100% for the majority of the 6 to 7 minutes it takes to
transfer the file, from a jstack on the thread that's using 99%+ of cpu, it
looks like this thread is spending a lot of time in
nifi.remote.util.SiteToSiteRestApiClient.read doing
LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a
bit because all of the ports have compression turned off, there should be
no compress/decompress activity, as far as i can tell.

Example stack for that thread:
"Timer-Driven Process Thread-6" #90 prio=5 os_prio=0 tid=0x00007f4c48002000
nid=0xdb38 runnable [0x00007f4c734f5000]
   java.lang.Thread.State: RUNNABLE
        at java.util.zip.Inflater.inflateBytes(Native Method)
        at java.util.zip.Inflater.inflate(Inflater.java:259)
        - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
        at
java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
        at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
        at
java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
        at
org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
        at
org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
        at java.io.InputStream.read(InputStream.java:179)
        at
org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
        at
org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
        at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
        at
org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
        at java.io.FilterInputStream.read(FilterInputStream.java:133)
        at
org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
        at
org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
        at
org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
        at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
        at
org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
        at
org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
        at
org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
        at
org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
        at
org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
        at
org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
        at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)

Has anyone seen this behavior or symptoms like this?

patw

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Pat White <pa...@verizonmedia.com>.
Sounds great Koji, thank you for looking into that.

I'm trying some tests with changes in GzipHandler included methods, will
update if i have any useful info from that.

patw


On Fri, Feb 15, 2019 at 3:39 AM Koji Kawamura <ij...@gmail.com>
wrote:

> Hi Pat,
>
> Thanks for sharing your insights.
> I will try benchmarking before and after "gzip.setExcludedPath()" that
> Mark has suggested if it helps improving S2S HTTP throughput.
>
> Koji
>
> On Fri, Feb 15, 2019 at 9:31 AM Pat White <pa...@verizonmedia.com>
> wrote:
> >
> > Hi Andy,
> >
> > My requirement is to use https with minimum tls v1.2, https being an
> approved protocol.
> > I haven't looked at websockets though, i need to do that, thank you for
> the suggestion.
> >
> > patw
> >
> >
> >
> > On Thu, Feb 14, 2019 at 12:24 PM Andy LoPresto <al...@apache.org>
> wrote:
> >>
> >> Pat,
> >>
> >> Just to clarify, your connection must be HTTPS or it just must be
> secure? What about Websockets over TLS (wss://)?
> >>
> >> Andy LoPresto
> >> alopresto@apache.org
> >> alopresto.apache@gmail.com
> >> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >>
> >> On Feb 14, 2019, at 9:56 AM, Pat White <pa...@verizonmedia.com>
> wrote:
> >>
> >> Thanks very much folks, definitely appreciate the feedback.
> >>
> >> Right, required to use tls/https connections for s2s, so raw is not an
> option for me.
> >>
> >> Will look further at JettyServer and setIncludedMethods, thanks again.
> >>
> >> patw
> >>
> >> On Thu, Feb 14, 2019 at 11:07 AM Mark Payne <ma...@hotmail.com>
> wrote:
> >>>
> >>> Pat,
> >>>
> >>> It appears to be hard-coded, in JettyServer (full path is
> >>>
> nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
> )
> >>>
> >>> Line 294 calls the gzip method, which looks like:
> >>>
> >>> private Handler gzip(final Handler handler) {
> >>>     final GzipHandler gzip = new GzipHandler();
> >>>     gzip.setIncludedMethods("GET", "POST", "PUT", "DELETE");
> >>>     gzip.setHandler(handler);
> >>>     return gzip;
> >>> }
> >>>
> >>>
> >>> We probably would want to add a "gzip.setExcludedPath()" call to
> exclude anything that goes to the site-to-site path.
> >>>
> >>> Thanks
> >>> -Mark
> >>>
> >>>
> >>> On Feb 14, 2019, at 11:46 AM, Joe Witt <jo...@gmail.com> wrote:
> >>>
> >>> ...interesting.  I dont have an answer but will initiate some
> research.  Hopefully someone else replies if they know off-hand.
> >>>
> >>> Thanks
> >>>
> >>> On Thu, Feb 14, 2019 at 11:43 AM Pat White <pa...@verizonmedia.com>
> wrote:
> >>>>
> >>>> Hi Folks,
> >>>>
> >>>> Could someone point me at the correct way to modify Nifi's embedded
> jetty configuration settings? Specifically i'd like to turn off jetty's
> automatic compression of payload.
> >>>>
> >>>> Reason for asking, think i've found my performance issue,
> uncompressed input to jetty is getting automatically compressed, by jetty,
> causing very small and fragmented packets to be sent, which pegs the cpu
> receive thread, recombining and uncompressing the incoming packets. I'd
> like to verify by turning off auto compress.
> >>>>
> >>>> This is what i'm seeing, app layer compressed data (nifi output port
> compression=on) is accepted by jetty as-is and sent over as large, complete
> tcp packets, which the receiver is able to keep up with (do not see rcv net
> buffers fill up). With app layer uncompressed data (nifi output port
> compression=off), jetty automatically wants to compress and sends payload
> as many small fragmented packets, this causes high cpu load on the receiver
> and fills up the net buffers, causing a great deal of throttling and
> backoff to the sender. This is consistent in wireshark traces, good case
> shows no throttling, bad case shows constant throttling with backoff.
> >>>>
> >>>> I've checked the User and Admin guides, as well as looking at
> JettyServer and web/webdefault.xml for such controls but i'm clearly
> missing something, changes have no effect on the server behavior.
> Appreciate any help on how to set jetty configs properly, thank you.
> >>>>
> >>>> patw
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Feb 5, 2019 at 9:07 AM Pat White <pa...@verizonmedia.com>
> wrote:
> >>>>>
> >>>>> Hi Mark, thank you very much for the feedback, and the JettyServer
> reference, will take a look at that code.
> >>>>>
> >>>>> I'll update the thread if i get any more info. Very strange issue,
> and hard to see what's going on in the stream due to https encryption.
> >>>>> Our usecase is fairly basic, get/put flows using https over s2s, i'd
> expect folks would have hit this if it is indeed an issue, so i tend to
> suspect my install or config, however the behavior is very consistent,
> across multiple clean installs, with small files as well as larger files
> (10s of MB vs GB sized files).
> >>>>>
> >>>>> Thanks again.
> >>>>>
> >>>>> patw
> >>>>>
> >>>>>
> >>>>> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <ma...@hotmail.com>
> wrote:
> >>>>>>
> >>>>>> Hey Pat,
> >>>>>>
> >>>>>> I saw this thread but have not yet had a chance to look into it. So
> thanks for following up!
> >>>>>>
> >>>>>> The embedded server is handled in the JettyServer class [1]. I can
> imagine that it may automatically turn on
> >>>>>> GZIP. When pushing data, though, the client would be the one
> supplying the stream of data, so the client is not
> >>>>>> GZIP'ing the data. But when requesting from Jetty, it may well be
> that Jetty is compressing the data. If that is the
> >>>>>> case, I would imagine that we could easily update the Site-to-Site
> client to add an Accept-Encoding header of None.
> >>>>>> I can't say for sure, off the top of my head, though, that it will
> be as simple of a fix as I'm hoping :)
> >>>>>>
> >>>>>> Thanks
> >>>>>> -Mark
> >>>>>>
> >>>>>> [1]
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
> >>>>>>
> >>>>>>
> >>>>>> On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com>
> wrote:
> >>>>>>
> >>>>>> This looks like a thrashing behavior in compress/decompress, found
> that if i enable compression in the output port of the receiver's RPG, the
> issue goes away, throughput becomes just as good as for the sender's flow.
> Again though, i believe i have compression off for all flows and
> components. Only thing i can think of is if jetty's enforcing compression,
> and with an uncompressed stream has an issue, but not sure why only in one
> direction.
> >>>>>>
> >>>>>> Could someone point me to where Nifi's embedded jetty configuration
> code is, or equiv controls?
> >>>>>>
> >>>>>> patw
> >>>>>>
> >>>>>>
> >>>>>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com>
> wrote:
> >>>>>>>
> >>>>>>> Hi Folks,
> >>>>>>>
> >>>>>>> I'm trying to track a very odd performance issue, this is on 1.6.0
> using S2S, would like to ask if there are any known issues like this or if
> my flow configuration is broken. From point of view of the RPG, receiving
> takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've
> setup two simple flows and see this behavior consistently, also duplicated
> the flows between two single node instances to verify the behavior follows
> the xsfr direction versus the node, behavior follows the direction of xsfr,
> ie a receive on both nodes is much slower than sending.
> >>>>>>>
> >>>>>>> Flows are:
> >>>>>>>
> >>>>>>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB >
> PutFile_nodeB
> >>>>>>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
> >>>>>>>
> >>>>>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s,
> FlowB xsfrs at ~52.0MB/s, this is leaving default values for all
> processors, connections and the RPG with the exception that RPG uses https
> (instead of raw), the nodes are running secure. Same policy values were
> applied on both nodes to both flows.
> >>>>>>>
> >>>>>>> Aside from the latency diff, the xsfrs appear to work fine with no
> anomalies that i can find, the file transfers correctly in both directions.
> The one anomaly i do see is in the slow case, the destination node will
> have cpu go to 100% for the majority of the 6 to 7 minutes it takes to
> transfer the file, from a jstack on the thread that's using 99%+ of cpu, it
> looks like this thread is spending a lot of time in
> nifi.remote.util.SiteToSiteRestApiClient.read doing
> LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a
> bit because all of the ports have compression turned off, there should be
> no compress/decompress activity, as far as i can tell.
> >>>>>>>
> >>>>>>> Example stack for that thread:
> >>>>>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0
> tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
> >>>>>>>    java.lang.Thread.State: RUNNABLE
> >>>>>>>         at java.util.zip.Inflater.inflateBytes(Native Method)
> >>>>>>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
> >>>>>>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
> >>>>>>>         at
> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
> >>>>>>>         at
> java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
> >>>>>>>         at
> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
> >>>>>>>         at
> org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
> >>>>>>>         at
> org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
> >>>>>>>         at java.io.InputStream.read(InputStream.java:179)
> >>>>>>>         at org.apache.nifi.remote.io
> .InterruptableInputStream.read(InterruptableInputStream.java:57)
> >>>>>>>         at org.apache.nifi.stream.io
> .ByteCountingInputStream.read(ByteCountingInputStream.java:51)
> >>>>>>>         at
> java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
> >>>>>>>         at org.apache.nifi.stream.io
> .LimitingInputStream.read(LimitingInputStream.java:88)
> >>>>>>>         at
> java.io.FilterInputStream.read(FilterInputStream.java:133)
> >>>>>>>         at org.apache.nifi.stream.io
> .MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
> >>>>>>>         at org.apache.nifi.stream.io
> .MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
> >>>>>>>         at org.apache.nifi.controller.repository.io
> .TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
> >>>>>>>         at org.apache.nifi.stream.io
> .StreamUtils.copy(StreamUtils.java:35)
> >>>>>>>         at
> org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
> >>>>>>>         at
> org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
> >>>>>>>         at
> org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
> >>>>>>>         at
> org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
> >>>>>>>         at
> org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
> >>>>>>>         at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
> >>>>>>>         at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
> >>>>>>>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> >>>>>>>         at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> >>>>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> >>>>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> >>>>>>>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >>>>>>>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >>>>>>>         at java.lang.Thread.run(Thread.java:748)
> >>>>>>>
> >>>>>>> Has anyone seen this behavior or symptoms like this?
> >>>>>>>
> >>>>>>> patw
> >>>>>>
> >>>>>>
> >>>
> >>
>

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Koji Kawamura <ij...@gmail.com>.
Hi Pat,

Thanks for sharing your insights.
I will try benchmarking before and after "gzip.setExcludedPath()" that
Mark has suggested if it helps improving S2S HTTP throughput.

Koji

On Fri, Feb 15, 2019 at 9:31 AM Pat White <pa...@verizonmedia.com> wrote:
>
> Hi Andy,
>
> My requirement is to use https with minimum tls v1.2, https being an approved protocol.
> I haven't looked at websockets though, i need to do that, thank you for the suggestion.
>
> patw
>
>
>
> On Thu, Feb 14, 2019 at 12:24 PM Andy LoPresto <al...@apache.org> wrote:
>>
>> Pat,
>>
>> Just to clarify, your connection must be HTTPS or it just must be secure? What about Websockets over TLS (wss://)?
>>
>> Andy LoPresto
>> alopresto@apache.org
>> alopresto.apache@gmail.com
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>
>> On Feb 14, 2019, at 9:56 AM, Pat White <pa...@verizonmedia.com> wrote:
>>
>> Thanks very much folks, definitely appreciate the feedback.
>>
>> Right, required to use tls/https connections for s2s, so raw is not an option for me.
>>
>> Will look further at JettyServer and setIncludedMethods, thanks again.
>>
>> patw
>>
>> On Thu, Feb 14, 2019 at 11:07 AM Mark Payne <ma...@hotmail.com> wrote:
>>>
>>> Pat,
>>>
>>> It appears to be hard-coded, in JettyServer (full path is
>>> nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java )
>>>
>>> Line 294 calls the gzip method, which looks like:
>>>
>>> private Handler gzip(final Handler handler) {
>>>     final GzipHandler gzip = new GzipHandler();
>>>     gzip.setIncludedMethods("GET", "POST", "PUT", "DELETE");
>>>     gzip.setHandler(handler);
>>>     return gzip;
>>> }
>>>
>>>
>>> We probably would want to add a "gzip.setExcludedPath()" call to exclude anything that goes to the site-to-site path.
>>>
>>> Thanks
>>> -Mark
>>>
>>>
>>> On Feb 14, 2019, at 11:46 AM, Joe Witt <jo...@gmail.com> wrote:
>>>
>>> ...interesting.  I dont have an answer but will initiate some research.  Hopefully someone else replies if they know off-hand.
>>>
>>> Thanks
>>>
>>> On Thu, Feb 14, 2019 at 11:43 AM Pat White <pa...@verizonmedia.com> wrote:
>>>>
>>>> Hi Folks,
>>>>
>>>> Could someone point me at the correct way to modify Nifi's embedded jetty configuration settings? Specifically i'd like to turn off jetty's automatic compression of payload.
>>>>
>>>> Reason for asking, think i've found my performance issue, uncompressed input to jetty is getting automatically compressed, by jetty, causing very small and fragmented packets to be sent, which pegs the cpu receive thread, recombining and uncompressing the incoming packets. I'd like to verify by turning off auto compress.
>>>>
>>>> This is what i'm seeing, app layer compressed data (nifi output port compression=on) is accepted by jetty as-is and sent over as large, complete tcp packets, which the receiver is able to keep up with (do not see rcv net buffers fill up). With app layer uncompressed data (nifi output port compression=off), jetty automatically wants to compress and sends payload as many small fragmented packets, this causes high cpu load on the receiver and fills up the net buffers, causing a great deal of throttling and backoff to the sender. This is consistent in wireshark traces, good case shows no throttling, bad case shows constant throttling with backoff.
>>>>
>>>> I've checked the User and Admin guides, as well as looking at JettyServer and web/webdefault.xml for such controls but i'm clearly missing something, changes have no effect on the server behavior. Appreciate any help on how to set jetty configs properly, thank you.
>>>>
>>>> patw
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Feb 5, 2019 at 9:07 AM Pat White <pa...@verizonmedia.com> wrote:
>>>>>
>>>>> Hi Mark, thank you very much for the feedback, and the JettyServer reference, will take a look at that code.
>>>>>
>>>>> I'll update the thread if i get any more info. Very strange issue, and hard to see what's going on in the stream due to https encryption.
>>>>> Our usecase is fairly basic, get/put flows using https over s2s, i'd expect folks would have hit this if it is indeed an issue, so i tend to suspect my install or config, however the behavior is very consistent, across multiple clean installs, with small files as well as larger files (10s of MB vs GB sized files).
>>>>>
>>>>> Thanks again.
>>>>>
>>>>> patw
>>>>>
>>>>>
>>>>> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <ma...@hotmail.com> wrote:
>>>>>>
>>>>>> Hey Pat,
>>>>>>
>>>>>> I saw this thread but have not yet had a chance to look into it. So thanks for following up!
>>>>>>
>>>>>> The embedded server is handled in the JettyServer class [1]. I can imagine that it may automatically turn on
>>>>>> GZIP. When pushing data, though, the client would be the one supplying the stream of data, so the client is not
>>>>>> GZIP'ing the data. But when requesting from Jetty, it may well be that Jetty is compressing the data. If that is the
>>>>>> case, I would imagine that we could easily update the Site-to-Site client to add an Accept-Encoding header of None.
>>>>>> I can't say for sure, off the top of my head, though, that it will be as simple of a fix as I'm hoping :)
>>>>>>
>>>>>> Thanks
>>>>>> -Mark
>>>>>>
>>>>>> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
>>>>>>
>>>>>>
>>>>>> On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com> wrote:
>>>>>>
>>>>>> This looks like a thrashing behavior in compress/decompress, found that if i enable compression in the output port of the receiver's RPG, the issue goes away, throughput becomes just as good as for the sender's flow. Again though, i believe i have compression off for all flows and components. Only thing i can think of is if jetty's enforcing compression, and with an uncompressed stream has an issue, but not sure why only in one direction.
>>>>>>
>>>>>> Could someone point me to where Nifi's embedded jetty configuration code is, or equiv controls?
>>>>>>
>>>>>> patw
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com> wrote:
>>>>>>>
>>>>>>> Hi Folks,
>>>>>>>
>>>>>>> I'm trying to track a very odd performance issue, this is on 1.6.0 using S2S, would like to ask if there are any known issues like this or if my flow configuration is broken. From point of view of the RPG, receiving takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've setup two simple flows and see this behavior consistently, also duplicated the flows between two single node instances to verify the behavior follows the xsfr direction versus the node, behavior follows the direction of xsfr, ie a receive on both nodes is much slower than sending.
>>>>>>>
>>>>>>> Flows are:
>>>>>>>
>>>>>>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
>>>>>>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
>>>>>>>
>>>>>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB xsfrs at ~52.0MB/s, this is leaving default values for all processors, connections and the RPG with the exception that RPG uses https (instead of raw), the nodes are running secure. Same policy values were applied on both nodes to both flows.
>>>>>>>
>>>>>>> Aside from the latency diff, the xsfrs appear to work fine with no anomalies that i can find, the file transfers correctly in both directions. The one anomaly i do see is in the slow case, the destination node will have cpu go to 100% for the majority of the 6 to 7 minutes it takes to transfer the file, from a jstack on the thread that's using 99%+ of cpu, it looks like this thread is spending a lot of time in nifi.remote.util.SiteToSiteRestApiClient.read doing LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a bit because all of the ports have compression turned off, there should be no compress/decompress activity, as far as i can tell.
>>>>>>>
>>>>>>> Example stack for that thread:
>>>>>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0 tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
>>>>>>>    java.lang.Thread.State: RUNNABLE
>>>>>>>         at java.util.zip.Inflater.inflateBytes(Native Method)
>>>>>>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
>>>>>>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
>>>>>>>         at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
>>>>>>>         at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
>>>>>>>         at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
>>>>>>>         at org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
>>>>>>>         at org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
>>>>>>>         at java.io.InputStream.read(InputStream.java:179)
>>>>>>>         at org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
>>>>>>>         at org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
>>>>>>>         at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
>>>>>>>         at org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
>>>>>>>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>>>>>>         at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
>>>>>>>         at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
>>>>>>>         at org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
>>>>>>>         at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
>>>>>>>         at org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
>>>>>>>         at org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
>>>>>>>         at org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
>>>>>>>         at org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
>>>>>>>         at org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
>>>>>>>         at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>>>>>>>         at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>>>>>>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>>>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>>>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>>>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>>         at java.lang.Thread.run(Thread.java:748)
>>>>>>>
>>>>>>> Has anyone seen this behavior or symptoms like this?
>>>>>>>
>>>>>>> patw
>>>>>>
>>>>>>
>>>
>>

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Pat White <pa...@verizonmedia.com>.
Hi Andy,

My requirement is to use https with minimum tls v1.2, https being an
approved protocol.
I haven't looked at websockets though, i need to do that, thank you for the
suggestion.

patw



On Thu, Feb 14, 2019 at 12:24 PM Andy LoPresto <al...@apache.org> wrote:

> Pat,
>
> Just to clarify, your connection must be HTTPS or it just must be secure?
> What about Websockets over TLS (wss://)?
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Feb 14, 2019, at 9:56 AM, Pat White <pa...@verizonmedia.com> wrote:
>
> Thanks very much folks, definitely appreciate the feedback.
>
> Right, required to use tls/https connections for s2s, so raw is not an
> option for me.
>
> Will look further at JettyServer and setIncludedMethods, thanks again.
>
> patw
>
> On Thu, Feb 14, 2019 at 11:07 AM Mark Payne <ma...@hotmail.com> wrote:
>
>> Pat,
>>
>> It appears to be hard-coded, in JettyServer (full path is
>> nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
>> )
>>
>> Line 294 calls the gzip method, which looks like:
>>
>> private Handler gzip(final Handler handler) {
>>     final GzipHandler gzip = new GzipHandler();
>>     gzip.setIncludedMethods("GET", "POST", "PUT", "DELETE");
>>     gzip.setHandler(handler);
>>     return gzip;
>> }
>>
>>
>> We probably would want to add a "gzip.setExcludedPath()" call to exclude
>> anything that goes to the site-to-site path.
>>
>> Thanks
>> -Mark
>>
>>
>> On Feb 14, 2019, at 11:46 AM, Joe Witt <jo...@gmail.com> wrote:
>>
>> ...interesting.  I dont have an answer but will initiate some research.
>> Hopefully someone else replies if they know off-hand.
>>
>> Thanks
>>
>> On Thu, Feb 14, 2019 at 11:43 AM Pat White <pa...@verizonmedia.com>
>> wrote:
>>
>>> Hi Folks,
>>>
>>> Could someone point me at the correct way to modify Nifi's embedded
>>> jetty configuration settings? Specifically i'd like to turn off jetty's
>>> automatic compression of payload.
>>>
>>> Reason for asking, think i've found my performance issue, uncompressed
>>> input to jetty is getting automatically compressed, by jetty, causing very
>>> small and fragmented packets to be sent, which pegs the cpu receive thread,
>>> recombining and uncompressing the incoming packets. I'd like to verify by
>>> turning off auto compress.
>>>
>>> This is what i'm seeing, app layer compressed data (nifi output port
>>> compression=on) is accepted by jetty as-is and sent over as large, complete
>>> tcp packets, which the receiver is able to keep up with (do not see rcv net
>>> buffers fill up). With app layer uncompressed data (nifi output port
>>> compression=off), jetty automatically wants to compress and sends payload
>>> as many small fragmented packets, this causes high cpu load on the receiver
>>> and fills up the net buffers, causing a great deal of throttling and
>>> backoff to the sender. This is consistent in wireshark traces, good case
>>> shows no throttling, bad case shows constant throttling with backoff.
>>>
>>> I've checked the User and Admin guides, as well as looking at
>>> JettyServer and web/webdefault.xml for such controls but i'm clearly
>>> missing something, changes have no effect on the server behavior.
>>> Appreciate any help on how to set jetty configs properly, thank you.
>>>
>>> patw
>>>
>>>
>>>
>>>
>>> On Tue, Feb 5, 2019 at 9:07 AM Pat White <pa...@verizonmedia.com>
>>> wrote:
>>>
>>>> Hi Mark, thank you very much for the feedback, and the JettyServer
>>>> reference, will take a look at that code.
>>>>
>>>> I'll update the thread if i get any more info. Very strange issue, and
>>>> hard to see what's going on in the stream due to https encryption.
>>>> Our usecase is fairly basic, get/put flows using https over s2s, i'd
>>>> expect folks would have hit this if it is indeed an issue, so i tend to
>>>> suspect my install or config, however the behavior is very consistent,
>>>> across multiple clean installs, with small files as well as larger files
>>>> (10s of MB vs GB sized files).
>>>>
>>>> Thanks again.
>>>>
>>>> patw
>>>>
>>>>
>>>> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <ma...@hotmail.com> wrote:
>>>>
>>>>> Hey Pat,
>>>>>
>>>>> I saw this thread but have not yet had a chance to look into it. So
>>>>> thanks for following up!
>>>>>
>>>>> The embedded server is handled in the JettyServer class [1]. I can
>>>>> imagine that it may automatically turn on
>>>>> GZIP. When pushing data, though, the client would be the one supplying
>>>>> the stream of data, so the client is not
>>>>> GZIP'ing the data. But when requesting from Jetty, it may well be that
>>>>> Jetty is compressing the data. If that is the
>>>>> case, I would imagine that we could easily update the Site-to-Site
>>>>> client to add an Accept-Encoding header of None.
>>>>> I can't say for sure, off the top of my head, though, that it will be
>>>>> as simple of a fix as I'm hoping :)
>>>>>
>>>>> Thanks
>>>>> -Mark
>>>>>
>>>>> [1]
>>>>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
>>>>>
>>>>>
>>>>> On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com>
>>>>> wrote:
>>>>>
>>>>> This looks like a thrashing behavior in compress/decompress, found
>>>>> that if i enable compression in the output port of the receiver's RPG, the
>>>>> issue goes away, throughput becomes just as good as for the sender's flow.
>>>>> Again though, i believe i have compression off for all flows and
>>>>> components. Only thing i can think of is if jetty's enforcing compression,
>>>>> and with an uncompressed stream has an issue, but not sure why only in one
>>>>> direction.
>>>>>
>>>>> Could someone point me to where Nifi's embedded jetty configuration
>>>>> code is, or equiv controls?
>>>>>
>>>>> patw
>>>>>
>>>>>
>>>>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Folks,
>>>>>>
>>>>>> I'm trying to track a very odd performance issue, this is on 1.6.0
>>>>>> using S2S, would like to ask if there are any known issues like this or if
>>>>>> my flow configuration is broken. From point of view of the RPG, receiving
>>>>>> takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've
>>>>>> setup two simple flows and see this behavior consistently, also duplicated
>>>>>> the flows between two single node instances to verify the behavior follows
>>>>>> the xsfr direction versus the node, behavior follows the direction of xsfr,
>>>>>> ie a receive on both nodes is much slower than sending.
>>>>>>
>>>>>> Flows are:
>>>>>>
>>>>>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
>>>>>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
>>>>>>
>>>>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s,
>>>>>> FlowB xsfrs at ~52.0MB/s, this is leaving default values for all
>>>>>> processors, connections and the RPG with the exception that RPG uses https
>>>>>> (instead of raw), the nodes are running secure. Same policy values were
>>>>>> applied on both nodes to both flows.
>>>>>>
>>>>>> Aside from the latency diff, the xsfrs appear to work fine with no
>>>>>> anomalies that i can find, the file transfers correctly in both directions.
>>>>>> The one anomaly i do see is in the slow case, the destination node will
>>>>>> have cpu go to 100% for the majority of the 6 to 7 minutes it takes to
>>>>>> transfer the file, from a jstack on the thread that's using 99%+ of cpu, it
>>>>>> looks like this thread is spending a lot of time in
>>>>>> nifi.remote.util.SiteToSiteRestApiClient.read doing
>>>>>> LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a
>>>>>> bit because all of the ports have compression turned off, there should be
>>>>>> no compress/decompress activity, as far as i can tell.
>>>>>>
>>>>>> Example stack for that thread:
>>>>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0
>>>>>> tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
>>>>>>    java.lang.Thread.State: RUNNABLE
>>>>>>         at java.util.zip.Inflater.inflateBytes(Native Method)
>>>>>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
>>>>>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
>>>>>>         at
>>>>>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
>>>>>>         at
>>>>>> java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
>>>>>>         at
>>>>>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
>>>>>>         at
>>>>>> org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
>>>>>>         at
>>>>>> org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
>>>>>>         at java.io.InputStream.read(InputStream.java:179)
>>>>>>         at
>>>>>> org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
>>>>>>         at
>>>>>> org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
>>>>>>         at
>>>>>> java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
>>>>>>         at
>>>>>> org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
>>>>>>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>>>>>         at
>>>>>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
>>>>>>         at
>>>>>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
>>>>>>         at
>>>>>> org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
>>>>>>         at
>>>>>> org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
>>>>>>         at
>>>>>> org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
>>>>>>         at
>>>>>> org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
>>>>>>         at
>>>>>> org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
>>>>>>         at
>>>>>> org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
>>>>>>         at
>>>>>> org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
>>>>>>         at
>>>>>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>>>>>>         at
>>>>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>>>>>>         at
>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>         at
>>>>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>>>>         at
>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>>>>         at
>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>         at java.lang.Thread.run(Thread.java:748)
>>>>>>
>>>>>> Has anyone seen this behavior or symptoms like this?
>>>>>>
>>>>>> patw
>>>>>>
>>>>>
>>>>>
>>
>

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Andy LoPresto <al...@apache.org>.
Pat, 

Just to clarify, your connection must be HTTPS or it just must be secure? What about Websockets over TLS (wss://)?

Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Feb 14, 2019, at 9:56 AM, Pat White <pa...@verizonmedia.com> wrote:
> 
> Thanks very much folks, definitely appreciate the feedback.
> 
> Right, required to use tls/https connections for s2s, so raw is not an option for me.
> 
> Will look further at JettyServer and setIncludedMethods, thanks again.
> 
> patw
> 
> On Thu, Feb 14, 2019 at 11:07 AM Mark Payne <markap14@hotmail.com <ma...@hotmail.com>> wrote:
> Pat,
> 
> It appears to be hard-coded, in JettyServer (full path is
> nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java )
> 
> Line 294 calls the gzip method, which looks like:
> 
> private Handler gzip(final Handler handler) {
>     final GzipHandler gzip = new GzipHandler();
>     gzip.setIncludedMethods("GET", "POST", "PUT", "DELETE");
>     gzip.setHandler(handler);
>     return gzip;
> }
> 
> We probably would want to add a "gzip.setExcludedPath()" call to exclude anything that goes to the site-to-site path.
> 
> Thanks
> -Mark
> 
> 
>> On Feb 14, 2019, at 11:46 AM, Joe Witt <joe.witt@gmail.com <ma...@gmail.com>> wrote:
>> 
>> ...interesting.  I dont have an answer but will initiate some research.  Hopefully someone else replies if they know off-hand.
>> 
>> Thanks
>> 
>> On Thu, Feb 14, 2019 at 11:43 AM Pat White <patwhite@verizonmedia.com <ma...@verizonmedia.com>> wrote:
>> Hi Folks,
>> 
>> Could someone point me at the correct way to modify Nifi's embedded jetty configuration settings? Specifically i'd like to turn off jetty's automatic compression of payload.
>> 
>> Reason for asking, think i've found my performance issue, uncompressed input to jetty is getting automatically compressed, by jetty, causing very small and fragmented packets to be sent, which pegs the cpu receive thread, recombining and uncompressing the incoming packets. I'd like to verify by turning off auto compress.
>> 
>> This is what i'm seeing, app layer compressed data (nifi output port compression=on) is accepted by jetty as-is and sent over as large, complete tcp packets, which the receiver is able to keep up with (do not see rcv net buffers fill up). With app layer uncompressed data (nifi output port compression=off), jetty automatically wants to compress and sends payload as many small fragmented packets, this causes high cpu load on the receiver and fills up the net buffers, causing a great deal of throttling and backoff to the sender. This is consistent in wireshark traces, good case shows no throttling, bad case shows constant throttling with backoff.
>> 
>> I've checked the User and Admin guides, as well as looking at JettyServer and web/webdefault.xml for such controls but i'm clearly missing something, changes have no effect on the server behavior. Appreciate any help on how to set jetty configs properly, thank you.
>> 
>> patw 
>> 
>> 
>> 
>> 
>> On Tue, Feb 5, 2019 at 9:07 AM Pat White <patwhite@verizonmedia.com <ma...@verizonmedia.com>> wrote:
>> Hi Mark, thank you very much for the feedback, and the JettyServer reference, will take a look at that code.
>> 
>> I'll update the thread if i get any more info. Very strange issue, and hard to see what's going on in the stream due to https encryption. 
>> Our usecase is fairly basic, get/put flows using https over s2s, i'd expect folks would have hit this if it is indeed an issue, so i tend to suspect my install or config, however the behavior is very consistent, across multiple clean installs, with small files as well as larger files (10s of MB vs GB sized files).
>> 
>> Thanks again.
>> 
>> patw
>> 
>> 
>> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <markap14@hotmail.com <ma...@hotmail.com>> wrote:
>> Hey Pat,
>> 
>> I saw this thread but have not yet had a chance to look into it. So thanks for following up!
>> 
>> The embedded server is handled in the JettyServer class [1]. I can imagine that it may automatically turn on
>> GZIP. When pushing data, though, the client would be the one supplying the stream of data, so the client is not
>> GZIP'ing the data. But when requesting from Jetty, it may well be that Jetty is compressing the data. If that is the
>> case, I would imagine that we could easily update the Site-to-Site client to add an Accept-Encoding header of None.
>> I can't say for sure, off the top of my head, though, that it will be as simple of a fix as I'm hoping :)
>> 
>> Thanks
>> -Mark
>> 
>> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java <https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java>
>> 
>> 
>>> On Feb 4, 2019, at 5:58 PM, Pat White <patwhite@verizonmedia.com <ma...@verizonmedia.com>> wrote:
>>> 
>>> This looks like a thrashing behavior in compress/decompress, found that if i enable compression in the output port of the receiver's RPG, the issue goes away, throughput becomes just as good as for the sender's flow. Again though, i believe i have compression off for all flows and components. Only thing i can think of is if jetty's enforcing compression, and with an uncompressed stream has an issue, but not sure why only in one direction.
>>> 
>>> Could someone point me to where Nifi's embedded jetty configuration code is, or equiv controls?
>>> 
>>> patw
>>> 
>>> 
>>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <patwhite@verizonmedia.com <ma...@verizonmedia.com>> wrote:
>>> Hi Folks,
>>> 
>>> I'm trying to track a very odd performance issue, this is on 1.6.0 using S2S, would like to ask if there are any known issues like this or if my flow configuration is broken. From point of view of the RPG, receiving takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've setup two simple flows and see this behavior consistently, also duplicated the flows between two single node instances to verify the behavior follows the xsfr direction versus the node, behavior follows the direction of xsfr, ie a receive on both nodes is much slower than sending.
>>> 
>>> Flows are:
>>> 
>>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
>>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
>>> 
>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB xsfrs at ~52.0MB/s, this is leaving default values for all processors, connections and the RPG with the exception that RPG uses https (instead of raw), the nodes are running secure. Same policy values were applied on both nodes to both flows. 
>>> 
>>> Aside from the latency diff, the xsfrs appear to work fine with no anomalies that i can find, the file transfers correctly in both directions. The one anomaly i do see is in the slow case, the destination node will have cpu go to 100% for the majority of the 6 to 7 minutes it takes to transfer the file, from a jstack on the thread that's using 99%+ of cpu, it looks like this thread is spending a lot of time in nifi.remote.util.SiteToSiteRestApiClient.read doing LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a bit because all of the ports have compression turned off, there should be no compress/decompress activity, as far as i can tell.
>>> 
>>> Example stack for that thread:
>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0 tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
>>>    java.lang.Thread.State: RUNNABLE
>>>         at java.util.zip.Inflater.inflateBytes(Native Method)
>>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
>>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
>>>         at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
>>>         at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
>>>         at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
>>>         at org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
>>>         at org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
>>>         at java.io.InputStream.read(InputStream.java:179)
>>>         at org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
>>>         at org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
>>>         at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
>>>         at org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
>>>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>>         at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
>>>         at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
>>>         at org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
>>>         at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
>>>         at org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
>>>         at org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
>>>         at org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
>>>         at org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
>>>         at org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
>>>         at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>>>         at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>         at java.lang.Thread.run(Thread.java:748)
>>> 
>>> Has anyone seen this behavior or symptoms like this?
>>> 
>>> patw
>> 
> 


Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Pat White <pa...@verizonmedia.com>.
Thank you Bryan, i have a lot to look at using raw protocol, will be sure
to keep this setting in mind.

patw

On Thu, Feb 14, 2019 at 12:05 PM Bryan Bende <bb...@gmail.com> wrote:

> You can use TLS with raw s2s by setting nifi.remote.input.secure=true
>
> On Thu, Feb 14, 2019 at 12:56 PM Pat White <pa...@verizonmedia.com>
> wrote:
> >
> > Thanks very much folks, definitely appreciate the feedback.
> >
> > Right, required to use tls/https connections for s2s, so raw is not an
> option for me.
> >
> > Will look further at JettyServer and setIncludedMethods, thanks again.
> >
> > patw
> >
> > On Thu, Feb 14, 2019 at 11:07 AM Mark Payne <ma...@hotmail.com>
> wrote:
> >>
> >> Pat,
> >>
> >> It appears to be hard-coded, in JettyServer (full path is
> >>
> nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
> )
> >>
> >> Line 294 calls the gzip method, which looks like:
> >>
> >> private Handler gzip(final Handler handler) {
> >>     final GzipHandler gzip = new GzipHandler();
> >>     gzip.setIncludedMethods("GET", "POST", "PUT", "DELETE");
> >>     gzip.setHandler(handler);
> >>     return gzip;
> >> }
> >>
> >>
> >> We probably would want to add a "gzip.setExcludedPath()" call to
> exclude anything that goes to the site-to-site path.
> >>
> >> Thanks
> >> -Mark
> >>
> >>
> >> On Feb 14, 2019, at 11:46 AM, Joe Witt <jo...@gmail.com> wrote:
> >>
> >> ...interesting.  I dont have an answer but will initiate some
> research.  Hopefully someone else replies if they know off-hand.
> >>
> >> Thanks
> >>
> >> On Thu, Feb 14, 2019 at 11:43 AM Pat White <pa...@verizonmedia.com>
> wrote:
> >>>
> >>> Hi Folks,
> >>>
> >>> Could someone point me at the correct way to modify Nifi's embedded
> jetty configuration settings? Specifically i'd like to turn off jetty's
> automatic compression of payload.
> >>>
> >>> Reason for asking, think i've found my performance issue, uncompressed
> input to jetty is getting automatically compressed, by jetty, causing very
> small and fragmented packets to be sent, which pegs the cpu receive thread,
> recombining and uncompressing the incoming packets. I'd like to verify by
> turning off auto compress.
> >>>
> >>> This is what i'm seeing, app layer compressed data (nifi output port
> compression=on) is accepted by jetty as-is and sent over as large, complete
> tcp packets, which the receiver is able to keep up with (do not see rcv net
> buffers fill up). With app layer uncompressed data (nifi output port
> compression=off), jetty automatically wants to compress and sends payload
> as many small fragmented packets, this causes high cpu load on the receiver
> and fills up the net buffers, causing a great deal of throttling and
> backoff to the sender. This is consistent in wireshark traces, good case
> shows no throttling, bad case shows constant throttling with backoff.
> >>>
> >>> I've checked the User and Admin guides, as well as looking at
> JettyServer and web/webdefault.xml for such controls but i'm clearly
> missing something, changes have no effect on the server behavior.
> Appreciate any help on how to set jetty configs properly, thank you.
> >>>
> >>> patw
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, Feb 5, 2019 at 9:07 AM Pat White <pa...@verizonmedia.com>
> wrote:
> >>>>
> >>>> Hi Mark, thank you very much for the feedback, and the JettyServer
> reference, will take a look at that code.
> >>>>
> >>>> I'll update the thread if i get any more info. Very strange issue,
> and hard to see what's going on in the stream due to https encryption.
> >>>> Our usecase is fairly basic, get/put flows using https over s2s, i'd
> expect folks would have hit this if it is indeed an issue, so i tend to
> suspect my install or config, however the behavior is very consistent,
> across multiple clean installs, with small files as well as larger files
> (10s of MB vs GB sized files).
> >>>>
> >>>> Thanks again.
> >>>>
> >>>> patw
> >>>>
> >>>>
> >>>> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <ma...@hotmail.com>
> wrote:
> >>>>>
> >>>>> Hey Pat,
> >>>>>
> >>>>> I saw this thread but have not yet had a chance to look into it. So
> thanks for following up!
> >>>>>
> >>>>> The embedded server is handled in the JettyServer class [1]. I can
> imagine that it may automatically turn on
> >>>>> GZIP. When pushing data, though, the client would be the one
> supplying the stream of data, so the client is not
> >>>>> GZIP'ing the data. But when requesting from Jetty, it may well be
> that Jetty is compressing the data. If that is the
> >>>>> case, I would imagine that we could easily update the Site-to-Site
> client to add an Accept-Encoding header of None.
> >>>>> I can't say for sure, off the top of my head, though, that it will
> be as simple of a fix as I'm hoping :)
> >>>>>
> >>>>> Thanks
> >>>>> -Mark
> >>>>>
> >>>>> [1]
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
> >>>>>
> >>>>>
> >>>>> On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com>
> wrote:
> >>>>>
> >>>>> This looks like a thrashing behavior in compress/decompress, found
> that if i enable compression in the output port of the receiver's RPG, the
> issue goes away, throughput becomes just as good as for the sender's flow.
> Again though, i believe i have compression off for all flows and
> components. Only thing i can think of is if jetty's enforcing compression,
> and with an uncompressed stream has an issue, but not sure why only in one
> direction.
> >>>>>
> >>>>> Could someone point me to where Nifi's embedded jetty configuration
> code is, or equiv controls?
> >>>>>
> >>>>> patw
> >>>>>
> >>>>>
> >>>>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com>
> wrote:
> >>>>>>
> >>>>>> Hi Folks,
> >>>>>>
> >>>>>> I'm trying to track a very odd performance issue, this is on 1.6.0
> using S2S, would like to ask if there are any known issues like this or if
> my flow configuration is broken. From point of view of the RPG, receiving
> takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've
> setup two simple flows and see this behavior consistently, also duplicated
> the flows between two single node instances to verify the behavior follows
> the xsfr direction versus the node, behavior follows the direction of xsfr,
> ie a receive on both nodes is much slower than sending.
> >>>>>>
> >>>>>> Flows are:
> >>>>>>
> >>>>>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
> >>>>>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
> >>>>>>
> >>>>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s,
> FlowB xsfrs at ~52.0MB/s, this is leaving default values for all
> processors, connections and the RPG with the exception that RPG uses https
> (instead of raw), the nodes are running secure. Same policy values were
> applied on both nodes to both flows.
> >>>>>>
> >>>>>> Aside from the latency diff, the xsfrs appear to work fine with no
> anomalies that i can find, the file transfers correctly in both directions.
> The one anomaly i do see is in the slow case, the destination node will
> have cpu go to 100% for the majority of the 6 to 7 minutes it takes to
> transfer the file, from a jstack on the thread that's using 99%+ of cpu, it
> looks like this thread is spending a lot of time in
> nifi.remote.util.SiteToSiteRestApiClient.read doing
> LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a
> bit because all of the ports have compression turned off, there should be
> no compress/decompress activity, as far as i can tell.
> >>>>>>
> >>>>>> Example stack for that thread:
> >>>>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0
> tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
> >>>>>>    java.lang.Thread.State: RUNNABLE
> >>>>>>         at java.util.zip.Inflater.inflateBytes(Native Method)
> >>>>>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
> >>>>>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
> >>>>>>         at
> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
> >>>>>>         at
> java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
> >>>>>>         at
> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
> >>>>>>         at
> org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
> >>>>>>         at
> org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
> >>>>>>         at java.io.InputStream.read(InputStream.java:179)
> >>>>>>         at org.apache.nifi.remote.io
> .InterruptableInputStream.read(InterruptableInputStream.java:57)
> >>>>>>         at org.apache.nifi.stream.io
> .ByteCountingInputStream.read(ByteCountingInputStream.java:51)
> >>>>>>         at
> java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
> >>>>>>         at org.apache.nifi.stream.io
> .LimitingInputStream.read(LimitingInputStream.java:88)
> >>>>>>         at
> java.io.FilterInputStream.read(FilterInputStream.java:133)
> >>>>>>         at org.apache.nifi.stream.io
> .MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
> >>>>>>         at org.apache.nifi.stream.io
> .MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
> >>>>>>         at org.apache.nifi.controller.repository.io
> .TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
> >>>>>>         at org.apache.nifi.stream.io
> .StreamUtils.copy(StreamUtils.java:35)
> >>>>>>         at
> org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
> >>>>>>         at
> org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
> >>>>>>         at
> org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
> >>>>>>         at
> org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
> >>>>>>         at
> org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
> >>>>>>         at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
> >>>>>>         at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
> >>>>>>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> >>>>>>         at
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> >>>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> >>>>>>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> >>>>>>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >>>>>>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >>>>>>         at java.lang.Thread.run(Thread.java:748)
> >>>>>>
> >>>>>> Has anyone seen this behavior or symptoms like this?
> >>>>>>
> >>>>>> patw
> >>>>>
> >>>>>
> >>
>

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Bryan Bende <bb...@gmail.com>.
You can use TLS with raw s2s by setting nifi.remote.input.secure=true

On Thu, Feb 14, 2019 at 12:56 PM Pat White <pa...@verizonmedia.com> wrote:
>
> Thanks very much folks, definitely appreciate the feedback.
>
> Right, required to use tls/https connections for s2s, so raw is not an option for me.
>
> Will look further at JettyServer and setIncludedMethods, thanks again.
>
> patw
>
> On Thu, Feb 14, 2019 at 11:07 AM Mark Payne <ma...@hotmail.com> wrote:
>>
>> Pat,
>>
>> It appears to be hard-coded, in JettyServer (full path is
>> nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java )
>>
>> Line 294 calls the gzip method, which looks like:
>>
>> private Handler gzip(final Handler handler) {
>>     final GzipHandler gzip = new GzipHandler();
>>     gzip.setIncludedMethods("GET", "POST", "PUT", "DELETE");
>>     gzip.setHandler(handler);
>>     return gzip;
>> }
>>
>>
>> We probably would want to add a "gzip.setExcludedPath()" call to exclude anything that goes to the site-to-site path.
>>
>> Thanks
>> -Mark
>>
>>
>> On Feb 14, 2019, at 11:46 AM, Joe Witt <jo...@gmail.com> wrote:
>>
>> ...interesting.  I dont have an answer but will initiate some research.  Hopefully someone else replies if they know off-hand.
>>
>> Thanks
>>
>> On Thu, Feb 14, 2019 at 11:43 AM Pat White <pa...@verizonmedia.com> wrote:
>>>
>>> Hi Folks,
>>>
>>> Could someone point me at the correct way to modify Nifi's embedded jetty configuration settings? Specifically i'd like to turn off jetty's automatic compression of payload.
>>>
>>> Reason for asking, think i've found my performance issue, uncompressed input to jetty is getting automatically compressed, by jetty, causing very small and fragmented packets to be sent, which pegs the cpu receive thread, recombining and uncompressing the incoming packets. I'd like to verify by turning off auto compress.
>>>
>>> This is what i'm seeing, app layer compressed data (nifi output port compression=on) is accepted by jetty as-is and sent over as large, complete tcp packets, which the receiver is able to keep up with (do not see rcv net buffers fill up). With app layer uncompressed data (nifi output port compression=off), jetty automatically wants to compress and sends payload as many small fragmented packets, this causes high cpu load on the receiver and fills up the net buffers, causing a great deal of throttling and backoff to the sender. This is consistent in wireshark traces, good case shows no throttling, bad case shows constant throttling with backoff.
>>>
>>> I've checked the User and Admin guides, as well as looking at JettyServer and web/webdefault.xml for such controls but i'm clearly missing something, changes have no effect on the server behavior. Appreciate any help on how to set jetty configs properly, thank you.
>>>
>>> patw
>>>
>>>
>>>
>>>
>>> On Tue, Feb 5, 2019 at 9:07 AM Pat White <pa...@verizonmedia.com> wrote:
>>>>
>>>> Hi Mark, thank you very much for the feedback, and the JettyServer reference, will take a look at that code.
>>>>
>>>> I'll update the thread if i get any more info. Very strange issue, and hard to see what's going on in the stream due to https encryption.
>>>> Our usecase is fairly basic, get/put flows using https over s2s, i'd expect folks would have hit this if it is indeed an issue, so i tend to suspect my install or config, however the behavior is very consistent, across multiple clean installs, with small files as well as larger files (10s of MB vs GB sized files).
>>>>
>>>> Thanks again.
>>>>
>>>> patw
>>>>
>>>>
>>>> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <ma...@hotmail.com> wrote:
>>>>>
>>>>> Hey Pat,
>>>>>
>>>>> I saw this thread but have not yet had a chance to look into it. So thanks for following up!
>>>>>
>>>>> The embedded server is handled in the JettyServer class [1]. I can imagine that it may automatically turn on
>>>>> GZIP. When pushing data, though, the client would be the one supplying the stream of data, so the client is not
>>>>> GZIP'ing the data. But when requesting from Jetty, it may well be that Jetty is compressing the data. If that is the
>>>>> case, I would imagine that we could easily update the Site-to-Site client to add an Accept-Encoding header of None.
>>>>> I can't say for sure, off the top of my head, though, that it will be as simple of a fix as I'm hoping :)
>>>>>
>>>>> Thanks
>>>>> -Mark
>>>>>
>>>>> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
>>>>>
>>>>>
>>>>> On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com> wrote:
>>>>>
>>>>> This looks like a thrashing behavior in compress/decompress, found that if i enable compression in the output port of the receiver's RPG, the issue goes away, throughput becomes just as good as for the sender's flow. Again though, i believe i have compression off for all flows and components. Only thing i can think of is if jetty's enforcing compression, and with an uncompressed stream has an issue, but not sure why only in one direction.
>>>>>
>>>>> Could someone point me to where Nifi's embedded jetty configuration code is, or equiv controls?
>>>>>
>>>>> patw
>>>>>
>>>>>
>>>>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com> wrote:
>>>>>>
>>>>>> Hi Folks,
>>>>>>
>>>>>> I'm trying to track a very odd performance issue, this is on 1.6.0 using S2S, would like to ask if there are any known issues like this or if my flow configuration is broken. From point of view of the RPG, receiving takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've setup two simple flows and see this behavior consistently, also duplicated the flows between two single node instances to verify the behavior follows the xsfr direction versus the node, behavior follows the direction of xsfr, ie a receive on both nodes is much slower than sending.
>>>>>>
>>>>>> Flows are:
>>>>>>
>>>>>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
>>>>>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
>>>>>>
>>>>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB xsfrs at ~52.0MB/s, this is leaving default values for all processors, connections and the RPG with the exception that RPG uses https (instead of raw), the nodes are running secure. Same policy values were applied on both nodes to both flows.
>>>>>>
>>>>>> Aside from the latency diff, the xsfrs appear to work fine with no anomalies that i can find, the file transfers correctly in both directions. The one anomaly i do see is in the slow case, the destination node will have cpu go to 100% for the majority of the 6 to 7 minutes it takes to transfer the file, from a jstack on the thread that's using 99%+ of cpu, it looks like this thread is spending a lot of time in nifi.remote.util.SiteToSiteRestApiClient.read doing LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a bit because all of the ports have compression turned off, there should be no compress/decompress activity, as far as i can tell.
>>>>>>
>>>>>> Example stack for that thread:
>>>>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0 tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
>>>>>>    java.lang.Thread.State: RUNNABLE
>>>>>>         at java.util.zip.Inflater.inflateBytes(Native Method)
>>>>>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
>>>>>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
>>>>>>         at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
>>>>>>         at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
>>>>>>         at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
>>>>>>         at org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
>>>>>>         at org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
>>>>>>         at java.io.InputStream.read(InputStream.java:179)
>>>>>>         at org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
>>>>>>         at org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
>>>>>>         at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
>>>>>>         at org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
>>>>>>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>>>>>         at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
>>>>>>         at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
>>>>>>         at org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
>>>>>>         at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
>>>>>>         at org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
>>>>>>         at org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
>>>>>>         at org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
>>>>>>         at org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
>>>>>>         at org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
>>>>>>         at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>>>>>>         at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>>>>>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>         at java.lang.Thread.run(Thread.java:748)
>>>>>>
>>>>>> Has anyone seen this behavior or symptoms like this?
>>>>>>
>>>>>> patw
>>>>>
>>>>>
>>

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Pat White <pa...@verizonmedia.com>.
Thanks very much folks, definitely appreciate the feedback.

Right, required to use tls/https connections for s2s, so raw is not an
option for me.

Will look further at JettyServer and setIncludedMethods, thanks again.

patw

On Thu, Feb 14, 2019 at 11:07 AM Mark Payne <ma...@hotmail.com> wrote:

> Pat,
>
> It appears to be hard-coded, in JettyServer (full path is
> nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
> )
>
> Line 294 calls the gzip method, which looks like:
>
> private Handler gzip(final Handler handler) {
>     final GzipHandler gzip = new GzipHandler();
>     gzip.setIncludedMethods("GET", "POST", "PUT", "DELETE");
>     gzip.setHandler(handler);
>     return gzip;
> }
>
>
> We probably would want to add a "gzip.setExcludedPath()" call to exclude
> anything that goes to the site-to-site path.
>
> Thanks
> -Mark
>
>
> On Feb 14, 2019, at 11:46 AM, Joe Witt <jo...@gmail.com> wrote:
>
> ...interesting.  I dont have an answer but will initiate some research.
> Hopefully someone else replies if they know off-hand.
>
> Thanks
>
> On Thu, Feb 14, 2019 at 11:43 AM Pat White <pa...@verizonmedia.com>
> wrote:
>
>> Hi Folks,
>>
>> Could someone point me at the correct way to modify Nifi's embedded jetty
>> configuration settings? Specifically i'd like to turn off jetty's automatic
>> compression of payload.
>>
>> Reason for asking, think i've found my performance issue, uncompressed
>> input to jetty is getting automatically compressed, by jetty, causing very
>> small and fragmented packets to be sent, which pegs the cpu receive thread,
>> recombining and uncompressing the incoming packets. I'd like to verify by
>> turning off auto compress.
>>
>> This is what i'm seeing, app layer compressed data (nifi output port
>> compression=on) is accepted by jetty as-is and sent over as large, complete
>> tcp packets, which the receiver is able to keep up with (do not see rcv net
>> buffers fill up). With app layer uncompressed data (nifi output port
>> compression=off), jetty automatically wants to compress and sends payload
>> as many small fragmented packets, this causes high cpu load on the receiver
>> and fills up the net buffers, causing a great deal of throttling and
>> backoff to the sender. This is consistent in wireshark traces, good case
>> shows no throttling, bad case shows constant throttling with backoff.
>>
>> I've checked the User and Admin guides, as well as looking at JettyServer
>> and web/webdefault.xml for such controls but i'm clearly missing something,
>> changes have no effect on the server behavior. Appreciate any help on how
>> to set jetty configs properly, thank you.
>>
>> patw
>>
>>
>>
>>
>> On Tue, Feb 5, 2019 at 9:07 AM Pat White <pa...@verizonmedia.com>
>> wrote:
>>
>>> Hi Mark, thank you very much for the feedback, and the JettyServer
>>> reference, will take a look at that code.
>>>
>>> I'll update the thread if i get any more info. Very strange issue, and
>>> hard to see what's going on in the stream due to https encryption.
>>> Our usecase is fairly basic, get/put flows using https over s2s, i'd
>>> expect folks would have hit this if it is indeed an issue, so i tend to
>>> suspect my install or config, however the behavior is very consistent,
>>> across multiple clean installs, with small files as well as larger files
>>> (10s of MB vs GB sized files).
>>>
>>> Thanks again.
>>>
>>> patw
>>>
>>>
>>> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <ma...@hotmail.com> wrote:
>>>
>>>> Hey Pat,
>>>>
>>>> I saw this thread but have not yet had a chance to look into it. So
>>>> thanks for following up!
>>>>
>>>> The embedded server is handled in the JettyServer class [1]. I can
>>>> imagine that it may automatically turn on
>>>> GZIP. When pushing data, though, the client would be the one supplying
>>>> the stream of data, so the client is not
>>>> GZIP'ing the data. But when requesting from Jetty, it may well be that
>>>> Jetty is compressing the data. If that is the
>>>> case, I would imagine that we could easily update the Site-to-Site
>>>> client to add an Accept-Encoding header of None.
>>>> I can't say for sure, off the top of my head, though, that it will be
>>>> as simple of a fix as I'm hoping :)
>>>>
>>>> Thanks
>>>> -Mark
>>>>
>>>> [1]
>>>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
>>>>
>>>>
>>>> On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com>
>>>> wrote:
>>>>
>>>> This looks like a thrashing behavior in compress/decompress, found that
>>>> if i enable compression in the output port of the receiver's RPG, the issue
>>>> goes away, throughput becomes just as good as for the sender's flow. Again
>>>> though, i believe i have compression off for all flows and components. Only
>>>> thing i can think of is if jetty's enforcing compression, and with an
>>>> uncompressed stream has an issue, but not sure why only in one direction.
>>>>
>>>> Could someone point me to where Nifi's embedded jetty configuration
>>>> code is, or equiv controls?
>>>>
>>>> patw
>>>>
>>>>
>>>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com>
>>>> wrote:
>>>>
>>>>> Hi Folks,
>>>>>
>>>>> I'm trying to track a very odd performance issue, this is on 1.6.0
>>>>> using S2S, would like to ask if there are any known issues like this or if
>>>>> my flow configuration is broken. From point of view of the RPG, receiving
>>>>> takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've
>>>>> setup two simple flows and see this behavior consistently, also duplicated
>>>>> the flows between two single node instances to verify the behavior follows
>>>>> the xsfr direction versus the node, behavior follows the direction of xsfr,
>>>>> ie a receive on both nodes is much slower than sending.
>>>>>
>>>>> Flows are:
>>>>>
>>>>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
>>>>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
>>>>>
>>>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s,
>>>>> FlowB xsfrs at ~52.0MB/s, this is leaving default values for all
>>>>> processors, connections and the RPG with the exception that RPG uses https
>>>>> (instead of raw), the nodes are running secure. Same policy values were
>>>>> applied on both nodes to both flows.
>>>>>
>>>>> Aside from the latency diff, the xsfrs appear to work fine with no
>>>>> anomalies that i can find, the file transfers correctly in both directions.
>>>>> The one anomaly i do see is in the slow case, the destination node will
>>>>> have cpu go to 100% for the majority of the 6 to 7 minutes it takes to
>>>>> transfer the file, from a jstack on the thread that's using 99%+ of cpu, it
>>>>> looks like this thread is spending a lot of time in
>>>>> nifi.remote.util.SiteToSiteRestApiClient.read doing
>>>>> LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a
>>>>> bit because all of the ports have compression turned off, there should be
>>>>> no compress/decompress activity, as far as i can tell.
>>>>>
>>>>> Example stack for that thread:
>>>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0
>>>>> tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
>>>>>    java.lang.Thread.State: RUNNABLE
>>>>>         at java.util.zip.Inflater.inflateBytes(Native Method)
>>>>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
>>>>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
>>>>>         at
>>>>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
>>>>>         at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
>>>>>         at
>>>>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
>>>>>         at
>>>>> org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
>>>>>         at
>>>>> org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
>>>>>         at java.io.InputStream.read(InputStream.java:179)
>>>>>         at
>>>>> org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
>>>>>         at
>>>>> org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
>>>>>         at
>>>>> java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
>>>>>         at
>>>>> org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
>>>>>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>>>>         at
>>>>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
>>>>>         at
>>>>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
>>>>>         at
>>>>> org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
>>>>>         at
>>>>> org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
>>>>>         at
>>>>> org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
>>>>>         at
>>>>> org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
>>>>>         at
>>>>> org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
>>>>>         at
>>>>> org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
>>>>>         at
>>>>> org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
>>>>>         at
>>>>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>>>>>         at
>>>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>>>>>         at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>         at
>>>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>>>         at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>>>         at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>         at java.lang.Thread.run(Thread.java:748)
>>>>>
>>>>> Has anyone seen this behavior or symptoms like this?
>>>>>
>>>>> patw
>>>>>
>>>>
>>>>
>

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Mark Payne <ma...@hotmail.com>.
Pat,

It appears to be hard-coded, in JettyServer (full path is
nifi/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java )

Line 294 calls the gzip method, which looks like:


private Handler gzip(final Handler handler) {
    final GzipHandler gzip = new GzipHandler();
    gzip.setIncludedMethods("GET", "POST", "PUT", "DELETE");
    gzip.setHandler(handler);
    return gzip;
}

We probably would want to add a "gzip.setExcludedPath()" call to exclude anything that goes to the site-to-site path.

Thanks
-Mark


On Feb 14, 2019, at 11:46 AM, Joe Witt <jo...@gmail.com>> wrote:

...interesting.  I dont have an answer but will initiate some research.  Hopefully someone else replies if they know off-hand.

Thanks

On Thu, Feb 14, 2019 at 11:43 AM Pat White <pa...@verizonmedia.com>> wrote:
Hi Folks,

Could someone point me at the correct way to modify Nifi's embedded jetty configuration settings? Specifically i'd like to turn off jetty's automatic compression of payload.

Reason for asking, think i've found my performance issue, uncompressed input to jetty is getting automatically compressed, by jetty, causing very small and fragmented packets to be sent, which pegs the cpu receive thread, recombining and uncompressing the incoming packets. I'd like to verify by turning off auto compress.

This is what i'm seeing, app layer compressed data (nifi output port compression=on) is accepted by jetty as-is and sent over as large, complete tcp packets, which the receiver is able to keep up with (do not see rcv net buffers fill up). With app layer uncompressed data (nifi output port compression=off), jetty automatically wants to compress and sends payload as many small fragmented packets, this causes high cpu load on the receiver and fills up the net buffers, causing a great deal of throttling and backoff to the sender. This is consistent in wireshark traces, good case shows no throttling, bad case shows constant throttling with backoff.

I've checked the User and Admin guides, as well as looking at JettyServer and web/webdefault.xml for such controls but i'm clearly missing something, changes have no effect on the server behavior. Appreciate any help on how to set jetty configs properly, thank you.

patw




On Tue, Feb 5, 2019 at 9:07 AM Pat White <pa...@verizonmedia.com>> wrote:
Hi Mark, thank you very much for the feedback, and the JettyServer reference, will take a look at that code.

I'll update the thread if i get any more info. Very strange issue, and hard to see what's going on in the stream due to https encryption.
Our usecase is fairly basic, get/put flows using https over s2s, i'd expect folks would have hit this if it is indeed an issue, so i tend to suspect my install or config, however the behavior is very consistent, across multiple clean installs, with small files as well as larger files (10s of MB vs GB sized files).

Thanks again.

patw


On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <ma...@hotmail.com>> wrote:
Hey Pat,

I saw this thread but have not yet had a chance to look into it. So thanks for following up!

The embedded server is handled in the JettyServer class [1]. I can imagine that it may automatically turn on
GZIP. When pushing data, though, the client would be the one supplying the stream of data, so the client is not
GZIP'ing the data. But when requesting from Jetty, it may well be that Jetty is compressing the data. If that is the
case, I would imagine that we could easily update the Site-to-Site client to add an Accept-Encoding header of None.
I can't say for sure, off the top of my head, though, that it will be as simple of a fix as I'm hoping :)

Thanks
-Mark

[1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java


On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com>> wrote:

This looks like a thrashing behavior in compress/decompress, found that if i enable compression in the output port of the receiver's RPG, the issue goes away, throughput becomes just as good as for the sender's flow. Again though, i believe i have compression off for all flows and components. Only thing i can think of is if jetty's enforcing compression, and with an uncompressed stream has an issue, but not sure why only in one direction.

Could someone point me to where Nifi's embedded jetty configuration code is, or equiv controls?

patw


On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com>> wrote:
Hi Folks,

I'm trying to track a very odd performance issue, this is on 1.6.0 using S2S, would like to ask if there are any known issues like this or if my flow configuration is broken. From point of view of the RPG, receiving takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've setup two simple flows and see this behavior consistently, also duplicated the flows between two single node instances to verify the behavior follows the xsfr direction versus the node, behavior follows the direction of xsfr, ie a receive on both nodes is much slower than sending.

Flows are:

FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA

For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB xsfrs at ~52.0MB/s, this is leaving default values for all processors, connections and the RPG with the exception that RPG uses https (instead of raw), the nodes are running secure. Same policy values were applied on both nodes to both flows.

Aside from the latency diff, the xsfrs appear to work fine with no anomalies that i can find, the file transfers correctly in both directions. The one anomaly i do see is in the slow case, the destination node will have cpu go to 100% for the majority of the 6 to 7 minutes it takes to transfer the file, from a jstack on the thread that's using 99%+ of cpu, it looks like this thread is spending a lot of time in nifi.remote.util.SiteToSiteRestApiClient.read doing LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a bit because all of the ports have compression turned off, there should be no compress/decompress activity, as far as i can tell.

Example stack for that thread:
"Timer-Driven Process Thread-6" #90 prio=5 os_prio=0 tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
   java.lang.Thread.State: RUNNABLE
        at java.util.zip.Inflater.inflateBytes(Native Method)
        at java.util.zip.Inflater.inflate(Inflater.java:259)
        - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
        at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
        at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
        at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
        at org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
        at org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
        at java.io.InputStream.read(InputStream.java:179)
        at org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
        at org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
        at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
        at org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
        at java.io.FilterInputStream.read(FilterInputStream.java:133)
        at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
        at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
        at org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
        at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
        at org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
        at org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
        at org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
        at org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
        at org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
        at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)

Has anyone seen this behavior or symptoms like this?

patw



Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Bryan Bende <bb...@gmail.com>.
I admit I haven't read this thread in detail, but is it a requirement
for you to use HTTP site-to-site?

I would think you could avoid this issue by using traditional raw
site-to-site which is over a direct socket and not hitting Jetty.

If you do want to modify Jetty's configuration you would have to
modify this part of the code and create a custom build of NiFi:

https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java#L294

Probably just remove that call to the gzip method that wraps all the handlers.

On Thu, Feb 14, 2019 at 11:46 AM Joe Witt <jo...@gmail.com> wrote:
>
> ...interesting.  I dont have an answer but will initiate some research.  Hopefully someone else replies if they know off-hand.
>
> Thanks
>
> On Thu, Feb 14, 2019 at 11:43 AM Pat White <pa...@verizonmedia.com> wrote:
>>
>> Hi Folks,
>>
>> Could someone point me at the correct way to modify Nifi's embedded jetty configuration settings? Specifically i'd like to turn off jetty's automatic compression of payload.
>>
>> Reason for asking, think i've found my performance issue, uncompressed input to jetty is getting automatically compressed, by jetty, causing very small and fragmented packets to be sent, which pegs the cpu receive thread, recombining and uncompressing the incoming packets. I'd like to verify by turning off auto compress.
>>
>> This is what i'm seeing, app layer compressed data (nifi output port compression=on) is accepted by jetty as-is and sent over as large, complete tcp packets, which the receiver is able to keep up with (do not see rcv net buffers fill up). With app layer uncompressed data (nifi output port compression=off), jetty automatically wants to compress and sends payload as many small fragmented packets, this causes high cpu load on the receiver and fills up the net buffers, causing a great deal of throttling and backoff to the sender. This is consistent in wireshark traces, good case shows no throttling, bad case shows constant throttling with backoff.
>>
>> I've checked the User and Admin guides, as well as looking at JettyServer and web/webdefault.xml for such controls but i'm clearly missing something, changes have no effect on the server behavior. Appreciate any help on how to set jetty configs properly, thank you.
>>
>> patw
>>
>>
>>
>>
>> On Tue, Feb 5, 2019 at 9:07 AM Pat White <pa...@verizonmedia.com> wrote:
>>>
>>> Hi Mark, thank you very much for the feedback, and the JettyServer reference, will take a look at that code.
>>>
>>> I'll update the thread if i get any more info. Very strange issue, and hard to see what's going on in the stream due to https encryption.
>>> Our usecase is fairly basic, get/put flows using https over s2s, i'd expect folks would have hit this if it is indeed an issue, so i tend to suspect my install or config, however the behavior is very consistent, across multiple clean installs, with small files as well as larger files (10s of MB vs GB sized files).
>>>
>>> Thanks again.
>>>
>>> patw
>>>
>>>
>>> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <ma...@hotmail.com> wrote:
>>>>
>>>> Hey Pat,
>>>>
>>>> I saw this thread but have not yet had a chance to look into it. So thanks for following up!
>>>>
>>>> The embedded server is handled in the JettyServer class [1]. I can imagine that it may automatically turn on
>>>> GZIP. When pushing data, though, the client would be the one supplying the stream of data, so the client is not
>>>> GZIP'ing the data. But when requesting from Jetty, it may well be that Jetty is compressing the data. If that is the
>>>> case, I would imagine that we could easily update the Site-to-Site client to add an Accept-Encoding header of None.
>>>> I can't say for sure, off the top of my head, though, that it will be as simple of a fix as I'm hoping :)
>>>>
>>>> Thanks
>>>> -Mark
>>>>
>>>> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
>>>>
>>>>
>>>> On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com> wrote:
>>>>
>>>> This looks like a thrashing behavior in compress/decompress, found that if i enable compression in the output port of the receiver's RPG, the issue goes away, throughput becomes just as good as for the sender's flow. Again though, i believe i have compression off for all flows and components. Only thing i can think of is if jetty's enforcing compression, and with an uncompressed stream has an issue, but not sure why only in one direction.
>>>>
>>>> Could someone point me to where Nifi's embedded jetty configuration code is, or equiv controls?
>>>>
>>>> patw
>>>>
>>>>
>>>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com> wrote:
>>>>>
>>>>> Hi Folks,
>>>>>
>>>>> I'm trying to track a very odd performance issue, this is on 1.6.0 using S2S, would like to ask if there are any known issues like this or if my flow configuration is broken. From point of view of the RPG, receiving takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've setup two simple flows and see this behavior consistently, also duplicated the flows between two single node instances to verify the behavior follows the xsfr direction versus the node, behavior follows the direction of xsfr, ie a receive on both nodes is much slower than sending.
>>>>>
>>>>> Flows are:
>>>>>
>>>>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
>>>>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
>>>>>
>>>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB xsfrs at ~52.0MB/s, this is leaving default values for all processors, connections and the RPG with the exception that RPG uses https (instead of raw), the nodes are running secure. Same policy values were applied on both nodes to both flows.
>>>>>
>>>>> Aside from the latency diff, the xsfrs appear to work fine with no anomalies that i can find, the file transfers correctly in both directions. The one anomaly i do see is in the slow case, the destination node will have cpu go to 100% for the majority of the 6 to 7 minutes it takes to transfer the file, from a jstack on the thread that's using 99%+ of cpu, it looks like this thread is spending a lot of time in nifi.remote.util.SiteToSiteRestApiClient.read doing LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a bit because all of the ports have compression turned off, there should be no compress/decompress activity, as far as i can tell.
>>>>>
>>>>> Example stack for that thread:
>>>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0 tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
>>>>>    java.lang.Thread.State: RUNNABLE
>>>>>         at java.util.zip.Inflater.inflateBytes(Native Method)
>>>>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
>>>>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
>>>>>         at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
>>>>>         at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
>>>>>         at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
>>>>>         at org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
>>>>>         at org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
>>>>>         at java.io.InputStream.read(InputStream.java:179)
>>>>>         at org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
>>>>>         at org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
>>>>>         at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
>>>>>         at org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
>>>>>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>>>>         at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
>>>>>         at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
>>>>>         at org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
>>>>>         at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
>>>>>         at org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
>>>>>         at org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
>>>>>         at org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
>>>>>         at org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
>>>>>         at org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
>>>>>         at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>>>>>         at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>>>>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>         at java.lang.Thread.run(Thread.java:748)
>>>>>
>>>>> Has anyone seen this behavior or symptoms like this?
>>>>>
>>>>> patw
>>>>
>>>>

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Joe Witt <jo...@gmail.com>.
...interesting.  I dont have an answer but will initiate some research.
Hopefully someone else replies if they know off-hand.

Thanks

On Thu, Feb 14, 2019 at 11:43 AM Pat White <pa...@verizonmedia.com>
wrote:

> Hi Folks,
>
> Could someone point me at the correct way to modify Nifi's embedded jetty
> configuration settings? Specifically i'd like to turn off jetty's automatic
> compression of payload.
>
> Reason for asking, think i've found my performance issue, uncompressed
> input to jetty is getting automatically compressed, by jetty, causing very
> small and fragmented packets to be sent, which pegs the cpu receive thread,
> recombining and uncompressing the incoming packets. I'd like to verify by
> turning off auto compress.
>
> This is what i'm seeing, app layer compressed data (nifi output port
> compression=on) is accepted by jetty as-is and sent over as large, complete
> tcp packets, which the receiver is able to keep up with (do not see rcv net
> buffers fill up). With app layer uncompressed data (nifi output port
> compression=off), jetty automatically wants to compress and sends payload
> as many small fragmented packets, this causes high cpu load on the receiver
> and fills up the net buffers, causing a great deal of throttling and
> backoff to the sender. This is consistent in wireshark traces, good case
> shows no throttling, bad case shows constant throttling with backoff.
>
> I've checked the User and Admin guides, as well as looking at JettyServer
> and web/webdefault.xml for such controls but i'm clearly missing something,
> changes have no effect on the server behavior. Appreciate any help on how
> to set jetty configs properly, thank you.
>
> patw
>
>
>
>
> On Tue, Feb 5, 2019 at 9:07 AM Pat White <pa...@verizonmedia.com>
> wrote:
>
>> Hi Mark, thank you very much for the feedback, and the JettyServer
>> reference, will take a look at that code.
>>
>> I'll update the thread if i get any more info. Very strange issue, and
>> hard to see what's going on in the stream due to https encryption.
>> Our usecase is fairly basic, get/put flows using https over s2s, i'd
>> expect folks would have hit this if it is indeed an issue, so i tend to
>> suspect my install or config, however the behavior is very consistent,
>> across multiple clean installs, with small files as well as larger files
>> (10s of MB vs GB sized files).
>>
>> Thanks again.
>>
>> patw
>>
>>
>> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <ma...@hotmail.com> wrote:
>>
>>> Hey Pat,
>>>
>>> I saw this thread but have not yet had a chance to look into it. So
>>> thanks for following up!
>>>
>>> The embedded server is handled in the JettyServer class [1]. I can
>>> imagine that it may automatically turn on
>>> GZIP. When pushing data, though, the client would be the one supplying
>>> the stream of data, so the client is not
>>> GZIP'ing the data. But when requesting from Jetty, it may well be that
>>> Jetty is compressing the data. If that is the
>>> case, I would imagine that we could easily update the Site-to-Site
>>> client to add an Accept-Encoding header of None.
>>> I can't say for sure, off the top of my head, though, that it will be as
>>> simple of a fix as I'm hoping :)
>>>
>>> Thanks
>>> -Mark
>>>
>>> [1]
>>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
>>>
>>>
>>> On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com> wrote:
>>>
>>> This looks like a thrashing behavior in compress/decompress, found that
>>> if i enable compression in the output port of the receiver's RPG, the issue
>>> goes away, throughput becomes just as good as for the sender's flow. Again
>>> though, i believe i have compression off for all flows and components. Only
>>> thing i can think of is if jetty's enforcing compression, and with an
>>> uncompressed stream has an issue, but not sure why only in one direction.
>>>
>>> Could someone point me to where Nifi's embedded jetty configuration code
>>> is, or equiv controls?
>>>
>>> patw
>>>
>>>
>>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com>
>>> wrote:
>>>
>>>> Hi Folks,
>>>>
>>>> I'm trying to track a very odd performance issue, this is on 1.6.0
>>>> using S2S, would like to ask if there are any known issues like this or if
>>>> my flow configuration is broken. From point of view of the RPG, receiving
>>>> takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've
>>>> setup two simple flows and see this behavior consistently, also duplicated
>>>> the flows between two single node instances to verify the behavior follows
>>>> the xsfr direction versus the node, behavior follows the direction of xsfr,
>>>> ie a receive on both nodes is much slower than sending.
>>>>
>>>> Flows are:
>>>>
>>>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
>>>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
>>>>
>>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s,
>>>> FlowB xsfrs at ~52.0MB/s, this is leaving default values for all
>>>> processors, connections and the RPG with the exception that RPG uses https
>>>> (instead of raw), the nodes are running secure. Same policy values were
>>>> applied on both nodes to both flows.
>>>>
>>>> Aside from the latency diff, the xsfrs appear to work fine with no
>>>> anomalies that i can find, the file transfers correctly in both directions.
>>>> The one anomaly i do see is in the slow case, the destination node will
>>>> have cpu go to 100% for the majority of the 6 to 7 minutes it takes to
>>>> transfer the file, from a jstack on the thread that's using 99%+ of cpu, it
>>>> looks like this thread is spending a lot of time in
>>>> nifi.remote.util.SiteToSiteRestApiClient.read doing
>>>> LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a
>>>> bit because all of the ports have compression turned off, there should be
>>>> no compress/decompress activity, as far as i can tell.
>>>>
>>>> Example stack for that thread:
>>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0
>>>> tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
>>>>    java.lang.Thread.State: RUNNABLE
>>>>         at java.util.zip.Inflater.inflateBytes(Native Method)
>>>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
>>>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
>>>>         at
>>>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
>>>>         at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
>>>>         at
>>>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
>>>>         at
>>>> org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
>>>>         at
>>>> org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
>>>>         at java.io.InputStream.read(InputStream.java:179)
>>>>         at
>>>> org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
>>>>         at
>>>> org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
>>>>         at
>>>> java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
>>>>         at
>>>> org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
>>>>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>>>         at
>>>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
>>>>         at
>>>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
>>>>         at
>>>> org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
>>>>         at
>>>> org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
>>>>         at
>>>> org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
>>>>         at
>>>> org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
>>>>         at
>>>> org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
>>>>         at
>>>> org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
>>>>         at
>>>> org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
>>>>         at
>>>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>>>>         at
>>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>>>>         at
>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>         at
>>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>>         at
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>>         at
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>>         at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>         at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>         at java.lang.Thread.run(Thread.java:748)
>>>>
>>>> Has anyone seen this behavior or symptoms like this?
>>>>
>>>> patw
>>>>
>>>
>>>

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Pat White <pa...@verizonmedia.com>.
Hi Folks,

Could someone point me at the correct way to modify Nifi's embedded jetty
configuration settings? Specifically i'd like to turn off jetty's automatic
compression of payload.

Reason for asking, think i've found my performance issue, uncompressed
input to jetty is getting automatically compressed, by jetty, causing very
small and fragmented packets to be sent, which pegs the cpu receive thread,
recombining and uncompressing the incoming packets. I'd like to verify by
turning off auto compress.

This is what i'm seeing, app layer compressed data (nifi output port
compression=on) is accepted by jetty as-is and sent over as large, complete
tcp packets, which the receiver is able to keep up with (do not see rcv net
buffers fill up). With app layer uncompressed data (nifi output port
compression=off), jetty automatically wants to compress and sends payload
as many small fragmented packets, this causes high cpu load on the receiver
and fills up the net buffers, causing a great deal of throttling and
backoff to the sender. This is consistent in wireshark traces, good case
shows no throttling, bad case shows constant throttling with backoff.

I've checked the User and Admin guides, as well as looking at JettyServer
and web/webdefault.xml for such controls but i'm clearly missing something,
changes have no effect on the server behavior. Appreciate any help on how
to set jetty configs properly, thank you.

patw




On Tue, Feb 5, 2019 at 9:07 AM Pat White <pa...@verizonmedia.com> wrote:

> Hi Mark, thank you very much for the feedback, and the JettyServer
> reference, will take a look at that code.
>
> I'll update the thread if i get any more info. Very strange issue, and
> hard to see what's going on in the stream due to https encryption.
> Our usecase is fairly basic, get/put flows using https over s2s, i'd
> expect folks would have hit this if it is indeed an issue, so i tend to
> suspect my install or config, however the behavior is very consistent,
> across multiple clean installs, with small files as well as larger files
> (10s of MB vs GB sized files).
>
> Thanks again.
>
> patw
>
>
> On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <ma...@hotmail.com> wrote:
>
>> Hey Pat,
>>
>> I saw this thread but have not yet had a chance to look into it. So
>> thanks for following up!
>>
>> The embedded server is handled in the JettyServer class [1]. I can
>> imagine that it may automatically turn on
>> GZIP. When pushing data, though, the client would be the one supplying
>> the stream of data, so the client is not
>> GZIP'ing the data. But when requesting from Jetty, it may well be that
>> Jetty is compressing the data. If that is the
>> case, I would imagine that we could easily update the Site-to-Site client
>> to add an Accept-Encoding header of None.
>> I can't say for sure, off the top of my head, though, that it will be as
>> simple of a fix as I'm hoping :)
>>
>> Thanks
>> -Mark
>>
>> [1]
>> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
>>
>>
>> On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com> wrote:
>>
>> This looks like a thrashing behavior in compress/decompress, found that
>> if i enable compression in the output port of the receiver's RPG, the issue
>> goes away, throughput becomes just as good as for the sender's flow. Again
>> though, i believe i have compression off for all flows and components. Only
>> thing i can think of is if jetty's enforcing compression, and with an
>> uncompressed stream has an issue, but not sure why only in one direction.
>>
>> Could someone point me to where Nifi's embedded jetty configuration code
>> is, or equiv controls?
>>
>> patw
>>
>>
>> On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com>
>> wrote:
>>
>>> Hi Folks,
>>>
>>> I'm trying to track a very odd performance issue, this is on 1.6.0 using
>>> S2S, would like to ask if there are any known issues like this or if my
>>> flow configuration is broken. From point of view of the RPG, receiving
>>> takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've
>>> setup two simple flows and see this behavior consistently, also duplicated
>>> the flows between two single node instances to verify the behavior follows
>>> the xsfr direction versus the node, behavior follows the direction of xsfr,
>>> ie a receive on both nodes is much slower than sending.
>>>
>>> Flows are:
>>>
>>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
>>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
>>>
>>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB
>>> xsfrs at ~52.0MB/s, this is leaving default values for all processors,
>>> connections and the RPG with the exception that RPG uses https (instead of
>>> raw), the nodes are running secure. Same policy values were applied on both
>>> nodes to both flows.
>>>
>>> Aside from the latency diff, the xsfrs appear to work fine with no
>>> anomalies that i can find, the file transfers correctly in both directions.
>>> The one anomaly i do see is in the slow case, the destination node will
>>> have cpu go to 100% for the majority of the 6 to 7 minutes it takes to
>>> transfer the file, from a jstack on the thread that's using 99%+ of cpu, it
>>> looks like this thread is spending a lot of time in
>>> nifi.remote.util.SiteToSiteRestApiClient.read doing
>>> LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a
>>> bit because all of the ports have compression turned off, there should be
>>> no compress/decompress activity, as far as i can tell.
>>>
>>> Example stack for that thread:
>>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0
>>> tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
>>>    java.lang.Thread.State: RUNNABLE
>>>         at java.util.zip.Inflater.inflateBytes(Native Method)
>>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
>>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
>>>         at
>>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
>>>         at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
>>>         at
>>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
>>>         at
>>> org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
>>>         at
>>> org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
>>>         at java.io.InputStream.read(InputStream.java:179)
>>>         at
>>> org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
>>>         at
>>> org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
>>>         at
>>> java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
>>>         at
>>> org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
>>>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>>         at
>>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
>>>         at
>>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
>>>         at
>>> org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
>>>         at
>>> org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
>>>         at
>>> org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
>>>         at
>>> org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
>>>         at
>>> org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
>>>         at
>>> org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
>>>         at
>>> org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
>>>         at
>>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>>>         at
>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>>>         at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>         at
>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>         at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>         at
>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>         at java.lang.Thread.run(Thread.java:748)
>>>
>>> Has anyone seen this behavior or symptoms like this?
>>>
>>> patw
>>>
>>
>>

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Pat White <pa...@verizonmedia.com>.
Hi Mark, thank you very much for the feedback, and the JettyServer
reference, will take a look at that code.

I'll update the thread if i get any more info. Very strange issue, and hard
to see what's going on in the stream due to https encryption.
Our usecase is fairly basic, get/put flows using https over s2s, i'd expect
folks would have hit this if it is indeed an issue, so i tend to suspect my
install or config, however the behavior is very consistent, across multiple
clean installs, with small files as well as larger files (10s of MB vs GB
sized files).

Thanks again.

patw


On Mon, Feb 4, 2019 at 5:18 PM Mark Payne <ma...@hotmail.com> wrote:

> Hey Pat,
>
> I saw this thread but have not yet had a chance to look into it. So thanks
> for following up!
>
> The embedded server is handled in the JettyServer class [1]. I can imagine
> that it may automatically turn on
> GZIP. When pushing data, though, the client would be the one supplying the
> stream of data, so the client is not
> GZIP'ing the data. But when requesting from Jetty, it may well be that
> Jetty is compressing the data. If that is the
> case, I would imagine that we could easily update the Site-to-Site client
> to add an Accept-Encoding header of None.
> I can't say for sure, off the top of my head, though, that it will be as
> simple of a fix as I'm hoping :)
>
> Thanks
> -Mark
>
> [1]
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java
>
>
> On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com> wrote:
>
> This looks like a thrashing behavior in compress/decompress, found that if
> i enable compression in the output port of the receiver's RPG, the issue
> goes away, throughput becomes just as good as for the sender's flow. Again
> though, i believe i have compression off for all flows and components. Only
> thing i can think of is if jetty's enforcing compression, and with an
> uncompressed stream has an issue, but not sure why only in one direction.
>
> Could someone point me to where Nifi's embedded jetty configuration code
> is, or equiv controls?
>
> patw
>
>
> On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com>
> wrote:
>
>> Hi Folks,
>>
>> I'm trying to track a very odd performance issue, this is on 1.6.0 using
>> S2S, would like to ask if there are any known issues like this or if my
>> flow configuration is broken. From point of view of the RPG, receiving
>> takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've
>> setup two simple flows and see this behavior consistently, also duplicated
>> the flows between two single node instances to verify the behavior follows
>> the xsfr direction versus the node, behavior follows the direction of xsfr,
>> ie a receive on both nodes is much slower than sending.
>>
>> Flows are:
>>
>> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
>> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
>>
>> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB
>> xsfrs at ~52.0MB/s, this is leaving default values for all processors,
>> connections and the RPG with the exception that RPG uses https (instead of
>> raw), the nodes are running secure. Same policy values were applied on both
>> nodes to both flows.
>>
>> Aside from the latency diff, the xsfrs appear to work fine with no
>> anomalies that i can find, the file transfers correctly in both directions.
>> The one anomaly i do see is in the slow case, the destination node will
>> have cpu go to 100% for the majority of the 6 to 7 minutes it takes to
>> transfer the file, from a jstack on the thread that's using 99%+ of cpu, it
>> looks like this thread is spending a lot of time in
>> nifi.remote.util.SiteToSiteRestApiClient.read doing
>> LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a
>> bit because all of the ports have compression turned off, there should be
>> no compress/decompress activity, as far as i can tell.
>>
>> Example stack for that thread:
>> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0
>> tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
>>    java.lang.Thread.State: RUNNABLE
>>         at java.util.zip.Inflater.inflateBytes(Native Method)
>>         at java.util.zip.Inflater.inflate(Inflater.java:259)
>>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
>>         at
>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
>>         at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
>>         at
>> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
>>         at
>> org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
>>         at
>> org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
>>         at java.io.InputStream.read(InputStream.java:179)
>>         at
>> org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
>>         at
>> org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
>>         at
>> java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
>>         at
>> org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
>>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>>         at
>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
>>         at
>> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
>>         at
>> org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
>>         at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
>>         at
>> org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
>>         at
>> org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
>>         at
>> org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
>>         at
>> org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
>>         at
>> org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
>>         at
>> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>>         at
>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>>         at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>         at
>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>         at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>         at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:748)
>>
>> Has anyone seen this behavior or symptoms like this?
>>
>> patw
>>
>
>

Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Mark Payne <ma...@hotmail.com>.
Hey Pat,

I saw this thread but have not yet had a chance to look into it. So thanks for following up!

The embedded server is handled in the JettyServer class [1]. I can imagine that it may automatically turn on
GZIP. When pushing data, though, the client would be the one supplying the stream of data, so the client is not
GZIP'ing the data. But when requesting from Jetty, it may well be that Jetty is compressing the data. If that is the
case, I would imagine that we could easily update the Site-to-Site client to add an Accept-Encoding header of None.
I can't say for sure, off the top of my head, though, that it will be as simple of a fix as I'm hoping :)

Thanks
-Mark

[1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-web/nifi-jetty/src/main/java/org/apache/nifi/web/server/JettyServer.java


On Feb 4, 2019, at 5:58 PM, Pat White <pa...@verizonmedia.com>> wrote:

This looks like a thrashing behavior in compress/decompress, found that if i enable compression in the output port of the receiver's RPG, the issue goes away, throughput becomes just as good as for the sender's flow. Again though, i believe i have compression off for all flows and components. Only thing i can think of is if jetty's enforcing compression, and with an uncompressed stream has an issue, but not sure why only in one direction.

Could someone point me to where Nifi's embedded jetty configuration code is, or equiv controls?

patw


On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com>> wrote:
Hi Folks,

I'm trying to track a very odd performance issue, this is on 1.6.0 using S2S, would like to ask if there are any known issues like this or if my flow configuration is broken. From point of view of the RPG, receiving takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've setup two simple flows and see this behavior consistently, also duplicated the flows between two single node instances to verify the behavior follows the xsfr direction versus the node, behavior follows the direction of xsfr, ie a receive on both nodes is much slower than sending.

Flows are:

FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA

For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB xsfrs at ~52.0MB/s, this is leaving default values for all processors, connections and the RPG with the exception that RPG uses https (instead of raw), the nodes are running secure. Same policy values were applied on both nodes to both flows.

Aside from the latency diff, the xsfrs appear to work fine with no anomalies that i can find, the file transfers correctly in both directions. The one anomaly i do see is in the slow case, the destination node will have cpu go to 100% for the majority of the 6 to 7 minutes it takes to transfer the file, from a jstack on the thread that's using 99%+ of cpu, it looks like this thread is spending a lot of time in nifi.remote.util.SiteToSiteRestApiClient.read doing LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a bit because all of the ports have compression turned off, there should be no compress/decompress activity, as far as i can tell.

Example stack for that thread:
"Timer-Driven Process Thread-6" #90 prio=5 os_prio=0 tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
   java.lang.Thread.State: RUNNABLE
        at java.util.zip.Inflater.inflateBytes(Native Method)
        at java.util.zip.Inflater.inflate(Inflater.java:259)
        - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
        at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
        at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
        at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
        at org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
        at org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
        at java.io.InputStream.read(InputStream.java:179)
        at org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
        at org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
        at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
        at org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
        at java.io.FilterInputStream.read(FilterInputStream.java:133)
        at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
        at org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
        at org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
        at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
        at org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
        at org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
        at org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
        at org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
        at org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
        at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)

Has anyone seen this behavior or symptoms like this?

patw


Re: Asymmetric push/pull throughput with S2S, possibly related to openConnectionForReceive compression?

Posted by Pat White <pa...@verizonmedia.com>.
This looks like a thrashing behavior in compress/decompress, found that if
i enable compression in the output port of the receiver's RPG, the issue
goes away, throughput becomes just as good as for the sender's flow. Again
though, i believe i have compression off for all flows and components. Only
thing i can think of is if jetty's enforcing compression, and with an
uncompressed stream has an issue, but not sure why only in one direction.

Could someone point me to where Nifi's embedded jetty configuration code
is, or equiv controls?

patw


On Fri, Feb 1, 2019 at 4:13 PM Pat White <pa...@verizonmedia.com> wrote:

> Hi Folks,
>
> I'm trying to track a very odd performance issue, this is on 1.6.0 using
> S2S, would like to ask if there are any known issues like this or if my
> flow configuration is broken. From point of view of the RPG, receiving
> takes ~15x longer to xsfr the same 1.5gb file as a send from that RPG. I've
> setup two simple flows and see this behavior consistently, also duplicated
> the flows between two single node instances to verify the behavior follows
> the xsfr direction versus the node, behavior follows the direction of xsfr,
> ie a receive on both nodes is much slower than sending.
>
> Flows are:
>
> FlowA:  GetFile_nodeA > OutputPort_nodeA > RPG_nodeB > PutFile_nodeB
> FlowB:  GetFile_nodeB > RPG_nodeB > InputPort_nodeA > PutFile_nodeA
>
> For the same 1.5gb file, FlowA will consistently xsfr at ~3.5MB/s, FlowB
> xsfrs at ~52.0MB/s, this is leaving default values for all processors,
> connections and the RPG with the exception that RPG uses https (instead of
> raw), the nodes are running secure. Same policy values were applied on both
> nodes to both flows.
>
> Aside from the latency diff, the xsfrs appear to work fine with no
> anomalies that i can find, the file transfers correctly in both directions.
> The one anomaly i do see is in the slow case, the destination node will
> have cpu go to 100% for the majority of the 6 to 7 minutes it takes to
> transfer the file, from a jstack on the thread that's using 99%+ of cpu, it
> looks like this thread is spending a lot of time in
> nifi.remote.util.SiteToSiteRestApiClient.read doing
> LazyDecompressingInputStream/InflaterInputStream, which puzzles me quite a
> bit because all of the ports have compression turned off, there should be
> no compress/decompress activity, as far as i can tell.
>
> Example stack for that thread:
> "Timer-Driven Process Thread-6" #90 prio=5 os_prio=0
> tid=0x00007f4c48002000 nid=0xdb38 runnable [0x00007f4c734f5000]
>    java.lang.Thread.State: RUNNABLE
>         at java.util.zip.Inflater.inflateBytes(Native Method)
>         at java.util.zip.Inflater.inflate(Inflater.java:259)
>         - locked <0x00007f55d891cf50> (a java.util.zip.ZStreamRef)
>         at
> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:152)
>         at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
>         at
> java.util.zip.InflaterInputStream.read(InflaterInputStream.java:122)
>         at
> org.apache.http.client.entity.LazyDecompressingInputStream.read(LazyDecompressingInputStream.java:58)
>         at
> org.apache.nifi.remote.util.SiteToSiteRestApiClient$3.read(SiteToSiteRestApiClient.java:722)
>         at java.io.InputStream.read(InputStream.java:179)
>         at
> org.apache.nifi.remote.io.InterruptableInputStream.read(InterruptableInputStream.java:57)
>         at
> org.apache.nifi.stream.io.ByteCountingInputStream.read(ByteCountingInputStream.java:51)
>         at
> java.util.zip.CheckedInputStream.read(CheckedInputStream.java:82)
>         at
> org.apache.nifi.stream.io.LimitingInputStream.read(LimitingInputStream.java:88)
>         at java.io.FilterInputStream.read(FilterInputStream.java:133)
>         at
> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:57)
>         at
> org.apache.nifi.stream.io.MinimumLengthInputStream.read(MinimumLengthInputStream.java:53)
>         at
> org.apache.nifi.controller.repository.io.TaskTerminationInputStream.read(TaskTerminationInputStream.java:62)
>         at org.apache.nifi.stream.io.StreamUtils.copy(StreamUtils.java:35)
>         at
> org.apache.nifi.controller.repository.FileSystemRepository.importFrom(FileSystemRepository.java:744)
>         at
> org.apache.nifi.controller.repository.StandardProcessSession.importFrom(StandardProcessSession.java:2990)
>         at
> org.apache.nifi.remote.StandardRemoteGroupPort.receiveFlowFiles(StandardRemoteGroupPort.java:419)
>         at
> org.apache.nifi.remote.StandardRemoteGroupPort.onTrigger(StandardRemoteGroupPort.java:286)
>         at
> org.apache.nifi.controller.AbstractPort.onTrigger(AbstractPort.java:250)
>         at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:175)
>         at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:748)
>
> Has anyone seen this behavior or symptoms like this?
>
> patw
>