You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Jean-Sebastien Vachon <js...@brizodata.com> on 2019/03/22 14:28:49 UTC

Problem with load balancing option

Hi all,

I've configured one of my connection to use the "partition by attribute" load balancing option.
It was not working as expected and after a few tests I realized I was missing some dependencies on the cluster nodes so I stopped everything (not related to the load balancing or Nifi at all)

Now, I stopped everything before fixing  my dependencies issues and the UI shows 1906 items in the queue for that connection but I can't list them or empty the queue.
Nifi tells me that there are no flow files in the queue when I try to list them and that 0 flowfiles out of 1906 were removed from the queue.

I tried connecting the destination to some other process like a LogMessage processor but nothing is happening. The 1906 items are stuck and I cannot delete the connection because it's not empty.

Any recommendations to fix this?

thanks


Re: Problem with load balancing option

Posted by Jean-Sebastien Vachon <js...@brizodata.com>.
I've recently upgraded to 1.9.1
________________________________
From: Vos, Walter <wa...@ns.nl>
Sent: Tuesday, April 2, 2019 3:41 AM
To: users@nifi.apache.org
Subject: RE: Problem with load balancing option


On which version of NiFi are you, Jean-Sebastien?



From: Jean-Sebastien Vachon <js...@brizodata.com>
Sent: vrijdag 22 maart 2019 16:15
To: Jean-Sebastien Vachon <js...@brizodata.com>; users@nifi.apache.org
Subject: Re: Problem with load balancing option



Hi,



FYI, I managed to get my node back by removing the node from the cluster, deleting the local flow and restart Nifi.



Hope this helps identify the issue

________________________________

From: Jean-Sebastien Vachon <js...@brizodata.com>>
Sent: Friday, March 22, 2019 10:56 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Problem with load balancing option



Hi again,



I thought everything was fine but one of my node can not start..



2019-03-22 14:51:27,811 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Successfully recovered 10396 records in 367 milliseconds. Now checkpointing to ensure that Write-Ahead Log is in a consistent state

2019-03-22 14:51:28,046 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 10396 Records and 0 Swap Files in 235 milliseconds (Stop-the-world time = 6 milliseconds), max Transaction ID 24370

2019-03-22 14:51:28,065 ERROR [main] o.a.nifi.controller.StandardFlowService Failed to load flow from cluster due to: org.apache.nifi.cluster.ConnectionExcepti

on: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1

org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1

        at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1009)

        at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:539)

        at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:939)

        at org.apache.nifi.NiFi.<init>(NiFi.java:157)

        at org.apache.nifi.NiFi.<init>(NiFi.java:71)

        at org.apache.nifi.NiFi.main(NiFi.java:296)

Caused by: java.lang.ArrayIndexOutOfBoundsException: -1

        at org.apache.nifi.controller.queue.clustered.partition.CorrelationAttributePartitioner.getPartition(CorrelationAttributePartitioner.java:44)

        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.getPartition(SocketLoadBalancedFlowFileQueue.java:611)

        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.putAndGetPartition(SocketLoadBalancedFlowFileQueue.java:749)

        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.put(SocketLoadBalancedFlowFileQueue.java:739)

        at org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.loadFlowFiles(WriteAheadFlowFileRepository.java:587)

        at org.apache.nifi.controller.FlowController.initializeFlow(FlowController.java:818)

        at org.apache.nifi.controller.StandardFlowService.initializeController(StandardFlowService.java:1019)

        at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:991)

        ... 5 common frames omitted



Any idea?

________________________________

From: Jean-Sebastien Vachon
Sent: Friday, March 22, 2019 10:34 AM
To: Jean-Sebastien Vachon; users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Problem with load balancing option



Hi,



I stopped each node one by one and the queue is now empty. Not sure if this is a bug or intended but it does look strange from a user point of view



Thanks

________________________________

From: Jean-Sebastien Vachon <js...@brizodata.com>>
Sent: Friday, March 22, 2019 10:28 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Problem with load balancing option



Hi all,



I've configured one of my connection to use the "partition by attribute" load balancing option.

It was not working as expected and after a few tests I realized I was missing some dependencies on the cluster nodes so I stopped everything (not related to the load balancing or Nifi at all)



Now, I stopped everything before fixing  my dependencies issues and the UI shows 1906 items in the queue for that connection but I can't list them or empty the queue.

Nifi tells me that there are no flow files in the queue when I try to list them and that 0 flowfiles out of 1906 were removed from the queue.



I tried connecting the destination to some other process like a LogMessage processor but nothing is happening. The 1906 items are stuck and I cannot delete the connection because it's not empty.



Any recommendations to fix this?



thanks



________________________________

Deze e-mail, inclusief eventuele bijlagen, is uitsluitend bestemd voor (gebruik door) de geadresseerde. De e-mail kan persoonlijke of vertrouwelijke informatie bevatten. Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking van (de inhoud van) deze e-mail (en eventuele bijlagen) aan derden is uitdrukkelijk niet toegestaan. Indien u niet de bedoelde geadresseerde bent, wordt u vriendelijk verzocht degene die de e-mail verzond hiervan direct op de hoogte te brengen en de e-mail (en eventuele bijlagen) te vernietigen.

Informatie vennootschap<http://www.ns.nl/emaildisclaimer>

RE: Problem with load balancing option

Posted by "Vos, Walter" <wa...@ns.nl>.
On which version of NiFi are you, Jean-Sebastien?

From: Jean-Sebastien Vachon <js...@brizodata.com>
Sent: vrijdag 22 maart 2019 16:15
To: Jean-Sebastien Vachon <js...@brizodata.com>; users@nifi.apache.org
Subject: Re: Problem with load balancing option

Hi,

FYI, I managed to get my node back by removing the node from the cluster, deleting the local flow and restart Nifi.

Hope this helps identify the issue
________________________________
From: Jean-Sebastien Vachon <js...@brizodata.com>>
Sent: Friday, March 22, 2019 10:56 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Problem with load balancing option

Hi again,

I thought everything was fine but one of my node can not start..

2019-03-22 14:51:27,811 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Successfully recovered 10396 records in 367 milliseconds. Now checkpointing to ensure that Write-Ahead Log is in a consistent state
2019-03-22 14:51:28,046 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 10396 Records and 0 Swap Files in 235 milliseconds (Stop-the-world time = 6 milliseconds), max Transaction ID 24370
2019-03-22 14:51:28,065 ERROR [main] o.a.nifi.controller.StandardFlowService Failed to load flow from cluster due to: org.apache.nifi.cluster.ConnectionExcepti
on: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1
org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1
        at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1009)
        at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:539)
        at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:939)
        at org.apache.nifi.NiFi.<init>(NiFi.java:157)
        at org.apache.nifi.NiFi.<init>(NiFi.java:71)
        at org.apache.nifi.NiFi.main(NiFi.java:296)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
        at org.apache.nifi.controller.queue.clustered.partition.CorrelationAttributePartitioner.getPartition(CorrelationAttributePartitioner.java:44)
        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.getPartition(SocketLoadBalancedFlowFileQueue.java:611)
        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.putAndGetPartition(SocketLoadBalancedFlowFileQueue.java:749)
        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.put(SocketLoadBalancedFlowFileQueue.java:739)
        at org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.loadFlowFiles(WriteAheadFlowFileRepository.java:587)
        at org.apache.nifi.controller.FlowController.initializeFlow(FlowController.java:818)
        at org.apache.nifi.controller.StandardFlowService.initializeController(StandardFlowService.java:1019)
        at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:991)
        ... 5 common frames omitted

Any idea?
________________________________
From: Jean-Sebastien Vachon
Sent: Friday, March 22, 2019 10:34 AM
To: Jean-Sebastien Vachon; users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Re: Problem with load balancing option

Hi,

I stopped each node one by one and the queue is now empty. Not sure if this is a bug or intended but it does look strange from a user point of view

Thanks
________________________________
From: Jean-Sebastien Vachon <js...@brizodata.com>>
Sent: Friday, March 22, 2019 10:28 AM
To: users@nifi.apache.org<ma...@nifi.apache.org>
Subject: Problem with load balancing option

Hi all,

I've configured one of my connection to use the "partition by attribute" load balancing option.
It was not working as expected and after a few tests I realized I was missing some dependencies on the cluster nodes so I stopped everything (not related to the load balancing or Nifi at all)

Now, I stopped everything before fixing  my dependencies issues and the UI shows 1906 items in the queue for that connection but I can't list them or empty the queue.
Nifi tells me that there are no flow files in the queue when I try to list them and that 0 flowfiles out of 1906 were removed from the queue.

I tried connecting the destination to some other process like a LogMessage processor but nothing is happening. The 1906 items are stuck and I cannot delete the connection because it's not empty.

Any recommendations to fix this?

thanks


________________________________

Deze e-mail, inclusief eventuele bijlagen, is uitsluitend bestemd voor (gebruik door) de geadresseerde. De e-mail kan persoonlijke of vertrouwelijke informatie bevatten. Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking van (de inhoud van) deze e-mail (en eventuele bijlagen) aan derden is uitdrukkelijk niet toegestaan. Indien u niet de bedoelde geadresseerde bent, wordt u vriendelijk verzocht degene die de e-mail verzond hiervan direct op de hoogte te brengen en de e-mail (en eventuele bijlagen) te vernietigen.

Informatie vennootschap<http://www.ns.nl/emaildisclaimer>

Re: Problem with load balancing option

Posted by Koji Kawamura <ij...@gmail.com>.
Glad to hear you got the load-balancing works correctly!

Thanks for pointing out that the lack of new properties on migration guide.
I've added a note for the new load balancing port.
https://cwiki.apache.org/confluence/display/NIFI/Migration+Guidance

On Mon, Mar 25, 2019 at 8:06 PM Jean-Sebastien Vachon <
jsvachon@brizodata.com> wrote:

> Hi,
>
> I saw that bug report and I will upgrade to the latest version ASAP. But
> my main problem was the lack of the section to configure the load balancer
> correctly. Once I've added the section and opened the required ports in my
> infrastructure, everything started to work as expected and it is a life
> changer 😉
>
> The load is now properly balanced between all nodes and the performance
> boost I got is outstanding
>
> One note however, I've checked the migration guide from 1.8 to 1.9 and
> didn't see any mention of this new section within nifi.properties. It might
> be good idea to add a section about this so that people upgrading their
> cluster have all the information at hand. This might save them some time.
>
> Thanks all for your outstanding work
> ------------------------------
> *From:* Koji Kawamura <ij...@gmail.com>
> *Sent:* Sunday, March 24, 2019 10:39 PM
> *To:* users@nifi.apache.org
> *Cc:* Jean-Sebastien Vachon
> *Subject:* Re: Problem with load balancing option
>
> Hi,
>
> That looks similar to this one:
> Occasionally FlowFiles appear to get "stuck" in a Load-Balanced Connection
> https://issues.apache.org/jira/browse/NIFI-5919
>
> If you're using NiFi 1.8.0, I recommend trying the latest 1.9.1 which
> has the fix for the above issue.
>
> Hope this helps.
>
> Koji
>
> On Sat, Mar 23, 2019 at 12:15 AM Jean-Sebastien Vachon
> <js...@brizodata.com> wrote:
> >
> > Hi,
> >
> > FYI, I managed to get my node back by removing the node from the
> cluster, deleting the local flow and restart Nifi.
> >
> > Hope this helps identify the issue
> > ________________________________
> > From: Jean-Sebastien Vachon <js...@brizodata.com>
> > Sent: Friday, March 22, 2019 10:56 AM
> > To: users@nifi.apache.org
> > Subject: Re: Problem with load balancing option
> >
> > Hi again,
> >
> > I thought everything was fine but one of my node can not start..
> >
> > 2019-03-22 14:51:27,811 INFO [main]
> o.a.n.wali.SequentialAccessWriteAheadLog Successfully recovered 10396
> records in 367 milliseconds. Now checkpointing to ensure that Write-Ahead
> Log is in a consistent state
> > 2019-03-22 14:51:28,046 INFO [main]
> o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with
> 10396 Records and 0 Swap Files in 235 milliseconds (Stop-the-world time = 6
> milliseconds), max Transaction ID 24370
> > 2019-03-22 14:51:28,065 ERROR [main]
> o.a.nifi.controller.StandardFlowService Failed to load flow from cluster
> due to: org.apache.nifi.cluster.ConnectionExcepti
> > on: Failed to connect node to cluster due to:
> java.lang.ArrayIndexOutOfBoundsException: -1
> > org.apache.nifi.cluster.ConnectionException: Failed to connect node to
> cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1
> >         at
> org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1009)
> >         at
> org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:539)
> >         at
> org.apache.nifi.web.server.JettyServer.start(JettyServer.java:939)
> >         at org.apache.nifi.NiFi.<init>(NiFi.java:157)
> >         at org.apache.nifi.NiFi.<init>(NiFi.java:71)
> >         at org.apache.nifi.NiFi.main(NiFi.java:296)
> > Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> >         at
> org.apache.nifi.controller.queue.clustered.partition.CorrelationAttributePartitioner.getPartition(CorrelationAttributePartitioner.java:44)
> >         at
> org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.getPartition(SocketLoadBalancedFlowFileQueue.java:611)
> >         at
> org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.putAndGetPartition(SocketLoadBalancedFlowFileQueue.java:749)
> >         at
> org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.put(SocketLoadBalancedFlowFileQueue.java:739)
> >         at
> org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.loadFlowFiles(WriteAheadFlowFileRepository.java:587)
> >         at
> org.apache.nifi.controller.FlowController.initializeFlow(FlowController.java:818)
> >         at
> org.apache.nifi.controller.StandardFlowService.initializeController(StandardFlowService.java:1019)
> >         at
> org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:991)
> >         ... 5 common frames omitted
> >
> > Any idea?
> > ________________________________
> > From: Jean-Sebastien Vachon
> > Sent: Friday, March 22, 2019 10:34 AM
> > To: Jean-Sebastien Vachon; users@nifi.apache.org
> > Subject: Re: Problem with load balancing option
> >
> > Hi,
> >
> > I stopped each node one by one and the queue is now empty. Not sure if
> this is a bug or intended but it does look strange from a user point of view
> >
> > Thanks
> > ________________________________
> > From: Jean-Sebastien Vachon <js...@brizodata.com>
> > Sent: Friday, March 22, 2019 10:28 AM
> > To: users@nifi.apache.org
> > Subject: Problem with load balancing option
> >
> > Hi all,
> >
> > I've configured one of my connection to use the "partition by attribute"
> load balancing option.
> > It was not working as expected and after a few tests I realized I was
> missing some dependencies on the cluster nodes so I stopped everything (not
> related to the load balancing or Nifi at all)
> >
> > Now, I stopped everything before fixing  my dependencies issues and the
> UI shows 1906 items in the queue for that connection but I can't list them
> or empty the queue.
> > Nifi tells me that there are no flow files in the queue when I try to
> list them and that 0 flowfiles out of 1906 were removed from the queue.
> >
> > I tried connecting the destination to some other process like a
> LogMessage processor but nothing is happening. The 1906 items are stuck and
> I cannot delete the connection because it's not empty.
> >
> > Any recommendations to fix this?
> >
> > thanks
> >
>

Re: Problem with load balancing option

Posted by Jean-Sebastien Vachon <js...@brizodata.com>.
Hi,

I saw that bug report and I will upgrade to the latest version ASAP. But my main problem was the lack of the section to configure the load balancer correctly. Once I've added the section and opened the required ports in my infrastructure, everything started to work as expected and it is a life changer 😉

The load is now properly balanced between all nodes and the performance boost I got is outstanding

One note however, I've checked the migration guide from 1.8 to 1.9 and didn't see any mention of this new section within nifi.properties. It might be good idea to add a section about this so that people upgrading their cluster have all the information at hand. This might save them some time.

Thanks all for your outstanding work
________________________________
From: Koji Kawamura <ij...@gmail.com>
Sent: Sunday, March 24, 2019 10:39 PM
To: users@nifi.apache.org
Cc: Jean-Sebastien Vachon
Subject: Re: Problem with load balancing option

Hi,

That looks similar to this one:
Occasionally FlowFiles appear to get "stuck" in a Load-Balanced Connection
https://issues.apache.org/jira/browse/NIFI-5919

If you're using NiFi 1.8.0, I recommend trying the latest 1.9.1 which
has the fix for the above issue.

Hope this helps.

Koji

On Sat, Mar 23, 2019 at 12:15 AM Jean-Sebastien Vachon
<js...@brizodata.com> wrote:
>
> Hi,
>
> FYI, I managed to get my node back by removing the node from the cluster, deleting the local flow and restart Nifi.
>
> Hope this helps identify the issue
> ________________________________
> From: Jean-Sebastien Vachon <js...@brizodata.com>
> Sent: Friday, March 22, 2019 10:56 AM
> To: users@nifi.apache.org
> Subject: Re: Problem with load balancing option
>
> Hi again,
>
> I thought everything was fine but one of my node can not start..
>
> 2019-03-22 14:51:27,811 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Successfully recovered 10396 records in 367 milliseconds. Now checkpointing to ensure that Write-Ahead Log is in a consistent state
> 2019-03-22 14:51:28,046 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 10396 Records and 0 Swap Files in 235 milliseconds (Stop-the-world time = 6 milliseconds), max Transaction ID 24370
> 2019-03-22 14:51:28,065 ERROR [main] o.a.nifi.controller.StandardFlowService Failed to load flow from cluster due to: org.apache.nifi.cluster.ConnectionExcepti
> on: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1
> org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1
>         at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1009)
>         at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:539)
>         at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:939)
>         at org.apache.nifi.NiFi.<init>(NiFi.java:157)
>         at org.apache.nifi.NiFi.<init>(NiFi.java:71)
>         at org.apache.nifi.NiFi.main(NiFi.java:296)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
>         at org.apache.nifi.controller.queue.clustered.partition.CorrelationAttributePartitioner.getPartition(CorrelationAttributePartitioner.java:44)
>         at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.getPartition(SocketLoadBalancedFlowFileQueue.java:611)
>         at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.putAndGetPartition(SocketLoadBalancedFlowFileQueue.java:749)
>         at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.put(SocketLoadBalancedFlowFileQueue.java:739)
>         at org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.loadFlowFiles(WriteAheadFlowFileRepository.java:587)
>         at org.apache.nifi.controller.FlowController.initializeFlow(FlowController.java:818)
>         at org.apache.nifi.controller.StandardFlowService.initializeController(StandardFlowService.java:1019)
>         at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:991)
>         ... 5 common frames omitted
>
> Any idea?
> ________________________________
> From: Jean-Sebastien Vachon
> Sent: Friday, March 22, 2019 10:34 AM
> To: Jean-Sebastien Vachon; users@nifi.apache.org
> Subject: Re: Problem with load balancing option
>
> Hi,
>
> I stopped each node one by one and the queue is now empty. Not sure if this is a bug or intended but it does look strange from a user point of view
>
> Thanks
> ________________________________
> From: Jean-Sebastien Vachon <js...@brizodata.com>
> Sent: Friday, March 22, 2019 10:28 AM
> To: users@nifi.apache.org
> Subject: Problem with load balancing option
>
> Hi all,
>
> I've configured one of my connection to use the "partition by attribute" load balancing option.
> It was not working as expected and after a few tests I realized I was missing some dependencies on the cluster nodes so I stopped everything (not related to the load balancing or Nifi at all)
>
> Now, I stopped everything before fixing  my dependencies issues and the UI shows 1906 items in the queue for that connection but I can't list them or empty the queue.
> Nifi tells me that there are no flow files in the queue when I try to list them and that 0 flowfiles out of 1906 were removed from the queue.
>
> I tried connecting the destination to some other process like a LogMessage processor but nothing is happening. The 1906 items are stuck and I cannot delete the connection because it's not empty.
>
> Any recommendations to fix this?
>
> thanks
>

Re: Problem with load balancing option

Posted by Koji Kawamura <ij...@gmail.com>.
Hi,

That looks similar to this one:
Occasionally FlowFiles appear to get "stuck" in a Load-Balanced Connection
https://issues.apache.org/jira/browse/NIFI-5919

If you're using NiFi 1.8.0, I recommend trying the latest 1.9.1 which
has the fix for the above issue.

Hope this helps.

Koji

On Sat, Mar 23, 2019 at 12:15 AM Jean-Sebastien Vachon
<js...@brizodata.com> wrote:
>
> Hi,
>
> FYI, I managed to get my node back by removing the node from the cluster, deleting the local flow and restart Nifi.
>
> Hope this helps identify the issue
> ________________________________
> From: Jean-Sebastien Vachon <js...@brizodata.com>
> Sent: Friday, March 22, 2019 10:56 AM
> To: users@nifi.apache.org
> Subject: Re: Problem with load balancing option
>
> Hi again,
>
> I thought everything was fine but one of my node can not start..
>
> 2019-03-22 14:51:27,811 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Successfully recovered 10396 records in 367 milliseconds. Now checkpointing to ensure that Write-Ahead Log is in a consistent state
> 2019-03-22 14:51:28,046 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 10396 Records and 0 Swap Files in 235 milliseconds (Stop-the-world time = 6 milliseconds), max Transaction ID 24370
> 2019-03-22 14:51:28,065 ERROR [main] o.a.nifi.controller.StandardFlowService Failed to load flow from cluster due to: org.apache.nifi.cluster.ConnectionExcepti
> on: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1
> org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1
>         at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1009)
>         at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:539)
>         at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:939)
>         at org.apache.nifi.NiFi.<init>(NiFi.java:157)
>         at org.apache.nifi.NiFi.<init>(NiFi.java:71)
>         at org.apache.nifi.NiFi.main(NiFi.java:296)
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
>         at org.apache.nifi.controller.queue.clustered.partition.CorrelationAttributePartitioner.getPartition(CorrelationAttributePartitioner.java:44)
>         at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.getPartition(SocketLoadBalancedFlowFileQueue.java:611)
>         at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.putAndGetPartition(SocketLoadBalancedFlowFileQueue.java:749)
>         at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.put(SocketLoadBalancedFlowFileQueue.java:739)
>         at org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.loadFlowFiles(WriteAheadFlowFileRepository.java:587)
>         at org.apache.nifi.controller.FlowController.initializeFlow(FlowController.java:818)
>         at org.apache.nifi.controller.StandardFlowService.initializeController(StandardFlowService.java:1019)
>         at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:991)
>         ... 5 common frames omitted
>
> Any idea?
> ________________________________
> From: Jean-Sebastien Vachon
> Sent: Friday, March 22, 2019 10:34 AM
> To: Jean-Sebastien Vachon; users@nifi.apache.org
> Subject: Re: Problem with load balancing option
>
> Hi,
>
> I stopped each node one by one and the queue is now empty. Not sure if this is a bug or intended but it does look strange from a user point of view
>
> Thanks
> ________________________________
> From: Jean-Sebastien Vachon <js...@brizodata.com>
> Sent: Friday, March 22, 2019 10:28 AM
> To: users@nifi.apache.org
> Subject: Problem with load balancing option
>
> Hi all,
>
> I've configured one of my connection to use the "partition by attribute" load balancing option.
> It was not working as expected and after a few tests I realized I was missing some dependencies on the cluster nodes so I stopped everything (not related to the load balancing or Nifi at all)
>
> Now, I stopped everything before fixing  my dependencies issues and the UI shows 1906 items in the queue for that connection but I can't list them or empty the queue.
> Nifi tells me that there are no flow files in the queue when I try to list them and that 0 flowfiles out of 1906 were removed from the queue.
>
> I tried connecting the destination to some other process like a LogMessage processor but nothing is happening. The 1906 items are stuck and I cannot delete the connection because it's not empty.
>
> Any recommendations to fix this?
>
> thanks
>

Re: Problem with load balancing option

Posted by Jean-Sebastien Vachon <js...@brizodata.com>.
Hi,

FYI, I managed to get my node back by removing the node from the cluster, deleting the local flow and restart Nifi.

Hope this helps identify the issue
________________________________
From: Jean-Sebastien Vachon <js...@brizodata.com>
Sent: Friday, March 22, 2019 10:56 AM
To: users@nifi.apache.org
Subject: Re: Problem with load balancing option

Hi again,

I thought everything was fine but one of my node can not start..

2019-03-22 14:51:27,811 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Successfully recovered 10396 records in 367 milliseconds. Now checkpointing to ensure that Write-Ahead Log is in a consistent state
2019-03-22 14:51:28,046 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 10396 Records and 0 Swap Files in 235 milliseconds (Stop-the-world time = 6 milliseconds), max Transaction ID 24370
2019-03-22 14:51:28,065 ERROR [main] o.a.nifi.controller.StandardFlowService Failed to load flow from cluster due to: org.apache.nifi.cluster.ConnectionExcepti
on: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1
org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1
        at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1009)
        at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:539)
        at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:939)
        at org.apache.nifi.NiFi.<init>(NiFi.java:157)
        at org.apache.nifi.NiFi.<init>(NiFi.java:71)
        at org.apache.nifi.NiFi.main(NiFi.java:296)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
        at org.apache.nifi.controller.queue.clustered.partition.CorrelationAttributePartitioner.getPartition(CorrelationAttributePartitioner.java:44)
        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.getPartition(SocketLoadBalancedFlowFileQueue.java:611)
        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.putAndGetPartition(SocketLoadBalancedFlowFileQueue.java:749)
        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.put(SocketLoadBalancedFlowFileQueue.java:739)
        at org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.loadFlowFiles(WriteAheadFlowFileRepository.java:587)
        at org.apache.nifi.controller.FlowController.initializeFlow(FlowController.java:818)
        at org.apache.nifi.controller.StandardFlowService.initializeController(StandardFlowService.java:1019)
        at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:991)
        ... 5 common frames omitted

Any idea?
________________________________
From: Jean-Sebastien Vachon
Sent: Friday, March 22, 2019 10:34 AM
To: Jean-Sebastien Vachon; users@nifi.apache.org
Subject: Re: Problem with load balancing option

Hi,

I stopped each node one by one and the queue is now empty. Not sure if this is a bug or intended but it does look strange from a user point of view

Thanks
________________________________
From: Jean-Sebastien Vachon <js...@brizodata.com>
Sent: Friday, March 22, 2019 10:28 AM
To: users@nifi.apache.org
Subject: Problem with load balancing option

Hi all,

I've configured one of my connection to use the "partition by attribute" load balancing option.
It was not working as expected and after a few tests I realized I was missing some dependencies on the cluster nodes so I stopped everything (not related to the load balancing or Nifi at all)

Now, I stopped everything before fixing  my dependencies issues and the UI shows 1906 items in the queue for that connection but I can't list them or empty the queue.
Nifi tells me that there are no flow files in the queue when I try to list them and that 0 flowfiles out of 1906 were removed from the queue.

I tried connecting the destination to some other process like a LogMessage processor but nothing is happening. The 1906 items are stuck and I cannot delete the connection because it's not empty.

Any recommendations to fix this?

thanks


Re: Problem with load balancing option

Posted by Jean-Sebastien Vachon <js...@brizodata.com>.
Hi again,

I thought everything was fine but one of my node can not start..

2019-03-22 14:51:27,811 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Successfully recovered 10396 records in 367 milliseconds. Now checkpointing to ensure that Write-Ahead Log is in a consistent state
2019-03-22 14:51:28,046 INFO [main] o.a.n.wali.SequentialAccessWriteAheadLog Checkpointed Write-Ahead Log with 10396 Records and 0 Swap Files in 235 milliseconds (Stop-the-world time = 6 milliseconds), max Transaction ID 24370
2019-03-22 14:51:28,065 ERROR [main] o.a.nifi.controller.StandardFlowService Failed to load flow from cluster due to: org.apache.nifi.cluster.ConnectionExcepti
on: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1
org.apache.nifi.cluster.ConnectionException: Failed to connect node to cluster due to: java.lang.ArrayIndexOutOfBoundsException: -1
        at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:1009)
        at org.apache.nifi.controller.StandardFlowService.load(StandardFlowService.java:539)
        at org.apache.nifi.web.server.JettyServer.start(JettyServer.java:939)
        at org.apache.nifi.NiFi.<init>(NiFi.java:157)
        at org.apache.nifi.NiFi.<init>(NiFi.java:71)
        at org.apache.nifi.NiFi.main(NiFi.java:296)
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
        at org.apache.nifi.controller.queue.clustered.partition.CorrelationAttributePartitioner.getPartition(CorrelationAttributePartitioner.java:44)
        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.getPartition(SocketLoadBalancedFlowFileQueue.java:611)
        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.putAndGetPartition(SocketLoadBalancedFlowFileQueue.java:749)
        at org.apache.nifi.controller.queue.clustered.SocketLoadBalancedFlowFileQueue.put(SocketLoadBalancedFlowFileQueue.java:739)
        at org.apache.nifi.controller.repository.WriteAheadFlowFileRepository.loadFlowFiles(WriteAheadFlowFileRepository.java:587)
        at org.apache.nifi.controller.FlowController.initializeFlow(FlowController.java:818)
        at org.apache.nifi.controller.StandardFlowService.initializeController(StandardFlowService.java:1019)
        at org.apache.nifi.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:991)
        ... 5 common frames omitted

Any idea?
________________________________
From: Jean-Sebastien Vachon
Sent: Friday, March 22, 2019 10:34 AM
To: Jean-Sebastien Vachon; users@nifi.apache.org
Subject: Re: Problem with load balancing option

Hi,

I stopped each node one by one and the queue is now empty. Not sure if this is a bug or intended but it does look strange from a user point of view

Thanks
________________________________
From: Jean-Sebastien Vachon <js...@brizodata.com>
Sent: Friday, March 22, 2019 10:28 AM
To: users@nifi.apache.org
Subject: Problem with load balancing option

Hi all,

I've configured one of my connection to use the "partition by attribute" load balancing option.
It was not working as expected and after a few tests I realized I was missing some dependencies on the cluster nodes so I stopped everything (not related to the load balancing or Nifi at all)

Now, I stopped everything before fixing  my dependencies issues and the UI shows 1906 items in the queue for that connection but I can't list them or empty the queue.
Nifi tells me that there are no flow files in the queue when I try to list them and that 0 flowfiles out of 1906 were removed from the queue.

I tried connecting the destination to some other process like a LogMessage processor but nothing is happening. The 1906 items are stuck and I cannot delete the connection because it's not empty.

Any recommendations to fix this?

thanks


Re: Problem with load balancing option

Posted by Jean-Sebastien Vachon <js...@brizodata.com>.
Hi,

I stopped each node one by one and the queue is now empty. Not sure if this is a bug or intended but it does look strange from a user point of view

Thanks
________________________________
From: Jean-Sebastien Vachon <js...@brizodata.com>
Sent: Friday, March 22, 2019 10:28 AM
To: users@nifi.apache.org
Subject: Problem with load balancing option

Hi all,

I've configured one of my connection to use the "partition by attribute" load balancing option.
It was not working as expected and after a few tests I realized I was missing some dependencies on the cluster nodes so I stopped everything (not related to the load balancing or Nifi at all)

Now, I stopped everything before fixing  my dependencies issues and the UI shows 1906 items in the queue for that connection but I can't list them or empty the queue.
Nifi tells me that there are no flow files in the queue when I try to list them and that 0 flowfiles out of 1906 were removed from the queue.

I tried connecting the destination to some other process like a LogMessage processor but nothing is happening. The 1906 items are stuck and I cannot delete the connection because it's not empty.

Any recommendations to fix this?

thanks