You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Paresh Shah <Pa...@symantec.com> on 2018/06/04 17:24:26 UTC

Re: [EXT] Re: URL configuration for the remote process group in Nifi 1.3

Mark

Want some more clarity. Let me see if I understand this. Just to be clear we are using RPG purely for load balancing on the same cluster.

Step 1: When it initially connects to Node1, it would fetch all the cluster details i.e it would know all the nodes that exist in my cluster which are
Node1
Node2
Node3

Question: Where is this information persisted and does it resolve this every time there are flow files sent to the RPG( remote process group ). 

Step 2: Now when Node1 goes down it would try to establish communication with one of the following nodes which it had retrieved and stored initially or as a background task.
Node2
Node3

Question: Does this update the persisted information in Step1. Is there any way to update the actual URL for the RPG. Basically we do not want every incoming flow file on the RPG to end up selecting the target node. 

Thanks
Paresh

On 6/3/18, 9:09 AM, "Mark Payne" <ma...@hotmail.com> wrote:

    Paresh,
    
    When NiFi establishes a connection to the remote instance, it will request information from the remote instance about all nodes in the cluster. It then persists this information in case nifi is restarted. So whichever node you use in your URL is only important for the initial connection. Additionally, NiFi will periodically reach out to the remote nifi instances to determine which nodes are in the cluster, in case nodes are added to or removed from the cluster.
    
    Does that all make sense?
    
    Thanks
    -Mark
    
    Sent from my iPhone
    
    > On Jun 3, 2018, at 11:15 AM, Paresh Shah <Pa...@symantec.com> wrote:
    > 
    > I have a cluster with 3 nodes. We are using RPG for load balancing
    > 
    > Node1 ( primary and cluster coordinator ).
    > Node2
    > Node3
    > 
    > When configuring the RPG is use Node1 as the target URL. My question is what happens to this RPG when the Node1 goes down or is offline. At this point how does the RPG keep functioning, since we cannot update the URL once its created.
    > 
    > Thanks
    > Paresh
    > 
    


Re: [EXT] Re: URL configuration for the remote process group in Nifi 1.3

Posted by Bryan Bende <bb...@gmail.com>.
Paresh,

Mark can correct me if I'm wrong, but I believe the information
fetched in step 1 is persisted in-memory on each node where the RPG is
running. This information is then periodically refreshed in a
background thread.

When data is flowing through it is distributing the data to the nodes
in a round robin manner in batches according to the batch size
configuration. If it knows a node is down, I believe it will not send
any data to that node until it is back up, and if it thinks the node
is up, but it fails to send data to it, then it will try another node.

The URL in the RPG should accept a comma-separated list of multiple
URLs, but as Mark mentioned this would only be used the first time you
start the RPG, of if a node restarted. For example, say you entered
the URL as "node1,node2" and then node 3 restarts while node1 is down,
it would try node1 to get cluster info and fail, then try node2 and
succeed.

-Bryan


On Mon, Jun 4, 2018 at 1:24 PM, Paresh Shah <Pa...@symantec.com> wrote:
> Mark
>
> Want some more clarity. Let me see if I understand this. Just to be clear we are using RPG purely for load balancing on the same cluster.
>
> Step 1: When it initially connects to Node1, it would fetch all the cluster details i.e it would know all the nodes that exist in my cluster which are
> Node1
> Node2
> Node3
>
> Question: Where is this information persisted and does it resolve this every time there are flow files sent to the RPG( remote process group ).
>
> Step 2: Now when Node1 goes down it would try to establish communication with one of the following nodes which it had retrieved and stored initially or as a background task.
> Node2
> Node3
>
> Question: Does this update the persisted information in Step1. Is there any way to update the actual URL for the RPG. Basically we do not want every incoming flow file on the RPG to end up selecting the target node.
>
> Thanks
> Paresh
>
> On 6/3/18, 9:09 AM, "Mark Payne" <ma...@hotmail.com> wrote:
>
>     Paresh,
>
>     When NiFi establishes a connection to the remote instance, it will request information from the remote instance about all nodes in the cluster. It then persists this information in case nifi is restarted. So whichever node you use in your URL is only important for the initial connection. Additionally, NiFi will periodically reach out to the remote nifi instances to determine which nodes are in the cluster, in case nodes are added to or removed from the cluster.
>
>     Does that all make sense?
>
>     Thanks
>     -Mark
>
>     Sent from my iPhone
>
>     > On Jun 3, 2018, at 11:15 AM, Paresh Shah <Pa...@symantec.com> wrote:
>     >
>     > I have a cluster with 3 nodes. We are using RPG for load balancing
>     >
>     > Node1 ( primary and cluster coordinator ).
>     > Node2
>     > Node3
>     >
>     > When configuring the RPG is use Node1 as the target URL. My question is what happens to this RPG when the Node1 goes down or is offline. At this point how does the RPG keep functioning, since we cannot update the URL once its created.
>     >
>     > Thanks
>     > Paresh
>     >
>
>

Re: [EXT] Re: URL configuration for the remote process group in Nifi 1.3

Posted by Paresh Shah <Pa...@symantec.com>.
Mark,

Wanted to make sure you did see my response. Looking forward to yours.

Thanks
Paresh

On 6/4/18, 10:24 AM, "Paresh Shah" <Pa...@symantec.com> wrote:

    Mark
    
    Want some more clarity. Let me see if I understand this. Just to be clear we are using RPG purely for load balancing on the same cluster.
    
    Step 1: When it initially connects to Node1, it would fetch all the cluster details i.e it would know all the nodes that exist in my cluster which are
    Node1
    Node2
    Node3
    
    Question: Where is this information persisted and does it resolve this every time there are flow files sent to the RPG( remote process group ). 
    
    Step 2: Now when Node1 goes down it would try to establish communication with one of the following nodes which it had retrieved and stored initially or as a background task.
    Node2
    Node3
    
    Question: Does this update the persisted information in Step1. Is there any way to update the actual URL for the RPG. Basically we do not want every incoming flow file on the RPG to end up selecting the target node. 
    
    Thanks
    Paresh
    
    On 6/3/18, 9:09 AM, "Mark Payne" <ma...@hotmail.com> wrote:
    
        Paresh,
        
        When NiFi establishes a connection to the remote instance, it will request information from the remote instance about all nodes in the cluster. It then persists this information in case nifi is restarted. So whichever node you use in your URL is only important for the initial connection. Additionally, NiFi will periodically reach out to the remote nifi instances to determine which nodes are in the cluster, in case nodes are added to or removed from the cluster.
        
        Does that all make sense?
        
        Thanks
        -Mark
        
        Sent from my iPhone
        
        > On Jun 3, 2018, at 11:15 AM, Paresh Shah <Pa...@symantec.com> wrote:
        > 
        > I have a cluster with 3 nodes. We are using RPG for load balancing
        > 
        > Node1 ( primary and cluster coordinator ).
        > Node2
        > Node3
        > 
        > When configuring the RPG is use Node1 as the target URL. My question is what happens to this RPG when the Node1 goes down or is offline. At this point how does the RPG keep functioning, since we cannot update the URL once its created.
        > 
        > Thanks
        > Paresh
        >