You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Ricky Saltzer <ri...@cloudera.com> on 2015/04/03 23:23:25 UTC

Site to Site not working within process groups

I just want to make sure this is a supported feature before I open a JIRA.
It appears as if I can't create a Site-to-Site connection within a process
group. It's easier to explain visually (see below). Any help would be
appreciated, thanks!

*Top Level *(works):

[image: Inline image 1]


*Inside Process Group *(remote doesn't work)

[image: Inline image 1]


-- 
Ricky Saltzer
http://www.cloudera.com

Re: Site to Site not working within process groups

Posted by Adam Taft <ad...@adamtaft.com>.
One point of clarification.  When I wrote, "the solution here is simply a
new UI element", I didn't mean to imply the work itself was trivial.  I was
just more voting for using UI elements that are already familiar to the
DFM, like a processor box that explicitly says, "load balance here."

I think the principle of making this work without touching the site-to-site
code should be close to achievable. That would probably be ideal from the
KISS perspective.

Adam


On Fri, Apr 10, 2015 at 9:07 AM, Adam Taft <ad...@adamtaft.com> wrote:

> Why does this have to be on a "connection?"  In my mind, the solution here
> is simply a new UI element that behaves like a site-to-site remote process
> group, but automates/hides all the configuration parameters.  From the
> backend's point of view, nothing would have to change, since site-to-site
> already works. The UI element should look basically like a processor box.
>
> The only possible change to the backend might be to the node selection
> algorithm, if you wanted to exclude the current node from receiving the
> flowfile in question.  In my mind, though, this might be a misfeature.  If
> the current node is loaded more lightly than the other nodes, it's better
> to keep the flowfile on it and continue processing.
>
> For efficiencies sake, it might be nice to have a configuration threshold
> for this element that won't attempt cluster redistribution if the current
> node is not overly loaded. Let the DFM decide at what point to start
> pushing files to other nodes, since the overhead for doing so is heavier
> than keeping the file local.
>
> Two cents,
>
> Adam
>
>
>
>
> On Fri, Apr 10, 2015 at 8:18 AM, Mark Payne <ma...@hotmail.com> wrote:
>
>> Joe,
>>
>> So there are a few pieces to the puzzle:
>>
>> * Deciding when to push data around - this is not trivial. Likely
>> implemented on NCM
>> * Deciding where to push the data - not trivial either but possibly easier
>> * Building the components to push the data and the component to receive
>> the data
>> * Modifying the site-to-site protocol to allow pushing data to a
>> connection rather than a port... this is reasonably easy but requires very
>> significant testing.
>> * Updating Node to receive a "rebalance" command from NCM and initiate it
>> to happen
>> * Updating UI, data model, Connection objects to support setting the flag
>>
>> I'd estimate at least a week of full-time work to get this done.
>>
>>
>> ------ Original Message ------
>> From: "Joe Witt" <jo...@gmail.com>
>> To: "dev@nifi.incubator.apache.org" <de...@nifi.incubator.apache.org>
>> Sent: 4/9/2015 9:00:56 PM
>> Subject: Re: Site to Site not working within process groups
>>
>>  mark
>>>
>>> just spitballin' here. Why would it be such a large amount of work?
>>> My initial thought here is that you allow the user to indicate whether
>>> a given connection should auto-load-balance at which point nifi would
>>> simply create an implicit site-to-site connection. It would need to
>>> be smart enough to not transfer data to the same node data is coming
>>> from and to avoid too much rebalancing and such.
>>>
>>> Maybe what i mean to say is how much time do you think it would take
>>> roughly?
>>>
>>> Thanks
>>> Joe
>>>
>>> On Thu, Apr 9, 2015 at 5:16 PM, Mark Payne <ma...@hotmail.com> wrote:
>>>
>>>>  Ryan,
>>>>
>>>>  We definitely want to get to the point that we have the ability to
>>>>  load-balance the data in a connection across a cluster. But yes, that
>>>> is a
>>>>  fairly large undertaking.
>>>>
>>>>  I can definitely appreciate that it would be more convenient to add a
>>>> port
>>>>  in the middle of a group, but I would shy away implementing something
>>>> like
>>>>  that as a stop-gap when the load-balanced connection is the true
>>>> desire.
>>>>  This stop-gap really would be quite a lot of work, as well, and I think
>>>>  would introduce more confusion.
>>>>
>>>>  Thanks
>>>>  -Mark
>>>>
>>>>
>>>>
>>>>  ------ Original Message ------
>>>>  From: "Ryan Blue" <bl...@cloudera.com>
>>>>  To: dev@nifi.incubator.apache.org
>>>>  Sent: 4/6/2015 11:32:12 AM
>>>>  Subject: Re: Site to Site not working within process groups
>>>>
>>>>   On 04/03/2015 03:36 PM, Matt Gilman wrote:
>>>>>
>>>>>>
>>>>>>  Ricky,
>>>>>>
>>>>>>  What your seeing is by design. While the approach can be limiting,
>>>>>>  especially if your looking to expose an [in|out]put port remotely in
>>>>>> a sub
>>>>>>  group, it was done for consistency and simplicity. Groups can have
>>>>>> input and
>>>>>>  output ports. This facilitates data flow into and out of the groups.
>>>>>> When
>>>>>>  this is done at the root level, it allows us to abstract a NiFi
>>>>>> instance as
>>>>>>  a group to a remote NiFi.
>>>>>>
>>>>>>  Matt Gilman
>>>>>>
>>>>>
>>>>>
>>>>>  I think the problem is that sometimes you want a process group to be
>>>>> able
>>>>>  to use the trick where you send to a local "remote" input port to load
>>>>>  balance. It would be great to be able to hide that detail within a
>>>>> process
>>>>>  group, but the reuse of ports for both purposes prevents it.
>>>>>
>>>>>  Could we add an option to select whether the port is for a process
>>>>> group
>>>>>  or should listen for remote connections? That seems like an easy way
>>>>> to
>>>>>  solve the problem, though I think adding an option to load balance a
>>>>>  connection in cluster mode would solve the problem more cleanly. But
>>>>> that
>>>>>  would be more work, right?
>>>>>
>>>>>  rb
>>>>>
>>>>>
>>>>>  -- Ryan Blue
>>>>>  Software Engineer
>>>>>  Cloudera, Inc.
>>>>>
>>>>
>

Re: Site to Site not working within process groups

Posted by Mark Payne <ma...@hotmail.com>.
  Adam,

Interesting - I had not thought about having a new "Load Balance" 
component. I'd always imagined doing this from a connection.
It's worth thinking about. Thought I think adding it to a connection is 
a lot simpler and cleaner. With a component, if you have a big
backlog in one connection, you'd have to add the component in and move 
your graph around to accommodate it, and then potentially
pull it back out if it's only intended to be temporary. If on a 
connection, we can mark the connection to auto-rebalance. Or allow a 
user
to simply say "Don't do it automatically but rebalance right now."

Definitely, though, it has to be smart about moving data around, because 
we dont want to push the data elsewhere when the node that already
has it can handle it. A simple algorithm would be something like "Node 
has at least 1 GB of data and more than double the average of all 
nodes."



------ Original Message ------
From: "Adam Taft" <ad...@adamtaft.com>
To: dev@nifi.incubator.apache.org
Sent: 4/10/2015 9:07:19 AM
Subject: Re: Site to Site not working within process groups

>Why does this have to be on a "connection?" In my mind, the solution 
>here
>is simply a new UI element that behaves like a site-to-site remote 
>process
>group, but automates/hides all the configuration parameters. From the
>backend's point of view, nothing would have to change, since 
>site-to-site
>already works. The UI element should look basically like a processor 
>box.
>
>The only possible change to the backend might be to the node selection
>algorithm, if you wanted to exclude the current node from receiving the
>flowfile in question. In my mind, though, this might be a misfeature. 
>If
>the current node is loaded more lightly than the other nodes, it's 
>better
>to keep the flowfile on it and continue processing.
>
>For efficiencies sake, it might be nice to have a configuration 
>threshold
>for this element that won't attempt cluster redistribution if the 
>current
>node is not overly loaded. Let the DFM decide at what point to start
>pushing files to other nodes, since the overhead for doing so is 
>heavier
>than keeping the file local.
>
>Two cents,
>
>Adam
>
>
>
>
>On Fri, Apr 10, 2015 at 8:18 AM, Mark Payne <ma...@hotmail.com> 
>wrote:
>
>>  Joe,
>>
>>  So there are a few pieces to the puzzle:
>>
>>  * Deciding when to push data around - this is not trivial. Likely
>>  implemented on NCM
>>  * Deciding where to push the data - not trivial either but possibly 
>>easier
>>  * Building the components to push the data and the component to 
>>receive
>>  the data
>>  * Modifying the site-to-site protocol to allow pushing data to a
>>  connection rather than a port... this is reasonably easy but requires 
>>very
>>  significant testing.
>>  * Updating Node to receive a "rebalance" command from NCM and 
>>initiate it
>>  to happen
>>  * Updating UI, data model, Connection objects to support setting the 
>>flag
>>
>>  I'd estimate at least a week of full-time work to get this done.
>>
>>
>>  ------ Original Message ------
>>  From: "Joe Witt" <jo...@gmail.com>
>>  To: "dev@nifi.incubator.apache.org" <de...@nifi.incubator.apache.org>
>>  Sent: 4/9/2015 9:00:56 PM
>>  Subject: Re: Site to Site not working within process groups
>>
>>   mark
>>>
>>>  just spitballin' here. Why would it be such a large amount of work?
>>>  My initial thought here is that you allow the user to indicate 
>>>whether
>>>  a given connection should auto-load-balance at which point nifi 
>>>would
>>>  simply create an implicit site-to-site connection. It would need to
>>>  be smart enough to not transfer data to the same node data is coming
>>>  from and to avoid too much rebalancing and such.
>>>
>>>  Maybe what i mean to say is how much time do you think it would take
>>>  roughly?
>>>
>>>  Thanks
>>>  Joe
>>>
>>>  On Thu, Apr 9, 2015 at 5:16 PM, Mark Payne <ma...@hotmail.com> 
>>>wrote:
>>>
>>>>   Ryan,
>>>>
>>>>   We definitely want to get to the point that we have the ability to
>>>>   load-balance the data in a connection across a cluster. But yes, 
>>>>that
>>>>  is a
>>>>   fairly large undertaking.
>>>>
>>>>   I can definitely appreciate that it would be more convenient to 
>>>>add a
>>>>  port
>>>>   in the middle of a group, but I would shy away implementing 
>>>>something
>>>>  like
>>>>   that as a stop-gap when the load-balanced connection is the true 
>>>>desire.
>>>>   This stop-gap really would be quite a lot of work, as well, and I 
>>>>think
>>>>   would introduce more confusion.
>>>>
>>>>   Thanks
>>>>   -Mark
>>>>
>>>>
>>>>
>>>>   ------ Original Message ------
>>>>   From: "Ryan Blue" <bl...@cloudera.com>
>>>>   To: dev@nifi.incubator.apache.org
>>>>   Sent: 4/6/2015 11:32:12 AM
>>>>   Subject: Re: Site to Site not working within process groups
>>>>
>>>>    On 04/03/2015 03:36 PM, Matt Gilman wrote:
>>>>>
>>>>>>
>>>>>>   Ricky,
>>>>>>
>>>>>>   What your seeing is by design. While the approach can be 
>>>>>>limiting,
>>>>>>   especially if your looking to expose an [in|out]put port 
>>>>>>remotely in
>>>>>>  a sub
>>>>>>   group, it was done for consistency and simplicity. Groups can 
>>>>>>have
>>>>>>  input and
>>>>>>   output ports. This facilitates data flow into and out of the 
>>>>>>groups.
>>>>>>  When
>>>>>>   this is done at the root level, it allows us to abstract a NiFi
>>>>>>  instance as
>>>>>>   a group to a remote NiFi.
>>>>>>
>>>>>>   Matt Gilman
>>>>>>
>>>>>
>>>>>
>>>>>   I think the problem is that sometimes you want a process group to 
>>>>>be
>>>>>  able
>>>>>   to use the trick where you send to a local "remote" input port to 
>>>>>load
>>>>>   balance. It would be great to be able to hide that detail within 
>>>>>a
>>>>>  process
>>>>>   group, but the reuse of ports for both purposes prevents it.
>>>>>
>>>>>   Could we add an option to select whether the port is for a 
>>>>>process
>>>>>  group
>>>>>   or should listen for remote connections? That seems like an easy 
>>>>>way to
>>>>>   solve the problem, though I think adding an option to load 
>>>>>balance a
>>>>>   connection in cluster mode would solve the problem more cleanly. 
>>>>>But
>>>>>  that
>>>>>   would be more work, right?
>>>>>
>>>>>   rb
>>>>>
>>>>>
>>>>>   -- Ryan Blue
>>>>>   Software Engineer
>>>>>   Cloudera, Inc.
>>>>>
>>>>

Re: Site to Site not working within process groups

Posted by Adam Taft <ad...@adamtaft.com>.
Why does this have to be on a "connection?"  In my mind, the solution here
is simply a new UI element that behaves like a site-to-site remote process
group, but automates/hides all the configuration parameters.  From the
backend's point of view, nothing would have to change, since site-to-site
already works. The UI element should look basically like a processor box.

The only possible change to the backend might be to the node selection
algorithm, if you wanted to exclude the current node from receiving the
flowfile in question.  In my mind, though, this might be a misfeature.  If
the current node is loaded more lightly than the other nodes, it's better
to keep the flowfile on it and continue processing.

For efficiencies sake, it might be nice to have a configuration threshold
for this element that won't attempt cluster redistribution if the current
node is not overly loaded. Let the DFM decide at what point to start
pushing files to other nodes, since the overhead for doing so is heavier
than keeping the file local.

Two cents,

Adam




On Fri, Apr 10, 2015 at 8:18 AM, Mark Payne <ma...@hotmail.com> wrote:

> Joe,
>
> So there are a few pieces to the puzzle:
>
> * Deciding when to push data around - this is not trivial. Likely
> implemented on NCM
> * Deciding where to push the data - not trivial either but possibly easier
> * Building the components to push the data and the component to receive
> the data
> * Modifying the site-to-site protocol to allow pushing data to a
> connection rather than a port... this is reasonably easy but requires very
> significant testing.
> * Updating Node to receive a "rebalance" command from NCM and initiate it
> to happen
> * Updating UI, data model, Connection objects to support setting the flag
>
> I'd estimate at least a week of full-time work to get this done.
>
>
> ------ Original Message ------
> From: "Joe Witt" <jo...@gmail.com>
> To: "dev@nifi.incubator.apache.org" <de...@nifi.incubator.apache.org>
> Sent: 4/9/2015 9:00:56 PM
> Subject: Re: Site to Site not working within process groups
>
>  mark
>>
>> just spitballin' here. Why would it be such a large amount of work?
>> My initial thought here is that you allow the user to indicate whether
>> a given connection should auto-load-balance at which point nifi would
>> simply create an implicit site-to-site connection. It would need to
>> be smart enough to not transfer data to the same node data is coming
>> from and to avoid too much rebalancing and such.
>>
>> Maybe what i mean to say is how much time do you think it would take
>> roughly?
>>
>> Thanks
>> Joe
>>
>> On Thu, Apr 9, 2015 at 5:16 PM, Mark Payne <ma...@hotmail.com> wrote:
>>
>>>  Ryan,
>>>
>>>  We definitely want to get to the point that we have the ability to
>>>  load-balance the data in a connection across a cluster. But yes, that
>>> is a
>>>  fairly large undertaking.
>>>
>>>  I can definitely appreciate that it would be more convenient to add a
>>> port
>>>  in the middle of a group, but I would shy away implementing something
>>> like
>>>  that as a stop-gap when the load-balanced connection is the true desire.
>>>  This stop-gap really would be quite a lot of work, as well, and I think
>>>  would introduce more confusion.
>>>
>>>  Thanks
>>>  -Mark
>>>
>>>
>>>
>>>  ------ Original Message ------
>>>  From: "Ryan Blue" <bl...@cloudera.com>
>>>  To: dev@nifi.incubator.apache.org
>>>  Sent: 4/6/2015 11:32:12 AM
>>>  Subject: Re: Site to Site not working within process groups
>>>
>>>   On 04/03/2015 03:36 PM, Matt Gilman wrote:
>>>>
>>>>>
>>>>>  Ricky,
>>>>>
>>>>>  What your seeing is by design. While the approach can be limiting,
>>>>>  especially if your looking to expose an [in|out]put port remotely in
>>>>> a sub
>>>>>  group, it was done for consistency and simplicity. Groups can have
>>>>> input and
>>>>>  output ports. This facilitates data flow into and out of the groups.
>>>>> When
>>>>>  this is done at the root level, it allows us to abstract a NiFi
>>>>> instance as
>>>>>  a group to a remote NiFi.
>>>>>
>>>>>  Matt Gilman
>>>>>
>>>>
>>>>
>>>>  I think the problem is that sometimes you want a process group to be
>>>> able
>>>>  to use the trick where you send to a local "remote" input port to load
>>>>  balance. It would be great to be able to hide that detail within a
>>>> process
>>>>  group, but the reuse of ports for both purposes prevents it.
>>>>
>>>>  Could we add an option to select whether the port is for a process
>>>> group
>>>>  or should listen for remote connections? That seems like an easy way to
>>>>  solve the problem, though I think adding an option to load balance a
>>>>  connection in cluster mode would solve the problem more cleanly. But
>>>> that
>>>>  would be more work, right?
>>>>
>>>>  rb
>>>>
>>>>
>>>>  -- Ryan Blue
>>>>  Software Engineer
>>>>  Cloudera, Inc.
>>>>
>>>

Re: Site to Site not working within process groups

Posted by Mark Payne <ma...@hotmail.com>.
Joe,

So there are a few pieces to the puzzle:

* Deciding when to push data around - this is not trivial. Likely 
implemented on NCM
* Deciding where to push the data - not trivial either but possibly 
easier
* Building the components to push the data and the component to receive 
the data
* Modifying the site-to-site protocol to allow pushing data to a 
connection rather than a port... this is reasonably easy but requires 
very significant testing.
* Updating Node to receive a "rebalance" command from NCM and initiate 
it to happen
* Updating UI, data model, Connection objects to support setting the 
flag

I'd estimate at least a week of full-time work to get this done.

------ Original Message ------
From: "Joe Witt" <jo...@gmail.com>
To: "dev@nifi.incubator.apache.org" <de...@nifi.incubator.apache.org>
Sent: 4/9/2015 9:00:56 PM
Subject: Re: Site to Site not working within process groups

>mark
>
>just spitballin' here. Why would it be such a large amount of work?
>My initial thought here is that you allow the user to indicate whether
>a given connection should auto-load-balance at which point nifi would
>simply create an implicit site-to-site connection. It would need to
>be smart enough to not transfer data to the same node data is coming
>from and to avoid too much rebalancing and such.
>
>Maybe what i mean to say is how much time do you think it would take 
>roughly?
>
>Thanks
>Joe
>
>On Thu, Apr 9, 2015 at 5:16 PM, Mark Payne <ma...@hotmail.com> 
>wrote:
>>  Ryan,
>>
>>  We definitely want to get to the point that we have the ability to
>>  load-balance the data in a connection across a cluster. But yes, that 
>>is a
>>  fairly large undertaking.
>>
>>  I can definitely appreciate that it would be more convenient to add a 
>>port
>>  in the middle of a group, but I would shy away implementing something 
>>like
>>  that as a stop-gap when the load-balanced connection is the true 
>>desire.
>>  This stop-gap really would be quite a lot of work, as well, and I 
>>think
>>  would introduce more confusion.
>>
>>  Thanks
>>  -Mark
>>
>>
>>
>>  ------ Original Message ------
>>  From: "Ryan Blue" <bl...@cloudera.com>
>>  To: dev@nifi.incubator.apache.org
>>  Sent: 4/6/2015 11:32:12 AM
>>  Subject: Re: Site to Site not working within process groups
>>
>>>  On 04/03/2015 03:36 PM, Matt Gilman wrote:
>>>>
>>>>  Ricky,
>>>>
>>>>  What your seeing is by design. While the approach can be limiting,
>>>>  especially if your looking to expose an [in|out]put port remotely 
>>>>in a sub
>>>>  group, it was done for consistency and simplicity. Groups can have 
>>>>input and
>>>>  output ports. This facilitates data flow into and out of the 
>>>>groups. When
>>>>  this is done at the root level, it allows us to abstract a NiFi 
>>>>instance as
>>>>  a group to a remote NiFi.
>>>>
>>>>  Matt Gilman
>>>
>>>
>>>  I think the problem is that sometimes you want a process group to be 
>>>able
>>>  to use the trick where you send to a local "remote" input port to 
>>>load
>>>  balance. It would be great to be able to hide that detail within a 
>>>process
>>>  group, but the reuse of ports for both purposes prevents it.
>>>
>>>  Could we add an option to select whether the port is for a process 
>>>group
>>>  or should listen for remote connections? That seems like an easy way 
>>>to
>>>  solve the problem, though I think adding an option to load balance a
>>>  connection in cluster mode would solve the problem more cleanly. But 
>>>that
>>>  would be more work, right?
>>>
>>>  rb
>>>
>>>
>>>  -- Ryan Blue
>>>  Software Engineer
>>>  Cloudera, Inc.

Re: Site to Site not working within process groups

Posted by Joe Witt <jo...@gmail.com>.
mark

just spitballin' here.  Why would it be such a large amount of work?
My initial thought here is that you allow the user to indicate whether
a given connection should auto-load-balance at which point nifi would
simply create an implicit site-to-site connection.  It would need to
be smart enough to not transfer data to the same node data is coming
from and to avoid too much rebalancing and such.

Maybe what i mean to say is how much time do you think it would take roughly?

Thanks
Joe

On Thu, Apr 9, 2015 at 5:16 PM, Mark Payne <ma...@hotmail.com> wrote:
> Ryan,
>
> We definitely want to get to the point that we have the ability to
> load-balance the data in a connection across a cluster. But yes, that is a
> fairly large undertaking.
>
> I can definitely appreciate that it would be more convenient to add a port
> in the middle of a group, but I would shy away implementing something like
> that as a stop-gap when the load-balanced connection is the true desire.
> This stop-gap really would be quite a lot of work, as well, and I think
> would introduce more confusion.
>
> Thanks
> -Mark
>
>
>
> ------ Original Message ------
> From: "Ryan Blue" <bl...@cloudera.com>
> To: dev@nifi.incubator.apache.org
> Sent: 4/6/2015 11:32:12 AM
> Subject: Re: Site to Site not working within process groups
>
>> On 04/03/2015 03:36 PM, Matt Gilman wrote:
>>>
>>> Ricky,
>>>
>>> What your seeing is by design. While the approach can be limiting,
>>> especially if your looking to expose an [in|out]put port remotely in a sub
>>> group, it was done for consistency and simplicity. Groups can have input and
>>> output ports. This facilitates data flow into and out of the groups. When
>>> this is done at the root level, it allows us to abstract a NiFi instance as
>>> a group to a remote NiFi.
>>>
>>> Matt Gilman
>>
>>
>> I think the problem is that sometimes you want a process group to be able
>> to use the trick where you send to a local "remote" input port to load
>> balance. It would be great to be able to hide that detail within a process
>> group, but the reuse of ports for both purposes prevents it.
>>
>> Could we add an option to select whether the port is for a process group
>> or should listen for remote connections? That seems like an easy way to
>> solve the problem, though I think adding an option to load balance a
>> connection in cluster mode would solve the problem more cleanly. But that
>> would be more work, right?
>>
>> rb
>>
>>
>> -- Ryan Blue
>> Software Engineer
>> Cloudera, Inc.

Re: Site to Site not working within process groups

Posted by Mark Payne <ma...@hotmail.com>.
Ryan,

We definitely want to get to the point that we have the ability to 
load-balance the data in a connection across a cluster. But yes, that is 
a fairly large undertaking.

I can definitely appreciate that it would be more convenient to add a 
port in the middle of a group, but I would shy away implementing 
something like that as a stop-gap when the load-balanced connection is 
the true desire. This stop-gap really would be quite a lot of work, as 
well, and I think would introduce more confusion.

Thanks
-Mark


------ Original Message ------
From: "Ryan Blue" <bl...@cloudera.com>
To: dev@nifi.incubator.apache.org
Sent: 4/6/2015 11:32:12 AM
Subject: Re: Site to Site not working within process groups

>On 04/03/2015 03:36 PM, Matt Gilman wrote:
>>Ricky,
>>
>>What your seeing is by design. While the approach can be limiting, 
>>especially if your looking to expose an [in|out]put port remotely in a 
>>sub group, it was done for consistency and simplicity. Groups can have 
>>input and output ports. This facilitates data flow into and out of the 
>>groups. When this is done at the root level, it allows us to abstract 
>>a NiFi instance as a group to a remote NiFi.
>>
>>Matt Gilman
>
>I think the problem is that sometimes you want a process group to be 
>able to use the trick where you send to a local "remote" input port to 
>load balance. It would be great to be able to hide that detail within a 
>process group, but the reuse of ports for both purposes prevents it.
>
>Could we add an option to select whether the port is for a process 
>group or should listen for remote connections? That seems like an easy 
>way to solve the problem, though I think adding an option to load 
>balance a connection in cluster mode would solve the problem more 
>cleanly. But that would be more work, right?
>
>rb
>
>
>-- Ryan Blue
>Software Engineer
>Cloudera, Inc.

Re: Site to Site not working within process groups

Posted by Ryan Blue <bl...@cloudera.com>.
On 04/03/2015 03:36 PM, Matt Gilman wrote:
> Ricky,
>
> What your seeing is by design. While the approach can be limiting, especially if your looking to expose an [in|out]put port remotely in a sub group, it was done for consistency and simplicity. Groups can have input and output ports. This facilitates data flow into and out of the groups. When this is done at the root level, it allows us to abstract a NiFi instance as a group to a remote NiFi.
>
> Matt Gilman

I think the problem is that sometimes you want a process group to be 
able to use the trick where you send to a local "remote" input port to 
load balance. It would be great to be able to hide that detail within a 
process group, but the reuse of ports for both purposes prevents it.

Could we add an option to select whether the port is for a process group 
or should listen for remote connections? That seems like an easy way to 
solve the problem, though I think adding an option to load balance a 
connection in cluster mode would solve the problem more cleanly. But 
that would be more work, right?

rb


-- 
Ryan Blue
Software Engineer
Cloudera, Inc.

Re: Site to Site not working within process groups

Posted by Matt Gilman <ma...@gmail.com>.
Ricky,

What your seeing is by design. While the approach can be limiting, especially if your looking to expose an [in|out]put port remotely in a sub group, it was done for consistency and simplicity. Groups can have input and output ports. This facilitates data flow into and out of the groups. When this is done at the root level, it allows us to abstract a NiFi instance as a group to a remote NiFi.

Matt Gilman

Sent from my iPhone

> On Apr 3, 2015, at 5:41 PM, Ricky Saltzer <ri...@cloudera.com> wrote:
> 
> Thanks for the heads up!
> 
> *Top Level *(works):
> https://s3.amazonaws.com/uploads.hipchat.com/108018/846877/Q5x6A0VikUraEpS/upload.png
> 
> *Inside Process Group *(remote doesn't work):
> https://s3.amazonaws.com/uploads.hipchat.com/108018/846877/iSpaYoUmGRq7dIz/upload.png
> 
> 
>> On Fri, Apr 3, 2015 at 5:31 PM, Sean Busbey <bu...@cloudera.com> wrote:
>> 
>> ASF mailing lists strip attachements. Can you post the images in a
>> pastebin?
>> 
>>> On Fri, Apr 3, 2015 at 4:23 PM, Ricky Saltzer <ri...@cloudera.com> wrote:
>>> 
>>> I just want to make sure this is a supported feature before I open a
>> JIRA.
>>> It appears as if I can't create a Site-to-Site connection within a
>> process
>>> group. It's easier to explain visually (see below). Any help would be
>>> appreciated, thanks!
>>> 
>>> *Top Level *(works):
>>> 
>>> [image: Inline image 1]
>>> 
>>> 
>>> *Inside Process Group *(remote doesn't work)
>>> 
>>> [image: Inline image 1]
>>> 
>>> 
>>> --
>>> Ricky Saltzer
>>> http://www.cloudera.com
>> 
>> 
>> --
>> Sean
> 
> 
> 
> -- 
> Ricky Saltzer
> http://www.cloudera.com

Re: Site to Site not working within process groups

Posted by Ricky Saltzer <ri...@cloudera.com>.
Thanks for the heads up!

*Top Level *(works):
https://s3.amazonaws.com/uploads.hipchat.com/108018/846877/Q5x6A0VikUraEpS/upload.png

*Inside Process Group *(remote doesn't work):
https://s3.amazonaws.com/uploads.hipchat.com/108018/846877/iSpaYoUmGRq7dIz/upload.png


On Fri, Apr 3, 2015 at 5:31 PM, Sean Busbey <bu...@cloudera.com> wrote:

> ASF mailing lists strip attachements. Can you post the images in a
> pastebin?
>
> On Fri, Apr 3, 2015 at 4:23 PM, Ricky Saltzer <ri...@cloudera.com> wrote:
>
> > I just want to make sure this is a supported feature before I open a
> JIRA.
> > It appears as if I can't create a Site-to-Site connection within a
> process
> > group. It's easier to explain visually (see below). Any help would be
> > appreciated, thanks!
> >
> > *Top Level *(works):
> >
> > [image: Inline image 1]
> >
> >
> > *Inside Process Group *(remote doesn't work)
> >
> > [image: Inline image 1]
> >
> >
> > --
> > Ricky Saltzer
> > http://www.cloudera.com
> >
> >
>
>
> --
> Sean
>



-- 
Ricky Saltzer
http://www.cloudera.com

Re: Site to Site not working within process groups

Posted by Sean Busbey <bu...@cloudera.com>.
ASF mailing lists strip attachements. Can you post the images in a pastebin?

On Fri, Apr 3, 2015 at 4:23 PM, Ricky Saltzer <ri...@cloudera.com> wrote:

> I just want to make sure this is a supported feature before I open a JIRA.
> It appears as if I can't create a Site-to-Site connection within a process
> group. It's easier to explain visually (see below). Any help would be
> appreciated, thanks!
>
> *Top Level *(works):
>
> [image: Inline image 1]
>
>
> *Inside Process Group *(remote doesn't work)
>
> [image: Inline image 1]
>
>
> --
> Ricky Saltzer
> http://www.cloudera.com
>
>


-- 
Sean