You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Jens Deppe <je...@apache.org> on 2018/01/23 22:36:36 UTC

Addition of rmi-io library

Apologies that this was not raised earlier in discussion but I'm happy to
describe it now.

*Background:*

When deploying jars into Geode they are moved through the system as simple
byte[] blobs. This obviously consumes memory. The various affected areas
are:

- gfsh reads the jars into memory
- the jars are pushed to the locator (via a jmx call) - again creating a
byte[] blob on the locator
- from the locator, the jars are pushed to all servers via a function call
(also sending the jars as byte[] blobs).

Obviously if the jar is small this would not be a problem, however in
memory constrained systems or with large jars this is obviously going to
put pressure on memory and possibly result in OOM situations. In fact, the
reason this came up was that some folks were unable to deploy a 40Mb jar to
a 512Mb (heap) locator.

*rmi-io:*

After doing some research, it seemed that the ideal solution would be
something that allows for serializing Input/OutputStreams. Java doesn't
provide anything natively.

One library that stood out as being robust and feature complete was rmi-io
[1]. This allows for serializing a remote Input/OutputStream object which
then lets us completely avoid having to pull deploying jars into memory
everywhere. Under the covers it uses RMI and allows for either 'pulling' or
'pushing' data. The reference page [2] has nice sequence diagrams.

If anyone sees any issues with this, please do raise them. The current
usage of this has not changed any user-facing interaction so ultimately
changing the actual implemented fix for this problem (if we needed to)
would not have any external effect.

Thanks
--Jens

[1] http://openhms.sourceforge.net/rmiio
[2] http://openhms.sourceforge.net/rmiio/class_reference.html

Re: Addition of rmi-io library

Posted by Jens Deppe <jd...@pivotal.io>.
Right - it needs to piggy-back on the existing JMX/RMI port in which case
no new ports will be required.

We're testing this now.

--Jens

On Wed, Jan 24, 2018 at 9:58 AM, Jinmei Liao <ji...@pivotal.io> wrote:

> yeah, Jens just found that out too. It's opening up a new port in either
> server/server and gfsh/jmManager cases. I think he has a solution to it and
> we will get it in soon.
>
> On Wed, Jan 24, 2018 at 9:47 AM, Dan Smith <ds...@pivotal.io> wrote:
>
> > >
> > > the content is going over the wire on whatever port that was port
> before.
> >
> >
> > From what I see, DownloadJarFunction is calling
> > SimpleRemoteInputStream.export() which will call
> > UnicastRemoteObject.exportObject. That's an RMI call to start a tcp
> server
> > socket listening for connections to interact with that object.
> >
> > -Dan
> >
> > On Tue, Jan 23, 2018 at 6:15 PM, Jinmei Liao <ji...@pivotal.io> wrote:
> >
> > > As far as I can see, we are utilizing the streaming capability provided
> > by
> > > the rmi-io, the content is going over the wire on whatever port that
> was
> > > port before. When streaming content from the gfsh to the jmxManager,
> it's
> > > using the jmx port; when getting jars between locator/servers, it's
> using
> > > the FunctionService, so it's whatever communication channel that
> > > FunctionService is using.
> > >
> > > All the FileContent are saved in temp folder, and get cleaned up after
> > each
> > > deployment.
> > >
> > > On Tue, Jan 23, 2018 at 3:17 PM, Dan Smith <ds...@pivotal.io> wrote:
> > >
> > > > I don't have an issue with the dependency. But if we are opening up
> new
> > > > ports for RMI connections, that seems like a potential security risk.
> > If
> > > > someone has enabled cluster SSL we shouldn't be opening up an
> insecure
> > > port
> > > > for RMI connections.
> > > >
> > > > We should also make sure this is not leaking open sockets/file
> > > decriptors.
> > > > How does this SimpleRemoteInputStream we are creating get shutdown
> and
> > > > cleaned up?
> > > >
> > > > -Dan
> > > >
> > > > On Tue, Jan 23, 2018 at 2:36 PM, Jens Deppe <je...@apache.org>
> > > wrote:
> > > >
> > > > > Apologies that this was not raised earlier in discussion but I'm
> > happy
> > > to
> > > > > describe it now.
> > > > >
> > > > > *Background:*
> > > > >
> > > > > When deploying jars into Geode they are moved through the system as
> > > > simple
> > > > > byte[] blobs. This obviously consumes memory. The various affected
> > > areas
> > > > > are:
> > > > >
> > > > > - gfsh reads the jars into memory
> > > > > - the jars are pushed to the locator (via a jmx call) - again
> > creating
> > > a
> > > > > byte[] blob on the locator
> > > > > - from the locator, the jars are pushed to all servers via a
> function
> > > > call
> > > > > (also sending the jars as byte[] blobs).
> > > > >
> > > > > Obviously if the jar is small this would not be a problem, however
> in
> > > > > memory constrained systems or with large jars this is obviously
> going
> > > to
> > > > > put pressure on memory and possibly result in OOM situations. In
> > fact,
> > > > the
> > > > > reason this came up was that some folks were unable to deploy a
> 40Mb
> > > jar
> > > > to
> > > > > a 512Mb (heap) locator.
> > > > >
> > > > > *rmi-io:*
> > > > >
> > > > > After doing some research, it seemed that the ideal solution would
> be
> > > > > something that allows for serializing Input/OutputStreams. Java
> > doesn't
> > > > > provide anything natively.
> > > > >
> > > > > One library that stood out as being robust and feature complete was
> > > > rmi-io
> > > > > [1]. This allows for serializing a remote Input/OutputStream object
> > > which
> > > > > then lets us completely avoid having to pull deploying jars into
> > memory
> > > > > everywhere. Under the covers it uses RMI and allows for either
> > > 'pulling'
> > > > or
> > > > > 'pushing' data. The reference page [2] has nice sequence diagrams.
> > > > >
> > > > > If anyone sees any issues with this, please do raise them. The
> > current
> > > > > usage of this has not changed any user-facing interaction so
> > ultimately
> > > > > changing the actual implemented fix for this problem (if we needed
> > to)
> > > > > would not have any external effect.
> > > > >
> > > > > Thanks
> > > > > --Jens
> > > > >
> > > > > [1] http://openhms.sourceforge.net/rmiio
> > > > > [2] http://openhms.sourceforge.net/rmiio/class_reference.html
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Cheers
> > >
> > > Jinmei
> > >
> >
>
>
>
> --
> Cheers
>
> Jinmei
>

Re: Addition of rmi-io library

Posted by Jens Deppe <jd...@pivotal.io>.
To address this we have two Jiras tracking the issue:

GEODE-4370 - jar deployment should not require additional ports to be opened
GEODE-4379 - gfsh deploy should push jars not have the server pull them

Essentially these will result in no new RMI ports being used/opened.

Code for 4370 is already checked in and 4379 is currently being tested.

--Jens

On Wed, Jan 24, 2018 at 11:42 AM, Anthony Baker <ab...@pivotal.io> wrote:

> Mike,
>
> ??  I think you meant 1.4.0 ??
>
> Let’s keep the release discussion on the VOTE thread.  IMO, adding another
> port has implications for security, firewalls, containers, bind addresses,
> etc.  I haven’t cast my release vote yet since I’m still considering these
> things.
>
> Anthony
>
>
> > On Jan 24, 2018, at 11:24 AM, Michael Stolz <ms...@pivotal.io> wrote:
> >
> > There is going to be a 9.3.1 shortly after 9.3.0. Lets not hold 9.3.0 for
> > this.
> >
> > --
> > Mike Stolz
> > Principal Engineer - Gemfire Product Manager
> > Mobile: 631-835-4771
> >
> > On Jan 24, 2018 10:58 AM, "Jinmei Liao" <ji...@pivotal.io> wrote:
> >
> > yeah, Jens just found that out too. It's opening up a new port in either
> > server/server and gfsh/jmManager cases. I think he has a solution to it
> and
> > we will get it in soon.
> >
> > On Wed, Jan 24, 2018 at 9:47 AM, Dan Smith <ds...@pivotal.io> wrote:
> >
> >>>
> >>> the content is going over the wire on whatever port that was port
> > before.
> >>
> >>
> >> From what I see, DownloadJarFunction is calling
> >> SimpleRemoteInputStream.export() which will call
> >> UnicastRemoteObject.exportObject. That's an RMI call to start a tcp
> server
> >> socket listening for connections to interact with that object.
> >>
> >> -Dan
> >>
> >> On Tue, Jan 23, 2018 at 6:15 PM, Jinmei Liao <ji...@pivotal.io> wrote:
> >>
> >>> As far as I can see, we are utilizing the streaming capability provided
> >> by
> >>> the rmi-io, the content is going over the wire on whatever port that
> was
> >>> port before. When streaming content from the gfsh to the jmxManager,
> > it's
> >>> using the jmx port; when getting jars between locator/servers, it's
> > using
> >>> the FunctionService, so it's whatever communication channel that
> >>> FunctionService is using.
> >>>
> >>> All the FileContent are saved in temp folder, and get cleaned up after
> >> each
> >>> deployment.
> >>>
> >>> On Tue, Jan 23, 2018 at 3:17 PM, Dan Smith <ds...@pivotal.io> wrote:
> >>>
> >>>> I don't have an issue with the dependency. But if we are opening up
> > new
> >>>> ports for RMI connections, that seems like a potential security risk.
> >> If
> >>>> someone has enabled cluster SSL we shouldn't be opening up an insecure
> >>> port
> >>>> for RMI connections.
> >>>>
> >>>> We should also make sure this is not leaking open sockets/file
> >>> decriptors.
> >>>> How does this SimpleRemoteInputStream we are creating get shutdown and
> >>>> cleaned up?
> >>>>
> >>>> -Dan
> >>>>
> >>>> On Tue, Jan 23, 2018 at 2:36 PM, Jens Deppe <je...@apache.org>
> >>> wrote:
> >>>>
> >>>>> Apologies that this was not raised earlier in discussion but I'm
> >> happy
> >>> to
> >>>>> describe it now.
> >>>>>
> >>>>> *Background:*
> >>>>>
> >>>>> When deploying jars into Geode they are moved through the system as
> >>>> simple
> >>>>> byte[] blobs. This obviously consumes memory. The various affected
> >>> areas
> >>>>> are:
> >>>>>
> >>>>> - gfsh reads the jars into memory
> >>>>> - the jars are pushed to the locator (via a jmx call) - again
> >> creating
> >>> a
> >>>>> byte[] blob on the locator
> >>>>> - from the locator, the jars are pushed to all servers via a
> > function
> >>>> call
> >>>>> (also sending the jars as byte[] blobs).
> >>>>>
> >>>>> Obviously if the jar is small this would not be a problem, however
> > in
> >>>>> memory constrained systems or with large jars this is obviously
> > going
> >>> to
> >>>>> put pressure on memory and possibly result in OOM situations. In
> >> fact,
> >>>> the
> >>>>> reason this came up was that some folks were unable to deploy a 40Mb
> >>> jar
> >>>> to
> >>>>> a 512Mb (heap) locator.
> >>>>>
> >>>>> *rmi-io:*
> >>>>>
> >>>>> After doing some research, it seemed that the ideal solution would
> > be
> >>>>> something that allows for serializing Input/OutputStreams. Java
> >> doesn't
> >>>>> provide anything natively.
> >>>>>
> >>>>> One library that stood out as being robust and feature complete was
> >>>> rmi-io
> >>>>> [1]. This allows for serializing a remote Input/OutputStream object
> >>> which
> >>>>> then lets us completely avoid having to pull deploying jars into
> >> memory
> >>>>> everywhere. Under the covers it uses RMI and allows for either
> >>> 'pulling'
> >>>> or
> >>>>> 'pushing' data. The reference page [2] has nice sequence diagrams.
> >>>>>
> >>>>> If anyone sees any issues with this, please do raise them. The
> >> current
> >>>>> usage of this has not changed any user-facing interaction so
> >> ultimately
> >>>>> changing the actual implemented fix for this problem (if we needed
> >> to)
> >>>>> would not have any external effect.
> >>>>>
> >>>>> Thanks
> >>>>> --Jens
> >>>>>
> >>>>> [1] http://openhms.sourceforge.net/rmiio
> >>>>> [2] http://openhms.sourceforge.net/rmiio/class_reference.html
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Cheers
> >>>
> >>> Jinmei
> >>>
> >>
> >
> >
> >
> > --
> > Cheers
> >
> > Jinmei
>
>

Re: Addition of rmi-io library

Posted by Anthony Baker <ab...@pivotal.io>.
Mike,

??  I think you meant 1.4.0 ??

Let’s keep the release discussion on the VOTE thread.  IMO, adding another port has implications for security, firewalls, containers, bind addresses, etc.  I haven’t cast my release vote yet since I’m still considering these things.

Anthony


> On Jan 24, 2018, at 11:24 AM, Michael Stolz <ms...@pivotal.io> wrote:
> 
> There is going to be a 9.3.1 shortly after 9.3.0. Lets not hold 9.3.0 for
> this.
> 
> --
> Mike Stolz
> Principal Engineer - Gemfire Product Manager
> Mobile: 631-835-4771
> 
> On Jan 24, 2018 10:58 AM, "Jinmei Liao" <ji...@pivotal.io> wrote:
> 
> yeah, Jens just found that out too. It's opening up a new port in either
> server/server and gfsh/jmManager cases. I think he has a solution to it and
> we will get it in soon.
> 
> On Wed, Jan 24, 2018 at 9:47 AM, Dan Smith <ds...@pivotal.io> wrote:
> 
>>> 
>>> the content is going over the wire on whatever port that was port
> before.
>> 
>> 
>> From what I see, DownloadJarFunction is calling
>> SimpleRemoteInputStream.export() which will call
>> UnicastRemoteObject.exportObject. That's an RMI call to start a tcp server
>> socket listening for connections to interact with that object.
>> 
>> -Dan
>> 
>> On Tue, Jan 23, 2018 at 6:15 PM, Jinmei Liao <ji...@pivotal.io> wrote:
>> 
>>> As far as I can see, we are utilizing the streaming capability provided
>> by
>>> the rmi-io, the content is going over the wire on whatever port that was
>>> port before. When streaming content from the gfsh to the jmxManager,
> it's
>>> using the jmx port; when getting jars between locator/servers, it's
> using
>>> the FunctionService, so it's whatever communication channel that
>>> FunctionService is using.
>>> 
>>> All the FileContent are saved in temp folder, and get cleaned up after
>> each
>>> deployment.
>>> 
>>> On Tue, Jan 23, 2018 at 3:17 PM, Dan Smith <ds...@pivotal.io> wrote:
>>> 
>>>> I don't have an issue with the dependency. But if we are opening up
> new
>>>> ports for RMI connections, that seems like a potential security risk.
>> If
>>>> someone has enabled cluster SSL we shouldn't be opening up an insecure
>>> port
>>>> for RMI connections.
>>>> 
>>>> We should also make sure this is not leaking open sockets/file
>>> decriptors.
>>>> How does this SimpleRemoteInputStream we are creating get shutdown and
>>>> cleaned up?
>>>> 
>>>> -Dan
>>>> 
>>>> On Tue, Jan 23, 2018 at 2:36 PM, Jens Deppe <je...@apache.org>
>>> wrote:
>>>> 
>>>>> Apologies that this was not raised earlier in discussion but I'm
>> happy
>>> to
>>>>> describe it now.
>>>>> 
>>>>> *Background:*
>>>>> 
>>>>> When deploying jars into Geode they are moved through the system as
>>>> simple
>>>>> byte[] blobs. This obviously consumes memory. The various affected
>>> areas
>>>>> are:
>>>>> 
>>>>> - gfsh reads the jars into memory
>>>>> - the jars are pushed to the locator (via a jmx call) - again
>> creating
>>> a
>>>>> byte[] blob on the locator
>>>>> - from the locator, the jars are pushed to all servers via a
> function
>>>> call
>>>>> (also sending the jars as byte[] blobs).
>>>>> 
>>>>> Obviously if the jar is small this would not be a problem, however
> in
>>>>> memory constrained systems or with large jars this is obviously
> going
>>> to
>>>>> put pressure on memory and possibly result in OOM situations. In
>> fact,
>>>> the
>>>>> reason this came up was that some folks were unable to deploy a 40Mb
>>> jar
>>>> to
>>>>> a 512Mb (heap) locator.
>>>>> 
>>>>> *rmi-io:*
>>>>> 
>>>>> After doing some research, it seemed that the ideal solution would
> be
>>>>> something that allows for serializing Input/OutputStreams. Java
>> doesn't
>>>>> provide anything natively.
>>>>> 
>>>>> One library that stood out as being robust and feature complete was
>>>> rmi-io
>>>>> [1]. This allows for serializing a remote Input/OutputStream object
>>> which
>>>>> then lets us completely avoid having to pull deploying jars into
>> memory
>>>>> everywhere. Under the covers it uses RMI and allows for either
>>> 'pulling'
>>>> or
>>>>> 'pushing' data. The reference page [2] has nice sequence diagrams.
>>>>> 
>>>>> If anyone sees any issues with this, please do raise them. The
>> current
>>>>> usage of this has not changed any user-facing interaction so
>> ultimately
>>>>> changing the actual implemented fix for this problem (if we needed
>> to)
>>>>> would not have any external effect.
>>>>> 
>>>>> Thanks
>>>>> --Jens
>>>>> 
>>>>> [1] http://openhms.sourceforge.net/rmiio
>>>>> [2] http://openhms.sourceforge.net/rmiio/class_reference.html
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Cheers
>>> 
>>> Jinmei
>>> 
>> 
> 
> 
> 
> --
> Cheers
> 
> Jinmei


Re: Addition of rmi-io library

Posted by Michael Stolz <ms...@pivotal.io>.
There is going to be a 9.3.1 shortly after 9.3.0. Lets not hold 9.3.0 for
this.

--
Mike Stolz
Principal Engineer - Gemfire Product Manager
Mobile: 631-835-4771

On Jan 24, 2018 10:58 AM, "Jinmei Liao" <ji...@pivotal.io> wrote:

yeah, Jens just found that out too. It's opening up a new port in either
server/server and gfsh/jmManager cases. I think he has a solution to it and
we will get it in soon.

On Wed, Jan 24, 2018 at 9:47 AM, Dan Smith <ds...@pivotal.io> wrote:

> >
> > the content is going over the wire on whatever port that was port
before.
>
>
> From what I see, DownloadJarFunction is calling
> SimpleRemoteInputStream.export() which will call
> UnicastRemoteObject.exportObject. That's an RMI call to start a tcp server
> socket listening for connections to interact with that object.
>
> -Dan
>
> On Tue, Jan 23, 2018 at 6:15 PM, Jinmei Liao <ji...@pivotal.io> wrote:
>
> > As far as I can see, we are utilizing the streaming capability provided
> by
> > the rmi-io, the content is going over the wire on whatever port that was
> > port before. When streaming content from the gfsh to the jmxManager,
it's
> > using the jmx port; when getting jars between locator/servers, it's
using
> > the FunctionService, so it's whatever communication channel that
> > FunctionService is using.
> >
> > All the FileContent are saved in temp folder, and get cleaned up after
> each
> > deployment.
> >
> > On Tue, Jan 23, 2018 at 3:17 PM, Dan Smith <ds...@pivotal.io> wrote:
> >
> > > I don't have an issue with the dependency. But if we are opening up
new
> > > ports for RMI connections, that seems like a potential security risk.
> If
> > > someone has enabled cluster SSL we shouldn't be opening up an insecure
> > port
> > > for RMI connections.
> > >
> > > We should also make sure this is not leaking open sockets/file
> > decriptors.
> > > How does this SimpleRemoteInputStream we are creating get shutdown and
> > > cleaned up?
> > >
> > > -Dan
> > >
> > > On Tue, Jan 23, 2018 at 2:36 PM, Jens Deppe <je...@apache.org>
> > wrote:
> > >
> > > > Apologies that this was not raised earlier in discussion but I'm
> happy
> > to
> > > > describe it now.
> > > >
> > > > *Background:*
> > > >
> > > > When deploying jars into Geode they are moved through the system as
> > > simple
> > > > byte[] blobs. This obviously consumes memory. The various affected
> > areas
> > > > are:
> > > >
> > > > - gfsh reads the jars into memory
> > > > - the jars are pushed to the locator (via a jmx call) - again
> creating
> > a
> > > > byte[] blob on the locator
> > > > - from the locator, the jars are pushed to all servers via a
function
> > > call
> > > > (also sending the jars as byte[] blobs).
> > > >
> > > > Obviously if the jar is small this would not be a problem, however
in
> > > > memory constrained systems or with large jars this is obviously
going
> > to
> > > > put pressure on memory and possibly result in OOM situations. In
> fact,
> > > the
> > > > reason this came up was that some folks were unable to deploy a 40Mb
> > jar
> > > to
> > > > a 512Mb (heap) locator.
> > > >
> > > > *rmi-io:*
> > > >
> > > > After doing some research, it seemed that the ideal solution would
be
> > > > something that allows for serializing Input/OutputStreams. Java
> doesn't
> > > > provide anything natively.
> > > >
> > > > One library that stood out as being robust and feature complete was
> > > rmi-io
> > > > [1]. This allows for serializing a remote Input/OutputStream object
> > which
> > > > then lets us completely avoid having to pull deploying jars into
> memory
> > > > everywhere. Under the covers it uses RMI and allows for either
> > 'pulling'
> > > or
> > > > 'pushing' data. The reference page [2] has nice sequence diagrams.
> > > >
> > > > If anyone sees any issues with this, please do raise them. The
> current
> > > > usage of this has not changed any user-facing interaction so
> ultimately
> > > > changing the actual implemented fix for this problem (if we needed
> to)
> > > > would not have any external effect.
> > > >
> > > > Thanks
> > > > --Jens
> > > >
> > > > [1] http://openhms.sourceforge.net/rmiio
> > > > [2] http://openhms.sourceforge.net/rmiio/class_reference.html
> > > >
> > >
> >
> >
> >
> > --
> > Cheers
> >
> > Jinmei
> >
>



--
Cheers

Jinmei

Re: Addition of rmi-io library

Posted by Jinmei Liao <ji...@pivotal.io>.
yeah, Jens just found that out too. It's opening up a new port in either
server/server and gfsh/jmManager cases. I think he has a solution to it and
we will get it in soon.

On Wed, Jan 24, 2018 at 9:47 AM, Dan Smith <ds...@pivotal.io> wrote:

> >
> > the content is going over the wire on whatever port that was port before.
>
>
> From what I see, DownloadJarFunction is calling
> SimpleRemoteInputStream.export() which will call
> UnicastRemoteObject.exportObject. That's an RMI call to start a tcp server
> socket listening for connections to interact with that object.
>
> -Dan
>
> On Tue, Jan 23, 2018 at 6:15 PM, Jinmei Liao <ji...@pivotal.io> wrote:
>
> > As far as I can see, we are utilizing the streaming capability provided
> by
> > the rmi-io, the content is going over the wire on whatever port that was
> > port before. When streaming content from the gfsh to the jmxManager, it's
> > using the jmx port; when getting jars between locator/servers, it's using
> > the FunctionService, so it's whatever communication channel that
> > FunctionService is using.
> >
> > All the FileContent are saved in temp folder, and get cleaned up after
> each
> > deployment.
> >
> > On Tue, Jan 23, 2018 at 3:17 PM, Dan Smith <ds...@pivotal.io> wrote:
> >
> > > I don't have an issue with the dependency. But if we are opening up new
> > > ports for RMI connections, that seems like a potential security risk.
> If
> > > someone has enabled cluster SSL we shouldn't be opening up an insecure
> > port
> > > for RMI connections.
> > >
> > > We should also make sure this is not leaking open sockets/file
> > decriptors.
> > > How does this SimpleRemoteInputStream we are creating get shutdown and
> > > cleaned up?
> > >
> > > -Dan
> > >
> > > On Tue, Jan 23, 2018 at 2:36 PM, Jens Deppe <je...@apache.org>
> > wrote:
> > >
> > > > Apologies that this was not raised earlier in discussion but I'm
> happy
> > to
> > > > describe it now.
> > > >
> > > > *Background:*
> > > >
> > > > When deploying jars into Geode they are moved through the system as
> > > simple
> > > > byte[] blobs. This obviously consumes memory. The various affected
> > areas
> > > > are:
> > > >
> > > > - gfsh reads the jars into memory
> > > > - the jars are pushed to the locator (via a jmx call) - again
> creating
> > a
> > > > byte[] blob on the locator
> > > > - from the locator, the jars are pushed to all servers via a function
> > > call
> > > > (also sending the jars as byte[] blobs).
> > > >
> > > > Obviously if the jar is small this would not be a problem, however in
> > > > memory constrained systems or with large jars this is obviously going
> > to
> > > > put pressure on memory and possibly result in OOM situations. In
> fact,
> > > the
> > > > reason this came up was that some folks were unable to deploy a 40Mb
> > jar
> > > to
> > > > a 512Mb (heap) locator.
> > > >
> > > > *rmi-io:*
> > > >
> > > > After doing some research, it seemed that the ideal solution would be
> > > > something that allows for serializing Input/OutputStreams. Java
> doesn't
> > > > provide anything natively.
> > > >
> > > > One library that stood out as being robust and feature complete was
> > > rmi-io
> > > > [1]. This allows for serializing a remote Input/OutputStream object
> > which
> > > > then lets us completely avoid having to pull deploying jars into
> memory
> > > > everywhere. Under the covers it uses RMI and allows for either
> > 'pulling'
> > > or
> > > > 'pushing' data. The reference page [2] has nice sequence diagrams.
> > > >
> > > > If anyone sees any issues with this, please do raise them. The
> current
> > > > usage of this has not changed any user-facing interaction so
> ultimately
> > > > changing the actual implemented fix for this problem (if we needed
> to)
> > > > would not have any external effect.
> > > >
> > > > Thanks
> > > > --Jens
> > > >
> > > > [1] http://openhms.sourceforge.net/rmiio
> > > > [2] http://openhms.sourceforge.net/rmiio/class_reference.html
> > > >
> > >
> >
> >
> >
> > --
> > Cheers
> >
> > Jinmei
> >
>



-- 
Cheers

Jinmei

Re: Addition of rmi-io library

Posted by Dan Smith <ds...@pivotal.io>.
>
> the content is going over the wire on whatever port that was port before.


From what I see, DownloadJarFunction is calling
SimpleRemoteInputStream.export() which will call
UnicastRemoteObject.exportObject. That's an RMI call to start a tcp server
socket listening for connections to interact with that object.

-Dan

On Tue, Jan 23, 2018 at 6:15 PM, Jinmei Liao <ji...@pivotal.io> wrote:

> As far as I can see, we are utilizing the streaming capability provided by
> the rmi-io, the content is going over the wire on whatever port that was
> port before. When streaming content from the gfsh to the jmxManager, it's
> using the jmx port; when getting jars between locator/servers, it's using
> the FunctionService, so it's whatever communication channel that
> FunctionService is using.
>
> All the FileContent are saved in temp folder, and get cleaned up after each
> deployment.
>
> On Tue, Jan 23, 2018 at 3:17 PM, Dan Smith <ds...@pivotal.io> wrote:
>
> > I don't have an issue with the dependency. But if we are opening up new
> > ports for RMI connections, that seems like a potential security risk. If
> > someone has enabled cluster SSL we shouldn't be opening up an insecure
> port
> > for RMI connections.
> >
> > We should also make sure this is not leaking open sockets/file
> decriptors.
> > How does this SimpleRemoteInputStream we are creating get shutdown and
> > cleaned up?
> >
> > -Dan
> >
> > On Tue, Jan 23, 2018 at 2:36 PM, Jens Deppe <je...@apache.org>
> wrote:
> >
> > > Apologies that this was not raised earlier in discussion but I'm happy
> to
> > > describe it now.
> > >
> > > *Background:*
> > >
> > > When deploying jars into Geode they are moved through the system as
> > simple
> > > byte[] blobs. This obviously consumes memory. The various affected
> areas
> > > are:
> > >
> > > - gfsh reads the jars into memory
> > > - the jars are pushed to the locator (via a jmx call) - again creating
> a
> > > byte[] blob on the locator
> > > - from the locator, the jars are pushed to all servers via a function
> > call
> > > (also sending the jars as byte[] blobs).
> > >
> > > Obviously if the jar is small this would not be a problem, however in
> > > memory constrained systems or with large jars this is obviously going
> to
> > > put pressure on memory and possibly result in OOM situations. In fact,
> > the
> > > reason this came up was that some folks were unable to deploy a 40Mb
> jar
> > to
> > > a 512Mb (heap) locator.
> > >
> > > *rmi-io:*
> > >
> > > After doing some research, it seemed that the ideal solution would be
> > > something that allows for serializing Input/OutputStreams. Java doesn't
> > > provide anything natively.
> > >
> > > One library that stood out as being robust and feature complete was
> > rmi-io
> > > [1]. This allows for serializing a remote Input/OutputStream object
> which
> > > then lets us completely avoid having to pull deploying jars into memory
> > > everywhere. Under the covers it uses RMI and allows for either
> 'pulling'
> > or
> > > 'pushing' data. The reference page [2] has nice sequence diagrams.
> > >
> > > If anyone sees any issues with this, please do raise them. The current
> > > usage of this has not changed any user-facing interaction so ultimately
> > > changing the actual implemented fix for this problem (if we needed to)
> > > would not have any external effect.
> > >
> > > Thanks
> > > --Jens
> > >
> > > [1] http://openhms.sourceforge.net/rmiio
> > > [2] http://openhms.sourceforge.net/rmiio/class_reference.html
> > >
> >
>
>
>
> --
> Cheers
>
> Jinmei
>

Re: Addition of rmi-io library

Posted by Jinmei Liao <ji...@pivotal.io>.
As far as I can see, we are utilizing the streaming capability provided by
the rmi-io, the content is going over the wire on whatever port that was
port before. When streaming content from the gfsh to the jmxManager, it's
using the jmx port; when getting jars between locator/servers, it's using
the FunctionService, so it's whatever communication channel that
FunctionService is using.

All the FileContent are saved in temp folder, and get cleaned up after each
deployment.

On Tue, Jan 23, 2018 at 3:17 PM, Dan Smith <ds...@pivotal.io> wrote:

> I don't have an issue with the dependency. But if we are opening up new
> ports for RMI connections, that seems like a potential security risk. If
> someone has enabled cluster SSL we shouldn't be opening up an insecure port
> for RMI connections.
>
> We should also make sure this is not leaking open sockets/file decriptors.
> How does this SimpleRemoteInputStream we are creating get shutdown and
> cleaned up?
>
> -Dan
>
> On Tue, Jan 23, 2018 at 2:36 PM, Jens Deppe <je...@apache.org> wrote:
>
> > Apologies that this was not raised earlier in discussion but I'm happy to
> > describe it now.
> >
> > *Background:*
> >
> > When deploying jars into Geode they are moved through the system as
> simple
> > byte[] blobs. This obviously consumes memory. The various affected areas
> > are:
> >
> > - gfsh reads the jars into memory
> > - the jars are pushed to the locator (via a jmx call) - again creating a
> > byte[] blob on the locator
> > - from the locator, the jars are pushed to all servers via a function
> call
> > (also sending the jars as byte[] blobs).
> >
> > Obviously if the jar is small this would not be a problem, however in
> > memory constrained systems or with large jars this is obviously going to
> > put pressure on memory and possibly result in OOM situations. In fact,
> the
> > reason this came up was that some folks were unable to deploy a 40Mb jar
> to
> > a 512Mb (heap) locator.
> >
> > *rmi-io:*
> >
> > After doing some research, it seemed that the ideal solution would be
> > something that allows for serializing Input/OutputStreams. Java doesn't
> > provide anything natively.
> >
> > One library that stood out as being robust and feature complete was
> rmi-io
> > [1]. This allows for serializing a remote Input/OutputStream object which
> > then lets us completely avoid having to pull deploying jars into memory
> > everywhere. Under the covers it uses RMI and allows for either 'pulling'
> or
> > 'pushing' data. The reference page [2] has nice sequence diagrams.
> >
> > If anyone sees any issues with this, please do raise them. The current
> > usage of this has not changed any user-facing interaction so ultimately
> > changing the actual implemented fix for this problem (if we needed to)
> > would not have any external effect.
> >
> > Thanks
> > --Jens
> >
> > [1] http://openhms.sourceforge.net/rmiio
> > [2] http://openhms.sourceforge.net/rmiio/class_reference.html
> >
>



-- 
Cheers

Jinmei

Re: Addition of rmi-io library

Posted by Dan Smith <ds...@pivotal.io>.
I don't have an issue with the dependency. But if we are opening up new
ports for RMI connections, that seems like a potential security risk. If
someone has enabled cluster SSL we shouldn't be opening up an insecure port
for RMI connections.

We should also make sure this is not leaking open sockets/file decriptors.
How does this SimpleRemoteInputStream we are creating get shutdown and
cleaned up?

-Dan

On Tue, Jan 23, 2018 at 2:36 PM, Jens Deppe <je...@apache.org> wrote:

> Apologies that this was not raised earlier in discussion but I'm happy to
> describe it now.
>
> *Background:*
>
> When deploying jars into Geode they are moved through the system as simple
> byte[] blobs. This obviously consumes memory. The various affected areas
> are:
>
> - gfsh reads the jars into memory
> - the jars are pushed to the locator (via a jmx call) - again creating a
> byte[] blob on the locator
> - from the locator, the jars are pushed to all servers via a function call
> (also sending the jars as byte[] blobs).
>
> Obviously if the jar is small this would not be a problem, however in
> memory constrained systems or with large jars this is obviously going to
> put pressure on memory and possibly result in OOM situations. In fact, the
> reason this came up was that some folks were unable to deploy a 40Mb jar to
> a 512Mb (heap) locator.
>
> *rmi-io:*
>
> After doing some research, it seemed that the ideal solution would be
> something that allows for serializing Input/OutputStreams. Java doesn't
> provide anything natively.
>
> One library that stood out as being robust and feature complete was rmi-io
> [1]. This allows for serializing a remote Input/OutputStream object which
> then lets us completely avoid having to pull deploying jars into memory
> everywhere. Under the covers it uses RMI and allows for either 'pulling' or
> 'pushing' data. The reference page [2] has nice sequence diagrams.
>
> If anyone sees any issues with this, please do raise them. The current
> usage of this has not changed any user-facing interaction so ultimately
> changing the actual implemented fix for this problem (if we needed to)
> would not have any external effect.
>
> Thanks
> --Jens
>
> [1] http://openhms.sourceforge.net/rmiio
> [2] http://openhms.sourceforge.net/rmiio/class_reference.html
>