You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@shindig.apache.org by "Merrill, Matt" <mm...@mitre.org> on 2014/09/04 17:04:58 UTC

Re: Shindig rpc calls to itself

Hi all,

I haven’t heard back on this, so I thought I’d provide some more
information in the hopes that perhaps someone has some ideas as to what
could be causing the issues we’re seeing with shindig’s “loopback” http
calls.

We have a situation where under load we hit a deadlock-like situation
because of the HTTP calls shindig makes to itself when pipelining gadget
data. Basically, the HTTP request threadpools inside our Shindig Tomcat
container are getting maxed out, and when shindig makes an http rpc call
to itself to render a gadget which pipelines data, the request gets held
up waiting for the rpc call, which might be being blocked by the Tomcat
container waiting to handle an HTTP request.  This only happens under
load, of course.

This is puzzling to me because when we were running Shindig 2.0.0, we had
the same size threadpool, and now that we’ve upgraded to Shindig
2.5.0-update1, the threadpools now seem to be getting maxed out.  I took
some timings inside of our various shindig SPI implementions
(PersonService, AppData Service) and I didn’t see anything alarming.
There are also no spikes in user traffic.

As I see it, we have a few options I could explore:

1) The “nuclear” option would be to simply increase our tomcat HTTP
threadpools, but that doesn’t seem prudent since the old version of
shindig worked just fine with that thread pool setting.  I feel like a
greater problem is being masked. Is there anything that changed between
Shindig 2.0.0 and 2.5.0-update1 that could have caused some kind of
increase in traffic to shindig?  I tried looking at release notes in Jira,
but that honestly wasn’t very helpful at all.

2) Re-configure Shindig to use implemented SPI methods (java method calls)
instead of making HTTP calls to itself through the RPC API shindig
exposes?  Based on Stanton’s note below, it seems like there are some
configuration options for the RPC calls, but they’re mostly related to how
the client-side javascript makes the calls.  Is there anything server side
I can configure?  Perhaps with Guice modules?

3) Explore would be if there are hooks in the code to write custom code to
do this. I see in PipelinedDataPreloader.executeSocialRequest that the
javadoc mentions that:
"Subclasses can override to provide special handling (e.g., directly
invoking a local API)”  However, I’m missing something because I can’t
find out where the preloader gets instantiated.  I see that the
PipelineExecutor takes in a Guice injected instance of
PipedlinedDataPreloader, however, I don’t see it getting created anywhere.
 Where is this being configured?

Any help is appreciated!

Thanks!
-Matt

On 8/25/14, 4:55 PM, "Merrill, Matt" <mm...@mitre.org> wrote:

>Thanks Stanton!
>
>I¹m assuming that you mean the javascript calls will call listmethods then
>make any necessary RPC calls, is that correct?  Is there any other
>documentation on the introspection part?
>
>The reason I ask is that we¹re having problems server side when Shindig is
>pipelining data.  For example, when you do the following in a gadget:
><os:ViewerRequest key="viewer" />
>    <os:DataRequest key="appData" method="appdata.get" userId="@viewer"
>appId="@app"/>
>
>
>Shindig appears to make HTTP requests to the rpc endpoint to itself in the
>process of rendering the gadget.  I could be missing something fundamental
>here, but is there any way to configure this differently so that shindig
>simply uses its SPI methods to retrieve this data instead?  Is this really
>just more of a convenience for the gadget developer than anything else?
>
>-Matt
>
>On 8/20/14, 4:14 PM, "Stanton Sievers" <ss...@apache.org> wrote:
>
>>Hi Matt,
>>
>>This behavior is configured in container.js in the "gadgets.features"
>>object.  If you look for "osapi" and "osapi.services", you'll see some
>>comments about this configuration and the behavior.
>>features/container/service.js is where this configuration is used and
>>where
>>the osapi services are instantiated.  As you've seen, Shindig introspects
>>to find available services by default.
>>
>>If I knew at one point why this behaves this way, I've since forgotten.
>>There is a system.listMethods API[1] defined in the Core API Server spec
>>that this might simply be re-using to discover the available services.
>>
>>I hope that helps.
>>
>>-Stanton
>>
>>[1]
>>http://opensocial.github.io/spec/trunk/Core-API-Server.xml#System-Service
>>-
>>ListMethods
>>
>>
>>On Tue, Aug 19, 2014 at 8:13 AM, Merrill, Matt <mm...@mitre.org>
>>wrote:
>>
>>> Good morning,
>>>
>>> I¹m hoping some shindig veterans can help shed some light into the
>>>reason
>>> that Shindig makes HTTP rpc calls to itself as part of the gadget
>>>rendering
>>> process.  Why is this done as opposed to retrieving information via
>>> internal Java method calls?  We hare having lots of issues where this
>>> approach seems to be causing a cascading failure when calls get hung up
>>>in
>>> the HTTPFetcher class.
>>>
>>> Also, I¹m curious what calls are made in this manner and how can they
>>>be
>>> configured?  I have seen retrieval of viewer data done this way, as
>>>well as
>>> application data.
>>>
>>> I¹ve looked for documentation on this topic before and have not seen
>>>any.
>>>  Any help is much appreciated.
>>>
>>> Thanks!
>>> -Matt Merrill
>>>
>


Re: Shindig rpc calls to itself

Posted by "Merrill, Matt" <mm...@mitre.org>.
Ok thanks, I’m probably reiterating more about the problem itself because
it took me so long to figure out, ha :)

I’ve rendered gadgets and watched internal/external calls with the logging
I’ve instrumented, but I don’t have the same instrumentation in the old
version of the code we have.  I could put that in, though we are about to
pull the plug on this effort for a while (other priorities) and continue
using Shindig 2.0.0.

Our container is almost exactly the same with the exception of the
BlobEncrypter stuff that changed between Shindig 2.0.0 and 2.5.0.  We are
also now retrieving the container js this way:
{shindig host}/gmodules/gadgets/js/container?c=1&container=ourContainerName

Instead of this way:
{shindig host}/gmodules/gadgets/js/rpc.js?container=ourContainerName&c=1

Which as best i can tell in the wiki is the way we should be doing it now.

-Matt

On 9/5/14, 1:34 PM, "Ryan Baxter" <rb...@gmail.com> wrote:

>I understand the issue I am just trying to understand the root cause like
>you ;)
>
>I guess what I am really wondering is if you have done any analysis on
>any additional calls being made between 2.0 and 2.5.0-update1?  Have
>you taken the same gadget rendered it using 2.0 and observed how many
>requests go to /rpc and then done the same thing with 2.5.0-update1.
>It would nice to know if there are really more traffic going to the
>servlet and if so where it is coming from.  I think we need to work
>backwards by identifying the root of the additional requests Shindig
>is making to itself to understand the cause of the problem.  I
>personally can't think of anything that would cause this off the top
>of my head and I can't think of any way to stop them from happening.
>
>What about the container code you are using to render the gadget?  Are
>you using the common container in 2.5.0-update1 or is the container
>code the same and the only difference is the server code?
>
>On Fri, Sep 5, 2014 at 12:48 PM, Merrill, Matt <mm...@mitre.org> wrote:
>> Yes, I added logging for every call out the door and the majority of the
>> calls which are holding up incoming threads are making http calls back
>>to
>> shindig itself, most notably the /rpc servlet endpoint.
>>
>> Basically, because shindig makes a loopback http call to itself and
>>that’s
>> on the same HTTP threadpool, the threadpool is starting to get exhausted
>> and there’s a cascading failure.  However, that threadpool is the same
>> size as we had it on Shindig 2.0.0, so I really can’t explain the
>> difference.
>>
>> I’m really wondering of any differences between 2.0.0 and 2.5.0-update1
>> that might cause additional HTTP calls to be made and whether you can
>> configure or code shindig not to make these loopbacks.
>>
>> -Matt
>>
>> On 9/5/14, 12:42 PM, "Ryan Baxter" <rb...@apache.org> wrote:
>>
>>>So have you looked at what resources the fetcher is fetching?
>>>
>>>On Fri, Sep 5, 2014 at 12:17 PM, Merrill, Matt <mm...@mitre.org>
>>>wrote:
>>>> Yes, we have.  During a couple of the outages we did a thread dump and
>>>>saw
>>>> that all (or almost all) of the threads were blocking on the
>>>> BasicHTTPFetcher fetch method. We also saw the number of threads jump
>>>>up
>>>> to around the same number of threads we have in our Tomcat HTTP thread
>>>> pool (300).
>>>>
>>>> From the best I can tell, it seems as though the issue is that there
>>>>are
>>>> now MORE calls to the various shindig servlets being made which is
>>>>causing
>>>> all of the HTTP threads to get consumed, but we can’t explain why as
>>>>the
>>>> load is the same. Once we roll back to the version of the application
>>>> which uses shindig 2.0.0, everything is absolutely fine.
>>>>
>>>> I’m very hesitant to just increase the thread pool without a good
>>>> understanding of what could cause this.  If someone knows something
>>>>that
>>>> changed between the 2.0.0 and 2.5.0-update1 versions that may have
>>>>caused
>>>> more calls to be made whether through the opensocial java API or
>>>> internally inside shindig that would be great to know.
>>>>
>>>> Or, perhaps a configuration parameter was introduced that we have set
>>>> wrong that may have caused all these extra calls?
>>>>
>>>> We have already made sure our HTTP responses are cached at a very high
>>>> level per your excellent advice. However, because the majority of the
>>>> calls which seem to be taking a long time are RPC calls, it doesn’t
>>>>appear
>>>> these get cached anyway so that wouldn’t affect this problem.
>>>>
>>>> And if someone knows the answers to the configuration/extension
>>>>questions
>>>> about pipelining, that would be great.
>>>>
>>>> Thanks!
>>>>
>>>> -Matt
>>>>
>>>> On 9/5/14, 11:35 AM, "Ryan Baxter" <rb...@apache.org> wrote:
>>>>
>>>>>So Matt have you looking into what those threads are doing?  I agree
>>>>>that it seems odd that with 2.5.1-update1 you are running out of
>>>>>threads but it is hard to pinpoint the reason without knowing what all
>>>>>those extra threads might be doing.
>>>>>
>>>>>
>>>>>On Thu, Sep 4, 2014 at 11:04 AM, Merrill, Matt <mm...@mitre.org>
>>>>>wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I haven’t heard back on this, so I thought I’d provide some more
>>>>>> information in the hopes that perhaps someone has some ideas as to
>>>>>>what
>>>>>> could be causing the issues we’re seeing with shindig’s “loopback”
>>>>>>http
>>>>>> calls.
>>>>>>
>>>>>> We have a situation where under load we hit a deadlock-like
>>>>>>situation
>>>>>> because of the HTTP calls shindig makes to itself when pipelining
>>>>>>gadget
>>>>>> data. Basically, the HTTP request threadpools inside our Shindig
>>>>>>Tomcat
>>>>>> container are getting maxed out, and when shindig makes an http rpc
>>>>>>call
>>>>>> to itself to render a gadget which pipelines data, the request gets
>>>>>>held
>>>>>> up waiting for the rpc call, which might be being blocked by the
>>>>>>Tomcat
>>>>>> container waiting to handle an HTTP request.  This only happens
>>>>>>under
>>>>>> load, of course.
>>>>>>
>>>>>> This is puzzling to me because when we were running Shindig 2.0.0,
>>>>>>we
>>>>>>had
>>>>>> the same size threadpool, and now that we’ve upgraded to Shindig
>>>>>> 2.5.0-update1, the threadpools now seem to be getting maxed out.  I
>>>>>>took
>>>>>> some timings inside of our various shindig SPI implementions
>>>>>> (PersonService, AppData Service) and I didn’t see anything alarming.
>>>>>> There are also no spikes in user traffic.
>>>>>>
>>>>>> As I see it, we have a few options I could explore:
>>>>>>
>>>>>> 1) The “nuclear” option would be to simply increase our tomcat HTTP
>>>>>> threadpools, but that doesn’t seem prudent since the old version of
>>>>>> shindig worked just fine with that thread pool setting.  I feel
>>>>>>like a
>>>>>> greater problem is being masked. Is there anything that changed
>>>>>>between
>>>>>> Shindig 2.0.0 and 2.5.0-update1 that could have caused some kind of
>>>>>> increase in traffic to shindig?  I tried looking at release notes in
>>>>>>Jira,
>>>>>> but that honestly wasn’t very helpful at all.
>>>>>>
>>>>>> 2) Re-configure Shindig to use implemented SPI methods (java method
>>>>>>calls)
>>>>>> instead of making HTTP calls to itself through the RPC API shindig
>>>>>> exposes?  Based on Stanton’s note below, it seems like there are
>>>>>>some
>>>>>> configuration options for the RPC calls, but they’re mostly related
>>>>>>to
>>>>>>how
>>>>>> the client-side javascript makes the calls.  Is there anything
>>>>>>server
>>>>>>side
>>>>>> I can configure?  Perhaps with Guice modules?
>>>>>>
>>>>>> 3) Explore would be if there are hooks in the code to write custom
>>>>>>code
>>>>>>to
>>>>>> do this. I see in PipelinedDataPreloader.executeSocialRequest that
>>>>>>the
>>>>>> javadoc mentions that:
>>>>>> "Subclasses can override to provide special handling (e.g., directly
>>>>>> invoking a local API)”  However, I’m missing something because I
>>>>>>can’t
>>>>>> find out where the preloader gets instantiated.  I see that the
>>>>>> PipelineExecutor takes in a Guice injected instance of
>>>>>> PipedlinedDataPreloader, however, I don’t see it getting created
>>>>>>anywhere.
>>>>>>  Where is this being configured?
>>>>>
>>>>>The intention was probably to make this possible via Guice, but there
>>>>>is not interface you can bind an implementation to.  You would have to
>>>>>replace the classes where PipelinesDataPreloader are used and then
>>>>>keep going up the chain until you get to a class where you can inject
>>>>>something via Guice.  Looks like a messy situation right now with the
>>>>>current way the code is written.
>>>>>
>>>>>>
>>>>>> Any help is appreciated!
>>>>>>
>>>>>> Thanks!
>>>>>> -Matt
>>>>>>
>>>>>> On 8/25/14, 4:55 PM, "Merrill, Matt" <mm...@mitre.org> wrote:
>>>>>>
>>>>>>>Thanks Stanton!
>>>>>>>
>>>>>>>I¹m assuming that you mean the javascript calls will call
>>>>>>>listmethods
>>>>>>>then
>>>>>>>make any necessary RPC calls, is that correct?  Is there any other
>>>>>>>documentation on the introspection part?
>>>>>>>
>>>>>>>The reason I ask is that we¹re having problems server side when
>>>>>>>Shindig
>>>>>>>is
>>>>>>>pipelining data.  For example, when you do the following in a
>>>>>>>gadget:
>>>>>>><os:ViewerRequest key="viewer" />
>>>>>>>    <os:DataRequest key="appData" method="appdata.get"
>>>>>>>userId="@viewer"
>>>>>>>appId="@app"/>
>>>>>>>
>>>>>>>
>>>>>>>Shindig appears to make HTTP requests to the rpc endpoint to itself
>>>>>>>in
>>>>>>>the
>>>>>>>process of rendering the gadget.  I could be missing something
>>>>>>>fundamental
>>>>>>>here, but is there any way to configure this differently so that
>>>>>>>shindig
>>>>>>>simply uses its SPI methods to retrieve this data instead?  Is this
>>>>>>>really
>>>>>>>just more of a convenience for the gadget developer than anything
>>>>>>>else?
>>>>>>>
>>>>>>>-Matt
>>>>>>>
>>>>>>>On 8/20/14, 4:14 PM, "Stanton Sievers" <ss...@apache.org> wrote:
>>>>>>>
>>>>>>>>Hi Matt,
>>>>>>>>
>>>>>>>>This behavior is configured in container.js in the
>>>>>>>>"gadgets.features"
>>>>>>>>object.  If you look for "osapi" and "osapi.services", you'll see
>>>>>>>>some
>>>>>>>>comments about this configuration and the behavior.
>>>>>>>>features/container/service.js is where this configuration is used
>>>>>>>>and
>>>>>>>>where
>>>>>>>>the osapi services are instantiated.  As you've seen, Shindig
>>>>>>>>introspects
>>>>>>>>to find available services by default.
>>>>>>>>
>>>>>>>>If I knew at one point why this behaves this way, I've since
>>>>>>>>forgotten.
>>>>>>>>There is a system.listMethods API[1] defined in the Core API Server
>>>>>>>>spec
>>>>>>>>that this might simply be re-using to discover the available
>>>>>>>>services.
>>>>>>>>
>>>>>>>>I hope that helps.
>>>>>>>>
>>>>>>>>-Stanton
>>>>>>>>
>>>>>>>>[1]
>>>>>>>>http://opensocial.github.io/spec/trunk/Core-API-Server.xml#System-S
>>>>>>>>er
>>>>>>>>vi
>>>>>>>>ce
>>>>>>>>-
>>>>>>>>ListMethods
>>>>>>>>
>>>>>>>>
>>>>>>>>On Tue, Aug 19, 2014 at 8:13 AM, Merrill, Matt <mm...@mitre.org>
>>>>>>>>wrote:
>>>>>>>>
>>>>>>>>> Good morning,
>>>>>>>>>
>>>>>>>>> I¹m hoping some shindig veterans can help shed some light into
>>>>>>>>>the
>>>>>>>>>reason
>>>>>>>>> that Shindig makes HTTP rpc calls to itself as part of the gadget
>>>>>>>>>rendering
>>>>>>>>> process.  Why is this done as opposed to retrieving information
>>>>>>>>>via
>>>>>>>>> internal Java method calls?  We hare having lots of issues where
>>>>>>>>>this
>>>>>>>>> approach seems to be causing a cascading failure when calls get
>>>>>>>>>hung
>>>>>>>>>up
>>>>>>>>>in
>>>>>>>>> the HTTPFetcher class.
>>>>>>>>>
>>>>>>>>> Also, I¹m curious what calls are made in this manner and how can
>>>>>>>>>they
>>>>>>>>>be
>>>>>>>>> configured?  I have seen retrieval of viewer data done this way,
>>>>>>>>>as
>>>>>>>>>well as
>>>>>>>>> application data.
>>>>>>>>>
>>>>>>>>> I¹ve looked for documentation on this topic before and have not
>>>>>>>>>seen
>>>>>>>>>any.
>>>>>>>>>  Any help is much appreciated.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>> -Matt Merrill
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>


Re: Shindig rpc calls to itself

Posted by Ryan Baxter <rb...@gmail.com>.
I understand the issue I am just trying to understand the root cause like you ;)

I guess what I am really wondering is if you have done any analysis on
any additional calls being made between 2.0 and 2.5.0-update1?  Have
you taken the same gadget rendered it using 2.0 and observed how many
requests go to /rpc and then done the same thing with 2.5.0-update1.
It would nice to know if there are really more traffic going to the
servlet and if so where it is coming from.  I think we need to work
backwards by identifying the root of the additional requests Shindig
is making to itself to understand the cause of the problem.  I
personally can't think of anything that would cause this off the top
of my head and I can't think of any way to stop them from happening.

What about the container code you are using to render the gadget?  Are
you using the common container in 2.5.0-update1 or is the container
code the same and the only difference is the server code?

On Fri, Sep 5, 2014 at 12:48 PM, Merrill, Matt <mm...@mitre.org> wrote:
> Yes, I added logging for every call out the door and the majority of the
> calls which are holding up incoming threads are making http calls back to
> shindig itself, most notably the /rpc servlet endpoint.
>
> Basically, because shindig makes a loopback http call to itself and that’s
> on the same HTTP threadpool, the threadpool is starting to get exhausted
> and there’s a cascading failure.  However, that threadpool is the same
> size as we had it on Shindig 2.0.0, so I really can’t explain the
> difference.
>
> I’m really wondering of any differences between 2.0.0 and 2.5.0-update1
> that might cause additional HTTP calls to be made and whether you can
> configure or code shindig not to make these loopbacks.
>
> -Matt
>
> On 9/5/14, 12:42 PM, "Ryan Baxter" <rb...@apache.org> wrote:
>
>>So have you looked at what resources the fetcher is fetching?
>>
>>On Fri, Sep 5, 2014 at 12:17 PM, Merrill, Matt <mm...@mitre.org> wrote:
>>> Yes, we have.  During a couple of the outages we did a thread dump and
>>>saw
>>> that all (or almost all) of the threads were blocking on the
>>> BasicHTTPFetcher fetch method. We also saw the number of threads jump up
>>> to around the same number of threads we have in our Tomcat HTTP thread
>>> pool (300).
>>>
>>> From the best I can tell, it seems as though the issue is that there are
>>> now MORE calls to the various shindig servlets being made which is
>>>causing
>>> all of the HTTP threads to get consumed, but we can’t explain why as the
>>> load is the same. Once we roll back to the version of the application
>>> which uses shindig 2.0.0, everything is absolutely fine.
>>>
>>> I’m very hesitant to just increase the thread pool without a good
>>> understanding of what could cause this.  If someone knows something that
>>> changed between the 2.0.0 and 2.5.0-update1 versions that may have
>>>caused
>>> more calls to be made whether through the opensocial java API or
>>> internally inside shindig that would be great to know.
>>>
>>> Or, perhaps a configuration parameter was introduced that we have set
>>> wrong that may have caused all these extra calls?
>>>
>>> We have already made sure our HTTP responses are cached at a very high
>>> level per your excellent advice. However, because the majority of the
>>> calls which seem to be taking a long time are RPC calls, it doesn’t
>>>appear
>>> these get cached anyway so that wouldn’t affect this problem.
>>>
>>> And if someone knows the answers to the configuration/extension
>>>questions
>>> about pipelining, that would be great.
>>>
>>> Thanks!
>>>
>>> -Matt
>>>
>>> On 9/5/14, 11:35 AM, "Ryan Baxter" <rb...@apache.org> wrote:
>>>
>>>>So Matt have you looking into what those threads are doing?  I agree
>>>>that it seems odd that with 2.5.1-update1 you are running out of
>>>>threads but it is hard to pinpoint the reason without knowing what all
>>>>those extra threads might be doing.
>>>>
>>>>
>>>>On Thu, Sep 4, 2014 at 11:04 AM, Merrill, Matt <mm...@mitre.org>
>>>>wrote:
>>>>> Hi all,
>>>>>
>>>>> I haven’t heard back on this, so I thought I’d provide some more
>>>>> information in the hopes that perhaps someone has some ideas as to
>>>>>what
>>>>> could be causing the issues we’re seeing with shindig’s “loopback”
>>>>>http
>>>>> calls.
>>>>>
>>>>> We have a situation where under load we hit a deadlock-like situation
>>>>> because of the HTTP calls shindig makes to itself when pipelining
>>>>>gadget
>>>>> data. Basically, the HTTP request threadpools inside our Shindig
>>>>>Tomcat
>>>>> container are getting maxed out, and when shindig makes an http rpc
>>>>>call
>>>>> to itself to render a gadget which pipelines data, the request gets
>>>>>held
>>>>> up waiting for the rpc call, which might be being blocked by the
>>>>>Tomcat
>>>>> container waiting to handle an HTTP request.  This only happens under
>>>>> load, of course.
>>>>>
>>>>> This is puzzling to me because when we were running Shindig 2.0.0, we
>>>>>had
>>>>> the same size threadpool, and now that we’ve upgraded to Shindig
>>>>> 2.5.0-update1, the threadpools now seem to be getting maxed out.  I
>>>>>took
>>>>> some timings inside of our various shindig SPI implementions
>>>>> (PersonService, AppData Service) and I didn’t see anything alarming.
>>>>> There are also no spikes in user traffic.
>>>>>
>>>>> As I see it, we have a few options I could explore:
>>>>>
>>>>> 1) The “nuclear” option would be to simply increase our tomcat HTTP
>>>>> threadpools, but that doesn’t seem prudent since the old version of
>>>>> shindig worked just fine with that thread pool setting.  I feel like a
>>>>> greater problem is being masked. Is there anything that changed
>>>>>between
>>>>> Shindig 2.0.0 and 2.5.0-update1 that could have caused some kind of
>>>>> increase in traffic to shindig?  I tried looking at release notes in
>>>>>Jira,
>>>>> but that honestly wasn’t very helpful at all.
>>>>>
>>>>> 2) Re-configure Shindig to use implemented SPI methods (java method
>>>>>calls)
>>>>> instead of making HTTP calls to itself through the RPC API shindig
>>>>> exposes?  Based on Stanton’s note below, it seems like there are some
>>>>> configuration options for the RPC calls, but they’re mostly related to
>>>>>how
>>>>> the client-side javascript makes the calls.  Is there anything server
>>>>>side
>>>>> I can configure?  Perhaps with Guice modules?
>>>>>
>>>>> 3) Explore would be if there are hooks in the code to write custom
>>>>>code
>>>>>to
>>>>> do this. I see in PipelinedDataPreloader.executeSocialRequest that the
>>>>> javadoc mentions that:
>>>>> "Subclasses can override to provide special handling (e.g., directly
>>>>> invoking a local API)”  However, I’m missing something because I can’t
>>>>> find out where the preloader gets instantiated.  I see that the
>>>>> PipelineExecutor takes in a Guice injected instance of
>>>>> PipedlinedDataPreloader, however, I don’t see it getting created
>>>>>anywhere.
>>>>>  Where is this being configured?
>>>>
>>>>The intention was probably to make this possible via Guice, but there
>>>>is not interface you can bind an implementation to.  You would have to
>>>>replace the classes where PipelinesDataPreloader are used and then
>>>>keep going up the chain until you get to a class where you can inject
>>>>something via Guice.  Looks like a messy situation right now with the
>>>>current way the code is written.
>>>>
>>>>>
>>>>> Any help is appreciated!
>>>>>
>>>>> Thanks!
>>>>> -Matt
>>>>>
>>>>> On 8/25/14, 4:55 PM, "Merrill, Matt" <mm...@mitre.org> wrote:
>>>>>
>>>>>>Thanks Stanton!
>>>>>>
>>>>>>I¹m assuming that you mean the javascript calls will call listmethods
>>>>>>then
>>>>>>make any necessary RPC calls, is that correct?  Is there any other
>>>>>>documentation on the introspection part?
>>>>>>
>>>>>>The reason I ask is that we¹re having problems server side when
>>>>>>Shindig
>>>>>>is
>>>>>>pipelining data.  For example, when you do the following in a gadget:
>>>>>><os:ViewerRequest key="viewer" />
>>>>>>    <os:DataRequest key="appData" method="appdata.get"
>>>>>>userId="@viewer"
>>>>>>appId="@app"/>
>>>>>>
>>>>>>
>>>>>>Shindig appears to make HTTP requests to the rpc endpoint to itself in
>>>>>>the
>>>>>>process of rendering the gadget.  I could be missing something
>>>>>>fundamental
>>>>>>here, but is there any way to configure this differently so that
>>>>>>shindig
>>>>>>simply uses its SPI methods to retrieve this data instead?  Is this
>>>>>>really
>>>>>>just more of a convenience for the gadget developer than anything
>>>>>>else?
>>>>>>
>>>>>>-Matt
>>>>>>
>>>>>>On 8/20/14, 4:14 PM, "Stanton Sievers" <ss...@apache.org> wrote:
>>>>>>
>>>>>>>Hi Matt,
>>>>>>>
>>>>>>>This behavior is configured in container.js in the "gadgets.features"
>>>>>>>object.  If you look for "osapi" and "osapi.services", you'll see
>>>>>>>some
>>>>>>>comments about this configuration and the behavior.
>>>>>>>features/container/service.js is where this configuration is used and
>>>>>>>where
>>>>>>>the osapi services are instantiated.  As you've seen, Shindig
>>>>>>>introspects
>>>>>>>to find available services by default.
>>>>>>>
>>>>>>>If I knew at one point why this behaves this way, I've since
>>>>>>>forgotten.
>>>>>>>There is a system.listMethods API[1] defined in the Core API Server
>>>>>>>spec
>>>>>>>that this might simply be re-using to discover the available
>>>>>>>services.
>>>>>>>
>>>>>>>I hope that helps.
>>>>>>>
>>>>>>>-Stanton
>>>>>>>
>>>>>>>[1]
>>>>>>>http://opensocial.github.io/spec/trunk/Core-API-Server.xml#System-Ser
>>>>>>>vi
>>>>>>>ce
>>>>>>>-
>>>>>>>ListMethods
>>>>>>>
>>>>>>>
>>>>>>>On Tue, Aug 19, 2014 at 8:13 AM, Merrill, Matt <mm...@mitre.org>
>>>>>>>wrote:
>>>>>>>
>>>>>>>> Good morning,
>>>>>>>>
>>>>>>>> I¹m hoping some shindig veterans can help shed some light into the
>>>>>>>>reason
>>>>>>>> that Shindig makes HTTP rpc calls to itself as part of the gadget
>>>>>>>>rendering
>>>>>>>> process.  Why is this done as opposed to retrieving information via
>>>>>>>> internal Java method calls?  We hare having lots of issues where
>>>>>>>>this
>>>>>>>> approach seems to be causing a cascading failure when calls get
>>>>>>>>hung
>>>>>>>>up
>>>>>>>>in
>>>>>>>> the HTTPFetcher class.
>>>>>>>>
>>>>>>>> Also, I¹m curious what calls are made in this manner and how can
>>>>>>>>they
>>>>>>>>be
>>>>>>>> configured?  I have seen retrieval of viewer data done this way, as
>>>>>>>>well as
>>>>>>>> application data.
>>>>>>>>
>>>>>>>> I¹ve looked for documentation on this topic before and have not
>>>>>>>>seen
>>>>>>>>any.
>>>>>>>>  Any help is much appreciated.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>> -Matt Merrill
>>>>>>>>
>>>>>>
>>>>>
>>>
>

Re: Shindig rpc calls to itself

Posted by "Merrill, Matt" <mm...@mitre.org>.
Yes, I added logging for every call out the door and the majority of the
calls which are holding up incoming threads are making http calls back to
shindig itself, most notably the /rpc servlet endpoint.

Basically, because shindig makes a loopback http call to itself and that’s
on the same HTTP threadpool, the threadpool is starting to get exhausted
and there’s a cascading failure.  However, that threadpool is the same
size as we had it on Shindig 2.0.0, so I really can’t explain the
difference.

I’m really wondering of any differences between 2.0.0 and 2.5.0-update1
that might cause additional HTTP calls to be made and whether you can
configure or code shindig not to make these loopbacks.

-Matt

On 9/5/14, 12:42 PM, "Ryan Baxter" <rb...@apache.org> wrote:

>So have you looked at what resources the fetcher is fetching?
>
>On Fri, Sep 5, 2014 at 12:17 PM, Merrill, Matt <mm...@mitre.org> wrote:
>> Yes, we have.  During a couple of the outages we did a thread dump and
>>saw
>> that all (or almost all) of the threads were blocking on the
>> BasicHTTPFetcher fetch method. We also saw the number of threads jump up
>> to around the same number of threads we have in our Tomcat HTTP thread
>> pool (300).
>>
>> From the best I can tell, it seems as though the issue is that there are
>> now MORE calls to the various shindig servlets being made which is
>>causing
>> all of the HTTP threads to get consumed, but we can’t explain why as the
>> load is the same. Once we roll back to the version of the application
>> which uses shindig 2.0.0, everything is absolutely fine.
>>
>> I’m very hesitant to just increase the thread pool without a good
>> understanding of what could cause this.  If someone knows something that
>> changed between the 2.0.0 and 2.5.0-update1 versions that may have
>>caused
>> more calls to be made whether through the opensocial java API or
>> internally inside shindig that would be great to know.
>>
>> Or, perhaps a configuration parameter was introduced that we have set
>> wrong that may have caused all these extra calls?
>>
>> We have already made sure our HTTP responses are cached at a very high
>> level per your excellent advice. However, because the majority of the
>> calls which seem to be taking a long time are RPC calls, it doesn’t
>>appear
>> these get cached anyway so that wouldn’t affect this problem.
>>
>> And if someone knows the answers to the configuration/extension
>>questions
>> about pipelining, that would be great.
>>
>> Thanks!
>>
>> -Matt
>>
>> On 9/5/14, 11:35 AM, "Ryan Baxter" <rb...@apache.org> wrote:
>>
>>>So Matt have you looking into what those threads are doing?  I agree
>>>that it seems odd that with 2.5.1-update1 you are running out of
>>>threads but it is hard to pinpoint the reason without knowing what all
>>>those extra threads might be doing.
>>>
>>>
>>>On Thu, Sep 4, 2014 at 11:04 AM, Merrill, Matt <mm...@mitre.org>
>>>wrote:
>>>> Hi all,
>>>>
>>>> I haven’t heard back on this, so I thought I’d provide some more
>>>> information in the hopes that perhaps someone has some ideas as to
>>>>what
>>>> could be causing the issues we’re seeing with shindig’s “loopback”
>>>>http
>>>> calls.
>>>>
>>>> We have a situation where under load we hit a deadlock-like situation
>>>> because of the HTTP calls shindig makes to itself when pipelining
>>>>gadget
>>>> data. Basically, the HTTP request threadpools inside our Shindig
>>>>Tomcat
>>>> container are getting maxed out, and when shindig makes an http rpc
>>>>call
>>>> to itself to render a gadget which pipelines data, the request gets
>>>>held
>>>> up waiting for the rpc call, which might be being blocked by the
>>>>Tomcat
>>>> container waiting to handle an HTTP request.  This only happens under
>>>> load, of course.
>>>>
>>>> This is puzzling to me because when we were running Shindig 2.0.0, we
>>>>had
>>>> the same size threadpool, and now that we’ve upgraded to Shindig
>>>> 2.5.0-update1, the threadpools now seem to be getting maxed out.  I
>>>>took
>>>> some timings inside of our various shindig SPI implementions
>>>> (PersonService, AppData Service) and I didn’t see anything alarming.
>>>> There are also no spikes in user traffic.
>>>>
>>>> As I see it, we have a few options I could explore:
>>>>
>>>> 1) The “nuclear” option would be to simply increase our tomcat HTTP
>>>> threadpools, but that doesn’t seem prudent since the old version of
>>>> shindig worked just fine with that thread pool setting.  I feel like a
>>>> greater problem is being masked. Is there anything that changed
>>>>between
>>>> Shindig 2.0.0 and 2.5.0-update1 that could have caused some kind of
>>>> increase in traffic to shindig?  I tried looking at release notes in
>>>>Jira,
>>>> but that honestly wasn’t very helpful at all.
>>>>
>>>> 2) Re-configure Shindig to use implemented SPI methods (java method
>>>>calls)
>>>> instead of making HTTP calls to itself through the RPC API shindig
>>>> exposes?  Based on Stanton’s note below, it seems like there are some
>>>> configuration options for the RPC calls, but they’re mostly related to
>>>>how
>>>> the client-side javascript makes the calls.  Is there anything server
>>>>side
>>>> I can configure?  Perhaps with Guice modules?
>>>>
>>>> 3) Explore would be if there are hooks in the code to write custom
>>>>code
>>>>to
>>>> do this. I see in PipelinedDataPreloader.executeSocialRequest that the
>>>> javadoc mentions that:
>>>> "Subclasses can override to provide special handling (e.g., directly
>>>> invoking a local API)”  However, I’m missing something because I can’t
>>>> find out where the preloader gets instantiated.  I see that the
>>>> PipelineExecutor takes in a Guice injected instance of
>>>> PipedlinedDataPreloader, however, I don’t see it getting created
>>>>anywhere.
>>>>  Where is this being configured?
>>>
>>>The intention was probably to make this possible via Guice, but there
>>>is not interface you can bind an implementation to.  You would have to
>>>replace the classes where PipelinesDataPreloader are used and then
>>>keep going up the chain until you get to a class where you can inject
>>>something via Guice.  Looks like a messy situation right now with the
>>>current way the code is written.
>>>
>>>>
>>>> Any help is appreciated!
>>>>
>>>> Thanks!
>>>> -Matt
>>>>
>>>> On 8/25/14, 4:55 PM, "Merrill, Matt" <mm...@mitre.org> wrote:
>>>>
>>>>>Thanks Stanton!
>>>>>
>>>>>I¹m assuming that you mean the javascript calls will call listmethods
>>>>>then
>>>>>make any necessary RPC calls, is that correct?  Is there any other
>>>>>documentation on the introspection part?
>>>>>
>>>>>The reason I ask is that we¹re having problems server side when
>>>>>Shindig
>>>>>is
>>>>>pipelining data.  For example, when you do the following in a gadget:
>>>>><os:ViewerRequest key="viewer" />
>>>>>    <os:DataRequest key="appData" method="appdata.get"
>>>>>userId="@viewer"
>>>>>appId="@app"/>
>>>>>
>>>>>
>>>>>Shindig appears to make HTTP requests to the rpc endpoint to itself in
>>>>>the
>>>>>process of rendering the gadget.  I could be missing something
>>>>>fundamental
>>>>>here, but is there any way to configure this differently so that
>>>>>shindig
>>>>>simply uses its SPI methods to retrieve this data instead?  Is this
>>>>>really
>>>>>just more of a convenience for the gadget developer than anything
>>>>>else?
>>>>>
>>>>>-Matt
>>>>>
>>>>>On 8/20/14, 4:14 PM, "Stanton Sievers" <ss...@apache.org> wrote:
>>>>>
>>>>>>Hi Matt,
>>>>>>
>>>>>>This behavior is configured in container.js in the "gadgets.features"
>>>>>>object.  If you look for "osapi" and "osapi.services", you'll see
>>>>>>some
>>>>>>comments about this configuration and the behavior.
>>>>>>features/container/service.js is where this configuration is used and
>>>>>>where
>>>>>>the osapi services are instantiated.  As you've seen, Shindig
>>>>>>introspects
>>>>>>to find available services by default.
>>>>>>
>>>>>>If I knew at one point why this behaves this way, I've since
>>>>>>forgotten.
>>>>>>There is a system.listMethods API[1] defined in the Core API Server
>>>>>>spec
>>>>>>that this might simply be re-using to discover the available
>>>>>>services.
>>>>>>
>>>>>>I hope that helps.
>>>>>>
>>>>>>-Stanton
>>>>>>
>>>>>>[1]
>>>>>>http://opensocial.github.io/spec/trunk/Core-API-Server.xml#System-Ser
>>>>>>vi
>>>>>>ce
>>>>>>-
>>>>>>ListMethods
>>>>>>
>>>>>>
>>>>>>On Tue, Aug 19, 2014 at 8:13 AM, Merrill, Matt <mm...@mitre.org>
>>>>>>wrote:
>>>>>>
>>>>>>> Good morning,
>>>>>>>
>>>>>>> I¹m hoping some shindig veterans can help shed some light into the
>>>>>>>reason
>>>>>>> that Shindig makes HTTP rpc calls to itself as part of the gadget
>>>>>>>rendering
>>>>>>> process.  Why is this done as opposed to retrieving information via
>>>>>>> internal Java method calls?  We hare having lots of issues where
>>>>>>>this
>>>>>>> approach seems to be causing a cascading failure when calls get
>>>>>>>hung
>>>>>>>up
>>>>>>>in
>>>>>>> the HTTPFetcher class.
>>>>>>>
>>>>>>> Also, I¹m curious what calls are made in this manner and how can
>>>>>>>they
>>>>>>>be
>>>>>>> configured?  I have seen retrieval of viewer data done this way, as
>>>>>>>well as
>>>>>>> application data.
>>>>>>>
>>>>>>> I¹ve looked for documentation on this topic before and have not
>>>>>>>seen
>>>>>>>any.
>>>>>>>  Any help is much appreciated.
>>>>>>>
>>>>>>> Thanks!
>>>>>>> -Matt Merrill
>>>>>>>
>>>>>
>>>>
>>


Re: Shindig rpc calls to itself

Posted by Ryan Baxter <rb...@apache.org>.
So have you looked at what resources the fetcher is fetching?

On Fri, Sep 5, 2014 at 12:17 PM, Merrill, Matt <mm...@mitre.org> wrote:
> Yes, we have.  During a couple of the outages we did a thread dump and saw
> that all (or almost all) of the threads were blocking on the
> BasicHTTPFetcher fetch method. We also saw the number of threads jump up
> to around the same number of threads we have in our Tomcat HTTP thread
> pool (300).
>
> From the best I can tell, it seems as though the issue is that there are
> now MORE calls to the various shindig servlets being made which is causing
> all of the HTTP threads to get consumed, but we can’t explain why as the
> load is the same. Once we roll back to the version of the application
> which uses shindig 2.0.0, everything is absolutely fine.
>
> I’m very hesitant to just increase the thread pool without a good
> understanding of what could cause this.  If someone knows something that
> changed between the 2.0.0 and 2.5.0-update1 versions that may have caused
> more calls to be made whether through the opensocial java API or
> internally inside shindig that would be great to know.
>
> Or, perhaps a configuration parameter was introduced that we have set
> wrong that may have caused all these extra calls?
>
> We have already made sure our HTTP responses are cached at a very high
> level per your excellent advice. However, because the majority of the
> calls which seem to be taking a long time are RPC calls, it doesn’t appear
> these get cached anyway so that wouldn’t affect this problem.
>
> And if someone knows the answers to the configuration/extension questions
> about pipelining, that would be great.
>
> Thanks!
>
> -Matt
>
> On 9/5/14, 11:35 AM, "Ryan Baxter" <rb...@apache.org> wrote:
>
>>So Matt have you looking into what those threads are doing?  I agree
>>that it seems odd that with 2.5.1-update1 you are running out of
>>threads but it is hard to pinpoint the reason without knowing what all
>>those extra threads might be doing.
>>
>>
>>On Thu, Sep 4, 2014 at 11:04 AM, Merrill, Matt <mm...@mitre.org> wrote:
>>> Hi all,
>>>
>>> I haven’t heard back on this, so I thought I’d provide some more
>>> information in the hopes that perhaps someone has some ideas as to what
>>> could be causing the issues we’re seeing with shindig’s “loopback” http
>>> calls.
>>>
>>> We have a situation where under load we hit a deadlock-like situation
>>> because of the HTTP calls shindig makes to itself when pipelining gadget
>>> data. Basically, the HTTP request threadpools inside our Shindig Tomcat
>>> container are getting maxed out, and when shindig makes an http rpc call
>>> to itself to render a gadget which pipelines data, the request gets held
>>> up waiting for the rpc call, which might be being blocked by the Tomcat
>>> container waiting to handle an HTTP request.  This only happens under
>>> load, of course.
>>>
>>> This is puzzling to me because when we were running Shindig 2.0.0, we
>>>had
>>> the same size threadpool, and now that we’ve upgraded to Shindig
>>> 2.5.0-update1, the threadpools now seem to be getting maxed out.  I took
>>> some timings inside of our various shindig SPI implementions
>>> (PersonService, AppData Service) and I didn’t see anything alarming.
>>> There are also no spikes in user traffic.
>>>
>>> As I see it, we have a few options I could explore:
>>>
>>> 1) The “nuclear” option would be to simply increase our tomcat HTTP
>>> threadpools, but that doesn’t seem prudent since the old version of
>>> shindig worked just fine with that thread pool setting.  I feel like a
>>> greater problem is being masked. Is there anything that changed between
>>> Shindig 2.0.0 and 2.5.0-update1 that could have caused some kind of
>>> increase in traffic to shindig?  I tried looking at release notes in
>>>Jira,
>>> but that honestly wasn’t very helpful at all.
>>>
>>> 2) Re-configure Shindig to use implemented SPI methods (java method
>>>calls)
>>> instead of making HTTP calls to itself through the RPC API shindig
>>> exposes?  Based on Stanton’s note below, it seems like there are some
>>> configuration options for the RPC calls, but they’re mostly related to
>>>how
>>> the client-side javascript makes the calls.  Is there anything server
>>>side
>>> I can configure?  Perhaps with Guice modules?
>>>
>>> 3) Explore would be if there are hooks in the code to write custom code
>>>to
>>> do this. I see in PipelinedDataPreloader.executeSocialRequest that the
>>> javadoc mentions that:
>>> "Subclasses can override to provide special handling (e.g., directly
>>> invoking a local API)”  However, I’m missing something because I can’t
>>> find out where the preloader gets instantiated.  I see that the
>>> PipelineExecutor takes in a Guice injected instance of
>>> PipedlinedDataPreloader, however, I don’t see it getting created
>>>anywhere.
>>>  Where is this being configured?
>>
>>The intention was probably to make this possible via Guice, but there
>>is not interface you can bind an implementation to.  You would have to
>>replace the classes where PipelinesDataPreloader are used and then
>>keep going up the chain until you get to a class where you can inject
>>something via Guice.  Looks like a messy situation right now with the
>>current way the code is written.
>>
>>>
>>> Any help is appreciated!
>>>
>>> Thanks!
>>> -Matt
>>>
>>> On 8/25/14, 4:55 PM, "Merrill, Matt" <mm...@mitre.org> wrote:
>>>
>>>>Thanks Stanton!
>>>>
>>>>I¹m assuming that you mean the javascript calls will call listmethods
>>>>then
>>>>make any necessary RPC calls, is that correct?  Is there any other
>>>>documentation on the introspection part?
>>>>
>>>>The reason I ask is that we¹re having problems server side when Shindig
>>>>is
>>>>pipelining data.  For example, when you do the following in a gadget:
>>>><os:ViewerRequest key="viewer" />
>>>>    <os:DataRequest key="appData" method="appdata.get" userId="@viewer"
>>>>appId="@app"/>
>>>>
>>>>
>>>>Shindig appears to make HTTP requests to the rpc endpoint to itself in
>>>>the
>>>>process of rendering the gadget.  I could be missing something
>>>>fundamental
>>>>here, but is there any way to configure this differently so that shindig
>>>>simply uses its SPI methods to retrieve this data instead?  Is this
>>>>really
>>>>just more of a convenience for the gadget developer than anything else?
>>>>
>>>>-Matt
>>>>
>>>>On 8/20/14, 4:14 PM, "Stanton Sievers" <ss...@apache.org> wrote:
>>>>
>>>>>Hi Matt,
>>>>>
>>>>>This behavior is configured in container.js in the "gadgets.features"
>>>>>object.  If you look for "osapi" and "osapi.services", you'll see some
>>>>>comments about this configuration and the behavior.
>>>>>features/container/service.js is where this configuration is used and
>>>>>where
>>>>>the osapi services are instantiated.  As you've seen, Shindig
>>>>>introspects
>>>>>to find available services by default.
>>>>>
>>>>>If I knew at one point why this behaves this way, I've since forgotten.
>>>>>There is a system.listMethods API[1] defined in the Core API Server
>>>>>spec
>>>>>that this might simply be re-using to discover the available services.
>>>>>
>>>>>I hope that helps.
>>>>>
>>>>>-Stanton
>>>>>
>>>>>[1]
>>>>>http://opensocial.github.io/spec/trunk/Core-API-Server.xml#System-Servi
>>>>>ce
>>>>>-
>>>>>ListMethods
>>>>>
>>>>>
>>>>>On Tue, Aug 19, 2014 at 8:13 AM, Merrill, Matt <mm...@mitre.org>
>>>>>wrote:
>>>>>
>>>>>> Good morning,
>>>>>>
>>>>>> I¹m hoping some shindig veterans can help shed some light into the
>>>>>>reason
>>>>>> that Shindig makes HTTP rpc calls to itself as part of the gadget
>>>>>>rendering
>>>>>> process.  Why is this done as opposed to retrieving information via
>>>>>> internal Java method calls?  We hare having lots of issues where this
>>>>>> approach seems to be causing a cascading failure when calls get hung
>>>>>>up
>>>>>>in
>>>>>> the HTTPFetcher class.
>>>>>>
>>>>>> Also, I¹m curious what calls are made in this manner and how can they
>>>>>>be
>>>>>> configured?  I have seen retrieval of viewer data done this way, as
>>>>>>well as
>>>>>> application data.
>>>>>>
>>>>>> I¹ve looked for documentation on this topic before and have not seen
>>>>>>any.
>>>>>>  Any help is much appreciated.
>>>>>>
>>>>>> Thanks!
>>>>>> -Matt Merrill
>>>>>>
>>>>
>>>
>

Re: Shindig rpc calls to itself

Posted by "Merrill, Matt" <mm...@mitre.org>.
Yes, we have.  During a couple of the outages we did a thread dump and saw
that all (or almost all) of the threads were blocking on the
BasicHTTPFetcher fetch method. We also saw the number of threads jump up
to around the same number of threads we have in our Tomcat HTTP thread
pool (300).

From the best I can tell, it seems as though the issue is that there are
now MORE calls to the various shindig servlets being made which is causing
all of the HTTP threads to get consumed, but we can’t explain why as the
load is the same. Once we roll back to the version of the application
which uses shindig 2.0.0, everything is absolutely fine.

I’m very hesitant to just increase the thread pool without a good
understanding of what could cause this.  If someone knows something that
changed between the 2.0.0 and 2.5.0-update1 versions that may have caused
more calls to be made whether through the opensocial java API or
internally inside shindig that would be great to know.

Or, perhaps a configuration parameter was introduced that we have set
wrong that may have caused all these extra calls?

We have already made sure our HTTP responses are cached at a very high
level per your excellent advice. However, because the majority of the
calls which seem to be taking a long time are RPC calls, it doesn’t appear
these get cached anyway so that wouldn’t affect this problem.

And if someone knows the answers to the configuration/extension questions
about pipelining, that would be great.

Thanks!

-Matt

On 9/5/14, 11:35 AM, "Ryan Baxter" <rb...@apache.org> wrote:

>So Matt have you looking into what those threads are doing?  I agree
>that it seems odd that with 2.5.1-update1 you are running out of
>threads but it is hard to pinpoint the reason without knowing what all
>those extra threads might be doing.
>
>
>On Thu, Sep 4, 2014 at 11:04 AM, Merrill, Matt <mm...@mitre.org> wrote:
>> Hi all,
>>
>> I haven’t heard back on this, so I thought I’d provide some more
>> information in the hopes that perhaps someone has some ideas as to what
>> could be causing the issues we’re seeing with shindig’s “loopback” http
>> calls.
>>
>> We have a situation where under load we hit a deadlock-like situation
>> because of the HTTP calls shindig makes to itself when pipelining gadget
>> data. Basically, the HTTP request threadpools inside our Shindig Tomcat
>> container are getting maxed out, and when shindig makes an http rpc call
>> to itself to render a gadget which pipelines data, the request gets held
>> up waiting for the rpc call, which might be being blocked by the Tomcat
>> container waiting to handle an HTTP request.  This only happens under
>> load, of course.
>>
>> This is puzzling to me because when we were running Shindig 2.0.0, we
>>had
>> the same size threadpool, and now that we’ve upgraded to Shindig
>> 2.5.0-update1, the threadpools now seem to be getting maxed out.  I took
>> some timings inside of our various shindig SPI implementions
>> (PersonService, AppData Service) and I didn’t see anything alarming.
>> There are also no spikes in user traffic.
>>
>> As I see it, we have a few options I could explore:
>>
>> 1) The “nuclear” option would be to simply increase our tomcat HTTP
>> threadpools, but that doesn’t seem prudent since the old version of
>> shindig worked just fine with that thread pool setting.  I feel like a
>> greater problem is being masked. Is there anything that changed between
>> Shindig 2.0.0 and 2.5.0-update1 that could have caused some kind of
>> increase in traffic to shindig?  I tried looking at release notes in
>>Jira,
>> but that honestly wasn’t very helpful at all.
>>
>> 2) Re-configure Shindig to use implemented SPI methods (java method
>>calls)
>> instead of making HTTP calls to itself through the RPC API shindig
>> exposes?  Based on Stanton’s note below, it seems like there are some
>> configuration options for the RPC calls, but they’re mostly related to
>>how
>> the client-side javascript makes the calls.  Is there anything server
>>side
>> I can configure?  Perhaps with Guice modules?
>>
>> 3) Explore would be if there are hooks in the code to write custom code
>>to
>> do this. I see in PipelinedDataPreloader.executeSocialRequest that the
>> javadoc mentions that:
>> "Subclasses can override to provide special handling (e.g., directly
>> invoking a local API)”  However, I’m missing something because I can’t
>> find out where the preloader gets instantiated.  I see that the
>> PipelineExecutor takes in a Guice injected instance of
>> PipedlinedDataPreloader, however, I don’t see it getting created
>>anywhere.
>>  Where is this being configured?
>
>The intention was probably to make this possible via Guice, but there
>is not interface you can bind an implementation to.  You would have to
>replace the classes where PipelinesDataPreloader are used and then
>keep going up the chain until you get to a class where you can inject
>something via Guice.  Looks like a messy situation right now with the
>current way the code is written.
>
>>
>> Any help is appreciated!
>>
>> Thanks!
>> -Matt
>>
>> On 8/25/14, 4:55 PM, "Merrill, Matt" <mm...@mitre.org> wrote:
>>
>>>Thanks Stanton!
>>>
>>>I¹m assuming that you mean the javascript calls will call listmethods
>>>then
>>>make any necessary RPC calls, is that correct?  Is there any other
>>>documentation on the introspection part?
>>>
>>>The reason I ask is that we¹re having problems server side when Shindig
>>>is
>>>pipelining data.  For example, when you do the following in a gadget:
>>><os:ViewerRequest key="viewer" />
>>>    <os:DataRequest key="appData" method="appdata.get" userId="@viewer"
>>>appId="@app"/>
>>>
>>>
>>>Shindig appears to make HTTP requests to the rpc endpoint to itself in
>>>the
>>>process of rendering the gadget.  I could be missing something
>>>fundamental
>>>here, but is there any way to configure this differently so that shindig
>>>simply uses its SPI methods to retrieve this data instead?  Is this
>>>really
>>>just more of a convenience for the gadget developer than anything else?
>>>
>>>-Matt
>>>
>>>On 8/20/14, 4:14 PM, "Stanton Sievers" <ss...@apache.org> wrote:
>>>
>>>>Hi Matt,
>>>>
>>>>This behavior is configured in container.js in the "gadgets.features"
>>>>object.  If you look for "osapi" and "osapi.services", you'll see some
>>>>comments about this configuration and the behavior.
>>>>features/container/service.js is where this configuration is used and
>>>>where
>>>>the osapi services are instantiated.  As you've seen, Shindig
>>>>introspects
>>>>to find available services by default.
>>>>
>>>>If I knew at one point why this behaves this way, I've since forgotten.
>>>>There is a system.listMethods API[1] defined in the Core API Server
>>>>spec
>>>>that this might simply be re-using to discover the available services.
>>>>
>>>>I hope that helps.
>>>>
>>>>-Stanton
>>>>
>>>>[1]
>>>>http://opensocial.github.io/spec/trunk/Core-API-Server.xml#System-Servi
>>>>ce
>>>>-
>>>>ListMethods
>>>>
>>>>
>>>>On Tue, Aug 19, 2014 at 8:13 AM, Merrill, Matt <mm...@mitre.org>
>>>>wrote:
>>>>
>>>>> Good morning,
>>>>>
>>>>> I¹m hoping some shindig veterans can help shed some light into the
>>>>>reason
>>>>> that Shindig makes HTTP rpc calls to itself as part of the gadget
>>>>>rendering
>>>>> process.  Why is this done as opposed to retrieving information via
>>>>> internal Java method calls?  We hare having lots of issues where this
>>>>> approach seems to be causing a cascading failure when calls get hung
>>>>>up
>>>>>in
>>>>> the HTTPFetcher class.
>>>>>
>>>>> Also, I¹m curious what calls are made in this manner and how can they
>>>>>be
>>>>> configured?  I have seen retrieval of viewer data done this way, as
>>>>>well as
>>>>> application data.
>>>>>
>>>>> I¹ve looked for documentation on this topic before and have not seen
>>>>>any.
>>>>>  Any help is much appreciated.
>>>>>
>>>>> Thanks!
>>>>> -Matt Merrill
>>>>>
>>>
>>


Re: Shindig rpc calls to itself

Posted by Ryan Baxter <rb...@apache.org>.
So Matt have you looking into what those threads are doing?  I agree
that it seems odd that with 2.5.1-update1 you are running out of
threads but it is hard to pinpoint the reason without knowing what all
those extra threads might be doing.


On Thu, Sep 4, 2014 at 11:04 AM, Merrill, Matt <mm...@mitre.org> wrote:
> Hi all,
>
> I haven’t heard back on this, so I thought I’d provide some more
> information in the hopes that perhaps someone has some ideas as to what
> could be causing the issues we’re seeing with shindig’s “loopback” http
> calls.
>
> We have a situation where under load we hit a deadlock-like situation
> because of the HTTP calls shindig makes to itself when pipelining gadget
> data. Basically, the HTTP request threadpools inside our Shindig Tomcat
> container are getting maxed out, and when shindig makes an http rpc call
> to itself to render a gadget which pipelines data, the request gets held
> up waiting for the rpc call, which might be being blocked by the Tomcat
> container waiting to handle an HTTP request.  This only happens under
> load, of course.
>
> This is puzzling to me because when we were running Shindig 2.0.0, we had
> the same size threadpool, and now that we’ve upgraded to Shindig
> 2.5.0-update1, the threadpools now seem to be getting maxed out.  I took
> some timings inside of our various shindig SPI implementions
> (PersonService, AppData Service) and I didn’t see anything alarming.
> There are also no spikes in user traffic.
>
> As I see it, we have a few options I could explore:
>
> 1) The “nuclear” option would be to simply increase our tomcat HTTP
> threadpools, but that doesn’t seem prudent since the old version of
> shindig worked just fine with that thread pool setting.  I feel like a
> greater problem is being masked. Is there anything that changed between
> Shindig 2.0.0 and 2.5.0-update1 that could have caused some kind of
> increase in traffic to shindig?  I tried looking at release notes in Jira,
> but that honestly wasn’t very helpful at all.
>
> 2) Re-configure Shindig to use implemented SPI methods (java method calls)
> instead of making HTTP calls to itself through the RPC API shindig
> exposes?  Based on Stanton’s note below, it seems like there are some
> configuration options for the RPC calls, but they’re mostly related to how
> the client-side javascript makes the calls.  Is there anything server side
> I can configure?  Perhaps with Guice modules?
>
> 3) Explore would be if there are hooks in the code to write custom code to
> do this. I see in PipelinedDataPreloader.executeSocialRequest that the
> javadoc mentions that:
> "Subclasses can override to provide special handling (e.g., directly
> invoking a local API)”  However, I’m missing something because I can’t
> find out where the preloader gets instantiated.  I see that the
> PipelineExecutor takes in a Guice injected instance of
> PipedlinedDataPreloader, however, I don’t see it getting created anywhere.
>  Where is this being configured?

The intention was probably to make this possible via Guice, but there
is not interface you can bind an implementation to.  You would have to
replace the classes where PipelinesDataPreloader are used and then
keep going up the chain until you get to a class where you can inject
something via Guice.  Looks like a messy situation right now with the
current way the code is written.

>
> Any help is appreciated!
>
> Thanks!
> -Matt
>
> On 8/25/14, 4:55 PM, "Merrill, Matt" <mm...@mitre.org> wrote:
>
>>Thanks Stanton!
>>
>>I¹m assuming that you mean the javascript calls will call listmethods then
>>make any necessary RPC calls, is that correct?  Is there any other
>>documentation on the introspection part?
>>
>>The reason I ask is that we¹re having problems server side when Shindig is
>>pipelining data.  For example, when you do the following in a gadget:
>><os:ViewerRequest key="viewer" />
>>    <os:DataRequest key="appData" method="appdata.get" userId="@viewer"
>>appId="@app"/>
>>
>>
>>Shindig appears to make HTTP requests to the rpc endpoint to itself in the
>>process of rendering the gadget.  I could be missing something fundamental
>>here, but is there any way to configure this differently so that shindig
>>simply uses its SPI methods to retrieve this data instead?  Is this really
>>just more of a convenience for the gadget developer than anything else?
>>
>>-Matt
>>
>>On 8/20/14, 4:14 PM, "Stanton Sievers" <ss...@apache.org> wrote:
>>
>>>Hi Matt,
>>>
>>>This behavior is configured in container.js in the "gadgets.features"
>>>object.  If you look for "osapi" and "osapi.services", you'll see some
>>>comments about this configuration and the behavior.
>>>features/container/service.js is where this configuration is used and
>>>where
>>>the osapi services are instantiated.  As you've seen, Shindig introspects
>>>to find available services by default.
>>>
>>>If I knew at one point why this behaves this way, I've since forgotten.
>>>There is a system.listMethods API[1] defined in the Core API Server spec
>>>that this might simply be re-using to discover the available services.
>>>
>>>I hope that helps.
>>>
>>>-Stanton
>>>
>>>[1]
>>>http://opensocial.github.io/spec/trunk/Core-API-Server.xml#System-Service
>>>-
>>>ListMethods
>>>
>>>
>>>On Tue, Aug 19, 2014 at 8:13 AM, Merrill, Matt <mm...@mitre.org>
>>>wrote:
>>>
>>>> Good morning,
>>>>
>>>> I¹m hoping some shindig veterans can help shed some light into the
>>>>reason
>>>> that Shindig makes HTTP rpc calls to itself as part of the gadget
>>>>rendering
>>>> process.  Why is this done as opposed to retrieving information via
>>>> internal Java method calls?  We hare having lots of issues where this
>>>> approach seems to be causing a cascading failure when calls get hung up
>>>>in
>>>> the HTTPFetcher class.
>>>>
>>>> Also, I¹m curious what calls are made in this manner and how can they
>>>>be
>>>> configured?  I have seen retrieval of viewer data done this way, as
>>>>well as
>>>> application data.
>>>>
>>>> I¹ve looked for documentation on this topic before and have not seen
>>>>any.
>>>>  Any help is much appreciated.
>>>>
>>>> Thanks!
>>>> -Matt Merrill
>>>>
>>
>