You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@shindig.apache.org by Paul Lindner <li...@inuus.com> on 2009/06/11 22:24:21 UTC

Re: [round trip compatibility] Re: rpc.js wire compatibility

There's actually a larger problem here and it affects more than just the rpc
calls.
One problem area is upgrading a cluster of shindig machines to insure that
iframe content matches up with container js and forced-libs js content.

If you upgrade the cluster in-place you'll end up with requests for
javascript going to the old build which is then cached by the browser.  This
is especially problematic because shindig always responds 'not-modified' to
an IMS request when a v= param is present.

If you have session affinity or other load balancer tricks you might not
have this problem, however with a CDN in place the request for the JS
content comes from the CDN host, not the user's browser.

The solution used at hi5 was to add a 'generation' param to all versioned
URLs.  To do the rollout you then did:

  Rolling upgrade of hosts from v to v+1
  Bump generation number
  Rolling restart of all hosts

One possible solution is to compare the v= param the browser sends with the
internal hash-code as calculated by the server.  If they mis-match you can
set low expiration value.  (Although this still doesn't help with IMS
requests..)

The other idea is to have separate instances and change the parent path on a
per-build basis for all gadgets/js requests.

  /opensocial-v1/gadgets/js
  /opensocial-v2/gadgets/js

This has a nice side effect that you can deploy both versions side-by-side
and drain down old for new.

On Thu, Jun 11, 2009 at 1:07 PM, John Hjelmstad <fa...@google.com> wrote:

> On Thu, Jun 11, 2009 at 1:46 AM, Kevin Brown <et...@google.com> wrote:
>
> > Realistically speaking, 'new' channels aren't going to be an issue. All
> new
> > browsers (and new browser versions) will use postMessage. We have a
> channel
> > that is 'fast enough' for all legacy browsers, and over time we will
> remove
> > libraries rather than add them.
> >
> > The reasons why we might add a new channel:
> >
> > 1. Some big security problem with an existing channel. Most likely we
> will
> > just switch back to IFPC for the browser(s) that are affected if this
> > happens. IE 6 is really the only browser where this is a significant risk
> > --
> > all other browsers (including IE7) are on an auto update path that will
> > make
> > the other legacy channels irrelevant by the end of the year.
>
>
> Agreed.
>
>
> >
> >
> > 2. I can't think of any other good reason. Vanity?
>
>
> I shudder at the prospect of adding yet more rpc code for vanity :)
>
>
> >
> >
> > The real issue is going to be code compatibility itself. Your proposed
> > solution wouldn't make any difference if the code isn't compatible.
> >
> > I stand by what I've said for nearly 2 years on this issue, which i that
> > the
> > only viable option for the rpc feature is for containers to source the
> file
> > directly from the gadget server. Every other approach has been full of
> > compatibility bugs.
>
>
> +1, bottom line.
>
> --John
>
>
> >
> >
> > On Wed, Jun 10, 2009 at 10:03 PM, Brian Eaton <be...@google.com> wrote:
> >
> > > On Wed, Jun 10, 2009 at 6:57 PM, John Hjelmstad<fa...@google.com>
> wrote:
> > > > I don't know of
> > > > any way that one transport would ever talk to another, so the best we
> > can
> > > do
> > > > in such failure cases is to fall back to some common transport that
> all
> > > > browsers support. So it's critically important that integrations
> happen
> > > > properly. It just doesn't work for containers to cache some stale old
> > > > version of rpc.js if the library is changing.
> > >
> > > Hmm.  This feels wrong.  What if the container passed acceptable
> > > transport types on the gadget render URL instead, then the gadget
> > > picked from that list?
> > >
> > > That way there would be no problem if the container didn't support
> > > RMR, but the gadget did.
> > >
> >
>

RE: [round trip compatibility] Re: rpc.js wire compatibility

Posted by "Weygandt, Jon" <jw...@ebay.com>.

Paul,

Any of your solution make it back to the code branch?

We have a similar clustered deployment, and would like to use what
exists or take part in creating and testing the solution.

How do you inform the container the version/generation number of the
server?

Should we start to introduce a version number to some of the features.
Fortunately it is only very view features that require this (rpc, pubsub
and core are the only ones with code spanning gadget and container).

Jon 

-----Original Message-----
From: Paul Lindner [mailto:lindner@inuus.com] 
Sent: Thursday, June 11, 2009 1:24 PM
To: shindig-dev@incubator.apache.org
Subject: Re: [round trip compatibility] Re: rpc.js wire compatibility

There's actually a larger problem here and it affects more than just the
rpc calls.
One problem area is upgrading a cluster of shindig machines to insure
that iframe content matches up with container js and forced-libs js
content.

If you upgrade the cluster in-place you'll end up with requests for
javascript going to the old build which is then cached by the browser.
This is especially problematic because shindig always responds
'not-modified' to an IMS request when a v= param is present.

If you have session affinity or other load balancer tricks you might not
have this problem, however with a CDN in place the request for the JS
content comes from the CDN host, not the user's browser.

The solution used at hi5 was to add a 'generation' param to all
versioned URLs.  To do the rollout you then did:

  Rolling upgrade of hosts from v to v+1
  Bump generation number
  Rolling restart of all hosts

One possible solution is to compare the v= param the browser sends with
the internal hash-code as calculated by the server.  If they mis-match
you can set low expiration value.  (Although this still doesn't help
with IMS
requests..)

The other idea is to have separate instances and change the parent path
on a per-build basis for all gadgets/js requests.

  /opensocial-v1/gadgets/js
  /opensocial-v2/gadgets/js

This has a nice side effect that you can deploy both versions
side-by-side and drain down old for new.

On Thu, Jun 11, 2009 at 1:07 PM, John Hjelmstad <fa...@google.com>
wrote:

> On Thu, Jun 11, 2009 at 1:46 AM, Kevin Brown <et...@google.com> wrote:
>
> > Realistically speaking, 'new' channels aren't going to be an issue. 
> > All
> new
> > browsers (and new browser versions) will use postMessage. We have a
> channel
> > that is 'fast enough' for all legacy browsers, and over time we will
> remove
> > libraries rather than add them.
> >
> > The reasons why we might add a new channel:
> >
> > 1. Some big security problem with an existing channel. Most likely 
> > we
> will
> > just switch back to IFPC for the browser(s) that are affected if 
> > this happens. IE 6 is really the only browser where this is a 
> > significant risk
> > --
> > all other browsers (including IE7) are on an auto update path that 
> > will make the other legacy channels irrelevant by the end of the 
> > year.
>
>
> Agreed.
>
>
> >
> >
> > 2. I can't think of any other good reason. Vanity?
>
>
> I shudder at the prospect of adding yet more rpc code for vanity :)
>
>
> >
> >
> > The real issue is going to be code compatibility itself. Your 
> > proposed solution wouldn't make any difference if the code isn't
compatible.
> >
> > I stand by what I've said for nearly 2 years on this issue, which i 
> > that the only viable option for the rpc feature is for containers to

> > source the
> file
> > directly from the gadget server. Every other approach has been full 
> > of compatibility bugs.
>
>
> +1, bottom line.
>
> --John
>
>
> >
> >
> > On Wed, Jun 10, 2009 at 10:03 PM, Brian Eaton <be...@google.com>
wrote:
> >
> > > On Wed, Jun 10, 2009 at 6:57 PM, John Hjelmstad<fa...@google.com>
> wrote:
> > > > I don't know of
> > > > any way that one transport would ever talk to another, so the 
> > > > best we
> > can
> > > do
> > > > in such failure cases is to fall back to some common transport 
> > > > that
> all
> > > > browsers support. So it's critically important that integrations
> happen
> > > > properly. It just doesn't work for containers to cache some 
> > > > stale old version of rpc.js if the library is changing.
> > >
> > > Hmm.  This feels wrong.  What if the container passed acceptable 
> > > transport types on the gadget render URL instead, then the gadget 
> > > picked from that list?
> > >
> > > That way there would be no problem if the container didn't support

> > > RMR, but the gadget did.
> > >
> >
>

Re: [round trip compatibility] Re: rpc.js wire compatibility

Posted by John Hjelmstad <fa...@google.com>.

On Thu, Jun 11, 2009 at 2:49 PM, Paul Lindner <li...@inuus.com> wrote:

> On Thu, Jun 11, 2009 at 1:35 PM, Kevin Brown <et...@google.com> wrote:
>
> > On Thu, Jun 11, 2009 at 1:24 PM, Paul Lindner <li...@inuus.com> wrote:
> >
> > > There's actually a larger problem here and it affects more than just
> the
> > > rpc
> > > calls.
> > > One problem area is upgrading a cluster of shindig machines to insure
> > that
> > > iframe content matches up with container js and forced-libs js content.
> > >
> > > If you upgrade the cluster in-place you'll end up with requests for
> > > javascript going to the old build which is then cached by the browser.
> > >  This
> > > is especially problematic because shindig always responds
> 'not-modified'
> > to
> > > an IMS request when a v= param is present.
> >
> >
> > We send not-modified when we get an If-Modified-Since, not when there's a
> v
> > param present.
> >
>
> Actually:
>
>    // If an If-Modified-Since header is ever provided, we always say
>    // not modified. This is because when there actually is a change,
>    // cache busting should occur.
>    if (req.getHeader("If-Modified-Since") != null &&
>        req.getParameter("v") != null) {
>      resp.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
>      return;
>     }
>
>
>
>
> >
> >
> > > If you have session affinity or other load balancer tricks you might
> not
> > > have this problem, however with a CDN in place the request for the JS
> > > content comes from the CDN host, not the user's browser.
> >
> >
> > If you're using a CDN, or a caching reverse proxy, it's best to have all
> > relevant versions available there for as long as is necessary.
> >
> > We use a caching reverse proxy, which ensures that the 'right' file gets
> > served 99.9% of the time.
> >
>
> Yes those caches get an amazing hit rate, but how do you insure that a user
> that has the following Iframe:
>
>
>
> http://lfkq9vbe9u4sg98ip8rfvf00l7atcn3d.ig.ig.sandbox.gmodules.com/gadgets/ifr?url=http://www.google.com/ig/modules/fv.xml&libs=core:core.io:core.iglegacy
>
>
> gets the correct JS content here:
>
>
> http://www.sandbox.gmodules.com/gadgets/js/core:core.iglegacy:core.io.js?v=7565e07cd2ecc6d7e363f7e55e79fbc&container=ig&debug=
> 0


Leaving aside the fact that rpc doesn't participate in this ;), the v= param
is computed as a hash of the JS to be emitted. If it (eg rpc) changes, v
changes. At that point you need some kind of frontend affinity to match
versions.

Still, that may not happen, and worse yet a CDN might pick up a stale
version of rpc with a (new v)=-param request.

The problem is that v= values are assumed to be properly generated and
consistently served. Rolling server startups (esp. w/o affinity, which IMO
is a high bar to demand of any Shindig deployment) introduce consistency
errors. We could consider v= verification to mitigate this, in JsServlet at
least to start. Efficiently computing v= for gadget IFRAMEs seems more
difficult.

--John


>
>
> when you're in the midst of upgr
>

Re: [round trip compatibility] Re: rpc.js wire compatibility

Posted by Brian Eaton <be...@google.com>.

On Thu, Jun 11, 2009 at 3:30 PM, Kevin Brown<et...@google.com> wrote:
> Currently, the latter is cached by our reverse proxy so the version
> requested always comes from there. You can still bump into problems if the
> 'new' version winds up getting pulled from an 'old' server, though. We can
> mitigate that by adding an actual check on the v param instead of just using
> it for cache busting. This works if you have a load balancer that allows you
> to bounce the request off of another server on failure, but I doubt many
> organizations have a setup like that.

Leaving aside caching questions for a minute, what about a viable
experiment framework for this code?

If we have wire compatibility (with protocol options chosen by the
container page), we can do experiments where we opt-in certain users
to transport changes.

Cheers,
Brian

Re: [round trip compatibility] Re: rpc.js wire compatibility

Posted by Kevin Brown <et...@google.com>.

On Thu, Jun 11, 2009 at 2:49 PM, Paul Lindner <li...@inuus.com> wrote:

> On Thu, Jun 11, 2009 at 1:35 PM, Kevin Brown <et...@google.com> wrote:
>
> > On Thu, Jun 11, 2009 at 1:24 PM, Paul Lindner <li...@inuus.com> wrote:
> >
> > > There's actually a larger problem here and it affects more than just
> the
> > > rpc
> > > calls.
> > > One problem area is upgrading a cluster of shindig machines to insure
> > that
> > > iframe content matches up with container js and forced-libs js content.
> > >
> > > If you upgrade the cluster in-place you'll end up with requests for
> > > javascript going to the old build which is then cached by the browser.
> > >  This
> > > is especially problematic because shindig always responds
> 'not-modified'
> > to
> > > an IMS request when a v= param is present.
> >
> >
> > We send not-modified when we get an If-Modified-Since, not when there's a
> v
> > param present.
> >
>
> Actually:
>
>    // If an If-Modified-Since header is ever provided, we always say
>    // not modified. This is because when there actually is a change,
>    // cache busting should occur.
>    if (req.getHeader("If-Modified-Since") != null &&
>        req.getParameter("v") != null) {
>      resp.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
>      return;
>     }
>
>
>
>
> >
> >
> > > If you have session affinity or other load balancer tricks you might
> not
> > > have this problem, however with a CDN in place the request for the JS
> > > content comes from the CDN host, not the user's browser.
> >
> >
> > If you're using a CDN, or a caching reverse proxy, it's best to have all
> > relevant versions available there for as long as is necessary.
> >
> > We use a caching reverse proxy, which ensures that the 'right' file gets
> > served 99.9% of the time.
> >
>
> Yes those caches get an amazing hit rate, but how do you insure that a user
> that has the following Iframe:
>
>
>
> http://lfkq9vbe9u4sg98ip8rfvf00l7atcn3d.ig.ig.sandbox.gmodules.com/gadgets/ifr?url=http://www.google.com/ig/modules/fv.xml&libs=core:core.io:core.iglegacy
>
>
> gets the correct JS content here:
>
>
> http://www.sandbox.gmodules.com/gadgets/js/core:core.iglegacy:core.io.js?v=7565e07cd2ecc6d7e363f7e55e79fbc&container=ig&debug=


Currently, the latter is cached by our reverse proxy so the version
requested always comes from there. You can still bump into problems if the
'new' version winds up getting pulled from an 'old' server, though. We can
mitigate that by adding an actual check on the v param instead of just using
it for cache busting. This works if you have a load balancer that allows you
to bounce the request off of another server on failure, but I doubt many
organizations have a setup like that.

If we wind up using Cache-Control: private, though, this doesn't work any
longer.


>
> 0
>
> when you're in the midst of upgr
>

Re: [round trip compatibility] Re: rpc.js wire compatibility

Posted by Paul Lindner <li...@inuus.com>.

On Thu, Jun 11, 2009 at 1:35 PM, Kevin Brown <et...@google.com> wrote:

> On Thu, Jun 11, 2009 at 1:24 PM, Paul Lindner <li...@inuus.com> wrote:
>
> > There's actually a larger problem here and it affects more than just the
> > rpc
> > calls.
> > One problem area is upgrading a cluster of shindig machines to insure
> that
> > iframe content matches up with container js and forced-libs js content.
> >
> > If you upgrade the cluster in-place you'll end up with requests for
> > javascript going to the old build which is then cached by the browser.
> >  This
> > is especially problematic because shindig always responds 'not-modified'
> to
> > an IMS request when a v= param is present.
>
>
> We send not-modified when we get an If-Modified-Since, not when there's a v
> param present.
>

Actually:

    // If an If-Modified-Since header is ever provided, we always say
    // not modified. This is because when there actually is a change,
    // cache busting should occur.
    if (req.getHeader("If-Modified-Since") != null &&
        req.getParameter("v") != null) {
      resp.setStatus(HttpServletResponse.SC_NOT_MODIFIED);
      return;
    }




>
>
> > If you have session affinity or other load balancer tricks you might not
> > have this problem, however with a CDN in place the request for the JS
> > content comes from the CDN host, not the user's browser.
>
>
> If you're using a CDN, or a caching reverse proxy, it's best to have all
> relevant versions available there for as long as is necessary.
>
> We use a caching reverse proxy, which ensures that the 'right' file gets
> served 99.9% of the time.
>

Yes those caches get an amazing hit rate, but how do you insure that a user
that has the following Iframe:


http://lfkq9vbe9u4sg98ip8rfvf00l7atcn3d.ig.ig.sandbox.gmodules.com/gadgets/ifr?url=http://www.google.com/ig/modules/fv.xml&libs=core:core.io:core.iglegacy


gets the correct JS content here:

http://www.sandbox.gmodules.com/gadgets/js/core:core.iglegacy:core.io.js?v=7565e07cd2ecc6d7e363f7e55e79fbc&container=ig&debug=
0

when you're in the midst of upgr

Re: [round trip compatibility] Re: rpc.js wire compatibility

Posted by Kevin Brown <et...@google.com>.

On Thu, Jun 11, 2009 at 1:24 PM, Paul Lindner <li...@inuus.com> wrote:

> There's actually a larger problem here and it affects more than just the
> rpc
> calls.
> One problem area is upgrading a cluster of shindig machines to insure that
> iframe content matches up with container js and forced-libs js content.
>
> If you upgrade the cluster in-place you'll end up with requests for
> javascript going to the old build which is then cached by the browser.
>  This
> is especially problematic because shindig always responds 'not-modified' to
> an IMS request when a v= param is present.


We send not-modified when we get an If-Modified-Since, not when there's a v
param present.


> If you have session affinity or other load balancer tricks you might not
> have this problem, however with a CDN in place the request for the JS
> content comes from the CDN host, not the user's browser.


If you're using a CDN, or a caching reverse proxy, it's best to have all
relevant versions available there for as long as is necessary.

We use a caching reverse proxy, which ensures that the 'right' file gets
served 99.9% of the time.


>
>
> The solution used at hi5 was to add a 'generation' param to all versioned
> URLs.  To do the rollout you then did:
>
>  Rolling upgrade of hosts from v to v+1
>  Bump generation number
>  Rolling restart of all hosts
>
> One possible solution is to compare the v= param the browser sends with the
> internal hash-code as calculated by the server.  If they mis-match you can
> set low expiration value.  (Although this still doesn't help with IMS
> requests..)
>
> The other idea is to have separate instances and change the parent path on
> a
> per-build basis for all gadgets/js requests.
>
>  /opensocial-v1/gadgets/js
>  /opensocial-v2/gadgets/js
>
> This has a nice side effect that you can deploy both versions side-by-side
> and drain down old for new.
>
>
>
> On Thu, Jun 11, 2009 at 1:07 PM, John Hjelmstad <fa...@google.com> wrote:
>
> > On Thu, Jun 11, 2009 at 1:46 AM, Kevin Brown <et...@google.com> wrote:
> >
> > > Realistically speaking, 'new' channels aren't going to be an issue. All
> > new
> > > browsers (and new browser versions) will use postMessage. We have a
> > channel
> > > that is 'fast enough' for all legacy browsers, and over time we will
> > remove
> > > libraries rather than add them.
> > >
> > > The reasons why we might add a new channel:
> > >
> > > 1. Some big security problem with an existing channel. Most likely we
> > will
> > > just switch back to IFPC for the browser(s) that are affected if this
> > > happens. IE 6 is really the only browser where this is a significant
> risk
> > > --
> > > all other browsers (including IE7) are on an auto update path that will
> > > make
> > > the other legacy channels irrelevant by the end of the year.
> >
> >
> > Agreed.
> >
> >
> > >
> > >
> > > 2. I can't think of any other good reason. Vanity?
> >
> >
> > I shudder at the prospect of adding yet more rpc code for vanity :)
> >
> >
> > >
> > >
> > > The real issue is going to be code compatibility itself. Your proposed
> > > solution wouldn't make any difference if the code isn't compatible.
> > >
> > > I stand by what I've said for nearly 2 years on this issue, which i
> that
> > > the
> > > only viable option for the rpc feature is for containers to source the
> > file
> > > directly from the gadget server. Every other approach has been full of
> > > compatibility bugs.
> >
> >
> > +1, bottom line.
> >
> > --John
> >
> >
> > >
> > >
> > > On Wed, Jun 10, 2009 at 10:03 PM, Brian Eaton <be...@google.com>
> wrote:
> > >
> > > > On Wed, Jun 10, 2009 at 6:57 PM, John Hjelmstad<fa...@google.com>
> > wrote:
> > > > > I don't know of
> > > > > any way that one transport would ever talk to another, so the best
> we
> > > can
> > > > do
> > > > > in such failure cases is to fall back to some common transport that
> > all
> > > > > browsers support. So it's critically important that integrations
> > happen
> > > > > properly. It just doesn't work for containers to cache some stale
> old
> > > > > version of rpc.js if the library is changing.
> > > >
> > > > Hmm.  This feels wrong.  What if the container passed acceptable
> > > > transport types on the gadget render URL instead, then the gadget
> > > > picked from that list?
> > > >
> > > > That way there would be no problem if the container didn't support
> > > > RMR, but the gadget did.
> > > >
> > >
> >
>