You are viewing a plain text version of this content. The canonical link for it is here.
Posted to adffaces-dev@incubator.apache.org by Arjuna Wijeyekoon <ar...@gmail.com> on 2006/05/09 18:09:47 UTC

caching of rowkeys in the table

Hi Devs,

In order to identify which row was updated or clicked on (by the user in the
browser) we need a string row identifier. RowKeys in the framework are
Objects (UIXCollection.getRowKey() returns Object). So we need a way to go
from Object to String and back.

In the current UIXCollection class we maintain a cache between RowKeys and
string tokens.
During the encode phase, any new RowKey (that was not encountered before) is
assigned a new string token (which is just a counter). Then during the
subsequent decode phase each submitted string token is used to lookup the
corresponding RowKey so that updates are certain to happen to the correct
row.

In order to prevent this cache from growing indefinitely, we clear the cache
at the start of each encode phase.

This approach has the following problems:

   - During a ppr request, some string tokens might still be "active" on
   the browser. However, we clear the token cache at the start of the encode
   phase, so on the next submit those old tokens could conflict and cause
   errors or updates to the wrong rows.
   - Increases the size of the component-state-saving tree.
   - Sometimes the same row is displayed by different components, and
   each component has its own cache which is wasteful.
   - The string token has no "meaning" so it makes things harder to debug
   on the client-side.

Possible solution
So if we get out of the business of caching the RowKey-string map, then we
don't have to worry about being consistent with the state on the
client-side. I think we should put the burden of producing a String rowkey
on the CollectionModel implementer. The CollectionModel can have two new
methods:
public String getRowKeyAsString(Object rowkey);
public Object getRowKeyFromString(String rowkey);

The default implementation of CollectionModel can use standard java
serializing (followed by base 64 uuencode) to go between Object and String
(in the event that the RowKey is not already a String).

And the corresponding methods on UIXCollection:
getCurrencyString()
setCurrencyString(String)
will also change to
getRowKeyString()
setRowKeyString(String)

Optimization for TreeModel

TreeModel will typically create long RowKey Strings, eg:
/foo/bar
/foo/bar/baz
/foo/bar/boo

However, if the component renderer detects that the rowkey String of a child
starts with the rowkey String of the parent rowkey, then the renderer can
use this to only write the parent's rowkey string (once), and write only the
discriminating parts of each child key, eg:
/foo/bar
./baz
./boo


thoughts? Can I start on this implementation?
--arjuna

Re: caching of rowkeys in the table

Posted by John Fallows <jo...@gmail.com>.
On 5/13/06, Adam Winer <aw...@gmail.com> wrote:
>
> On 5/13/06, John Fallows <jo...@gmail.com> wrote:
> > Do we have any performance analysis of the impact of these long keys, or
> do
> > we just intuitively think it's a big deal?  For example, what happens
> when
> > GZIP compression is used to deliver both the initial page and partial
> > responses?  We should have a real justification before making things too
> > complex.
>
> I know from past measurements that a table containing long
> keys can have over HTML size dominated by model keys.
> (Gzip'ing is just too buggy in Internet Explorer to rely on.)


I remember an issue where DHTML Behaviors in IE were not decompressing their
GZIP'd JavaScript payload, but then more recently discovered that IE gets
upset about DHTML Behaviors being served without the test/x-component
content type anyway.

Are you referring to some other bugs?

Also, I'd like support for non-String keys.  Serialization
> is dog-slow, and should be minimized as much as possible - and
> serializing + Base64'ing lots of separate objects into Strings is
> really brutal performance-wise.


How about using a Converter (or similar)?

> We haven't yet addressed Arjuna's assertion that the current design
> doesn't
> > seem to cover all the usecases, specifically regarding the cache
> stability
> > over multiple postbacks, when previously observed keys have been cached
> at
> > the client.
> >
> > I'm -1 on a special flag with pseudo-strict behavior in certain cases.
> >
> > Suppose we took the simplest design, where CollectionModel returns
> Object,
> > doesn't need to worry about Strings at all, and a default conversion is
> > applied to encode and decode these keys as (probably long) Strings
> during
> > RenderResponse and ApplyRequestValues.
>
> If "encoding" and "decoding" is serializing, that's slooooow and
> generates very long keys.


Didn't mean to imply any particular implementation.

> Can someone enumerate the actual issues with that approach that we are
> > trying to address with the current design?
>
> By "current design", I guess you mean the token cache?
>
> 1. HTML size


I'm curious to learn why GZIP might not be sufficient.  What do Web caches
use for IE?

2. Performant support for non-Object keys


Did you mean "non-String" keys?

3. Improved support for client-side Javascript (predictable
>    row indices).  (This is a relatively minor benefit.)


The web-tier instance of the CollectionModel is per-session.  Could the
CollectionModel be made smart enough to deliver stable row indicies across
requests?  (thus making the key serialization discussion moot)

tc,
-john.

>
> > tc,
> > -john.
> >
> > On 5/9/06, Adam Winer <aw...@gmail.com> wrote:
> > >
> > > Arjuna,
> > >
> > > +1 for supporting collection models that can return efficient strings.
> > >
> > > But I'm currently -1 on adding new methods to CollectionModel,
> > > and -1 on removing the key cache altogether.  It's really handy
> > > to support collection models that return potentially long keys,
> > > especially once you start stamping out a lot of rows, since you only
> > > store the really long key once.
> > >
> > > Any thoughts about adding a flag to turn the key cache on and
> > > off?  And when the key cache is off, the model had better return
> > > Strings or else...?
> > >
> > > -- Adam
> > >
> > > On 5/9/06, Arjuna Wijeyekoon <ar...@gmail.com> wrote:
> > > > Hi Devs,
> > > >
> > > > In order to identify which row was updated or clicked on (by the
> user in
> > > the
> > > > browser) we need a string row identifier. RowKeys in the framework
> are
> > > > Objects (UIXCollection.getRowKey() returns Object). So we need a way
> to
> > > go
> > > > from Object to String and back.
> > > >
> > > > In the current UIXCollection class we maintain a cache between
> RowKeys
> > > and
> > > > string tokens.
> > > > During the encode phase, any new RowKey (that was not encountered
> > > before) is
> > > > assigned a new string token (which is just a counter). Then during
> the
> > > > subsequent decode phase each submitted string token is used to
> lookup
> > > the
> > > > corresponding RowKey so that updates are certain to happen to the
> > > correct
> > > > row.
> > > >
> > > > In order to prevent this cache from growing indefinitely, we clear
> the
> > > cache
> > > > at the start of each encode phase.
> > > >
> > > > This approach has the following problems:
> > > >
> > > >    - During a ppr request, some string tokens might still be
> "active" on
> > > >    the browser. However, we clear the token cache at the start of
> the
> > > encode
> > > >    phase, so on the next submit those old tokens could conflict and
> > > cause
> > > >    errors or updates to the wrong rows.
> > > >    - Increases the size of the component-state-saving tree.
> > > >    - Sometimes the same row is displayed by different components,
> and
> > > >    each component has its own cache which is wasteful.
> > > >    - The string token has no "meaning" so it makes things harder to
> > > debug
> > > >    on the client-side.
> > > >
> > > > Possible solution
> > > > So if we get out of the business of caching the RowKey-string map,
> then
> > > we
> > > > don't have to worry about being consistent with the state on the
> > > > client-side. I think we should put the burden of producing a String
> > > rowkey
> > > > on the CollectionModel implementer. The CollectionModel can have two
> new
> > > > methods:
> > > > public String getRowKeyAsString(Object rowkey);
> > > > public Object getRowKeyFromString(String rowkey);
> > > >
> > > > The default implementation of CollectionModel can use standard java
> > > > serializing (followed by base 64 uuencode) to go between Object and
> > > String
> > > > (in the event that the RowKey is not already a String).
> > > >
> > > > And the corresponding methods on UIXCollection:
> > > > getCurrencyString()
> > > > setCurrencyString(String)
> > > > will also change to
> > > > getRowKeyString()
> > > > setRowKeyString(String)
> > > >
> > > > Optimization for TreeModel
> > > >
> > > > TreeModel will typically create long RowKey Strings, eg:
> > > > /foo/bar
> > > > /foo/bar/baz
> > > > /foo/bar/boo
> > > >
> > > > However, if the component renderer detects that the rowkey String of
> a
> > > child
> > > > starts with the rowkey String of the parent rowkey, then the
> renderer
> > > can
> > > > use this to only write the parent's rowkey string (once), and write
> only
> > > the
> > > > discriminating parts of each child key, eg:
> > > > /foo/bar
> > > > ./baz
> > > > ./boo
> > > >
> > > >
> > > > thoughts? Can I start on this implementation?
> > > > --arjuna
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > http://apress.com/book/bookDisplay.html?bID=10044
> > Author: Pro JSF and Ajax: Building Rich Internet Components, Apress
> >
> >
>



-- 
http://apress.com/book/bookDisplay.html?bID=10044
Author: Pro JSF and Ajax: Building Rich Internet Components, Apress

Re: caching of rowkeys in the table

Posted by Adam Winer <aw...@gmail.com>.
On 5/13/06, John Fallows <jo...@gmail.com> wrote:
> Do we have any performance analysis of the impact of these long keys, or do
> we just intuitively think it's a big deal?  For example, what happens when
> GZIP compression is used to deliver both the initial page and partial
> responses?  We should have a real justification before making things too
> complex.

I know from past measurements that a table containing long
keys can have over HTML size dominated by model keys.
(Gzip'ing is just too buggy in Internet Explorer to rely on.)

Also, I'd like support for non-String keys.  Serialization
is dog-slow, and should be minimized as much as possible - and
serializing + Base64'ing lots of separate objects into Strings is
really brutal performance-wise.

> We haven't yet addressed Arjuna's assertion that the current design doesn't
> seem to cover all the usecases, specifically regarding the cache stability
> over multiple postbacks, when previously observed keys have been cached at
> the client.
>
> I'm -1 on a special flag with pseudo-strict behavior in certain cases.
>
> Suppose we took the simplest design, where CollectionModel returns Object,
> doesn't need to worry about Strings at all, and a default conversion is
> applied to encode and decode these keys as (probably long) Strings during
> RenderResponse and ApplyRequestValues.

If "encoding" and "decoding" is serializing, that's slooooow and
generates very long keys.

> Can someone enumerate the actual issues with that approach that we are
> trying to address with the current design?

By "current design", I guess you mean the token cache?

1. HTML size
2. Performant support for non-Object keys
3. Improved support for client-side Javascript (predictable
   row indices).  (This is a relatively minor benefit.)

I'm reluctant to discard these.

-- Adam


>
> tc,
> -john.
>
> On 5/9/06, Adam Winer <aw...@gmail.com> wrote:
> >
> > Arjuna,
> >
> > +1 for supporting collection models that can return efficient strings.
> >
> > But I'm currently -1 on adding new methods to CollectionModel,
> > and -1 on removing the key cache altogether.  It's really handy
> > to support collection models that return potentially long keys,
> > especially once you start stamping out a lot of rows, since you only
> > store the really long key once.
> >
> > Any thoughts about adding a flag to turn the key cache on and
> > off?  And when the key cache is off, the model had better return
> > Strings or else...?
> >
> > -- Adam
> >
> > On 5/9/06, Arjuna Wijeyekoon <ar...@gmail.com> wrote:
> > > Hi Devs,
> > >
> > > In order to identify which row was updated or clicked on (by the user in
> > the
> > > browser) we need a string row identifier. RowKeys in the framework are
> > > Objects (UIXCollection.getRowKey() returns Object). So we need a way to
> > go
> > > from Object to String and back.
> > >
> > > In the current UIXCollection class we maintain a cache between RowKeys
> > and
> > > string tokens.
> > > During the encode phase, any new RowKey (that was not encountered
> > before) is
> > > assigned a new string token (which is just a counter). Then during the
> > > subsequent decode phase each submitted string token is used to lookup
> > the
> > > corresponding RowKey so that updates are certain to happen to the
> > correct
> > > row.
> > >
> > > In order to prevent this cache from growing indefinitely, we clear the
> > cache
> > > at the start of each encode phase.
> > >
> > > This approach has the following problems:
> > >
> > >    - During a ppr request, some string tokens might still be "active" on
> > >    the browser. However, we clear the token cache at the start of the
> > encode
> > >    phase, so on the next submit those old tokens could conflict and
> > cause
> > >    errors or updates to the wrong rows.
> > >    - Increases the size of the component-state-saving tree.
> > >    - Sometimes the same row is displayed by different components, and
> > >    each component has its own cache which is wasteful.
> > >    - The string token has no "meaning" so it makes things harder to
> > debug
> > >    on the client-side.
> > >
> > > Possible solution
> > > So if we get out of the business of caching the RowKey-string map, then
> > we
> > > don't have to worry about being consistent with the state on the
> > > client-side. I think we should put the burden of producing a String
> > rowkey
> > > on the CollectionModel implementer. The CollectionModel can have two new
> > > methods:
> > > public String getRowKeyAsString(Object rowkey);
> > > public Object getRowKeyFromString(String rowkey);
> > >
> > > The default implementation of CollectionModel can use standard java
> > > serializing (followed by base 64 uuencode) to go between Object and
> > String
> > > (in the event that the RowKey is not already a String).
> > >
> > > And the corresponding methods on UIXCollection:
> > > getCurrencyString()
> > > setCurrencyString(String)
> > > will also change to
> > > getRowKeyString()
> > > setRowKeyString(String)
> > >
> > > Optimization for TreeModel
> > >
> > > TreeModel will typically create long RowKey Strings, eg:
> > > /foo/bar
> > > /foo/bar/baz
> > > /foo/bar/boo
> > >
> > > However, if the component renderer detects that the rowkey String of a
> > child
> > > starts with the rowkey String of the parent rowkey, then the renderer
> > can
> > > use this to only write the parent's rowkey string (once), and write only
> > the
> > > discriminating parts of each child key, eg:
> > > /foo/bar
> > > ./baz
> > > ./boo
> > >
> > >
> > > thoughts? Can I start on this implementation?
> > > --arjuna
> > >
> > >
> >
>
>
>
> --
> http://apress.com/book/bookDisplay.html?bID=10044
> Author: Pro JSF and Ajax: Building Rich Internet Components, Apress
>
>

Re: caching of rowkeys in the table

Posted by John Fallows <jo...@gmail.com>.
Do we have any performance analysis of the impact of these long keys, or do
we just intuitively think it's a big deal?  For example, what happens when
GZIP compression is used to deliver both the initial page and partial
responses?  We should have a real justification before making things too
complex.

We haven't yet addressed Arjuna's assertion that the current design doesn't
seem to cover all the usecases, specifically regarding the cache stability
over multiple postbacks, when previously observed keys have been cached at
the client.

I'm -1 on a special flag with pseudo-strict behavior in certain cases.

Suppose we took the simplest design, where CollectionModel returns Object,
doesn't need to worry about Strings at all, and a default conversion is
applied to encode and decode these keys as (probably long) Strings during
RenderResponse and ApplyRequestValues.

Can someone enumerate the actual issues with that approach that we are
trying to address with the current design?

tc,
-john.

On 5/9/06, Adam Winer <aw...@gmail.com> wrote:
>
> Arjuna,
>
> +1 for supporting collection models that can return efficient strings.
>
> But I'm currently -1 on adding new methods to CollectionModel,
> and -1 on removing the key cache altogether.  It's really handy
> to support collection models that return potentially long keys,
> especially once you start stamping out a lot of rows, since you only
> store the really long key once.
>
> Any thoughts about adding a flag to turn the key cache on and
> off?  And when the key cache is off, the model had better return
> Strings or else...?
>
> -- Adam
>
> On 5/9/06, Arjuna Wijeyekoon <ar...@gmail.com> wrote:
> > Hi Devs,
> >
> > In order to identify which row was updated or clicked on (by the user in
> the
> > browser) we need a string row identifier. RowKeys in the framework are
> > Objects (UIXCollection.getRowKey() returns Object). So we need a way to
> go
> > from Object to String and back.
> >
> > In the current UIXCollection class we maintain a cache between RowKeys
> and
> > string tokens.
> > During the encode phase, any new RowKey (that was not encountered
> before) is
> > assigned a new string token (which is just a counter). Then during the
> > subsequent decode phase each submitted string token is used to lookup
> the
> > corresponding RowKey so that updates are certain to happen to the
> correct
> > row.
> >
> > In order to prevent this cache from growing indefinitely, we clear the
> cache
> > at the start of each encode phase.
> >
> > This approach has the following problems:
> >
> >    - During a ppr request, some string tokens might still be "active" on
> >    the browser. However, we clear the token cache at the start of the
> encode
> >    phase, so on the next submit those old tokens could conflict and
> cause
> >    errors or updates to the wrong rows.
> >    - Increases the size of the component-state-saving tree.
> >    - Sometimes the same row is displayed by different components, and
> >    each component has its own cache which is wasteful.
> >    - The string token has no "meaning" so it makes things harder to
> debug
> >    on the client-side.
> >
> > Possible solution
> > So if we get out of the business of caching the RowKey-string map, then
> we
> > don't have to worry about being consistent with the state on the
> > client-side. I think we should put the burden of producing a String
> rowkey
> > on the CollectionModel implementer. The CollectionModel can have two new
> > methods:
> > public String getRowKeyAsString(Object rowkey);
> > public Object getRowKeyFromString(String rowkey);
> >
> > The default implementation of CollectionModel can use standard java
> > serializing (followed by base 64 uuencode) to go between Object and
> String
> > (in the event that the RowKey is not already a String).
> >
> > And the corresponding methods on UIXCollection:
> > getCurrencyString()
> > setCurrencyString(String)
> > will also change to
> > getRowKeyString()
> > setRowKeyString(String)
> >
> > Optimization for TreeModel
> >
> > TreeModel will typically create long RowKey Strings, eg:
> > /foo/bar
> > /foo/bar/baz
> > /foo/bar/boo
> >
> > However, if the component renderer detects that the rowkey String of a
> child
> > starts with the rowkey String of the parent rowkey, then the renderer
> can
> > use this to only write the parent's rowkey string (once), and write only
> the
> > discriminating parts of each child key, eg:
> > /foo/bar
> > ./baz
> > ./boo
> >
> >
> > thoughts? Can I start on this implementation?
> > --arjuna
> >
> >
>



-- 
http://apress.com/book/bookDisplay.html?bID=10044
Author: Pro JSF and Ajax: Building Rich Internet Components, Apress

Re: caching of rowkeys in the table

Posted by Adam Winer <aw...@gmail.com>.
Arjuna,

+1 for supporting collection models that can return efficient strings.

But I'm currently -1 on adding new methods to CollectionModel,
and -1 on removing the key cache altogether.  It's really handy
to support collection models that return potentially long keys,
especially once you start stamping out a lot of rows, since you only
store the really long key once.

Any thoughts about adding a flag to turn the key cache on and
off?  And when the key cache is off, the model had better return
Strings or else...?

-- Adam

On 5/9/06, Arjuna Wijeyekoon <ar...@gmail.com> wrote:
> Hi Devs,
>
> In order to identify which row was updated or clicked on (by the user in the
> browser) we need a string row identifier. RowKeys in the framework are
> Objects (UIXCollection.getRowKey() returns Object). So we need a way to go
> from Object to String and back.
>
> In the current UIXCollection class we maintain a cache between RowKeys and
> string tokens.
> During the encode phase, any new RowKey (that was not encountered before) is
> assigned a new string token (which is just a counter). Then during the
> subsequent decode phase each submitted string token is used to lookup the
> corresponding RowKey so that updates are certain to happen to the correct
> row.
>
> In order to prevent this cache from growing indefinitely, we clear the cache
> at the start of each encode phase.
>
> This approach has the following problems:
>
>    - During a ppr request, some string tokens might still be "active" on
>    the browser. However, we clear the token cache at the start of the encode
>    phase, so on the next submit those old tokens could conflict and cause
>    errors or updates to the wrong rows.
>    - Increases the size of the component-state-saving tree.
>    - Sometimes the same row is displayed by different components, and
>    each component has its own cache which is wasteful.
>    - The string token has no "meaning" so it makes things harder to debug
>    on the client-side.
>
> Possible solution
> So if we get out of the business of caching the RowKey-string map, then we
> don't have to worry about being consistent with the state on the
> client-side. I think we should put the burden of producing a String rowkey
> on the CollectionModel implementer. The CollectionModel can have two new
> methods:
> public String getRowKeyAsString(Object rowkey);
> public Object getRowKeyFromString(String rowkey);
>
> The default implementation of CollectionModel can use standard java
> serializing (followed by base 64 uuencode) to go between Object and String
> (in the event that the RowKey is not already a String).
>
> And the corresponding methods on UIXCollection:
> getCurrencyString()
> setCurrencyString(String)
> will also change to
> getRowKeyString()
> setRowKeyString(String)
>
> Optimization for TreeModel
>
> TreeModel will typically create long RowKey Strings, eg:
> /foo/bar
> /foo/bar/baz
> /foo/bar/boo
>
> However, if the component renderer detects that the rowkey String of a child
> starts with the rowkey String of the parent rowkey, then the renderer can
> use this to only write the parent's rowkey string (once), and write only the
> discriminating parts of each child key, eg:
> /foo/bar
> ./baz
> ./boo
>
>
> thoughts? Can I start on this implementation?
> --arjuna
>
>