You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@wicket.apache.org by Igor Vaynberg <ig...@gmail.com> on 2007/01/30 08:14:32 UTC

optimizing serialized state: model serialization codecs idea

another idea to optimize serialization state from jon and i

allow an easy way to override model serialization

a simple example:

class EntityModel extends LoadableDetachableModel {
   private long id;
   //standard junk
}

now when this is serialized you get a bunch of junk like the class header,
etc, but all you really care about is that single long id field. so what if
you have

interface ModelSerializationCodec {
boolean supportsModel(Class<? extends IModel>);
writeModel(ObjectOutputStream s, IModel model);
IModel readModel(ObjectInputStream);
}

class ModelSerializationCodecRegistry {
   private ModelSerializationCodec[]=new ModelSerializationCodec[255];
   void registerCodec(ModelSerializationCodec codec) {...}
   void codecForId(byte id) {...}
   int codecIdForClass(Class<? extends IModel>){...}
}

Component.writeObject(ObjectOutputStream oos) {
// or instead of overriding component.writeobject we can put model
// into a model holder object that has this logic
// or even wrap the model in the model holder conditionally when the model
is set only if there is a codec

// i suppose this can also be in a special outputstream we use so it doesnt
have to be in the component
// we already have a few of these
  ...
// write out model
byte codecId=registry.codecIdForClass(model.getClass());
oos.writebyte(0);

if (codecId!=0) {
   registry.codecForId(codecId).writeModel(oos, model);
}
}

class EntityModelCodec implements ModelSerializationCodec {
boolean supportsModel(Class<? extends IModel> c) { return
EntityModel.class.equals(c); }
writeModel(ObjectOutputStream s, IModel model) { s.writeLong(model.id); }
IModel readModel(ObjectInputStream ois) { EntityModel em=new EntityModel();
em.id=ois.readLong(); }}

so now instead of serializing the entire model instance you can just write
out the fields you need and because this ability is outside the model you
can write out simple primitives. the downside is that for any model that
doesnt have a codec the size of output takes an extra hit - which is the
codec id byte. the question is how many bytes are in the extra junk like
class header - this ratio will determine if this is generally worth doing.

the only trick here is to keep the codecid byte consistent across the
cluster. we can probably use initializers to do that for components that
come in jars, just add a registerModelCodecs to the initializer and also
allow the same from application.init()

does this make sense? what do you guys think?

-igor

Re: optimizing serialized state: model serialization codecs idea

Posted by Eelco Hillenius <ee...@gmail.com>.

On 1/30/07, Johan Compagner <jc...@gmail.com> wrote:
> thats kinda like Externalizable

My thought as well. Small changes in models and components can have a
pretty big impact on memory consumption overall, so investigating
optimization is definitively worth it. However, we can also learn from
some of the past optimizations we built that it grows the API, making
things harder to understand and maintain. If the gain would be big
enough, I'm ok with it, if not, I'd prefer us to stay with the
mechanisms like Externalizable that Java provides out of the box.

Eelco

Re: optimizing serialized state: model serialization codecs idea

Posted by Igor Vaynberg <ig...@gmail.com>.

i dont think we can do that automatically, but manually you can already do
it via IPageMapEntry Page.getPageMapEntry()

-igor


On 1/30/07, Johan Compagner <jc...@gmail.com> wrote:
>
> >
> > > Can't we go one step further? :)
> > > Don't create the whole new tree but adjust the current one... that
> would
> > > be
> > > cool.
> >
> >
> > you mean what we discussed a long long time ago about automatic
> > versioning?
> > compare two trees and build a property map that is the delta?
>
>
>
> i think delta is a bit tricky (what happens if we want from to go to
> version
> X instead of Z).
> And i don't know how big the property map would be so maybe this is
> nothing
> But if we could extract only the state from the page
> and then if we need page version we could do this
>
> page.load(xxxx)
>
> or if we didn't have the page anymore
>
> new Page().load(xxx)
>
> johan
>
>

Re: optimizing serialized state: model serialization codecs idea

Posted by Johan Compagner <jc...@gmail.com>.

>
> > Can't we go one step further? :)
> > Don't create the whole new tree but adjust the current one... that would
> > be
> > cool.
>
>
> you mean what we discussed a long long time ago about automatic
> versioning?
> compare two trees and build a property map that is the delta?



i think delta is a bit tricky (what happens if we want from to go to version
X instead of Z).
 And i don't know how big the property map would be so maybe this is nothing
But if we could extract only the state from the page
and then if we need page version we could do this

page.load(xxxx)

or if we didn't have the page anymore

new Page().load(xxx)

johan

Re: optimizing serialized state: model serialization codecs idea

Posted by Igor Vaynberg <ig...@gmail.com>.

yep

-igor


On 1/30/07, Johan Compagner <jc...@gmail.com> wrote:
>
> ohh it should then be this:
>
> byte codecId=registry.codecIdForClass(model.getClass());
> oos.writebyte(codecId);
>
> and that byte can be 0
>
> thats why i was confused.
>
> johan
>
>
> On 1/30/07, Igor Vaynberg <ig...@gmail.com> wrote:
> >
> > On 1/30/07, Johan Compagner <jc...@gmail.com> wrote:
> > >
> > > thats kinda like Externalizable We could try it and see what it gains
> us
> > > (speed and size)
> >
> >
> > yeah pretty close, but we dont even want the id written out, who knows
> how
> > big that is? the whole idea is that some models can be replaced with
> just
> > a
> > few primitives. and especially with our file store we win size, but we
> > also
> > win speed because we have to read/write less.
> >
> > Why do we need to write that 0 byte?
> >
> >
> > the 0 byte tells the system that no codec has been used, so it should
> just
> > load the  model isntance using default serialization.
> >
> >
> > > Can't we go one step further? :)
> > > Don't create the whole new tree but adjust the current one... that
> would
> > > be
> > > cool.
> >
> >
> > you mean what we discussed a long long time ago about automatic
> > versioning?
> > compare two trees and build a property map that is the delta?
> >
> > -igor
> >
> > On 1/30/07, Igor Vaynberg <ig...@gmail.com> wrote:
> > > >
> > > > another idea to optimize serialization state from jon and i
> > > >
> > > > allow an easy way to override model serialization
> > > >
> > > > a simple example:
> > > >
> > > > class EntityModel extends LoadableDetachableModel {
> > > >    private long id;
> > > >    //standard junk
> > > > }
> > > >
> > > > now when this is serialized you get a bunch of junk like the class
> > > header,
> > > > etc, but all you really care about is that single long id field. so
> > what
> > > > if
> > > > you have
> > > >
> > > > interface ModelSerializationCodec {
> > > > boolean supportsModel(Class<? extends IModel>);
> > > > writeModel(ObjectOutputStream s, IModel model);
> > > > IModel readModel(ObjectInputStream);
> > > > }
> > > >
> > > > class ModelSerializationCodecRegistry {
> > > >    private ModelSerializationCodec[]=new
> ModelSerializationCodec[255];
> > > >    void registerCodec(ModelSerializationCodec codec) {...}
> > > >    void codecForId(byte id) {...}
> > > >    int codecIdForClass(Class<? extends IModel>){...}
> > > > }
> > > >
> > > > Component.writeObject(ObjectOutputStream oos) {
> > > > // or instead of overriding component.writeobject we can put model
> > > > // into a model holder object that has this logic
> > > > // or even wrap the model in the model holder conditionally when the
> > > model
> > > > is set only if there is a codec
> > > >
> > > > // i suppose this can also be in a special outputstream we use so it
> > > > doesnt
> > > > have to be in the component
> > > > // we already have a few of these
> > > >   ...
> > > > // write out model
> > > > byte codecId=registry.codecIdForClass(model.getClass());
> > > > oos.writebyte(0);
> > > >
> > > > if (codecId!=0) {
> > > >    registry.codecForId(codecId).writeModel(oos, model);
> > > > }
> > > > }
> > > >
> > > > class EntityModelCodec implements ModelSerializationCodec {
> > > > boolean supportsModel(Class<? extends IModel> c) { return
> > > > EntityModel.class.equals(c); }
> > > > writeModel(ObjectOutputStream s, IModel model) { s.writeLong(
> model.id
> > );
> > > }
> > > > IModel readModel(ObjectInputStream ois) { EntityModel em=new
> > > > EntityModel();
> > > > em.id=ois.readLong(); }}
> > > >
> > > > so now instead of serializing the entire model instance you can just
> > > write
> > > > out the fields you need and because this ability is outside the
> model
> > > you
> > > > can write out simple primitives. the downside is that for any model
> > that
> > > > doesnt have a codec the size of output takes an extra hit - which is
> > the
> > > > codec id byte. the question is how many bytes are in the extra junk
> > like
> > > > class header - this ratio will determine if this is generally worth
> > > doing.
> > > >
> > > > the only trick here is to keep the codecid byte consistent across
> the
> > > > cluster. we can probably use initializers to do that for components
> > that
> > > > come in jars, just add a registerModelCodecs to the initializer and
> > also
> > > > allow the same from application.init()
> > > >
> > > > does this make sense? what do you guys think?
> > > >
> > > > -igor
> > > >
> > > >
> > >
> > >
> >
> >
>
>

Re: optimizing serialized state: model serialization codecs idea

Posted by Johan Compagner <jc...@gmail.com>.

ohh it should then be this:

byte codecId=registry.codecIdForClass(model.getClass());
oos.writebyte(codecId);

and that byte can be 0

thats why i was confused.

johan


On 1/30/07, Igor Vaynberg <ig...@gmail.com> wrote:
>
> On 1/30/07, Johan Compagner <jc...@gmail.com> wrote:
> >
> > thats kinda like Externalizable We could try it and see what it gains us
> > (speed and size)
>
>
> yeah pretty close, but we dont even want the id written out, who knows how
> big that is? the whole idea is that some models can be replaced with just
> a
> few primitives. and especially with our file store we win size, but we
> also
> win speed because we have to read/write less.
>
> Why do we need to write that 0 byte?
>
>
> the 0 byte tells the system that no codec has been used, so it should just
> load the  model isntance using default serialization.
>
>
> > Can't we go one step further? :)
> > Don't create the whole new tree but adjust the current one... that would
> > be
> > cool.
>
>
> you mean what we discussed a long long time ago about automatic
> versioning?
> compare two trees and build a property map that is the delta?
>
> -igor
>
> On 1/30/07, Igor Vaynberg <ig...@gmail.com> wrote:
> > >
> > > another idea to optimize serialization state from jon and i
> > >
> > > allow an easy way to override model serialization
> > >
> > > a simple example:
> > >
> > > class EntityModel extends LoadableDetachableModel {
> > >    private long id;
> > >    //standard junk
> > > }
> > >
> > > now when this is serialized you get a bunch of junk like the class
> > header,
> > > etc, but all you really care about is that single long id field. so
> what
> > > if
> > > you have
> > >
> > > interface ModelSerializationCodec {
> > > boolean supportsModel(Class<? extends IModel>);
> > > writeModel(ObjectOutputStream s, IModel model);
> > > IModel readModel(ObjectInputStream);
> > > }
> > >
> > > class ModelSerializationCodecRegistry {
> > >    private ModelSerializationCodec[]=new ModelSerializationCodec[255];
> > >    void registerCodec(ModelSerializationCodec codec) {...}
> > >    void codecForId(byte id) {...}
> > >    int codecIdForClass(Class<? extends IModel>){...}
> > > }
> > >
> > > Component.writeObject(ObjectOutputStream oos) {
> > > // or instead of overriding component.writeobject we can put model
> > > // into a model holder object that has this logic
> > > // or even wrap the model in the model holder conditionally when the
> > model
> > > is set only if there is a codec
> > >
> > > // i suppose this can also be in a special outputstream we use so it
> > > doesnt
> > > have to be in the component
> > > // we already have a few of these
> > >   ...
> > > // write out model
> > > byte codecId=registry.codecIdForClass(model.getClass());
> > > oos.writebyte(0);
> > >
> > > if (codecId!=0) {
> > >    registry.codecForId(codecId).writeModel(oos, model);
> > > }
> > > }
> > >
> > > class EntityModelCodec implements ModelSerializationCodec {
> > > boolean supportsModel(Class<? extends IModel> c) { return
> > > EntityModel.class.equals(c); }
> > > writeModel(ObjectOutputStream s, IModel model) { s.writeLong(model.id
> );
> > }
> > > IModel readModel(ObjectInputStream ois) { EntityModel em=new
> > > EntityModel();
> > > em.id=ois.readLong(); }}
> > >
> > > so now instead of serializing the entire model instance you can just
> > write
> > > out the fields you need and because this ability is outside the model
> > you
> > > can write out simple primitives. the downside is that for any model
> that
> > > doesnt have a codec the size of output takes an extra hit - which is
> the
> > > codec id byte. the question is how many bytes are in the extra junk
> like
> > > class header - this ratio will determine if this is generally worth
> > doing.
> > >
> > > the only trick here is to keep the codecid byte consistent across the
> > > cluster. we can probably use initializers to do that for components
> that
> > > come in jars, just add a registerModelCodecs to the initializer and
> also
> > > allow the same from application.init()
> > >
> > > does this make sense? what do you guys think?
> > >
> > > -igor
> > >
> > >
> >
> >
>
>

Re: optimizing serialized state: model serialization codecs idea

Posted by Igor Vaynberg <ig...@gmail.com>.

On 1/30/07, Johan Compagner <jc...@gmail.com> wrote:
>
> thats kinda like Externalizable We could try it and see what it gains us
> (speed and size)


yeah pretty close, but we dont even want the id written out, who knows how
big that is? the whole idea is that some models can be replaced with just a
few primitives. and especially with our file store we win size, but we also
win speed because we have to read/write less.

Why do we need to write that 0 byte?


the 0 byte tells the system that no codec has been used, so it should just
load the  model isntance using default serialization.


> Can't we go one step further? :)
> Don't create the whole new tree but adjust the current one... that would
> be
> cool.


you mean what we discussed a long long time ago about automatic versioning?
compare two trees and build a property map that is the delta?

-igor

On 1/30/07, Igor Vaynberg <ig...@gmail.com> wrote:
> >
> > another idea to optimize serialization state from jon and i
> >
> > allow an easy way to override model serialization
> >
> > a simple example:
> >
> > class EntityModel extends LoadableDetachableModel {
> >    private long id;
> >    //standard junk
> > }
> >
> > now when this is serialized you get a bunch of junk like the class
> header,
> > etc, but all you really care about is that single long id field. so what
> > if
> > you have
> >
> > interface ModelSerializationCodec {
> > boolean supportsModel(Class<? extends IModel>);
> > writeModel(ObjectOutputStream s, IModel model);
> > IModel readModel(ObjectInputStream);
> > }
> >
> > class ModelSerializationCodecRegistry {
> >    private ModelSerializationCodec[]=new ModelSerializationCodec[255];
> >    void registerCodec(ModelSerializationCodec codec) {...}
> >    void codecForId(byte id) {...}
> >    int codecIdForClass(Class<? extends IModel>){...}
> > }
> >
> > Component.writeObject(ObjectOutputStream oos) {
> > // or instead of overriding component.writeobject we can put model
> > // into a model holder object that has this logic
> > // or even wrap the model in the model holder conditionally when the
> model
> > is set only if there is a codec
> >
> > // i suppose this can also be in a special outputstream we use so it
> > doesnt
> > have to be in the component
> > // we already have a few of these
> >   ...
> > // write out model
> > byte codecId=registry.codecIdForClass(model.getClass());
> > oos.writebyte(0);
> >
> > if (codecId!=0) {
> >    registry.codecForId(codecId).writeModel(oos, model);
> > }
> > }
> >
> > class EntityModelCodec implements ModelSerializationCodec {
> > boolean supportsModel(Class<? extends IModel> c) { return
> > EntityModel.class.equals(c); }
> > writeModel(ObjectOutputStream s, IModel model) { s.writeLong(model.id);
> }
> > IModel readModel(ObjectInputStream ois) { EntityModel em=new
> > EntityModel();
> > em.id=ois.readLong(); }}
> >
> > so now instead of serializing the entire model instance you can just
> write
> > out the fields you need and because this ability is outside the model
> you
> > can write out simple primitives. the downside is that for any model that
> > doesnt have a codec the size of output takes an extra hit - which is the
> > codec id byte. the question is how many bytes are in the extra junk like
> > class header - this ratio will determine if this is generally worth
> doing.
> >
> > the only trick here is to keep the codecid byte consistent across the
> > cluster. we can probably use initializers to do that for components that
> > come in jars, just add a registerModelCodecs to the initializer and also
> > allow the same from application.init()
> >
> > does this make sense? what do you guys think?
> >
> > -igor
> >
> >
>
>

Re: optimizing serialized state: model serialization codecs idea

Posted by Johan Compagner <jc...@gmail.com>.

thats kinda like Externalizable We could try it and see what it gains us
(speed and size)
Why do we need to write that 0 byte?



Can't we go one step further? :)
Don't create the whole new tree but adjust the current one... that would be
cool.
(and fast) But that would mean quite a lot of hacking if it is possible..




On 1/30/07, Igor Vaynberg <ig...@gmail.com> wrote:
>
> another idea to optimize serialization state from jon and i
>
> allow an easy way to override model serialization
>
> a simple example:
>
> class EntityModel extends LoadableDetachableModel {
>    private long id;
>    //standard junk
> }
>
> now when this is serialized you get a bunch of junk like the class header,
> etc, but all you really care about is that single long id field. so what
> if
> you have
>
> interface ModelSerializationCodec {
> boolean supportsModel(Class<? extends IModel>);
> writeModel(ObjectOutputStream s, IModel model);
> IModel readModel(ObjectInputStream);
> }
>
> class ModelSerializationCodecRegistry {
>    private ModelSerializationCodec[]=new ModelSerializationCodec[255];
>    void registerCodec(ModelSerializationCodec codec) {...}
>    void codecForId(byte id) {...}
>    int codecIdForClass(Class<? extends IModel>){...}
> }
>
> Component.writeObject(ObjectOutputStream oos) {
> // or instead of overriding component.writeobject we can put model
> // into a model holder object that has this logic
> // or even wrap the model in the model holder conditionally when the model
> is set only if there is a codec
>
> // i suppose this can also be in a special outputstream we use so it
> doesnt
> have to be in the component
> // we already have a few of these
>   ...
> // write out model
> byte codecId=registry.codecIdForClass(model.getClass());
> oos.writebyte(0);
>
> if (codecId!=0) {
>    registry.codecForId(codecId).writeModel(oos, model);
> }
> }
>
> class EntityModelCodec implements ModelSerializationCodec {
> boolean supportsModel(Class<? extends IModel> c) { return
> EntityModel.class.equals(c); }
> writeModel(ObjectOutputStream s, IModel model) { s.writeLong(model.id); }
> IModel readModel(ObjectInputStream ois) { EntityModel em=new
> EntityModel();
> em.id=ois.readLong(); }}
>
> so now instead of serializing the entire model instance you can just write
> out the fields you need and because this ability is outside the model you
> can write out simple primitives. the downside is that for any model that
> doesnt have a codec the size of output takes an extra hit - which is the
> codec id byte. the question is how many bytes are in the extra junk like
> class header - this ratio will determine if this is generally worth doing.
>
> the only trick here is to keep the codecid byte consistent across the
> cluster. we can probably use initializers to do that for components that
> come in jars, just add a registerModelCodecs to the initializer and also
> allow the same from application.init()
>
> does this make sense? what do you guys think?
>
> -igor
>
>

Re: [OT] pagemap entries and versioning

Posted by Eelco Hillenius <ee...@gmail.com>.

On 2/2/07, Johan Compagner <jc...@gmail.com> wrote:
> please read the thread.
> There is no value in it because they both point to each other somewhere in
> the code
> We CAN'T separate it because we can't kill all the lines to each other.

Well, only if you think the parenting thing is unsolvable. Are you
convinced it isn't?

> ofcourse what we could do is really kill all version info we keep and just
> up the version number
> but then we need to save the page and one back button press will always have
> to read/serialize the page back in.

Yeah, that sounds like a good second idea. Not optimal but probably good enough.

Eelco

Re: [OT] pagemap entries and versioning

Posted by Johan Compagner <jc...@gmail.com>.

please read the thread.
There is no value in it because they both point to each other somewhere in
the code
We CAN'T separate it because we can't kill all the lines to each other.

ofcourse what we could do is really kill all version info we keep and just
up the version number
but then we need to save the page and one back button press will always have
to read/serialize the page back in.

johan

On 2/3/07, Eelco Hillenius <ee...@gmail.com> wrote:
>
> > so i don't see any real value of separating the changes from a page.
> Because
> > what would you do with that?
>
> You must be kidding! Do you have any idea how big pages can get when
> you do lots of component replacement? Furthermore, we now have two
> very different ways of managing how many page instances and versions
> there are, while this should be the same really (as they both relate
> to the browser history).
>
> Eelco
>

Re: [OT] pagemap entries and versioning

Posted by Eelco Hillenius <ee...@gmail.com>.

> so i don't see any real value of separating the changes from a page. Because
> what would you do with that?

You must be kidding! Do you have any idea how big pages can get when
you do lots of component replacement? Furthermore, we now have two
very different ways of managing how many page instances and versions
there are, while this should be the same really (as they both relate
to the browser history).

Eelco

Re: [OT] pagemap entries and versioning

Posted by Eelco Hillenius <ee...@gmail.com>.

On 2/2/07, Eelco Hillenius <ee...@gmail.com> wrote:
> On 2/2/07, Johan Compagner <jc...@gmail.com> wrote:
> > ofcourse in the end everything is the page
> > But what to do for example with the component that is in both places?
> > (the component that is removed from the parent but still kept in the page
> > for later attachement)
>
> The first pass would serialize the whole page and index it so that we
> can easily find it back later. If we do custom serialization, we could
> take some heuristics into account like the component path/ ids and
> such.

So, this would work the opposite of what we're doing now. Rather than
rolling back, we would start with the first version of the page (which
is fully serialized) and then apply the changes up to the version we
need. This may or may not be more expensive depending on how many
version there would be and what version is rolled back too, but in
terms of serializing and saving state (including disk or network
access) it would for certain be cheaper.

Eelco

Re: [OT] pagemap entries and versioning

Posted by Eelco Hillenius <ee...@gmail.com>.

On 2/2/07, Johan Compagner <jc...@gmail.com> wrote:
> ofcourse in the end everything is the page
> But what to do for example with the component that is in both places?
> (the component that is removed from the parent but still kept in the page
> for later attachement)

The first pass would serialize the whole page and index it so that we
can easily find it back later. If we do custom serialization, we could
take some heuristics into account like the component path/ ids and
such.

> would be very hard to decide where it belongs and at what place to cut it
> loose.

Yeah, I'm not saying it would be easy. But if we could pull it off the
gain would be pretty big.

Eelco

Re: [OT] pagemap entries and versioning

Posted by Johan Compagner <jc...@gmail.com>.

ofcourse in the end everything is the page
But what to do for example with the component that is in both places?
(the component that is removed from the parent but still kept in the page
for later attachement)

would be very hard to decide where it belongs and at what place to cut it
loose.

johan


On 2/3/07, Eelco Hillenius <ee...@gmail.com> wrote:
>
> On 2/2/07, Johan Compagner <jc...@gmail.com> wrote:
> > No really forget it to really separate the undo changes from the page.
> > This is really impossible to guarantee. Anon classes, models pointing to
> the
> > page or component
> > parents that keep the child as a reference so that they can do attach()
> (to
> > parent) again.
> > there are sooo many things that connect the 2 things
>
> Well, ultimately you would have a finite set of possible parents, with
> the ultimate parent being the page (or else you would have a problem,
> in which case it would actually be an advantage if we could detect
> that). I'm just wondering whether if we take the serialization in our
> own hands, we can keep track of the parents and resolve them from the
> other versions up to the serialized page when we need it.
>
> Eelco
>

Re: [OT] pagemap entries and versioning

Posted by Eelco Hillenius <ee...@gmail.com>.

On 2/2/07, Johan Compagner <jc...@gmail.com> wrote:
> No really forget it to really separate the undo changes from the page.
> This is really impossible to guarantee. Anon classes, models pointing to the
> page or component
> parents that keep the child as a reference so that they can do attach() (to
> parent) again.
> there are sooo many things that connect the 2 things

Well, ultimately you would have a finite set of possible parents, with
the ultimate parent being the page (or else you would have a problem,
in which case it would actually be an advantage if we could detect
that). I'm just wondering whether if we take the serialization in our
own hands, we can keep track of the parents and resolve them from the
other versions up to the serialized page when we need it.

Eelco

Re: [OT] pagemap entries and versioning

Posted by Johan Compagner <jc...@gmail.com>.

No really forget it to really separate the undo changes from the page.
This is really impossible to guarantee. Anon classes, models pointing to the
page or component
parents that keep the child as a reference so that they can do attach() (to
parent) again.
there are sooo many things that connect the 2 things

johan

On 2/3/07, Eelco Hillenius <ee...@gmail.com> wrote:
>
> > But with the second level cache if
> > we implement it right
> > we only really have to keep 1 page version (maybe not even that but that
> is
> > what i would do)
>
> That still wouldn't be as efficient, but ok, that would probably be good
> enough.
>
> Also: there has been talks about doing custom serialization. If we
> would have the serialization bit in our own hands, we could take care
> of the parenting couldn't we (though I'm not saying it would be easy)?
>
> Eelco
>

Re: [OT] pagemap entries and versioning

Posted by Eelco Hillenius <ee...@gmail.com>.

> But with the second level cache if
> we implement it right
> we only really have to keep 1 page version (maybe not even that but that is
> what i would do)

That still wouldn't be as efficient, but ok, that would probably be good enough.

Also: there has been talks about doing custom serialization. If we
would have the serialization bit in our own hands, we could take care
of the parenting couldn't we (though I'm not saying it would be easy)?

Eelco

Re: [OT] pagemap entries and versioning

Posted by Eelco Hillenius <ee...@gmail.com>.

> yeah, i agree that it's best to focus on making our serialization
> more efficient as a first priority.  then see if there's anything else
> worth doing.  i'm not 100% convinced that it's actually impossible to
> separate
> out the versioned components because we could record some kind of virtual
> object pointer to substitute for the reference that could be fixed
> up later (basically implement a very custom serialization mechanism that
> allows object references to span files), but at that point, i think we'd be
> writing our own highly wicket-specific serialization code from scratch and
> it
> seems like just doing efficient serialization with something like xstream
> as a starting point might be good enough to really improve things
> dramatically.

Agreed, let's start there.

Eelco

Re: [OT] pagemap entries and versioning

Posted by Jonathan Locke <jo...@gmail.com>.


yeah, i agree that it's best to focus on making our serialization
more efficient as a first priority.  then see if there's anything else 
worth doing.  i'm not 100% convinced that it's actually impossible to
separate 
out the versioned components because we could record some kind of virtual
object pointer to substitute for the reference that could be fixed 
up later (basically implement a very custom serialization mechanism that 
allows object references to span files), but at that point, i think we'd be 
writing our own highly wicket-specific serialization code from scratch and
it 
seems like just doing efficient serialization with something like xstream
as a starting point might be good enough to really improve things
dramatically.

obviously we should be able to make a binary serialization of wicket
components
radically more efficient than the default java version, particularly in
terms of 
the size of things.  one off-the-cuff idea of how to do this is to maintain
a complete 
map of the classes and fields that the serializer serializes separately. 
then, using 
that map, we should be able to get rid of object headers and if we're
willing to say 
we don't care about serialization version compatibility issues (an
incompatible
class or field change would cause the system to drop all your pagemaps) 
we could even drop field names and types and just get down to the raw data.  
and i'd i bet there's actually not so much of that.  probably a tiny
fraction of the size.
since this doesn't affect clustering, the only issue would be keeping a
persistent
version of this class/field map so that restarts would only dump your
backbutton
data if it really needs to be dumped.

i'm actually very curious how xstream works, assuming it's really compatible
with
java serialization.  i mean, how do they create objects to initialize with
field data
without actually constructing them with new?  i always figured that was
unique 
to the built-in serialization.  maybe xstream is not completely compatible
that way?


Johan Compagner wrote:
> 
> On 2/3/07, Jonathan Locke <jo...@gmail.com> wrote:
>>
>>
>>
>> of course.  right you are.  you could fix parenting.  but anon classes
>> that
>> reference the
> 
> 
> this would be a problem in 2.0!
> components need to always keep a reference to the parent
> Else the developer can't say reattach() to the component. Because then the
> component
> can't add itself again to the parent. So really getting rid of it is hard.
> 
> It is not just anon classes i can do this:
> 
> MyTabPanel
> 
> Component child1 = MyChild1(MyTabPanel.this,"child");
> Component child2= MyChild2(MyTabPanel.this,"child");
> 
> and keep the child as a reference
> then later on the other tab must be shown
> 
> child1.attach();
> 
> This will cause the child2 to be removed (and in the undo map)
> but the page still has the reference to the child itself (so it can say
> attach() ) again
> 
> So separating changes is just not possible.
> 
> 
> page... that would be a killer for this whole idea.  shucks, it was too
> good
>> to be true...
>>
>> so you are suggesting getting rid of versioning entirely when the second
>> level cache
>> is running and just save the whole page each time? (sounded like it)
> 
> 
> We already do that now.
> We save  all page versions to disk: pageid:pageversion
> 
> so my plan was to only have 1 extra version (besides the page itself)
> in the undo manager so that the page in the session is pretty light.
> And one backbutton is very quick (no disk read) and that is the most used
> behavior i guess.
> So it looks like the best trade off
> 
> We should really focus on getting the serializable as quick as possible
> and
> the resulting size as small as possible
> 
> johan
> 
> 
> Johan Compagner wrote:
>> >
>> > just one thing.
>> > It is not possible to really seperate changes from its page.
>> > Because changes (mostly components) always have there parent (so you
>> can
>> > re
>> > attach them)
>> >
>> > And we have no idea what the components or models also by itself have
>> > references to the page. (anon classes)
>> >
>> > so i don't see any real value of separating the changes from a page.
>> > Because
>> > what would you do with that?
>> > save only changes? But then you will save the page anyway.
>> > The page itself will be smaller ofcourse. But with the second level
>> cache
>> > if
>> > we implement it right
>> > we only really have to keep 1 page version (maybe not even that but
>> that
>> > is
>> > what i would do)
>> >
>> > johan
>> >
>> >
>> > On 2/2/07, Jonathan Locke <jo...@gmail.com> wrote:
>> >>
>> >>
>> >>
>> >> before johan complains, i just realized there's a flaw in my little
>> >> plan.  you still have to undo changes to pages that are not
>> reconstructed
>> >> by custom IPageMapEntry implementations because the page is in the
>> >> state of the highest ordinal in the page map because that was the last
>> >> one accessed.  even so, a little more logic here should correct for
>> that.
>> >>
>> >>
>> >> Jonathan Locke wrote:
>> >> >
>> >> >
>> >> > A part of this whole discussion of serializing page map entries is
>> >> > also the current open bug that Eelco submitted that we should make
>> >> > page versions separate from pages.  This came up over at Diva
>> espresso
>> >> > a few minutes ago when Eelco and I were chatting and I had an
>> >> > interesting and very elegant little idea that could sort this out
>> >> > quite nicely.  Not only would it definitely make things more
>> efficient,
>> >> > it would also be a more elegant solution that fixes an unreported
>> bug.
>> >> >
>> >> > first, i think that IPageMapEntry.getNumericId is really more like
>> >> > getOrdinal since page ids will always increment in a page map.
>> >> > but regardless, to implement a page version based on another page
>> >> > in the page map (which might also be a page version), we can use a
>> >> > simple little container implementing IPageMapEntry to hold the
>> >> > base pagemap entry id and the changes to apply to that page.
>> >> > this container might be just an anonymous class, but let's give it
>> >> > a name here for clarity:
>> >> >
>> >> > class PageVersion implements IPageMapEntry
>> >> > {
>> >> >    // Identifier of page map entry to apply these changes to
>> >> >    // If we're using ordinals, we don't need this field at all
>> >> >    // because the page this version is based on will be our own
>> >> >    // ordinal - 1.
>> >> >    int basePageMapEntryNumericId;
>> >> >
>> >> >    // Don't recall the exact base class name for change entries
>> >> >    // in the versioning code, but this is the list of changes to
>> apply
>> >> >    List<Change> changes;
>> >> >
>> >> >    Page getPage()
>> >> >    {
>> >> >       // Get previous page (possibly recursing)
>> >> >       final Page page = pageMap.get(basePageMapEntryNumericId);
>> >> >
>> >> >       // Apply changes to page
>> >> >       page.applyChanges(changes();
>> >> >       return page;
>> >> >    }
>> >> > }
>> >> >
>> >> > this should work nicely and actually this fixes an existing bug
>> because
>> >> > right
>> >> > now if you provide a custom implementation of IPageMapEntry to
>> >> reconstruct
>> >> > a page, that page is probably not versionable.  in this case it
>> would
>> >> be
>> >> > because
>> >> > the recursion would bottom out and reconstruct that page.
>> >> >
>> >> >
>> >> > igor.vaynberg wrote:
>> >> >>
>> >> >> another idea to optimize serialization state from jon and i
>> >> >>
>> >> >> allow an easy way to override model serialization
>> >> >>
>> >> >> a simple example:
>> >> >>
>> >> >> class EntityModel extends LoadableDetachableModel {
>> >> >>    private long id;
>> >> >>    //standard junk
>> >> >> }
>> >> >>
>> >> >> now when this is serialized you get a bunch of junk like the class
>> >> >> header,
>> >> >> etc, but all you really care about is that single long id field. so
>> >> what
>> >> >> if
>> >> >> you have
>> >> >>
>> >> >> interface ModelSerializationCodec {
>> >> >> boolean supportsModel(Class<? extends IModel>);
>> >> >> writeModel(ObjectOutputStream s, IModel model);
>> >> >> IModel readModel(ObjectInputStream);
>> >> >> }
>> >> >>
>> >> >> class ModelSerializationCodecRegistry {
>> >> >>    private ModelSerializationCodec[]=new
>> ModelSerializationCodec[255];
>> >> >>    void registerCodec(ModelSerializationCodec codec) {...}
>> >> >>    void codecForId(byte id) {...}
>> >> >>    int codecIdForClass(Class<? extends IModel>){...}
>> >> >> }
>> >> >>
>> >> >> Component.writeObject(ObjectOutputStream oos) {
>> >> >> // or instead of overriding component.writeobject we can put model
>> >> >> // into a model holder object that has this logic
>> >> >> // or even wrap the model in the model holder conditionally when
>> the
>> >> >> model
>> >> >> is set only if there is a codec
>> >> >>
>> >> >> // i suppose this can also be in a special outputstream we use so
>> it
>> >> >> doesnt
>> >> >> have to be in the component
>> >> >> // we already have a few of these
>> >> >>   ...
>> >> >> // write out model
>> >> >> byte codecId=registry.codecIdForClass(model.getClass());
>> >> >> oos.writebyte(0);
>> >> >>
>> >> >> if (codecId!=0) {
>> >> >>    registry.codecForId(codecId).writeModel(oos, model);
>> >> >> }
>> >> >> }
>> >> >>
>> >> >> class EntityModelCodec implements ModelSerializationCodec {
>> >> >> boolean supportsModel(Class<? extends IModel> c) { return
>> >> >> EntityModel.class.equals(c); }
>> >> >> writeModel(ObjectOutputStream s, IModel model) {
>> >> s.writeLong(model.id);
>> >> }
>> >> >> IModel readModel(ObjectInputStream ois) { EntityModel em=new
>> >> >> EntityModel();
>> >> >> em.id=ois.readLong(); }}
>> >> >>
>> >> >> so now instead of serializing the entire model instance you can
>> just
>> >> >> write
>> >> >> out the fields you need and because this ability is outside the
>> model
>> >> you
>> >> >> can write out simple primitives. the downside is that for any model
>> >> that
>> >> >> doesnt have a codec the size of output takes an extra hit - which
>> is
>> >> the
>> >> >> codec id byte. the question is how many bytes are in the extra junk
>> >> like
>> >> >> class header - this ratio will determine if this is generally worth
>> >> >> doing.
>> >> >>
>> >> >> the only trick here is to keep the codecid byte consistent across
>> the
>> >> >> cluster. we can probably use initializers to do that for components
>> >> that
>> >> >> come in jars, just add a registerModelCodecs to the initializer and
>> >> also
>> >> >> allow the same from application.init()
>> >> >>
>> >> >> does this make sense? what do you guys think?
>> >> >>
>> >> >> -igor
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8775862
>> >> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>> >>
>> >>
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8778010
>> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8778788
Sent from the Wicket - Dev mailing list archive at Nabble.com.

Re: [OT] pagemap entries and versioning

Posted by Johan Compagner <jc...@gmail.com>.

On 2/3/07, Jonathan Locke <jo...@gmail.com> wrote:
>
>
>
> of course.  right you are.  you could fix parenting.  but anon classes
> that
> reference the


this would be a problem in 2.0!
components need to always keep a reference to the parent
Else the developer can't say reattach() to the component. Because then the
component
can't add itself again to the parent. So really getting rid of it is hard.

It is not just anon classes i can do this:

MyTabPanel

Component child1 = MyChild1(MyTabPanel.this,"child");
Component child2= MyChild2(MyTabPanel.this,"child");

and keep the child as a reference
then later on the other tab must be shown

child1.attach();

This will cause the child2 to be removed (and in the undo map)
but the page still has the reference to the child itself (so it can say
attach() ) again

So separating changes is just not possible.


page... that would be a killer for this whole idea.  shucks, it was too good
> to be true...
>
> so you are suggesting getting rid of versioning entirely when the second
> level cache
> is running and just save the whole page each time? (sounded like it)


We already do that now.
We save  all page versions to disk: pageid:pageversion

so my plan was to only have 1 extra version (besides the page itself)
in the undo manager so that the page in the session is pretty light.
And one backbutton is very quick (no disk read) and that is the most used
behavior i guess.
So it looks like the best trade off

We should really focus on getting the serializable as quick as possible and
the resulting size as small as possible

johan


Johan Compagner wrote:
> >
> > just one thing.
> > It is not possible to really seperate changes from its page.
> > Because changes (mostly components) always have there parent (so you can
> > re
> > attach them)
> >
> > And we have no idea what the components or models also by itself have
> > references to the page. (anon classes)
> >
> > so i don't see any real value of separating the changes from a page.
> > Because
> > what would you do with that?
> > save only changes? But then you will save the page anyway.
> > The page itself will be smaller ofcourse. But with the second level
> cache
> > if
> > we implement it right
> > we only really have to keep 1 page version (maybe not even that but that
> > is
> > what i would do)
> >
> > johan
> >
> >
> > On 2/2/07, Jonathan Locke <jo...@gmail.com> wrote:
> >>
> >>
> >>
> >> before johan complains, i just realized there's a flaw in my little
> >> plan.  you still have to undo changes to pages that are not
> reconstructed
> >> by custom IPageMapEntry implementations because the page is in the
> >> state of the highest ordinal in the page map because that was the last
> >> one accessed.  even so, a little more logic here should correct for
> that.
> >>
> >>
> >> Jonathan Locke wrote:
> >> >
> >> >
> >> > A part of this whole discussion of serializing page map entries is
> >> > also the current open bug that Eelco submitted that we should make
> >> > page versions separate from pages.  This came up over at Diva
> espresso
> >> > a few minutes ago when Eelco and I were chatting and I had an
> >> > interesting and very elegant little idea that could sort this out
> >> > quite nicely.  Not only would it definitely make things more
> efficient,
> >> > it would also be a more elegant solution that fixes an unreported
> bug.
> >> >
> >> > first, i think that IPageMapEntry.getNumericId is really more like
> >> > getOrdinal since page ids will always increment in a page map.
> >> > but regardless, to implement a page version based on another page
> >> > in the page map (which might also be a page version), we can use a
> >> > simple little container implementing IPageMapEntry to hold the
> >> > base pagemap entry id and the changes to apply to that page.
> >> > this container might be just an anonymous class, but let's give it
> >> > a name here for clarity:
> >> >
> >> > class PageVersion implements IPageMapEntry
> >> > {
> >> >    // Identifier of page map entry to apply these changes to
> >> >    // If we're using ordinals, we don't need this field at all
> >> >    // because the page this version is based on will be our own
> >> >    // ordinal - 1.
> >> >    int basePageMapEntryNumericId;
> >> >
> >> >    // Don't recall the exact base class name for change entries
> >> >    // in the versioning code, but this is the list of changes to
> apply
> >> >    List<Change> changes;
> >> >
> >> >    Page getPage()
> >> >    {
> >> >       // Get previous page (possibly recursing)
> >> >       final Page page = pageMap.get(basePageMapEntryNumericId);
> >> >
> >> >       // Apply changes to page
> >> >       page.applyChanges(changes();
> >> >       return page;
> >> >    }
> >> > }
> >> >
> >> > this should work nicely and actually this fixes an existing bug
> because
> >> > right
> >> > now if you provide a custom implementation of IPageMapEntry to
> >> reconstruct
> >> > a page, that page is probably not versionable.  in this case it would
> >> be
> >> > because
> >> > the recursion would bottom out and reconstruct that page.
> >> >
> >> >
> >> > igor.vaynberg wrote:
> >> >>
> >> >> another idea to optimize serialization state from jon and i
> >> >>
> >> >> allow an easy way to override model serialization
> >> >>
> >> >> a simple example:
> >> >>
> >> >> class EntityModel extends LoadableDetachableModel {
> >> >>    private long id;
> >> >>    //standard junk
> >> >> }
> >> >>
> >> >> now when this is serialized you get a bunch of junk like the class
> >> >> header,
> >> >> etc, but all you really care about is that single long id field. so
> >> what
> >> >> if
> >> >> you have
> >> >>
> >> >> interface ModelSerializationCodec {
> >> >> boolean supportsModel(Class<? extends IModel>);
> >> >> writeModel(ObjectOutputStream s, IModel model);
> >> >> IModel readModel(ObjectInputStream);
> >> >> }
> >> >>
> >> >> class ModelSerializationCodecRegistry {
> >> >>    private ModelSerializationCodec[]=new
> ModelSerializationCodec[255];
> >> >>    void registerCodec(ModelSerializationCodec codec) {...}
> >> >>    void codecForId(byte id) {...}
> >> >>    int codecIdForClass(Class<? extends IModel>){...}
> >> >> }
> >> >>
> >> >> Component.writeObject(ObjectOutputStream oos) {
> >> >> // or instead of overriding component.writeobject we can put model
> >> >> // into a model holder object that has this logic
> >> >> // or even wrap the model in the model holder conditionally when the
> >> >> model
> >> >> is set only if there is a codec
> >> >>
> >> >> // i suppose this can also be in a special outputstream we use so it
> >> >> doesnt
> >> >> have to be in the component
> >> >> // we already have a few of these
> >> >>   ...
> >> >> // write out model
> >> >> byte codecId=registry.codecIdForClass(model.getClass());
> >> >> oos.writebyte(0);
> >> >>
> >> >> if (codecId!=0) {
> >> >>    registry.codecForId(codecId).writeModel(oos, model);
> >> >> }
> >> >> }
> >> >>
> >> >> class EntityModelCodec implements ModelSerializationCodec {
> >> >> boolean supportsModel(Class<? extends IModel> c) { return
> >> >> EntityModel.class.equals(c); }
> >> >> writeModel(ObjectOutputStream s, IModel model) {
> >> s.writeLong(model.id);
> >> }
> >> >> IModel readModel(ObjectInputStream ois) { EntityModel em=new
> >> >> EntityModel();
> >> >> em.id=ois.readLong(); }}
> >> >>
> >> >> so now instead of serializing the entire model instance you can just
> >> >> write
> >> >> out the fields you need and because this ability is outside the
> model
> >> you
> >> >> can write out simple primitives. the downside is that for any model
> >> that
> >> >> doesnt have a codec the size of output takes an extra hit - which is
> >> the
> >> >> codec id byte. the question is how many bytes are in the extra junk
> >> like
> >> >> class header - this ratio will determine if this is generally worth
> >> >> doing.
> >> >>
> >> >> the only trick here is to keep the codecid byte consistent across
> the
> >> >> cluster. we can probably use initializers to do that for components
> >> that
> >> >> come in jars, just add a registerModelCodecs to the initializer and
> >> also
> >> >> allow the same from application.init()
> >> >>
> >> >> does this make sense? what do you guys think?
> >> >>
> >> >> -igor
> >> >>
> >> >>
> >> >
> >> >
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8775862
> >> Sent from the Wicket - Dev mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8778010
> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>
>

Re: [OT] pagemap entries and versioning

Posted by Jonathan Locke <jo...@gmail.com>.


of course.  right you are.  you could fix parenting.  but anon classes that
reference the 
page... that would be a killer for this whole idea.  shucks, it was too good
to be true...

so you are suggesting getting rid of versioning entirely when the second
level cache 
is running and just save the whole page each time? (sounded like it)


Johan Compagner wrote:
> 
> just one thing.
> It is not possible to really seperate changes from its page.
> Because changes (mostly components) always have there parent (so you can
> re
> attach them)
> 
> And we have no idea what the components or models also by itself have
> references to the page. (anon classes)
> 
> so i don't see any real value of separating the changes from a page.
> Because
> what would you do with that?
> save only changes? But then you will save the page anyway.
> The page itself will be smaller ofcourse. But with the second level cache
> if
> we implement it right
> we only really have to keep 1 page version (maybe not even that but that
> is
> what i would do)
> 
> johan
> 
> 
> On 2/2/07, Jonathan Locke <jo...@gmail.com> wrote:
>>
>>
>>
>> before johan complains, i just realized there's a flaw in my little
>> plan.  you still have to undo changes to pages that are not reconstructed
>> by custom IPageMapEntry implementations because the page is in the
>> state of the highest ordinal in the page map because that was the last
>> one accessed.  even so, a little more logic here should correct for that.
>>
>>
>> Jonathan Locke wrote:
>> >
>> >
>> > A part of this whole discussion of serializing page map entries is
>> > also the current open bug that Eelco submitted that we should make
>> > page versions separate from pages.  This came up over at Diva espresso
>> > a few minutes ago when Eelco and I were chatting and I had an
>> > interesting and very elegant little idea that could sort this out
>> > quite nicely.  Not only would it definitely make things more efficient,
>> > it would also be a more elegant solution that fixes an unreported bug.
>> >
>> > first, i think that IPageMapEntry.getNumericId is really more like
>> > getOrdinal since page ids will always increment in a page map.
>> > but regardless, to implement a page version based on another page
>> > in the page map (which might also be a page version), we can use a
>> > simple little container implementing IPageMapEntry to hold the
>> > base pagemap entry id and the changes to apply to that page.
>> > this container might be just an anonymous class, but let's give it
>> > a name here for clarity:
>> >
>> > class PageVersion implements IPageMapEntry
>> > {
>> >    // Identifier of page map entry to apply these changes to
>> >    // If we're using ordinals, we don't need this field at all
>> >    // because the page this version is based on will be our own
>> >    // ordinal - 1.
>> >    int basePageMapEntryNumericId;
>> >
>> >    // Don't recall the exact base class name for change entries
>> >    // in the versioning code, but this is the list of changes to apply
>> >    List<Change> changes;
>> >
>> >    Page getPage()
>> >    {
>> >       // Get previous page (possibly recursing)
>> >       final Page page = pageMap.get(basePageMapEntryNumericId);
>> >
>> >       // Apply changes to page
>> >       page.applyChanges(changes();
>> >       return page;
>> >    }
>> > }
>> >
>> > this should work nicely and actually this fixes an existing bug because
>> > right
>> > now if you provide a custom implementation of IPageMapEntry to
>> reconstruct
>> > a page, that page is probably not versionable.  in this case it would
>> be
>> > because
>> > the recursion would bottom out and reconstruct that page.
>> >
>> >
>> > igor.vaynberg wrote:
>> >>
>> >> another idea to optimize serialization state from jon and i
>> >>
>> >> allow an easy way to override model serialization
>> >>
>> >> a simple example:
>> >>
>> >> class EntityModel extends LoadableDetachableModel {
>> >>    private long id;
>> >>    //standard junk
>> >> }
>> >>
>> >> now when this is serialized you get a bunch of junk like the class
>> >> header,
>> >> etc, but all you really care about is that single long id field. so
>> what
>> >> if
>> >> you have
>> >>
>> >> interface ModelSerializationCodec {
>> >> boolean supportsModel(Class<? extends IModel>);
>> >> writeModel(ObjectOutputStream s, IModel model);
>> >> IModel readModel(ObjectInputStream);
>> >> }
>> >>
>> >> class ModelSerializationCodecRegistry {
>> >>    private ModelSerializationCodec[]=new ModelSerializationCodec[255];
>> >>    void registerCodec(ModelSerializationCodec codec) {...}
>> >>    void codecForId(byte id) {...}
>> >>    int codecIdForClass(Class<? extends IModel>){...}
>> >> }
>> >>
>> >> Component.writeObject(ObjectOutputStream oos) {
>> >> // or instead of overriding component.writeobject we can put model
>> >> // into a model holder object that has this logic
>> >> // or even wrap the model in the model holder conditionally when the
>> >> model
>> >> is set only if there is a codec
>> >>
>> >> // i suppose this can also be in a special outputstream we use so it
>> >> doesnt
>> >> have to be in the component
>> >> // we already have a few of these
>> >>   ...
>> >> // write out model
>> >> byte codecId=registry.codecIdForClass(model.getClass());
>> >> oos.writebyte(0);
>> >>
>> >> if (codecId!=0) {
>> >>    registry.codecForId(codecId).writeModel(oos, model);
>> >> }
>> >> }
>> >>
>> >> class EntityModelCodec implements ModelSerializationCodec {
>> >> boolean supportsModel(Class<? extends IModel> c) { return
>> >> EntityModel.class.equals(c); }
>> >> writeModel(ObjectOutputStream s, IModel model) {
>> s.writeLong(model.id);
>> }
>> >> IModel readModel(ObjectInputStream ois) { EntityModel em=new
>> >> EntityModel();
>> >> em.id=ois.readLong(); }}
>> >>
>> >> so now instead of serializing the entire model instance you can just
>> >> write
>> >> out the fields you need and because this ability is outside the model
>> you
>> >> can write out simple primitives. the downside is that for any model
>> that
>> >> doesnt have a codec the size of output takes an extra hit - which is
>> the
>> >> codec id byte. the question is how many bytes are in the extra junk
>> like
>> >> class header - this ratio will determine if this is generally worth
>> >> doing.
>> >>
>> >> the only trick here is to keep the codecid byte consistent across the
>> >> cluster. we can probably use initializers to do that for components
>> that
>> >> come in jars, just add a registerModelCodecs to the initializer and
>> also
>> >> allow the same from application.init()
>> >>
>> >> does this make sense? what do you guys think?
>> >>
>> >> -igor
>> >>
>> >>
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8775862
>> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8778010
Sent from the Wicket - Dev mailing list archive at Nabble.com.

Re: [OT] pagemap entries and versioning

Posted by Johan Compagner <jc...@gmail.com>.

just one thing.
It is not possible to really seperate changes from its page.
Because changes (mostly components) always have there parent (so you can re
attach them)

And we have no idea what the components or models also by itself have
references to the page. (anon classes)

so i don't see any real value of separating the changes from a page. Because
what would you do with that?
save only changes? But then you will save the page anyway.
The page itself will be smaller ofcourse. But with the second level cache if
we implement it right
we only really have to keep 1 page version (maybe not even that but that is
what i would do)

johan


On 2/2/07, Jonathan Locke <jo...@gmail.com> wrote:
>
>
>
> before johan complains, i just realized there's a flaw in my little
> plan.  you still have to undo changes to pages that are not reconstructed
> by custom IPageMapEntry implementations because the page is in the
> state of the highest ordinal in the page map because that was the last
> one accessed.  even so, a little more logic here should correct for that.
>
>
> Jonathan Locke wrote:
> >
> >
> > A part of this whole discussion of serializing page map entries is
> > also the current open bug that Eelco submitted that we should make
> > page versions separate from pages.  This came up over at Diva espresso
> > a few minutes ago when Eelco and I were chatting and I had an
> > interesting and very elegant little idea that could sort this out
> > quite nicely.  Not only would it definitely make things more efficient,
> > it would also be a more elegant solution that fixes an unreported bug.
> >
> > first, i think that IPageMapEntry.getNumericId is really more like
> > getOrdinal since page ids will always increment in a page map.
> > but regardless, to implement a page version based on another page
> > in the page map (which might also be a page version), we can use a
> > simple little container implementing IPageMapEntry to hold the
> > base pagemap entry id and the changes to apply to that page.
> > this container might be just an anonymous class, but let's give it
> > a name here for clarity:
> >
> > class PageVersion implements IPageMapEntry
> > {
> >    // Identifier of page map entry to apply these changes to
> >    // If we're using ordinals, we don't need this field at all
> >    // because the page this version is based on will be our own
> >    // ordinal - 1.
> >    int basePageMapEntryNumericId;
> >
> >    // Don't recall the exact base class name for change entries
> >    // in the versioning code, but this is the list of changes to apply
> >    List<Change> changes;
> >
> >    Page getPage()
> >    {
> >       // Get previous page (possibly recursing)
> >       final Page page = pageMap.get(basePageMapEntryNumericId);
> >
> >       // Apply changes to page
> >       page.applyChanges(changes();
> >       return page;
> >    }
> > }
> >
> > this should work nicely and actually this fixes an existing bug because
> > right
> > now if you provide a custom implementation of IPageMapEntry to
> reconstruct
> > a page, that page is probably not versionable.  in this case it would be
> > because
> > the recursion would bottom out and reconstruct that page.
> >
> >
> > igor.vaynberg wrote:
> >>
> >> another idea to optimize serialization state from jon and i
> >>
> >> allow an easy way to override model serialization
> >>
> >> a simple example:
> >>
> >> class EntityModel extends LoadableDetachableModel {
> >>    private long id;
> >>    //standard junk
> >> }
> >>
> >> now when this is serialized you get a bunch of junk like the class
> >> header,
> >> etc, but all you really care about is that single long id field. so
> what
> >> if
> >> you have
> >>
> >> interface ModelSerializationCodec {
> >> boolean supportsModel(Class<? extends IModel>);
> >> writeModel(ObjectOutputStream s, IModel model);
> >> IModel readModel(ObjectInputStream);
> >> }
> >>
> >> class ModelSerializationCodecRegistry {
> >>    private ModelSerializationCodec[]=new ModelSerializationCodec[255];
> >>    void registerCodec(ModelSerializationCodec codec) {...}
> >>    void codecForId(byte id) {...}
> >>    int codecIdForClass(Class<? extends IModel>){...}
> >> }
> >>
> >> Component.writeObject(ObjectOutputStream oos) {
> >> // or instead of overriding component.writeobject we can put model
> >> // into a model holder object that has this logic
> >> // or even wrap the model in the model holder conditionally when the
> >> model
> >> is set only if there is a codec
> >>
> >> // i suppose this can also be in a special outputstream we use so it
> >> doesnt
> >> have to be in the component
> >> // we already have a few of these
> >>   ...
> >> // write out model
> >> byte codecId=registry.codecIdForClass(model.getClass());
> >> oos.writebyte(0);
> >>
> >> if (codecId!=0) {
> >>    registry.codecForId(codecId).writeModel(oos, model);
> >> }
> >> }
> >>
> >> class EntityModelCodec implements ModelSerializationCodec {
> >> boolean supportsModel(Class<? extends IModel> c) { return
> >> EntityModel.class.equals(c); }
> >> writeModel(ObjectOutputStream s, IModel model) { s.writeLong(model.id);
> }
> >> IModel readModel(ObjectInputStream ois) { EntityModel em=new
> >> EntityModel();
> >> em.id=ois.readLong(); }}
> >>
> >> so now instead of serializing the entire model instance you can just
> >> write
> >> out the fields you need and because this ability is outside the model
> you
> >> can write out simple primitives. the downside is that for any model
> that
> >> doesnt have a codec the size of output takes an extra hit - which is
> the
> >> codec id byte. the question is how many bytes are in the extra junk
> like
> >> class header - this ratio will determine if this is generally worth
> >> doing.
> >>
> >> the only trick here is to keep the codecid byte consistent across the
> >> cluster. we can probably use initializers to do that for components
> that
> >> come in jars, just add a registerModelCodecs to the initializer and
> also
> >> allow the same from application.init()
> >>
> >> does this make sense? what do you guys think?
> >>
> >> -igor
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8775862
> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>
>

Re: [OT] pagemap entries and versioning

Posted by Jonathan Locke <jo...@gmail.com>.


before johan complains, i just realized there's a flaw in my little
plan.  you still have to undo changes to pages that are not reconstructed
by custom IPageMapEntry implementations because the page is in the
state of the highest ordinal in the page map because that was the last
one accessed.  even so, a little more logic here should correct for that.


Jonathan Locke wrote:
> 
> 
> A part of this whole discussion of serializing page map entries is
> also the current open bug that Eelco submitted that we should make
> page versions separate from pages.  This came up over at Diva espresso
> a few minutes ago when Eelco and I were chatting and I had an 
> interesting and very elegant little idea that could sort this out
> quite nicely.  Not only would it definitely make things more efficient, 
> it would also be a more elegant solution that fixes an unreported bug.
> 
> first, i think that IPageMapEntry.getNumericId is really more like 
> getOrdinal since page ids will always increment in a page map.
> but regardless, to implement a page version based on another page
> in the page map (which might also be a page version), we can use a 
> simple little container implementing IPageMapEntry to hold the 
> base pagemap entry id and the changes to apply to that page.
> this container might be just an anonymous class, but let's give it 
> a name here for clarity:
> 
> class PageVersion implements IPageMapEntry
> {
>    // Identifier of page map entry to apply these changes to
>    // If we're using ordinals, we don't need this field at all
>    // because the page this version is based on will be our own
>    // ordinal - 1.
>    int basePageMapEntryNumericId; 
> 
>    // Don't recall the exact base class name for change entries
>    // in the versioning code, but this is the list of changes to apply
>    List<Change> changes;  
> 
>    Page getPage()
>    {
>       // Get previous page (possibly recursing)
>       final Page page = pageMap.get(basePageMapEntryNumericId);
> 
>       // Apply changes to page
>       page.applyChanges(changes();
>       return page;
>    }
> }
> 
> this should work nicely and actually this fixes an existing bug because
> right 
> now if you provide a custom implementation of IPageMapEntry to reconstruct 
> a page, that page is probably not versionable.  in this case it would be
> because 
> the recursion would bottom out and reconstruct that page.
> 
> 
> igor.vaynberg wrote:
>> 
>> another idea to optimize serialization state from jon and i
>> 
>> allow an easy way to override model serialization
>> 
>> a simple example:
>> 
>> class EntityModel extends LoadableDetachableModel {
>>    private long id;
>>    //standard junk
>> }
>> 
>> now when this is serialized you get a bunch of junk like the class
>> header,
>> etc, but all you really care about is that single long id field. so what
>> if
>> you have
>> 
>> interface ModelSerializationCodec {
>> boolean supportsModel(Class<? extends IModel>);
>> writeModel(ObjectOutputStream s, IModel model);
>> IModel readModel(ObjectInputStream);
>> }
>> 
>> class ModelSerializationCodecRegistry {
>>    private ModelSerializationCodec[]=new ModelSerializationCodec[255];
>>    void registerCodec(ModelSerializationCodec codec) {...}
>>    void codecForId(byte id) {...}
>>    int codecIdForClass(Class<? extends IModel>){...}
>> }
>> 
>> Component.writeObject(ObjectOutputStream oos) {
>> // or instead of overriding component.writeobject we can put model
>> // into a model holder object that has this logic
>> // or even wrap the model in the model holder conditionally when the
>> model
>> is set only if there is a codec
>> 
>> // i suppose this can also be in a special outputstream we use so it
>> doesnt
>> have to be in the component
>> // we already have a few of these
>>   ...
>> // write out model
>> byte codecId=registry.codecIdForClass(model.getClass());
>> oos.writebyte(0);
>> 
>> if (codecId!=0) {
>>    registry.codecForId(codecId).writeModel(oos, model);
>> }
>> }
>> 
>> class EntityModelCodec implements ModelSerializationCodec {
>> boolean supportsModel(Class<? extends IModel> c) { return
>> EntityModel.class.equals(c); }
>> writeModel(ObjectOutputStream s, IModel model) { s.writeLong(model.id); }
>> IModel readModel(ObjectInputStream ois) { EntityModel em=new
>> EntityModel();
>> em.id=ois.readLong(); }}
>> 
>> so now instead of serializing the entire model instance you can just
>> write
>> out the fields you need and because this ability is outside the model you
>> can write out simple primitives. the downside is that for any model that
>> doesnt have a codec the size of output takes an extra hit - which is the
>> codec id byte. the question is how many bytes are in the extra junk like
>> class header - this ratio will determine if this is generally worth
>> doing.
>> 
>> the only trick here is to keep the codecid byte consistent across the
>> cluster. we can probably use initializers to do that for components that
>> come in jars, just add a registerModelCodecs to the initializer and also
>> allow the same from application.init()
>> 
>> does this make sense? what do you guys think?
>> 
>> -igor
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8775862
Sent from the Wicket - Dev mailing list archive at Nabble.com.

[OT] pagemap entries and versioning

Posted by Jonathan Locke <jo...@gmail.com>.

A part of this whole discussion of serializing page map entries is
also the current open bug that Eelco submitted that we should make
page versions separate from pages.  This came up over at Diva espresso
a few minutes ago when Eelco and I were chatting and I had an 
interesting and very elegant little idea that could sort this out
quite nicely.  Not only would it definitely make things more efficient, 
it would also be a more elegant solution that fixes an unreported bug.

first, i think that IPageMapEntry.getNumericId is really more like 
getOrdinal since page ids will always increment in a page map.
but regardless, to implement a page version based on another page
in the page map (which might also be a page version), we can use a 
simple little container implementing IPageMapEntry to hold the 
base pagemap entry id and the changes to apply to that page.
this container might be just an anonymous class, but let's give it 
a name here for clarity:

class PageVersion implements IPageMapEntry
{
   // Identifier of page map entry to apply these changes to
   // If we're using ordinals, we don't need this field at all
   // because the page this version is based on will be our own
   // ordinal - 1.
   int basePageMapEntryNumericId; 

   // Don't recall the exact base class name for change entries
   // in the versioning code, but this is the list of changes to apply
   List<Change> changes;  

   Page getPage()
   {
      // Get previous page (possibly recursing)
      final Page page = pageMap.get(basePageMapEntryNumericId);

      // Apply changes to page
      page.applyChanges(changes();
      return page;
   }
}

this should work nicely and actually this fixes an existing bug because
right 
now if you provide a custom implementation of IPageMapEntry to reconstruct 
a page, that page is probably not versionable.  in this case it would be
because 
the recursion would bottom out and reconstruct that page.

igor.vaynberg wrote:
> 
> another idea to optimize serialization state from jon and i
> 
> allow an easy way to override model serialization
> 
> a simple example:
> 
> class EntityModel extends LoadableDetachableModel {
>    private long id;
>    //standard junk
> }
> 
> now when this is serialized you get a bunch of junk like the class header,
> etc, but all you really care about is that single long id field. so what
> if
> you have
> 
> interface ModelSerializationCodec {
> boolean supportsModel(Class<? extends IModel>);
> writeModel(ObjectOutputStream s, IModel model);
> IModel readModel(ObjectInputStream);
> }
> 
> class ModelSerializationCodecRegistry {
>    private ModelSerializationCodec[]=new ModelSerializationCodec[255];
>    void registerCodec(ModelSerializationCodec codec) {...}
>    void codecForId(byte id) {...}
>    int codecIdForClass(Class<? extends IModel>){...}
> }
> 
> Component.writeObject(ObjectOutputStream oos) {
> // or instead of overriding component.writeobject we can put model
> // into a model holder object that has this logic
> // or even wrap the model in the model holder conditionally when the model
> is set only if there is a codec
> 
> // i suppose this can also be in a special outputstream we use so it
> doesnt
> have to be in the component
> // we already have a few of these
>   ...
> // write out model
> byte codecId=registry.codecIdForClass(model.getClass());
> oos.writebyte(0);
> 
> if (codecId!=0) {
>    registry.codecForId(codecId).writeModel(oos, model);
> }
> }
> 
> class EntityModelCodec implements ModelSerializationCodec {
> boolean supportsModel(Class<? extends IModel> c) { return
> EntityModel.class.equals(c); }
> writeModel(ObjectOutputStream s, IModel model) { s.writeLong(model.id); }
> IModel readModel(ObjectInputStream ois) { EntityModel em=new
> EntityModel();
> em.id=ois.readLong(); }}
> 
> so now instead of serializing the entire model instance you can just write
> out the fields you need and because this ability is outside the model you
> can write out simple primitives. the downside is that for any model that
> doesnt have a codec the size of output takes an extra hit - which is the
> codec id byte. the question is how many bytes are in the extra junk like
> class header - this ratio will determine if this is generally worth doing.
> 
> the only trick here is to keep the codecid byte consistent across the
> cluster. we can probably use initializers to do that for components that
> come in jars, just add a registerModelCodecs to the initializer and also
> allow the same from application.init()
> 
> does this make sense? what do you guys think?
> 
> -igor
> 
> 

-- 
View this message in context: http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8775808
Sent from the Wicket - Dev mailing list archive at Nabble.com.