You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Antonio Gallardo <ag...@agssa.net> on 2005/05/27 04:10:41 UTC

[RT] Cforms selection-lists

Hi:

Yesterday, I posted a simple RT on how to improve cforms dynamic selection
list performance. After getting a huge responses and a lot interest from
the cocoon dev list. :-)

I am not sure if this was approved by a lazy consensus or simply people is
happy with the current performance. I prefer to interpret that as it was
approved by a lazy consensus. ;-)


Seriously, before continuing I must admit I feel like a newbie inside this
code. Yesterday, I was reviewing the current implementation of selections
lists. And I will try to describe what I saw there:

Current situation
=================

In short, The world is black and white, there are no greys between.

Why? Because, we have only 2 choices:

1- Static selection list (SSL): generated only once over the all life
cycle of the servlet. Once it is generated it is served from the cache
forever. There is no chance to change it.

2- Dynamic selection list (DSL), this is generated every time we need it.
Over and over.

This polarized selections list are missing the real nature of forms. Or
perhaps we need only to improve the current implementation of them. I see
the need of 1 additional selections lists:

3-Semi-static selection list (SSSL): generated every time the user request
the form. Then it is served from the cache. This will allow us to change
on-line a static selection list without the need to restart the servlet. I
don't know if this should be called better semi-dynamic SL. Please
comment.

Also, in the previous mail, I tried to explain the current situation of
DSL inside a repeater:

http://marc.theaimsgroup.com/?t=111706780900003&r=1&w=2

IMHO, it should improved too. The question is:

Need it to be a user defined optional feature?

Architecture
============

If the above is approved -> this mostly, why I am writing this mail. ;-)

In order to improve that I see 2 posibilities:

1. The selection list widget (SLW) should know more about the status of
his parent to know how to manage the current generation.  SLW need to
know:

A. If his parent is a repeater or not. If yes, then need to know the
repeater-row index to. Useful for DSL.
B. Need to know if his parent is just initializating. Useful for SSSL. I
know, this is posible by the initialization of the form.

2. The repeater-row widget, the repeater widget know what to do and call
corresponding SLW methods to do the job. In this case the job is managed
from the parent and the parent call the required methods. I see in this
implementation an overhead, since the parent will need to know wich kind
of widget it needs to generate.

I am willing to implement this and I will be glad to hear opinions before
starting to implement this.

Also, need we a vote to do this?

WDYT?

Best Regards,

Antonio Gallardo.

Re: [RT] Cforms selection-lists

Posted by Antonio Gallardo <ag...@agssa.net>.

On Dom, 29 de Mayo de 2005, 3:21, Sylvain Wallez dijo:
> Antonio Gallardo wrote:
>>On Sab, 28 de Mayo de 2005, 7:45, Sylvain Wallez dijo:
>>>Antonio Gallardo wrote:

> No problem. You have a need, you explain it and start implementing, and
> we discuss :-)

Hehe! :-S

>>>- each field in a repeater can programmatically be given a distinct
>>>selection-list (e.g. with event handlers), in which case triggering the
>>>test on the first repeater row will fail.
>>
>>Yes. This is what I already saw. This address my second concern related
>> to
>>setting programatically a new SelectionList inside an event. Somehow, we
>>need to cache this too, because inside a repeater there is a big chance
>> we
>>are using the same SelectionList over and over. Again, this is place for
>> a
>>performance improvement.
>>
>>
>
> My proposal also handles this case. If the DSL object manages the
> caching by itself, caching also happens if the same DSL object is set
> programmatically on various objects.
>
> And this applies to more uses case than a repeater, but also to more
> complicated use cases such as recursive forms, e.g. a sitemap editor
> where the list of generators will be attached to all widgets
> representing a <map:generate>
>
>>Given this new facts, I started to think if is posible to have a cache
>>Collection of all involved DSL list (perhaps implemented as a handler, as
>>suggested by Joerg). The handler will check first inside the cache
>>Collection is the SelectionList was already generated.... Bah! Your
>>solution suggested below is far better than this one. Dropping this
>>idea...
>>
>>
>
> :-)
>
>>>So this must be handled entirely either in the pipeline or in the
>>>selection-list.
>>>
>>>
>>
>>I don't like the pipeline idea. Better at the selection-list.
>>
>>But I see again teh selection-list needs to know more info, to create a
>>smarter cache. We need to cache only if needed. If not, then we will
>>slowdown the other selection-list that don't need an attribute after all.
>>
>>
>
> So let's add a "cache" attribute on <fd:selection-list> that, if true,
> will enable the caching.

For sure we need something like that, I think the best is if cocoon will
decide when to use cache. Giving to the end user the choice to set or not
@cache in the SelectionList will end up in mistakes (ie: unnecesary cache
setup or missing @cache when is really needed). I can even see the mails
on the user list. Hmm... I also understand this is a 1st step. And we can
left this for the future, but I don't like this idea,.... but.... if there
is not another option, then I can live with a new attribute. We don't have
a choice. :-( Perhaps the builder can take care of this? Is this posible?
I know it is posible, but I mean in a way to not break the code again! You
know what I mean. ;-)

Well, lets start by the new optional @cache. In fact this was one of my
first ideas, before I realized that cforms should do that for us. ;-)

<snip>

>>This last suggestion, is really great! As you told, we need to find a way
>>to generate a real unique attribute name to avoid conflict with whatever
>>other attributes inside the request.
>>
> What about "new java.rmi.server.UID().toString()"?

It's OK to me.

>>How we are going to implement this? Can you do this?
>
> Better do it yourself if you need it quicly :-)

It's OK (again!) :-S

> Don't hesitate to ask if you need advices however!

Thanks! :-)

Best Regards,

Antonio Gallardo.

Re: [RT] Cforms selection-lists

Posted by Sylvain Wallez <sy...@apache.org>.

Antonio Gallardo wrote:

>On Sab, 28 de Mayo de 2005, 7:45, Sylvain Wallez dijo:
>  
>
>>Antonio Gallardo wrote:
>>
>>    
>>
>>>Hi:
>>>
>>>Yesterday, I posted a simple RT on how to improve cforms dynamic
>>>selection
>>>list performance. After getting a huge responses and a lot interest from
>>>the cocoon dev list. :-)
>>>      
>>>
>>Sorry for the late answer.
>>
>>Your use case is perfectly valid and must be taken into consideration.
>>However, just as Joerg I have some reservations on the implementation ;-)
>>
>>Caching the of the list data must not be triggered by the Field object,
>>for several reasons:
>>- the repeater containing the field may not be its direct parent, but
>>one of its ancestors if you have some enclosing group or union.
>>    
>>
>
>Opss. I forgot this! Hence, the current implementation is buggy and needs
>to be changed. I will revert the changes now. Sorry for the disturb.
>  
>

No problem. You have a need, you explain it and start implementing, and 
we discuss :-)

>>- each field in a repeater can programmatically be given a distinct
>>selection-list (e.g. with event handlers), in which case triggering the
>>test on the first repeater row will fail.
>>    
>>
>
>Yes. This is what I already saw. This address my second concern related to
>setting programatically a new SelectionList inside an event. Somehow, we
>need to cache this too, because inside a repeater there is a big chance we
>are using the same SelectionList over and over. Again, this is place for a
>performance improvement.
>  
>

My proposal also handles this case. If the DSL object manages the 
caching by itself, caching also happens if the same DSL object is set 
programmatically on various objects.

And this applies to more uses case than a repeater, but also to more 
complicated use cases such as recursive forms, e.g. a sitemap editor 
where the list of generators will be attached to all widgets 
representing a <map:generate>

>Given this new facts, I started to think if is posible to have a cache
>Collection of all involved DSL list (perhaps implemented as a handler, as
>suggested by Joerg). The handler will check first inside the cache
>Collection is the SelectionList was already generated.... Bah! Your
>solution suggested below is far better than this one. Dropping this
>idea...
>  
>

:-)

>>So this must be handled entirely either in the pipeline or in the
>>selection-list.
>>    
>>
>
>I don't like the pipeline idea. Better at the selection-list.
>
>But I see again teh selection-list needs to know more info, to create a
>smarter cache. We need to cache only if needed. If not, then we will
>slowdown the other selection-list that don't need an attribute after all.
>  
>

So let's add a "cache" attribute on <fd:selection-list> that, if true, 
will enable the caching.

>For example, If we have only a SelectionList in a form without a repeater
>or in a form with repeater but the SelectionList is placed outside the
>repeater, then there is totally not sense to cache the info. It is only a
>waste of time and memory.
>  
>

Yep.

>>Caching in the selection-list itself can be done on a per-request basis
>>(thus ensuring the pipeline will be called only once per request),
>>by storing the selection list in a request attribute. The DSL has to
>>generate a unique attribute name (to avoid conflicts). When generating
>>the SAX fragment, either the attribute is empty and the pipeline is
>>called, caching on the fly, or the attribute exists and can be used
>>immediately. The cached value is then automatically trashed at the end
>>of the request processing.
>>    
>>
>
>I think this is the best. And this is what I will be glad to see this
>working! :-)
>
>  
>
>>How does that sound?
>>    
>>
>
>This last suggestion, is really great! As you told, we need to find a way
>to generate a real unique attribute name to avoid conflict with whatever
>other attributes inside the request.
>  
>

What about "new java.rmi.server.UID().toString()"?

>How we are going to implement this? Can you do this?
>  
>

Better do it yourself if you need it quicly :-)

Don't hesitate to ask if you need advices however!

Sylvain

-- 
Sylvain Wallez                        Anyware Technologies
http://apache.org/~sylvain            http://anyware-tech.com
Apache Software Foundation Member     Research & Technology Director

Re: [RT] Cforms selection-lists

Posted by Antonio Gallardo <ag...@agssa.net>.

On Sab, 28 de Mayo de 2005, 7:45, Sylvain Wallez dijo:
> Antonio Gallardo wrote:
>
>>Hi:
>>
>>Yesterday, I posted a simple RT on how to improve cforms dynamic
>> selection
>>list performance. After getting a huge responses and a lot interest from
>>the cocoon dev list. :-)
>>
>>
>
> Sorry for the late answer.
>
> Your use case is perfectly valid and must be taken into consideration.
> However, just as Joerg I have some reservations on the implementation ;-)
>
> Caching the of the list data must not be triggered by the Field object,
> for several reasons:
> - the repeater containing the field may not be its direct parent, but
> one of its ancestors if you have some enclosing group or union.

Opss. I forgot this! Hence, the current implementation is buggy and needs
to be changed. I will revert the changes now. Sorry for the disturb.

> - each field in a repeater can programmatically be given a distinct
> selection-list (e.g. with event handlers), in which case triggering the
> test on the first repeater row will fail.

Yes. This is what I already saw. This address my second concern related to
setting programatically a new SelectionList inside an event. Somehow, we
need to cache this too, because inside a repeater there is a big chance we
are using the same SelectionList over and over. Again, this is place for a
performance improvement.

Given this new facts, I started to think if is posible to have a cache
Collection of all involved DSL list (perhaps implemented as a handler, as
suggested by Joerg). The handler will check first inside the cache
Collection is the SelectionList was already generated.... Bah! Your
solution suggested below is far better than this one. Dropping this
idea...

> So this must be handled entirely either in the pipeline or in the
> selection-list.

I don't like the pipeline idea. Better at the selection-list.

But I see again teh selection-list needs to know more info, to create a
smarter cache. We need to cache only if needed. If not, then we will
slowdown the other selection-list that don't need an attribute after all.

For example, If we have only a SelectionList in a form without a repeater
or in a form with repeater but the SelectionList is placed outside the
repeater, then there is totally not sense to cache the info. It is only a
waste of time and memory.

> You're right in saying that a pipeline querying a database cannot easily
> be cached. Now we have in the scratchpad (and this should IMO be
> promoted to core) a nice "cached:" protocol, which allows to cache any
> URI for a given period of time. So your DSL can use the
> "cached:cocoon://database-query?cocoon:cache-expires=1" to have your
> expensive pipeline called only once in a single second.

Hmm. I think is this not a good solution. There are some forms that can
take more than 1 second. And we see this often, there are not too much
special cases. IMHO, this is not also a solution to suggest 2 or X
seconds. Since we never will know how much time in fact we will need.
Customizing the time for every forms sounds like a not good idea.

> Caching in the selection-list itself can be done on a per-request basis
> (thus ensuring the pipeline will be called only once per request),
> by storing the selection list in a request attribute. The DSL has to
> generate a unique attribute name (to avoid conflicts). When generating
> the SAX fragment, either the attribute is empty and the pipeline is
> called, caching on the fly, or the attribute exists and can be used
> immediately. The cached value is then automatically trashed at the end
> of the request processing.

I think this is the best. And this is what I will be glad to see this
working! :-)

> How does that sound?

This last suggestion, is really great! As you told, we need to find a way
to generate a real unique attribute name to avoid conflict with whatever
other attributes inside the request.

How we are going to implement this? Can you do this?

Best Regards,

Antonio Gallardo.

Re: [RT] Cforms selection-lists

Posted by Joerg Heinicke <jo...@gmx.de>.

On 29.05.2005 04:03, Antonio Gallardo wrote:

> Per form is not a good option, IMHO because I suppose the life of the form
> can have more than one request. Please correct me if I am wrong here.

You are correct. I meant only a part of the form lifecycle, the form 
processing during a request.

Joerg

Re: [RT] Cforms selection-lists

Posted by Antonio Gallardo <ag...@agssa.net>.

On Sab, 28 de Mayo de 2005, 11:44, Joerg Heinicke dijo:
>> Caching in the selection-list itself can be done on a per-request basis
>> (thus ensuring the pipeline will be called only once per request), by
>> storing the selection list in a request attribute. The DSL has to
>> generate a unique attribute name (to avoid conflicts). When generating
>> the SAX fragment, either the attribute is empty and the pipeline is
>> called, caching on the fly, or the attribute exists and can be used
>> immediately. The cached value is then automatically trashed at the end
>> of the request processing.
>
> "Polluting" the request is always something to be avoided as far as
> possible IMO. Would it not be possible to have just one buffering
> SelectionListHandler per source and request (or form) with an internal
> SAXBuffer?

A SelectionListsHandler per request is OK. We can store the cache
elsewhere. That is OK to me.

The handler must contains all the selection lists that needs to be cached.
And I said "needs", because we don't want to cache all of them. We can
detect wich of thems are good candidates to cached.

Per form is not a good option, IMHO because I suppose the life of the form
can have more than one request. Please correct me if I am wrong here.

Here is some tipical sample where we need to cache a programatically
changed selection list:

We have an invoice form. In the detail repeater there can be 2
selection-lists: family of products and product. Said, the identity is
conformed by both: the family and the product.

Given this products:

family     product
======     =======
Food       Rice
Food       Beans
Food       Bread
Food       ...
Drink      Water
Drink      Juice
Drink      ...
...        ...

The interface will set the product selection list once the user select a
family. So we will use here a programatically selection-list. Perhaps we
use for that:

cocoon:/products.data?family=Food

If the user choose to buy said: Rice, Beans, Bread Water and Juice then
while we select the "Food" family we are triggering the pipeline for
products of Food. We can see easily this pipeline must be cached since we
will need it when the user select again the same family for Beans and
Bread.

(I hope this clearly explains the need of caching programatically changed
selection-list).

I know the sample is very easy and we can use Ids, etc. But this is not
the point.

Best Regards,

Antonio Gallardo.

Re: [RT] Cforms selection-lists

Posted by Joerg Heinicke <jo...@gmx.de>.

On 29.05.2005 10:10, Sylvain Wallez wrote:

> I don't see why you consider it as pollution. Request attributes do 
> exist, and their purpose is to provide temporary storage for the various 
> things that participate in processing a request and are isolated either 
> in space (different classes) or time (same class called several times -- 
> our case here).

But the request lifecycle does not necessarily match the form processing 
lifecycle.

>> Would it not be possible to have just one buffering 
>> SelectionListHandler per source and request (or form) with an internal 
>> SAXBuffer?
> 
> What would this handler do? Associate a buffered selection-list content 
> to a request? Then why not using request attributes?

This handler already exists: 
org.apache.cocoon.forms.datatype.DynamicSelectionList.SelectionListHandler. 
My idea was to use it also for buffering.
The usage of it probably also has to be changed then in 
org.apache.cocoon.forms.datatype.DynamicSelectionList.generateSaxFragment(ContentHandler, 
Locale, Source).

Joerg

Re: [RT] Cforms selection-lists

Posted by Sylvain Wallez <sy...@apache.org>.

Joerg Heinicke wrote:

> On 28.05.2005 14:45, Sylvain Wallez wrote:
>
>> So this must be handled entirely either in the pipeline or in the 
>> selection-list.
>
>
> +1
>
>> You're right in saying that a pipeline querying a database cannot 
>> easily be cached. Now we have in the scratchpad (and this should IMO 
>> be promoted to core) a nice "cached:" protocol, which allows to cache 
>> any URI for a given period of time. So your DSL can use the 
>> "cached:cocoon://database-query?cocoon:cache-expires=1" to have your 
>> expensive pipeline called only once in a single second.
>
>
> This might lead to inconsistent forms as the form request processing 
> is not synchronized with this caching.


That's true. But that was the "instant, no additinal code required" 
solution :-)

>> Caching in the selection-list itself can be done on a per-request 
>> basis (thus ensuring the pipeline will be called only once per 
>> request), by storing the selection list in a request attribute. The 
>> DSL has to generate a unique attribute name (to avoid conflicts). 
>> When generating the SAX fragment, either the attribute is empty and 
>> the pipeline is called, caching on the fly, or the attribute exists 
>> and can be used immediately. The cached value is then automatically 
>> trashed at the end of the request processing.
>
>
> "Polluting" the request is always something to be avoided as far as 
> possible IMO.


I don't see why you consider it as pollution. Request attributes do 
exist, and their purpose is to provide temporary storage for the various 
things that participate in processing a request and are isolated either 
in space (different classes) or time (same class called several times -- 
our case here).

> Would it not be possible to have just one buffering 
> SelectionListHandler per source and request (or form) with an internal 
> SAXBuffer?


What would this handler do? Associate a buffered selection-list content 
to a request? Then why not using request attributes?

Sylvain

-- 
Sylvain Wallez                        Anyware Technologies
http://apache.org/~sylvain            http://anyware-tech.com
Apache Software Foundation Member     Research & Technology Director

Re: [RT] Cforms selection-lists

Posted by Joerg Heinicke <jo...@gmx.de>.

On 28.05.2005 14:45, Sylvain Wallez wrote:

> So this must be handled entirely either in the pipeline or in the 
> selection-list.

+1

> You're right in saying that a pipeline querying a database cannot easily 
> be cached. Now we have in the scratchpad (and this should IMO be 
> promoted to core) a nice "cached:" protocol, which allows to cache any 
> URI for a given period of time. So your DSL can use the 
> "cached:cocoon://database-query?cocoon:cache-expires=1" to have your 
> expensive pipeline called only once in a single second.

This might lead to inconsistent forms as the form request processing is 
not synchronized with this caching.

> Caching in the selection-list itself can be done on a per-request basis 
> (thus ensuring the pipeline will be called only once per request), by 
> storing the selection list in a request attribute. The DSL has to 
> generate a unique attribute name (to avoid conflicts). When generating 
> the SAX fragment, either the attribute is empty and the pipeline is 
> called, caching on the fly, or the attribute exists and can be used 
> immediately. The cached value is then automatically trashed at the end 
> of the request processing.

"Polluting" the request is always something to be avoided as far as 
possible IMO. Would it not be possible to have just one buffering 
SelectionListHandler per source and request (or form) with an internal 
SAXBuffer?

Joerg

Re: [RT] Cforms selection-lists

Posted by Sylvain Wallez <sy...@apache.org>.

Antonio Gallardo wrote:

>Hi:
>
>Yesterday, I posted a simple RT on how to improve cforms dynamic selection
>list performance. After getting a huge responses and a lot interest from
>the cocoon dev list. :-)
>  
>

Sorry for the late answer.

Your use case is perfectly valid and must be taken into consideration. 
However, just as Joerg I have some reservations on the implementation ;-)

Caching the of the list data must not be triggered by the Field object, 
for several reasons:
- the repeater containing the field may not be its direct parent, but 
one of its ancestors if you have some enclosing group or union.
- each field in a repeater can programmatically be given a distinct 
selection-list (e.g. with event handlers), in which case triggering the 
test on the first repeater row will fail.

So this must be handled entirely either in the pipeline or in the 
selection-list.

You're right in saying that a pipeline querying a database cannot easily 
be cached. Now we have in the scratchpad (and this should IMO be 
promoted to core) a nice "cached:" protocol, which allows to cache any 
URI for a given period of time. So your DSL can use the 
"cached:cocoon://database-query?cocoon:cache-expires=1" to have your 
expensive pipeline called only once in a single second.

Caching in the selection-list itself can be done on a per-request basis 
(thus ensuring the pipeline will be called only once per request), by 
storing the selection list in a request attribute. The DSL has to 
generate a unique attribute name (to avoid conflicts). When generating 
the SAX fragment, either the attribute is empty and the pipeline is 
called, caching on the fly, or the attribute exists and can be used 
immediately. The cached value is then automatically trashed at the end 
of the request processing.

How does that sound?

Sylvain

-- 
Sylvain Wallez                        Anyware Technologies
http://apache.org/~sylvain            http://anyware-tech.com
Apache Software Foundation Member     Research & Technology Director