You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jspwiki.apache.org by Janne Jalkanen <Ja...@ecyrd.com> on 2009/04/12 11:52:59 UTC

Workflow fail

Heya!

I just realized something - the way that workflows currently work does  
not scale. The problem is that preSaveTask stashes all of the  
attributes of the page into memory (and assumes it can serialize it).

Now, in 3.0 the attribute map cannot exist, since e.g. page content  
will be an attribute. This means that attributes can span gigabytes  
(like with attachments), and you probably don't want those in memory.

Another problem is that copying all of the attributes and restoring  
them will probably cause all the attributes to be versioned again  
since we overwrite all attributes with the ones from memory.  That is,  
every single versioning will create complete copies of all attributes  
since we rewrite all of them each time...  Though this could very well  
be something that the repository does anyway, but at least we're not  
helping it.

This means that the whole workflow storage will need to be rethought a  
bit.  My current idea is that we just simply add a new Node in the  
repository:

/wiki:workflows/

and we add the workflow information into that repo as a series of  
Nodes. For example:

/wiki:workflows/<workflow-id>/<node-uuid>/<modified attributes>

This allows a couple of things to happen:

1) workflows are clustered automatically
2) we don't need serialization anymore, since we just copy JCR  
Properties back-n-forth
3) workflows are persisted automatically
4) workflows could in the future contain multiple objects
5) workflows can be exported and backed up together with the repo  
contents

How does this sound?  We do also want to create a canonical  
representation of a workflow object, and this might need a bit of  
design.  I'm not *that* familiar with the way it works, so some help  
might be needed here.

/Janne

Re: Workflow fail

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

Agreed. Any suggestions as how the workflows should be stored as a set  
of JCR Properties?

/Janne

On Apr 13, 2009, at 00:56 , Andrew Jaquith wrote:

> This makes sense. Using the repository foe storing workflows is a  
> very good idea. In general, all workflows should be stored in the  
> repo... so that they are always persisted.
>
> We should also take a look at the two workflows we have now (page  
> saves and profile creation). Today, we create a workflow for every  
> page save and every profile creation. Even those that don't need  
> approvals have a "straight-through" (i.e., no decision) workflow.  
> This was easy to code, but pretty wasteful. For those no-decision  
> cases, just saving the page or creating the profile, without setting  
> up a workflow, would be better.
>
> Andrew
>
> On Apr 12, 2009, at 5:52, Janne Jalkanen <Ja...@ecyrd.com>  
> wrote:
>
>> Heya!
>>
>> I just realized something - the way that workflows currently work  
>> does not scale. The problem is that preSaveTask stashes all of the  
>> attributes of the page into memory (and assumes it can serialize it).
>>
>> Now, in 3.0 the attribute map cannot exist, since e.g. page content  
>> will be an attribute. This means that attributes can span gigabytes  
>> (like with attachments), and you probably don't want those in memory.
>>
>> Another problem is that copying all of the attributes and restoring  
>> them will probably cause all the attributes to be versioned again  
>> since we overwrite all attributes with the ones from memory.  That  
>> is, every single versioning will create complete copies of all  
>> attributes since we rewrite all of them each time...  Though this  
>> could very well be something that the repository does anyway, but  
>> at least we're not helping it.
>>
>> This means that the whole workflow storage will need to be  
>> rethought a bit.  My current idea is that we just simply add a new  
>> Node in the repository:
>>
>> /wiki:workflows/
>>
>> and we add the workflow information into that repo as a series of  
>> Nodes. For example:
>>
>> /wiki:workflows/<workflow-id>/<node-uuid>/<modified attributes>
>>
>> This allows a couple of things to happen:
>>
>> 1) workflows are clustered automatically
>> 2) we don't need serialization anymore, since we just copy JCR  
>> Properties back-n-forth
>> 3) workflows are persisted automatically
>> 4) workflows could in the future contain multiple objects
>> 5) workflows can be exported and backed up together with the repo  
>> contents
>>
>> How does this sound?  We do also want to create a canonical  
>> representation of a workflow object, and this might need a bit of  
>> design.  I'm not *that* familiar with the way it works, so some  
>> help might be needed here.
>>
>> /Janne

Re: Workflow fail

Posted by Andrew Jaquith <an...@gmail.com>.

This makes sense. Using the repository foe storing workflows is a very  
good idea. In general, all workflows should be stored in the repo...  
so that they are always persisted.

We should also take a look at the two workflows we have now (page  
saves and profile creation). Today, we create a workflow for every  
page save and every profile creation. Even those that don't need  
approvals have a "straight-through" (i.e., no decision) workflow. This  
was easy to code, but pretty wasteful. For those no-decision cases,  
just saving the page or creating the profile, without setting up a  
workflow, would be better.

Andrew

On Apr 12, 2009, at 5:52, Janne Jalkanen <Ja...@ecyrd.com>  
wrote:

> Heya!
>
> I just realized something - the way that workflows currently work  
> does not scale. The problem is that preSaveTask stashes all of the  
> attributes of the page into memory (and assumes it can serialize it).
>
> Now, in 3.0 the attribute map cannot exist, since e.g. page content  
> will be an attribute. This means that attributes can span gigabytes  
> (like with attachments), and you probably don't want those in memory.
>
> Another problem is that copying all of the attributes and restoring  
> them will probably cause all the attributes to be versioned again  
> since we overwrite all attributes with the ones from memory.  That  
> is, every single versioning will create complete copies of all  
> attributes since we rewrite all of them each time...  Though this  
> could very well be something that the repository does anyway, but at  
> least we're not helping it.
>
> This means that the whole workflow storage will need to be rethought  
> a bit.  My current idea is that we just simply add a new Node in the  
> repository:
>
> /wiki:workflows/
>
> and we add the workflow information into that repo as a series of  
> Nodes. For example:
>
> /wiki:workflows/<workflow-id>/<node-uuid>/<modified attributes>
>
> This allows a couple of things to happen:
>
> 1) workflows are clustered automatically
> 2) we don't need serialization anymore, since we just copy JCR  
> Properties back-n-forth
> 3) workflows are persisted automatically
> 4) workflows could in the future contain multiple objects
> 5) workflows can be exported and backed up together with the repo  
> contents
>
> How does this sound?  We do also want to create a canonical  
> representation of a workflow object, and this might need a bit of  
> design.  I'm not *that* familiar with the way it works, so some help  
> might be needed here.
>
> /Janne