You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lenya.apache.org by Josias Thoeny <jo...@wyona.com> on 2005/08/26 16:35:38 UTC

Multiple document instantiation

Hi,

Just noticed that for each request for a document in lenya-trunk, the
DefaultDocument class gets instantiated about 300 times (put some debug
output into the constructor to see it).
Could the DocumentBuilder somehow use the DocumentIdentityMap to prevent
multiple document instantiation?
Or might that lead to problems, e.g. with synchronization if multiple
threads access the same document simultaneously?

Something similar happens with the metadata. The meta data file is
parsed about 20 times for each request. But probably if there was only
one document instance, this would not happen anymore.

Any comments?

Josias



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: Multiple document instantiation

Posted by Andreas Hartmann <an...@apache.org>.
Andreas Hartmann wrote:
> Josias Thoeny wrote:
> 
>> Hi,
>>
>> Just noticed that for each request for a document in lenya-trunk, the
>> DefaultDocument class gets instantiated about 300 times (put some debug
>> output into the constructor to see it).
>> Could the DocumentBuilder somehow use the DocumentIdentityMap to prevent
>> multiple document instantiation?
>> Or might that lead to problems, e.g. with synchronization if multiple
>> threads access the same document simultaneously?
> 
> 
> I guess that the main reason for this behaviour is that the preconditions
> are verified for all usecases which are available in the menubar.
> 
> If we allow to show the menu during usecase execution (which is possible
> at the moment, for instance in the site area), each usecase has to create
> its own distinct identity map. If they would use a shared identity map
> which is attached e.g. to the request, all usecases whould see the changes
> which were made by the active usecase. This way, the precondition check
> could be skewed.

I just noticed that we have a major problem in the current codebase.

With the source access architecture, it is not possible to introduce this
separation. The node factory doesn't know which usecase it is accessed from
(active or inactive). So we can't use lenya:// to access sources, we have
to access the repository nodes of the appropriate identity map directly.

I'll try to find a way to solve this problem ...

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: Multiple document instantiation

Posted by Josias Thoeny <jo...@wyona.com>.
> > 
> > I think quite a large number of document instances are created in input
> > modules (and maybe other sitemap components).
> > Many input modules use the following code:
> > 
> > Document document = getEnvelope(objectModel, name).getDocument();
> > 
> > which 'often' creates a new instance of the document, because the
> > InputModules don't seem to be re-used.
> > 
> > My question is:
> > How is this related to repository sessions?
> 
> Actually the identity map is attached to the repository session, which means
> that already built documents should be taken from there.
> 
> > Should the input modules (or other sitemap components) use the
> > repository session attached to the request to access the document?
> 
> At the moment they do (because the LenyaSourceFactory uses this session).
> But in the future this probably has to be changed (see thread "Isolation of
> active / inactive usecases").

Yeah, since your below-mentioned commit they really use the session (I'm
talking about OperationModule.getDocumentIdentityMap() ->
RepositoryUtil.getSession()). 
I was just a little bit too quick with my mail :-)

> 
> BTW, I just committed some changes to the publication classes which seem
> to have increased the performance (from about 7 to about 4 seconds per
> authoring request on my machine). I hope this has removed some overhead.

Wow, I just had a look at it and it has drastically improved!
Thanks!
BTW, I didn't start this thread because of performance concerns, but
rather because I didn't understand the idea behind the identity map.
I think I learned something...

Josias

> 
> -- Andreas
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: Multiple document instantiation

Posted by Andreas Hartmann <an...@apache.org>.
Josias Thoeny wrote:
>>>>Just noticed that for each request for a document in lenya-trunk, the
>>>>DefaultDocument class gets instantiated about 300 times (put some debug
>>>>output into the constructor to see it).
>>>>Could the DocumentBuilder somehow use the DocumentIdentityMap to prevent
>>>>multiple document instantiation?
>>>>Or might that lead to problems, e.g. with synchronization if multiple
>>>>threads access the same document simultaneously?
>>>
>>>
>>>I guess that the main reason for this behaviour is that the preconditions
>>>are verified for all usecases which are available in the menubar.
>>
>>I found out that this is not true. There is a bug somewhere, documents seem
>>to be instanciated multiple times for the same repository session.
> 
> 
> I think quite a large number of document instances are created in input
> modules (and maybe other sitemap components).
> Many input modules use the following code:
> 
> Document document = getEnvelope(objectModel, name).getDocument();
> 
> which 'often' creates a new instance of the document, because the
> InputModules don't seem to be re-used.
> 
> My question is:
> How is this related to repository sessions?

Actually the identity map is attached to the repository session, which means
that already built documents should be taken from there.

> Should the input modules (or other sitemap components) use the
> repository session attached to the request to access the document?

At the moment they do (because the LenyaSourceFactory uses this session).
But in the future this probably has to be changed (see thread "Isolation of
active / inactive usecases").

BTW, I just committed some changes to the publication classes which seem
to have increased the performance (from about 7 to about 4 seconds per
authoring request on my machine). I hope this has removed some overhead.

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: Multiple document instantiation

Posted by Josias Thoeny <jo...@wyona.com>.
> >> Just noticed that for each request for a document in lenya-trunk, the
> >> DefaultDocument class gets instantiated about 300 times (put some debug
> >> output into the constructor to see it).
> >> Could the DocumentBuilder somehow use the DocumentIdentityMap to prevent
> >> multiple document instantiation?
> >> Or might that lead to problems, e.g. with synchronization if multiple
> >> threads access the same document simultaneously?
> > 
> > 
> > I guess that the main reason for this behaviour is that the preconditions
> > are verified for all usecases which are available in the menubar.
> 
> I found out that this is not true. There is a bug somewhere, documents seem
> to be instanciated multiple times for the same repository session.

I think quite a large number of document instances are created in input
modules (and maybe other sitemap components).
Many input modules use the following code:

Document document = getEnvelope(objectModel, name).getDocument();

which 'often' creates a new instance of the document, because the
InputModules don't seem to be re-used.

My question is:
How is this related to repository sessions?
Should the input modules (or other sitemap components) use the
repository session attached to the request to access the document?

thanks,
Josias


> 
> -- Andreas
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: Multiple document instantiation

Posted by Andreas Hartmann <an...@apache.org>.
Andreas Hartmann wrote:
> Josias Thoeny wrote:
> 
>> Hi,
>>
>> Just noticed that for each request for a document in lenya-trunk, the
>> DefaultDocument class gets instantiated about 300 times (put some debug
>> output into the constructor to see it).
>> Could the DocumentBuilder somehow use the DocumentIdentityMap to prevent
>> multiple document instantiation?
>> Or might that lead to problems, e.g. with synchronization if multiple
>> threads access the same document simultaneously?
> 
> 
> I guess that the main reason for this behaviour is that the preconditions
> are verified for all usecases which are available in the menubar.

I found out that this is not true. There is a bug somewhere, documents seem
to be instanciated multiple times for the same repository session.

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: Multiple document instantiation

Posted by Andreas Hartmann <an...@apache.org>.
Josias Thoeny wrote:

[...]

> So each usecase has its own identity map in order to have a view of the
> content which is independent from the other usecases?

Yes. Actually the purpose is to isolate the documents in different
user sessions. The problem at hand occurs because we also have to
isolate active / non-active usecases of a single user. But maybe this
is too complex and we just shouldn't allow "inactive usecases" when a
usecase is active.


> Does that mean that two instances of the same document stored in two
> different identity maps may have different meta data, workflow, or
> actual document content? 

Yes. This is necessary to support transactions.


> Another question: Where in the code are documents added to the identity
> map? I expected this to happen in DefaultDocumentBuilder or
> DefaultDocument, but I wasn't able to figure it out.

IdentityMapImpl.get(IdentifiableFactory factory, String key)


         if (object == null) {
             try {
                 object = factory.build(this, key);
             } catch (Exception e) {
                 throw new RuntimeException(e);
             }
             map.put(key, object);
         }

This should be the single entry point for creating new Identifiables.


> Thanks in advance for an explanation. (I'm just trying to get a better
> understanding of the matter.)

I'm looking forward to your comments :)

BTW, hopefully we can get rid of this home-grown code by doing
a JCR-only approach. But for me it is a good opportunity to learn.

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: Multiple document instantiation

Posted by Josias Thoeny <jo...@wyona.com>.
On Fri, 2005-08-26 at 16:48 +0200, Andreas Hartmann wrote:
> Josias Thoeny wrote:
> > Hi,
> > 
> > Just noticed that for each request for a document in lenya-trunk, the
> > DefaultDocument class gets instantiated about 300 times (put some debug
> > output into the constructor to see it).
> > Could the DocumentBuilder somehow use the DocumentIdentityMap to prevent
> > multiple document instantiation?
> > Or might that lead to problems, e.g. with synchronization if multiple
> > threads access the same document simultaneously?
> 
> I guess that the main reason for this behaviour is that the preconditions
> are verified for all usecases which are available in the menubar.
> 
> If we allow to show the menu during usecase execution (which is possible
> at the moment, for instance in the site area), each usecase has to create
> its own distinct identity map. If they would use a shared identity map
> which is attached e.g. to the request, all usecases whould see the changes
> which were made by the active usecase. This way, the precondition check
> could be skewed.
> 
> What we could try is to use a distinct identity map for the active usecase
> and a shared identity map for all inactive usecases (which should have no
> write access to avoid interference).

Looks like I do not completely understand the idea behind the
DocumentIdentityMap (yet).
So each usecase has its own identity map in order to have a view of the
content which is independent from the other usecases? 
Does that mean that two instances of the same document stored in two
different identity maps may have different meta data, workflow, or
actual document content? 

Another question: Where in the code are documents added to the identity
map? I expected this to happen in DefaultDocumentBuilder or
DefaultDocument, but I wasn't able to figure it out.

Thanks in advance for an explanation. (I'm just trying to get a better
understanding of the matter.)

  Josias


> 
> 
> > Something similar happens with the metadata. The meta data file is
> > parsed about 20 times for each request. But probably if there was only
> > one document instance, this would not happen anymore.
> 
> This is the same issue as above.
> 
> -- Andreas
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: Multiple document instantiation

Posted by Andreas Hartmann <an...@apache.org>.
Josias Thoeny wrote:
> Hi,
> 
> Just noticed that for each request for a document in lenya-trunk, the
> DefaultDocument class gets instantiated about 300 times (put some debug
> output into the constructor to see it).
> Could the DocumentBuilder somehow use the DocumentIdentityMap to prevent
> multiple document instantiation?
> Or might that lead to problems, e.g. with synchronization if multiple
> threads access the same document simultaneously?

I guess that the main reason for this behaviour is that the preconditions
are verified for all usecases which are available in the menubar.

If we allow to show the menu during usecase execution (which is possible
at the moment, for instance in the site area), each usecase has to create
its own distinct identity map. If they would use a shared identity map
which is attached e.g. to the request, all usecases whould see the changes
which were made by the active usecase. This way, the precondition check
could be skewed.

What we could try is to use a distinct identity map for the active usecase
and a shared identity map for all inactive usecases (which should have no
write access to avoid interference).


> Something similar happens with the metadata. The meta data file is
> parsed about 20 times for each request. But probably if there was only
> one document instance, this would not happen anymore.

This is the same issue as above.

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org


Re: Multiple document instantiation

Posted by Andreas Hartmann <an...@apache.org>.
Josias Thoeny wrote:
> Hi,
> 
> Just noticed that for each request for a document in lenya-trunk, the
> DefaultDocument class gets instantiated about 300 times (put some debug
> output into the constructor to see it).
> Could the DocumentBuilder somehow use the DocumentIdentityMap to prevent
> multiple document instantiation?
> Or might that lead to problems, e.g. with synchronization if multiple
> threads access the same document simultaneously?

The reason is that documents are built in

   DocumentIdentityMap.getKey(String webappUrl)

This is done because the DocumentBuilder is the only entity which knows
how a URL is mapped to a document. Maybe this can be optimized.

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org