You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Andreas L Delmelle <a_...@pandora.be> on 2007/07/25 00:51:11 UTC

More about the effects of the PropertyCache

Hi all


Been reading up a bit on concurrency and synchronization and the  
likes, and I'm in the meantime quite convinced that the strategy  
should indeed be altered to move away from static final Maps. Don't  
know if Richard is still listening in and has any opinions/ideas?

Without doing any measurements, one can't say for certain, but I'm  
guessing the risk of thread contention is reasonably high. OTOH, I  
also haven't measured how much time each thread would actually spend  
parsing properties. If that percentage is only marginal, then the  
risk would be not be immediately catastrophic, unless in case of a  
very large amount of concurrent threads where chances increase that  
two threads need the same Map at the exact same instance.

In the meantime, I'll see if I can think up some tests/timings and  
make them part of the build. Jeremias wants to measure this for the  
past evolution, and I'd like to make it easier for us to track it in  
the future. In the longer term, we could maybe set up a page with  
statistics for this on the website (?)

That said, Java 1.5 has added a neat util.concurrent package with,  
besides the famous Semaphore, for example, a ConcurrentHashMap  
implementation. Supposed to perform far better in case of heavy  
contention, since it never locks the entire underlying table. Tricky  
implementation to simulate under 1.4 though, AFAIU. :-(

The benefit remains, though, since in the above example, given the  
results Jeremias provided, the save on those different Property  
instances together can easily mean a couple MB of extra heap per thread.

I've been playing with trying to make the cache thread-local, but I  
did not immediately find a solution for situations like:
- NumberProperty.getLength() has no context at all, but requires  
access to the cache of FixedLengths. Seems like NumberProperty should  
get a reference to the document it belongs to --a direct reference to  
the Root?-- from where it could gain access to a document-global  
cache of properties... The increase due to the reference is still  
insignificant compared to the save on distinct instances, and it  
would make the caches guaranteed thread-safe (since FOP is sequential  
internally) without the need for synchronization.

On the other hand, as mentioned in Bugzilla 41044, I've been thinking  
about using the PropertyMakers to our advantage here. Make all  
Property constructors protected, so only the Makers can access them  
in their make() methods. Nowhere outside the properties package  
should any class ever instantiate a NumberProperty (note: not even  
the PropertyParser, that should use a NumberPropertyMaker to get to  
an instance)

FlyWeightFactories? Good idea! I think we have them already, only,  
they are not exactly configured yet to flyweight anything.

I'm going to make this a priority, together with doing something  
similar for CommonHyphenation and CommonFont (which could become a  
candidate if it were only decomposed in font-size + rest of the  
properties...).


Just so you know :)

Cheers

Andreas

Re: More about the effects of the PropertyCache

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Andreas L Delmelle wrote:
> The CommonBorderPaddingBackground is still a memory eater, but I'll have 
> to take a look at LengthRangeProperty and CondLengthProperty first for 
> that.

The trick is probably to split it BPBs in a set of
"specified data" which can be cached and the "resolved
data".

Another idea I pursued (without success) in 0.20: some
properties can only have a very limited set of values,
in particular text-decoration, keeps, and a few others,
which can probably implemented as singletons instead of
using a cache. For properties like hyphenation-ladder-count
which are likely to have only a few values from a potentially
unlimited domain, a mix of pre-fabricated singletons for the
most likely used values and a list in case another value
is specified, without pruning during rendering, might provide
the best balance between memory efficacy and performance.

J.Pietschmann

Re: More about the effects of the PropertyCache

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 28, 2007, at 17:21, Andreas L Delmelle wrote:

<snip what="proposal of a concurrent PropertyCache" />

FWIW: I just couldn't resist extending further upon the cache, and in  
the meantime I have locally implemented similar caching for  
CommonHyphenation (already proposed last week), CommonFont and  
KeepProperty.

For CommonFont, since font-size and font-size-adjust may require  
dynamic evaluation context, I split up the class internally, so  
CommonFont itself gets only three references: fontSize,  
fontSizeAdjust and a CachedCommonFont instance containing the rest of  
the properties.
If font-size and font-size-adjust are absolute numerics, then the  
whole CommonFont instance is also cached in turn. If not, then a  
separate instance will be created, but the six member variables for  
the cacheable props are traded for one member reference to a cached  
bundle of those props, so the non-cached instances that do get  
created should already be a lot smaller than they are now.

For KeepProperty, the trick was to use the cache, not at creation  
time (since it's a compound it is currently not possible to do so:  
see bugzilla 41044 for the explanation), but in the getKeep() method  
which is used by FObj.bind() of the different FO subclasses. The  
scope of the non-cached KeepProperty instances is thereby limited to  
the PropertyList. The FOs will always get references to cached Keeps.

Those few changes should already significantly alter the looks of a  
dump like the one we saw last week. More Blocks, TableCells etc.

The CommonBorderPaddingBackground is still a memory eater, but I'll  
have to take a look at LengthRangeProperty and CondLengthProperty  
first for that.

Unless any objections arise, I will commit the proposed changes  
tomorrow evening.

Now, it's time for me to start that Wiki-page with the results of the  
user-poll...



Later

Andreas

Re: More about the effects of the PropertyCache

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 25, 2007, at 00:51, Andreas L Delmelle wrote:

> Without doing any measurements, one can't say for certain, but I'm  
> guessing the risk of thread contention is reasonably high.

Already did a few measurements, albeit no real FOP-related tests, but  
purely focused on the concurrency behavior of PropertyCache itself.  
The results seem somewhat disconcerting, IIC.

The more threads you spawn, and run simultaneously, the higher the  
penalty per thread. For maximum effect, try spawning 100 threads on a  
multi-CPU machine, and compare the average duration of a single  
thread performing 100000-200000 fetches with performing the same task  
100 times in sequence.

But even for as little as 10 threads, there already seems to be a  
severe performance penalty.

As to that number, I must admit that I have no real idea on realistic  
figures for testing. Maybe 100 threads is a bit over the top...
OTOH, if you look at the original profile Jeremias posted last week,  
we do see that PropertyCache.fetch() would be called roughly 100.000  
times for NumberProperty alone for the same 10 instances, for such a  
document. Very right to be concerned.

So, I got to thinking, and for that matter, more than only thinking...

<snip />

> That said, Java 1.5 has added a neat util.concurrent package with,  
> besides the famous Semaphore, for example, a ConcurrentHashMap  
> implementation. Supposed to perform far better in case of heavy  
> contention, since it never locks the entire underlying table.  
> Tricky implementation to simulate under 1.4 though, AFAIU. :-(

... and here's the catch:
Tricky, but certainly not impossible, so I found out. Cost me a bit  
of my time, but I hijacked some code from ConcurrentHashMap and  
HashMap to rewrite the PropertyCache as a 'WeakConcurrentHashMap',  
only, it's not really map. More a hashtable dedicated to storing only  
Property instances as keys. No need to store them also as (weak  
referents of) values, indeed!

Instead of making it a sort of proxy to a synchronizedMap, I went all  
the way down to the roots, gave the PropertyCache its own table,  
implemented get() and put() methods but made those only accessible to  
the fetch() method, etc.

Most tricky were in fact the cleaner thread and the rehash, which are  
implemented as follows:
If any put() operation pushes the number of entries in a segment  
above double the amount of segments, it spawns a cleaner thread that  
tries to clean up the WeakReferences in that segment first.
Obviously, since many of these cleaner threads can be spawned  
simultaneously, each one does a re-check under synch first to see if  
another already carried the load... if that is so, the thread  
immediately dies.
If that has no effect, rehash() is called, which blocks the entire  
cache. It first checks to see if more than half the amount of  
segments exceed this threshold. If that is the case, the amount of  
segments is doubled and the entries are redistributed. The larger the  
cache grows, the less likely it becomes that it is ever completely  
blocked, and at the same time, the cleaner threads always makes sure  
that obsolete references will be removed before increasing the size,  
so the cache should never grow unnecessarily large.

Almost didn't believe it myself at first, but if you compare the  
timings...
I almost suspect that I overlooked something, but everything I tested  
so far seems to be working. Definitely works for FOP.

Yes, it's an increased overhead in maintenance to 'roll your own',  
but still, if you look at the speed difference, that may be  
convincing enough.

No more words! The code: see attachments for the revised  
PropertyCache + a tester class. Try it out, compare with the minimal  
PropertyCache in current FOP trunk, and let me know if there's any  
objection to committing it before the release (even if only as a  
temporary measure, so we can at least offer the reduced footprint  
without too much of a penalty in concurrent runs)



Re: More about the effects of the PropertyCache

Posted by Manuel Mall <mm...@arcus.com.au>.
On Friday 03 August 2007 07:01, Andreas L Delmelle wrote:
> On Aug 1, 2007, at 10:06, Manuel Mall wrote:
> > On Wednesday 01 August 2007 15:45,
> > admin@rswheeldon.abelalways.co.uk
> >
> > What am I missing?
>
> You are not missing anything, AFAICT except that:
> ... FOP still aims for 1.4 compliance
> ... java.util.concurrent and like classes have not been implemented
> under 1.4 yet...
>

Andreas, you didn't quite answer my question. Why do we need a shared 
property cache and therefore synchronisation, etc.? Why can't we have a 
simple cache per FOP rendering run?

>
> Cheers
>
> Andreas

Manuel

Re: More about the effects of the PropertyCache

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Aug 1, 2007, at 10:06, Manuel Mall wrote:

> On Wednesday 01 August 2007 15:45, admin@rswheeldon.abelalways.co.uk
>
> What am I missing?

You are not missing anything, AFAICT except that:
... FOP still aims for 1.4 compliance
... java.util.concurrent and like classes have not been implemented  
under 1.4 yet...


Cheers

Andreas


Re: More about the effects of the PropertyCache

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Aug 9, 2007, at 19:00, Andreas L Delmelle wrote:

> On Aug 8, 2007, at 08:30, Manuel Mall wrote:
>
>> <snip />
>> Not quite my point. When you put your private implementation of an  
>> int
>> HashMap forward there were concerns raised about this as hard to
>> maintain and not really required. Now we have a custom implementation
>> of a concurrent hash map. Same concerns apply IMO.
>
> Concerns as Simple and Stupid as the cache IMO.

Before anyone (Manuel or Vincent, who opposed the idea of the IntMap  
in the first place) feels offended, let me clarify:
Why did you think I posted a proposal first, this time?

I do understand the concerns, and I agreed for the IntMap, since  
ultimately it was only used for a very small portion of FOP. The  
PropertyCache on the other hand, is a relatively crucial component as  
it heavily used (indirectly by the PropertyParser, for instance). As  
such, IMO, the added maintenance here is peanuts compared to the  
speed and memory benefits --provided that there is a heavy,  
noticeable performance penalty in real-time, and not only in the  
simulations I did.

That said, I did not read a veto anywhere... :o)


Cheers

Andreas

Re: More about the effects of the PropertyCache

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Aug 8, 2007, at 08:30, Manuel Mall wrote:

> <snip />
> Not quite my point. When you put your private implementation of an int
> HashMap forward there were concerns raised about this as hard to
> maintain and not really required. Now we have a custom implementation
> of a concurrent hash map. Same concerns apply IMO.

Concerns as Simple and Stupid as the cache IMO.

Cheers

Andreas

Re: More about the effects of the PropertyCache

Posted by Manuel Mall <mm...@arcus.com.au>.
On Wednesday 08 August 2007 04:02, Andreas L Delmelle wrote:
> On Aug 4, 2007, at 13:22, Vincent Hennebert wrote:
> > Manuel Mall a écrit :
> > <snip/>
> >
> >> I have been following this discussion with very little attempt to
> >> understand the intricate technical details of concurrent maps etc,
> >> but
> >> I am wondering why we don't apply the KISS principle here?
>
> Oh, but it is Simple and Stupid. :-)
> Much simpler and much more stupid than our layoutengine or hyphenator
> or property resolution code 

Not quite my point. When you put your private implementation of an int 
HashMap forward there were concerns raised about this as hard to 
maintain and not really required. Now we have a custom implementation 
of a concurrent hash map. Same concerns apply IMO.

<snip/>
>
> Cheers
>
> Andreas

Manuel

Re: More about the effects of the PropertyCache

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Aug 4, 2007, at 13:22, Vincent Hennebert wrote:

> Manuel Mall a écrit :
> <snip/>
>> I have been following this discussion with very little attempt to
>> understand the intricate technical details of concurrent maps etc,  
>> but
>> I am wondering why we don't apply the KISS principle here?

Oh, but it is Simple and Stupid. :-)
Much simpler and much more stupid than our layoutengine or hyphenator  
or property resolution code or...

The initial version I posted earlier was in fact so stupid it still  
contained a huge design flaw.
Lucky for me, nobody actually took a look at the code... Either that,  
or I should assume that a whole lot of people looked at it, but no  
one understood it. But that doesn't make it 'not simple'!

>> IIRC the original problem was that the FOP memory footprint for
>> rendering large documents was causing issues. One set of culprits  
>> that
>> were identified were the properties. Given that a FOP rendering  
>> run is
>> single threaded, i.e. there are no threads created within FOP, why
>> don't we start with a property cache per run?

We have already started (a couple of months ago)... with static final  
caches. An idea that seems to do the job nicely, although there are  
some reservations on the performance penalty because the cache needs  
to be synchronized in that case.

OTOH, Richard also raised the question, and to be honest, I don't  
have a clue for the moment:
What are realistic numbers to measure that penalty with?

I tried spawning 10 threads, and the penalty was already significant,  
but... This is supposing that all those 10 threads would need to  
access the /same/ cache at the exact /same/ instant.

If each thread spends less than 10% of the time parsing properties,  
the chances of that happening become very little.

>> No threading issues, no performance issues, and large gains in  
>> memory footprint reduction.

Indeed, but I think that implementing that would be much more  
difficult than the current solution. Much less adhering to the  
beloved KISS-principle than a Simple and Stupid home-made  
Hashtable... :-)

>> That will even benefit the memory footprint of concurrent FOP runs.
>> Admittedly not to the extend as globally shared cache would do,  
>> but it
>> would be much simpler and we can use the standard Java collection
>> classes to implement it.
>
> I’m afraid I must agree with Manuel here; although I’m not very  
> familiar
> with that whole area and I may well be also missing something.
>
> It seems to me that the main problem of FOP is that it isn’t able to
> render big documents, and that properties only play a part in that
> problem. It might be more useful to try and optimize the whole  
> rendering
> process, from which everyone will benefit, those running FOP on a  
> server
> as well as all others. That’s not the same kind of effort but it’s as
> much important IMHO.
>
> A cache per rendering run would do the thing, wouldn’t it? Coupled  
> with
> a flyweight factory for those properties with a small number of  
> possible
> values, which themselves could be shared among the different threads.

Yes, but see my comments above: to implement this, we would be  
introducing yet another set of classes, making the property  
resolution code even more complicated than it already is. It only / 
seems/ simpler...

> Also, maybe it’s worth keeping in mind that while that’s not currently
> the case, we want to eventually make the rendering process
> multi-threaded. Although the two issues might actually not interfere.

Well, if anything, the big PRO for a concurrent, thread-safe cache  
should precisely be the prospect of FOP multithreading *internally*.  
In that case, the current solution --possibly over time again backed  
by a standard 1.5 ConcurrentHashMap-- at least offers something which  
a rendering-run-local cache would lack.


Cheers

Andreas


Re: More about the effects of the PropertyCache

Posted by Vincent Hennebert <vi...@anyware-tech.com>.
Hi,

Manuel Mall a écrit :
<snip/>
> I have been following this discussion with very little attempt to
> understand the intricate technical details of concurrent maps etc, but
> I am wondering why we don't apply the KISS principle here?
>
> IIRC the original problem was that the FOP memory footprint for
> rendering large documents was causing issues. One set of culprits that
> were identified were the properties. Given that a FOP rendering run is
> single threaded, i.e. there are no threads created within FOP, why
> don't we start with a property cache per run? No threading issues, no
> performance issues, and large gains in memory footprint reduction.
>
> That will even benefit the memory footprint of concurrent FOP runs.
> Admittedly not to the extend as globally shared cache would do, but it
> would be much simpler and we can use the standard Java collection
> classes to implement it.

I’m afraid I must agree with Manuel here; although I’m not very familiar
with that whole area and I may well be also missing something.

It seems to me that the main problem of FOP is that it isn’t able to
render big documents, and that properties only play a part in that
problem. It might be more useful to try and optimize the whole rendering
process, from which everyone will benefit, those running FOP on a server
as well as all others. That’s not the same kind of effort but it’s as
much important IMHO.

A cache per rendering run would do the thing, wouldn’t it? Coupled with
a flyweight factory for those properties with a small number of possible
values, which themselves could be shared among the different threads.

Also, maybe it’s worth keeping in mind that while that’s not currently
the case, we want to eventually make the rendering process
multi-threaded. Although the two issues might actually not interfere.

WDYT?
Vincent


Re: More about the effects of the PropertyCache

Posted by Manuel Mall <mm...@arcus.com.au>.
On Wednesday 01 August 2007 15:45, admin@rswheeldon.abelalways.co.uk 
wrote:
> Andreas L Delmelle writes:
<snip/>
>  > I've been playing with trying to make the cache thread-local
>
> If in doubt, you could always make the caching multi-level.
> L1 is the current global cache. L2 is a thread-local equivalent
> of the same - all pointing to the same properties (if required)
> but with different access keys. The L2 cache in this case need
> have no contention at all,
>

I have been following this discussion with very little attempt to 
understand the intricate technical details of concurrent maps etc, but 
I am wondering why we don't apply the KISS principle here?

IIRC the original problem was that the FOP memory footprint for 
rendering large documents was causing issues. One set of culprits that 
were identified were the properties. Given that a FOP rendering run is 
single threaded, i.e. there are no threads created within FOP, why 
don't we start with a property cache per run? No threading issues, no 
performance issues, and large gains in memory footprint reduction.

That will even benefit the memory footprint of concurrent FOP runs.  
Admittedly not to the extend as globally shared cache would do, but it 
would be much simpler and we can use the standard Java collection 
classes to implement it.

What am I missing?

> Richard

Manuel

Re: More about the effects of the PropertyCache

Posted by ad...@rswheeldon.abelalways.co.uk.
Andreas L Delmelle writes:
 > Been reading up a bit on concurrency and synchronization and the  
 > likes, and I'm in the meantime quite convinced that the strategy  
 > should indeed be altered to move away from static final Maps. Don't  
 > know if Richard is still listening in and has any opinions/ideas?

I'm listening again. Been very busy for the last few weeks - moving
etc. I've missed most of the recent conversations so if there is
anything important, relevant and not covered on the 41044 messages
please let me know.

Back to the concurrency, I'm not sure entirely why this is an major
issue. As I understand it the generation of a single fop document is
mostly linear - i.e. single threaded.

Assuming we're dealing with the case of multiple documents
simultaneously (e.g. in a webapp), does anyone have usable numbers
on the number of possible threads?

 > I've been playing with trying to make the cache thread-local

If in doubt, you could always make the caching multi-level.
L1 is the current global cache. L2 is a thread-local equivalent
of the same - all pointing to the same properties (if required)
but with different access keys. The L2 cache in this case need
have no contention at all,

Richard


Re: More about the effects of the PropertyCache

Posted by Andreas L Delmelle <a_...@pandora.be>.
On Jul 25, 2007, at 00:51, Andreas L Delmelle wrote:

>
> Hi all

<snip />

One more idea before I go to bed:

> In the meantime, I'll see if I can think up some tests/timings and  
> make them part of the build. Jeremias wants to measure this for the  
> past evolution, and I'd like to make it easier for us to track it  
> in the future. In the longer term, we could maybe set up a page  
> with statistics for this on the website (?)

Would it make sense to build functionality into FOP to attach a  
chronometer to the process? An object, attached to ... the  
FOEventHandler, and can from there also be passed up the layout-tree.  
One that each type of object we consider to be critical can register  
timings on of its most important methods or something...?


Cheers

Andreas