You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Marshall Schor <ms...@schor.com> on 2008/10/01 21:21:12 UTC

Another interesting potential speedup

Profiling certainly shows unusual places you'd never think to look :-)

This may be a bit of an anomaly - but we have a scaleout test for
uima-as, sending large numbers of CASes over the wire (but the test is
running in multiple JVMs on one machine - so there's no network
delays).  We're running this with essentially empty CASes - just to see
where other overhead is.

We expected that things like deserialization would not show up - because
the CASes were empty.  However, deserialization was the biggest time
consumer.  Looking into this, it turns out that (in our particular case)
90% of the time in deserialization was due to creating a new XML Reader
(the call: XMLReaderFactory.createXMLReader.  A quick search on the
internet turned up this link:
http://www.ibm.com/developerworks/xml/library/x-perfap2.html which
suggested this could indeed be a bottleneck, which could be avoided by
reusing the same XMLReader object, instead of throwing it away and
getting a new one on every call.

This would take some work (pooling, etc.) to make things thread-safe,
but might be a good thing to do -- unless small but non-empty CASes turn
out to bottleneck in some other way that swamps this measurement.

This only applies to transports that use XML-style of
serialization/deserialization, of course.

-Marshall

Re: Another interesting potential speedup

Posted by Adam Lally <al...@alum.rpi.edu>.
On Wed, Oct 1, 2008 at 3:21 PM, Marshall Schor <ms...@schor.com> wrote:
> Profiling certainly shows unusual places you'd never think to look :-)
>
> This may be a bit of an anomaly - but we have a scaleout test for
> uima-as, sending large numbers of CASes over the wire (but the test is
> running in multiple JVMs on one machine - so there's no network
> delays).  We're running this with essentially empty CASes - just to see
> where other overhead is.
>
> We expected that things like deserialization would not show up - because
> the CASes were empty.  However, deserialization was the biggest time
> consumer.  Looking into this, it turns out that (in our particular case)
> 90% of the time in deserialization was due to creating a new XML Reader
> (the call: XMLReaderFactory.createXMLReader.  A quick search on the
> internet turned up this link:
> http://www.ibm.com/developerworks/xml/library/x-perfap2.html which
> suggested this could indeed be a bottleneck, which could be avoided by
> reusing the same XMLReader object, instead of throwing it away and
> getting a new one on every call.
>
> This would take some work (pooling, etc.) to make things thread-safe,
> but might be a good thing to do -- unless small but non-empty CASes turn
> out to bottleneck in some other way that swamps this measurement.
>
> This only applies to transports that use XML-style of
> serialization/deserialization, of course.
>

That sounds like a good find!  I think pooling might not actually be
necessary to use this in UIMA-AS.  If there are a fixed number of
listener threads that do deserialization, each can just create its own
instance of the XMLReader object once during initialization and then
reuse that one object multiple times.

I don't think the uima core has to change at all.  Just don't use the
static XmiCasDeserializer.deserialize methods, which internally create
XMLReaders.  If you look inside XmiCasDeserialiszer.deserialize you'll
see:

    XMLReader xmlReader = XMLReaderFactory.createXMLReader();
    XmiCasDeserializer deser = new XmiCasDeserializer(aCAS.getTypeSystem());
    ContentHandler handler = deser.getXmiCasHandler(aCAS, aLenient,
aSharedData, aMergePoint);
    xmlReader.setContentHandler(handler);
    xmlReader.parse(new InputSource(aStream));

Which you can do yourself in UIMA-AS, just moving the call to
XMLReaderFactory to an initialization step.

  -Adam