You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@chemistry.apache.org by Bindu Wavell <bw...@google.com> on 2014/03/19 20:14:35 UTC

Infinite loop in AtomEntryParser

Our server is Alfresco Enterprise 4.1.3.0 which is in turn using
openchemistry 0.7.0. Our client is Alfresco Workdesk 4.1.1.1 which is in
turn using openchemistry 0.8.0.

For the last few weeks we have run into a situation where the Alfresco
server becomes unresponsive. Specifically it doesn't crash it simply
becomes so overloaded that we can't login via Workdesk/CMIS. Using JConsole
and subsequently YourKit we can see that the server runs fine and uses
between 3-6GB of RAM with nice slow growth and cleanup. Then at some point
we see a huge amount of memory allocation and deallocation between 3GB and
15GB every couple of seconds and this occurs forever until we restart the
server. The parallel GC cleaning up this memory consumes so much CPU that
folks can't actually use the system.

To begin with we took a stackshot every 10 seconds for a minute. We saw
that there were 5 opencmis threads "frozen" while parsing an AtomPub
document. They were not deadlocked. Subsequently we found the same thing
using YourKit, it calls these frozen threads. Following is an example stack
trace from YourKit:

http-127.0.0.1-8080-2 <--- Frozen for at least 47m 15 sec
com.ctc.wstx.sr.StreamScanner.getNext()
com.ctc.wstx.sr.BasicStreamReader.skipToken()
com.ctc.wstx.sr.BasicStreamReader.nextFromTree()
com.ctc.wstx.sr.BasicStreamReader.next()
org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.next(XMLStreamReader)
org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.skip(XMLStreamReader)
org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseCmisContent(XMLStreamReader)
org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseEntry(XMLStreamReader)
org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parse(InputStream)
org.apache.chemistry.opencmis.server.impl.atompub.ObjectService.create(CallContext,
CmisService, String, HttpServletRequest, HttpServletResponse)
sun.reflect.NativeMethodAccessorImpl.invoke0(Method, Object, Object[])
sun.reflect.NativeMethodAccessorImpl.invoke(Object, Object[])
sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[])
java.lang.reflect.Method.invoke(Object, Object[])
org.apache.chemistry.opencmis.server.shared.Dispatcher.dispatch(String,
String, CallContext, CmisService, String, HttpServletRequest,
HttpServletResponse)
org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.dispatch(CallContext,
HttpServletRequest, HttpServletResponse)
org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.service(HttpServletRequest,
HttpServletResponse)
javax.servlet.http.HttpServlet.service(ServletRequest, ServletResponse)
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
ServletResponse)
org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
ServletResponse)
org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(ServletRequest,
ServletResponse, FilterChain)
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
ServletResponse)
org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
ServletResponse)
org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response)
org.apache.catalina.core.StandardContextValve.invoke(Request, Response)
org.apache.catalina.authenticator.AuthenticatorBase.invoke(Request,
Response)
org.apache.catalina.core.StandardHostValve.invoke(Request, Response)
org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response)
org.apache.catalina.core.StandardEngineValve.invoke(Request, Response)
org.apache.catalina.connector.CoyoteAdapter.service(Request, Response)
org.apache.coyote.http11.Http11Processor.process(Socket)
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Socket)
org.apache.tomcat.util.net.JIoEndpoint$Worker.run()
java.lang.Thread.run()

By inspecting garbage collected objects we can see that this is almost all
string[] tied back to AtomEntryParser (and the woodstox parser stuff under
it.) If we look at one of these arrays we see the same stuff over and over
again... see a screenshot here of the first few elements in such an array:
http://i.imgur.com/RzNQBYG.png . There are a bunch of these VERY large
string arrays and they all look like they are looping over the front of an
AtomPub document.

I noticed in opencmis 0.9.0 in the implementation of AtomEntryParser that a
CappedInputStream was added while parsing the AtomPub stream. This appears
to be limited to 10MB. Not sure if that is related to this issue. Or if
others have run into this issue. Any thought/guidance would be greatly
appreciated.

I'm bcc'ing Florian and Gab (the spy) but figured it was worth having this
public as I've been trying to track this issue down for a while and have
not found an answer.

Thanks all!

Re: Infinite loop in AtomEntryParser

Posted by Bindu Wavell <bw...@google.com>.

I'm relatively certain there is a "bug" in chemistry 0.7, I'm relatively
certain it is fixed in 0.10. The reason I put bug in double quotes is that
the actual bug is really a problem in either xerces or woodstox or
something else that interacts with xml under chemistry. But, this bug is
triggered by something chemistry is doing that I suspect is naughty.

In the skip() method in AtomEntryParser in 0.7 you'll see a while loop
calling next(). There is a try catch wrapping parser.next() and an
IOException wrapped in an XMLStreamException (yeah next in
BasicStreamReader in woodstox does that for read timed out, reaching end of
file before we are done parsing, etc) hits the catch block and is eaten and
breaks us out of the while loop in skip(). Then an extra next() is called
in skip(). This call to next() after the stream has thrown the IO exception
seems quite dubious. And with the xerces/woodstox included with Alfresco
4.1.3 calling next after this exception causes the stream to reset and we
start re-parsing the atom entry all over again.

In chemistry 0.10 (maybe earlier) the try catch is removed and the
exception bubbles up and the additional next() is not called meaning we
don't (at least in this case) end up in an infinite loop reading the same
stuff over and over again.

For what it's worth, we have a createDocument atompub document you can send
to alfresco 4.1.3 using netcat and trigger this issue. If this is
interesting to folks on the chemistry team I can provide the file and
instructions. As the issue appears to be resolved in newer versions of
chemistry, I'm guessing it's not that important, but do let me know if you
want the file. On the other hand, I think this can be used to trigger a
denial of service attack against any code using the chemistry 0.7 server
stuff, so might be worth groking and possibly notifying some folks...





On Thu, Mar 20, 2014 at 11:38 AM, Bindu Wavell <bw...@google.com> wrote:

> Thanks for your reply Florian. I too would like to be able to reproduce
> this issue, it would make things much easier to debug. Even having a live
> system experiencing the issue, it is quite difficult to get enough
> information from a profiler (can't attach a debugger to our live systems)
> to really understand in detail what the problem is.
>
> It seems to me that woodstox has an issue where next() is looping
> infinitely or we are producing an atompub document with a hell of a lot of
> atom:entry elements. I'm leaning towards the former. Partly because I am
> able to reproduce a different issue that seems to be related.
>
> I have a document where the document name contains some strange
> characters, when I upload this document via Alfresco Workdesk using their
> html5 drag-and-drop uploader the create atom pub document ends with
> <chemistry:filename>  i.e. the document is incomplete. This results in an
> invalid space exception from woodstox when alfresco/opencmis-server tries
> to parse the document.
>
> Here is the incomplete creation atompub document that is sent to alfresco:
>
> <?xml version='1.0' encoding='UTF-8'?><atom:entry xmlns:atom="
> http://www.w3.org/2005/Atom" xmlns:cmis="
> http://docs.oasis-open.org/ns/cmis/core/200908/" xmlns:cmisra="
> http://docs.oasis-open.org/ns/cmis/restatom/200908/" xmlns:chemistry="
> http://chemistry.apache.org/"><atom:id>urn:uuid:00000000-0000-0000-0000-00000000000</atom:id><atom:title>Exhibit
> 2.2(c) to Series F Disclosure Schedule
> </atom:title><atom:updated>2014-03-20T16:56:51Z</atom:updated><cmisra:content><cmisra:mediatype>application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</cmisra:mediatype><chemistry:filename>
>
> Here is a zip containing the document with the strange characters in the
> name, note the zip may also have the strange characters in the name. I made
> the zip to protect the original filename as certain things like email
> clients will "fix" the name of an attachment...
>
>
> https://dl.dropboxusercontent.com/u/54062/forum%20attachments/Exhibit%202.2(c)%20to%20Series%20F%20Disclosure%20Schedule%20%E2%80%93%20Capitalization%20Table.xlsx.zip
>
> For what it's worth, I don't think that this particular use case triggers
> the memory explosion, but it is indicative of some problems with
> serializing and deserializing in
> wordkesk/alfresco/opencmis/woodstox/xerces...
>
> We are using Tomcat, I'll see if I can find an open network connection
> that is spewing a lot of data the next time the issue occurs (for what it's
> worth I looked in my YourKit dump and did not see that, but I'm not
> confident in that answer.)
>
> I'm not sure which timeouts you are referring to, we have timeouts
> configured for opencmis-client in workdesk, not sure if alfresco configures
> timeouts (I assume they do)... We have increased the timeouts for workdesk.
> We have the connection timeout set to 10000 and the readtimeout set to
> 60000. I'm not sure if there is a write timeout, but I don't see one in the
> config, so I assume it would use the default.
>
>
>
> On Thu, Mar 20, 2014 at 10:51 AM, Florian Müller <fm...@apache.org> wrote:
>
>> Hi Bindu,
>>
>> I've scanned the OpenCMIS and Woodstox code again, but I can't find the
>> cause of this issue. It's difficult without being able to reproduce this.
>> I can exclude at least that OpenCMIS parses the entry over and over
>> again. The incoming stream is directly read and parsed. There is (at least
>> for OpenCMIS) no way to jump back and parse it again.
>>
>> There might be some kind of IO issue (see the exceptions). What servlet
>> container do you use? Is there a proxy or firewall between the client and
>> the server?
>> In one of your emails I found this: "Frozen for at least 47m 15 sec".
>> Usually a connection times out when no bytes are transferred. Did you
>> disable the timeout on the client and the server? If not, the client must
>> send more and more data, but I don't know how. The OpenCMIS client doesn't
>> do that. Could you check if there is steady incoming network traffic when
>> the server is in that state?
>>
>> - Florian
>>
>>
>>
>>  I should add that I'm seeing 3.6 million of the following three
>>> exceptions
>>> (with associate stacks resulting in the exception):
>>>
>>> com.ctc.wstx.exc.WstxParsingException
>>>   BasicStreamReader.readPIPrimary()
>>>     BasicStreamReader.nextFromTree()
>>>       BasicStreamReader.next()
>>>         AtomEntryParser.next()
>>>           AtomEntryParser.skip()
>>>             AtomEntryParser.parseCmisContent()
>>>               AtomEntryParser.parseEntry()
>>>                 AtomEntryParser.parse()
>>>                   ObjectService.create()
>>>
>>> java.io.IOException
>>>   ChunkedInputFilter.doRead()
>>>     InternalInputBuffer.doRead()
>>>       Request.doRead()
>>>         InputBuffer.realReadBytes()
>>>           ByteChunk.substract()
>>>             InputBuffer.read()
>>>               CoyoteInputStream.read()
>>>                 UTF8Reader.loadMore()
>>>                   UTF8Reader.read()
>>>                     ReaderSource.readInto()
>>>                       BranchingReaderSource.readInto()
>>>                         StreamScanner.loadMore()
>>>
>>> com.ctc.wstx.exc.WstxIOException
>>>   BasicStreamReader.next()
>>>     AtomEntryParser.next()
>>>       AtomEntryParser.skip()
>>>         AtomEntryParser.parseCmisContent()
>>>           AtomEntryParser.parseEntry()
>>>             AtomEntryParser.parse()
>>>               ObjectService.create()
>>>
>>>
>>> Looking at the 3.2.4 version of com.ctc.wstx.sr.BasicStreamReader I see
>>> the
>>> following in readPIPrimary()
>>>
>>>
>>> if (target.equalsIgnoreCase
>>>
>>> <http://grepcode.com/file/repository.grepcode.com/java/
>>> root/jdk/openjdk/6-b27/java/lang/String.java#String.
>>> equalsIgnoreCase%28java.lang.String%29>("xml"))
>>>
>>> {
>>>
>>>     // 07-Oct-2005, TSa: Still legal in multi-doc mode...
>>>
>>>     if (!mConfig
>>>
>>> <http://grepcode.com/file/repo1.maven.org/maven2/org.
>>> codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/sr/StreamScanner.java#
>>> StreamScanner.0mConfig>.inputParsingModeDocuments
>>>
>>> <http://grepcode.com/file/repo1.maven.org/maven2/org.
>>> codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/api/ReaderConfig.java#
>>> ReaderConfig.inputParsingModeDocuments%28%29>())
>>> {
>>>
>>>         throwParseError
>>>
>>> <http://grepcode.com/file/repo1.maven.org/maven2/org.
>>> codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/sr/StreamScanner.java#
>>> StreamScanner.throwParseError%28java.lang.String%2Cjava.lang.Object%29
>>> >(ErrorConsts
>>>
>>> <http://grepcode.com/file/repo1.maven.org/maven2/org.
>>> codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/cfg/
>>> ErrorConsts.java#ErrorConsts.0ERR_WF_PI_XML_TARGET>.ERR_WF_PI_XML_TARGET
>>>
>>> <http://grepcode.com/file/repo1.maven.org/maven2/org.
>>> codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/cfg/
>>> ErrorConsts.java#ErrorConsts.0ERR_WF_PI_XML_TARGET>,
>>>
>>> target);
>>>
>>>     }
>>>
>>>
>>> I also see next() catching IOException and throwing
>>> com.ctc.wstx.exc.WstxIOException.
>>>
>>> Does this mean that cmisra:content somehow contains another xml document
>>> with an xml root processing instruction? Since I'm not sure how to
>>> capture
>>> the source AtomPub document that is triggering this I find that I'm
>>> really
>>> grasping at straws, I hope someone with more experience in the code base
>>> can shed some light :)
>>>
>>> Thanks for reading through my ramblings...
>>>
>>>
>>>
>>> On Wed, Mar 19, 2014 at 8:33 PM, Bindu Wavell <bw...@google.com>
>>> wrote:
>>>
>>>  As I said we had a bunch of very large string arrays allocated and
>>>> collected... These huge arrays have the following list of strings over
>>>> and
>>>> again (as indicated in the linked picture):
>>>>
>>>> atom
>>>> entry
>>>> http://www.w3.org/2005/Atom
>>>> cmisra
>>>> content
>>>> http://docs.oasis-open.org/ns/cmis/restatom/200908/
>>>> chemistry
>>>> filename
>>>> http://chemistry.apache.org/
>>>>
>>>> Since Alfresco Workdesk 4.1.1.1 is using the openchemistry 0.8.0 AtomPub
>>>> client the chemistry:filename element is included in the cmisra:content
>>>> element. The parser in openchemistry 0.7.0 is trying to skip this
>>>> (because
>>>> it doesn't know about the chemistry:filename extension.) Notice that we
>>>> aren't seeing cmisra:mediatype or cmisra:base64 here.
>>>>
>>>> For some reason this is introducing a parsing loop where we re-parse the
>>>> atom:entry over and over again. I guess it's possible the client is
>>>> producing this repetition too...
>>>>
>>>>
>>>>
>>>> On Wed, Mar 19, 2014 at 3:18 PM, Bindu Wavell <bw...@google.com>
>>>> wrote:
>>>>
>>>>  I believe they are using 3.2.4, assuming this is the right
>>>>> jar: wstx-asl-3.2.4.jar
>>>>>
>>>>>
>>>>> On Wed, Mar 19, 2014 at 2:11 PM, Florian Müller <fm...@apache.org>
>>>>> wrote:
>>>>>
>>>>>  Hi Bindu,
>>>>>>
>>>>>> I've never seen something like this. Do you know which Woodstox
>>>>>> version
>>>>>> is used?
>>>>>>
>>>>>> The CappedInputStream is something different. It prevents malicious
>>>>>> clients from sending an endless XML and consuming server resources.
>>>>>>
>>>>>>
>>>>>> - Florian
>>>>>>
>>>>>>
>>>>>>
>>>>>> > Our server is Alfresco Enterprise 4.1.3.0 which is in turn using
>>>>>> > openchemistry 0.7.0. Our client is Alfresco Workdesk 4.1.1.1 which
>>>>>> is
>>>>>> in
>>>>>> > turn using openchemistry 0.8.0.
>>>>>> >
>>>>>> > For the last few weeks we have run into a situation where the
>>>>>> Alfresco
>>>>>> > server becomes unresponsive. Specifically it doesn't crash it simply
>>>>>> > becomes so overloaded that we can't login via Workdesk/CMIS. Using
>>>>>> > JConsole and subsequently YourKit we can see that the server runs
>>>>>> fine
>>>>>> > and uses between 3-6GB of RAM with nice slow growth and cleanup.
>>>>>> Then
>>>>>> at
>>>>>> > some point we see a huge amount of memory allocation and
>>>>>> deallocation
>>>>>> > between 3GB and 15GB every couple of seconds and this occurs forever
>>>>>> > until we restart the server. The parallel GC cleaning up this memory
>>>>>> > consumes so much CPU that folks can't actually use the system.
>>>>>> >
>>>>>> > To begin with we took a stackshot every 10 seconds for a minute. We
>>>>>> saw
>>>>>> > that there were 5 opencmis threads "frozen" while parsing an AtomPub
>>>>>> > document. They were not deadlocked. Subsequently we found the same
>>>>>> thing
>>>>>> > using YourKit, it calls these frozen threads. Following is an
>>>>>> example
>>>>>> > stack trace from YourKit:
>>>>>> >
>>>>>> > http-127.0.0.1-8080-2 <--- Frozen for at least 47m 15 sec
>>>>>> > com.ctc.wstx.sr.StreamScanner.getNext()
>>>>>> > com.ctc.wstx.sr.BasicStreamReader.skipToken()
>>>>>> > com.ctc.wstx.sr.BasicStreamReader.nextFromTree()
>>>>>> > com.ctc.wstx.sr.BasicStreamReader.next()
>>>>>> >
>>>>>>
>>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>>> AtomEntryParser.next(XMLStreamReader)
>>>>>> >
>>>>>>
>>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>>> AtomEntryParser.skip(XMLStreamReader)
>>>>>> >
>>>>>>
>>>>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.
>>>>>> parseCmisContent(XMLStreamReader)
>>>>>> >
>>>>>>
>>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>>> AtomEntryParser.parseEntry(XMLStreamReader)
>>>>>> >
>>>>>>
>>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>>> AtomEntryParser.parse(InputStream)
>>>>>> >
>>>>>>
>>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>>> ObjectService.create(CallContext,
>>>>>> > CmisService, String, HttpServletRequest, HttpServletResponse)
>>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke0(Method, Object,
>>>>>> Object[])
>>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(Object, Object[])
>>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[])
>>>>>> > java.lang.reflect.Method.invoke(Object, Object[])
>>>>>> > org.apache.chemistry.opencmis.server.shared.Dispatcher.
>>>>>> dispatch(String,
>>>>>> > String, CallContext, CmisService, String, HttpServletRequest,
>>>>>> > HttpServletResponse)
>>>>>> >
>>>>>>
>>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>>> CmisAtomPubServlet.dispatch(CallContext,
>>>>>> > HttpServletRequest, HttpServletResponse)
>>>>>> >
>>>>>>
>>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>>> CmisAtomPubServlet.service(HttpServletRequest,
>>>>>> > HttpServletResponse)
>>>>>> > javax.servlet.http.HttpServlet.service(ServletRequest,
>>>>>> ServletResponse)
>>>>>> >
>>>>>>
>>>>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
>>>>>> ServletRequest,
>>>>>> > ServletResponse)
>>>>>> >
>>>>>>
>>>>>> org.apache.catalina.core.ApplicationFilterChain.
>>>>>> doFilter(ServletRequest,
>>>>>> > ServletResponse)
>>>>>> >
>>>>>>
>>>>>> org.alfresco.web.app.servlet.GlobalLocalizationFilter.
>>>>>> doFilter(ServletRequest,
>>>>>> > ServletResponse, FilterChain)
>>>>>> >
>>>>>>
>>>>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
>>>>>> ServletRequest,
>>>>>> > ServletResponse)
>>>>>> >
>>>>>>
>>>>>> org.apache.catalina.core.ApplicationFilterChain.
>>>>>> doFilter(ServletRequest,
>>>>>> > ServletResponse)
>>>>>> > org.apache.catalina.core.StandardWrapperValve.invoke(Request,
>>>>>> Response)
>>>>>> > org.apache.catalina.core.StandardContextValve.invoke(Request,
>>>>>> Response)
>>>>>> > org.apache.catalina.authenticator.AuthenticatorBase.invoke(Request,
>>>>>> > Response)
>>>>>> > org.apache.catalina.core.StandardHostValve.invoke(Request,
>>>>>> Response)
>>>>>> > org.apache.catalina.valves.ErrorReportValve.invoke(Request,
>>>>>> Response)
>>>>>> > org.apache.catalina.core.StandardEngineValve.invoke(Request,
>>>>>> Response)
>>>>>> > org.apache.catalina.connector.CoyoteAdapter.service(Request,
>>>>>> Response)
>>>>>> > org.apache.coyote.http11.Http11Processor.process(Socket)
>>>>>> >
>>>>>>
>>>>>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.
>>>>>> process(Socket)
>>>>>> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run()
>>>>>> > java.lang.Thread.run()
>>>>>> >
>>>>>> > By inspecting garbage collected objects we can see that this is
>>>>>> almost
>>>>>> > all string[] tied back to AtomEntryParser (and the woodstox parser
>>>>>> stuff
>>>>>> > under it.) If we look at one of these arrays we see the same stuff
>>>>>> over
>>>>>> > and over again... see a screenshot here of the first few elements in
>>>>>> > such an array: http://i.imgur.com/RzNQBYG.png . There are a bunch
>>>>>> of
>>>>>> > these VERY large string arrays and they all look like they are
>>>>>> looping
>>>>>> > over the front of an AtomPub document.
>>>>>> >
>>>>>> > I noticed in opencmis 0.9.0 in the implementation of AtomEntryParser
>>>>>> > that a CappedInputStream was added while parsing the AtomPub stream.
>>>>>> > This appears to be limited to 10MB. Not sure if that is related to
>>>>>> this
>>>>>> > issue. Or if others have run into this issue. Any thought/guidance
>>>>>> would
>>>>>> > be greatly appreciated.
>>>>>> >
>>>>>> > I'm bcc'ing Florian and Gab (the spy) but figured it was worth
>>>>>> having
>>>>>> > this public as I've been trying to track this issue down for a while
>>>>>> and
>>>>>> > have not found an answer.
>>>>>> >
>>>>>> > Thanks all!
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>
>

Re: Infinite loop in AtomEntryParser

Posted by Bindu Wavell <bw...@google.com>.

Thanks for your reply Florian. I too would like to be able to reproduce
this issue, it would make things much easier to debug. Even having a live
system experiencing the issue, it is quite difficult to get enough
information from a profiler (can't attach a debugger to our live systems)
to really understand in detail what the problem is.

It seems to me that woodstox has an issue where next() is looping
infinitely or we are producing an atompub document with a hell of a lot of
atom:entry elements. I'm leaning towards the former. Partly because I am
able to reproduce a different issue that seems to be related.

I have a document where the document name contains some strange characters,
when I upload this document via Alfresco Workdesk using their html5
drag-and-drop uploader the create atom pub document ends with
<chemistry:filename>  i.e. the document is incomplete. This results in an
invalid space exception from woodstox when alfresco/opencmis-server tries
to parse the document.

Here is the incomplete creation atompub document that is sent to alfresco:

<?xml version='1.0' encoding='UTF-8'?><atom:entry xmlns:atom="
http://www.w3.org/2005/Atom" xmlns:cmis="
http://docs.oasis-open.org/ns/cmis/core/200908/" xmlns:cmisra="
http://docs.oasis-open.org/ns/cmis/restatom/200908/" xmlns:chemistry="
http://chemistry.apache.org/"><atom:id>urn:uuid:00000000-0000-0000-0000-00000000000</atom:id><atom:title>Exhibit
2.2(c) to Series F Disclosure Schedule
</atom:title><atom:updated>2014-03-20T16:56:51Z</atom:updated><cmisra:content><cmisra:mediatype>application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</cmisra:mediatype><chemistry:filename>

Here is a zip containing the document with the strange characters in the
name, note the zip may also have the strange characters in the name. I made
the zip to protect the original filename as certain things like email
clients will "fix" the name of an attachment...

https://dl.dropboxusercontent.com/u/54062/forum%20attachments/Exhibit%202.2(c)%20to%20Series%20F%20Disclosure%20Schedule%20%E2%80%93%20Capitalization%20Table.xlsx.zip

For what it's worth, I don't think that this particular use case triggers
the memory explosion, but it is indicative of some problems with
serializing and deserializing in
wordkesk/alfresco/opencmis/woodstox/xerces...

We are using Tomcat, I'll see if I can find an open network connection that
is spewing a lot of data the next time the issue occurs (for what it's
worth I looked in my YourKit dump and did not see that, but I'm not
confident in that answer.)

I'm not sure which timeouts you are referring to, we have timeouts
configured for opencmis-client in workdesk, not sure if alfresco configures
timeouts (I assume they do)... We have increased the timeouts for workdesk.
We have the connection timeout set to 10000 and the readtimeout set to
60000. I'm not sure if there is a write timeout, but I don't see one in the
config, so I assume it would use the default.



On Thu, Mar 20, 2014 at 10:51 AM, Florian Müller <fm...@apache.org> wrote:

> Hi Bindu,
>
> I've scanned the OpenCMIS and Woodstox code again, but I can't find the
> cause of this issue. It's difficult without being able to reproduce this.
> I can exclude at least that OpenCMIS parses the entry over and over again.
> The incoming stream is directly read and parsed. There is (at least for
> OpenCMIS) no way to jump back and parse it again.
>
> There might be some kind of IO issue (see the exceptions). What servlet
> container do you use? Is there a proxy or firewall between the client and
> the server?
> In one of your emails I found this: "Frozen for at least 47m 15 sec".
> Usually a connection times out when no bytes are transferred. Did you
> disable the timeout on the client and the server? If not, the client must
> send more and more data, but I don't know how. The OpenCMIS client doesn't
> do that. Could you check if there is steady incoming network traffic when
> the server is in that state?
>
> - Florian
>
>
>
>  I should add that I'm seeing 3.6 million of the following three exceptions
>> (with associate stacks resulting in the exception):
>>
>> com.ctc.wstx.exc.WstxParsingException
>>   BasicStreamReader.readPIPrimary()
>>     BasicStreamReader.nextFromTree()
>>       BasicStreamReader.next()
>>         AtomEntryParser.next()
>>           AtomEntryParser.skip()
>>             AtomEntryParser.parseCmisContent()
>>               AtomEntryParser.parseEntry()
>>                 AtomEntryParser.parse()
>>                   ObjectService.create()
>>
>> java.io.IOException
>>   ChunkedInputFilter.doRead()
>>     InternalInputBuffer.doRead()
>>       Request.doRead()
>>         InputBuffer.realReadBytes()
>>           ByteChunk.substract()
>>             InputBuffer.read()
>>               CoyoteInputStream.read()
>>                 UTF8Reader.loadMore()
>>                   UTF8Reader.read()
>>                     ReaderSource.readInto()
>>                       BranchingReaderSource.readInto()
>>                         StreamScanner.loadMore()
>>
>> com.ctc.wstx.exc.WstxIOException
>>   BasicStreamReader.next()
>>     AtomEntryParser.next()
>>       AtomEntryParser.skip()
>>         AtomEntryParser.parseCmisContent()
>>           AtomEntryParser.parseEntry()
>>             AtomEntryParser.parse()
>>               ObjectService.create()
>>
>>
>> Looking at the 3.2.4 version of com.ctc.wstx.sr.BasicStreamReader I see
>> the
>> following in readPIPrimary()
>>
>>
>> if (target.equalsIgnoreCase
>>
>> <http://grepcode.com/file/repository.grepcode.com/java/
>> root/jdk/openjdk/6-b27/java/lang/String.java#String.
>> equalsIgnoreCase%28java.lang.String%29>("xml"))
>>
>> {
>>
>>     // 07-Oct-2005, TSa: Still legal in multi-doc mode...
>>
>>     if (!mConfig
>>
>> <http://grepcode.com/file/repo1.maven.org/maven2/org.
>> codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/sr/StreamScanner.java#
>> StreamScanner.0mConfig>.inputParsingModeDocuments
>>
>> <http://grepcode.com/file/repo1.maven.org/maven2/org.
>> codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/api/ReaderConfig.java#
>> ReaderConfig.inputParsingModeDocuments%28%29>())
>> {
>>
>>         throwParseError
>>
>> <http://grepcode.com/file/repo1.maven.org/maven2/org.
>> codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/sr/StreamScanner.java#
>> StreamScanner.throwParseError%28java.lang.String%2Cjava.lang.Object%29
>> >(ErrorConsts
>>
>> <http://grepcode.com/file/repo1.maven.org/maven2/org.
>> codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/cfg/
>> ErrorConsts.java#ErrorConsts.0ERR_WF_PI_XML_TARGET>.ERR_WF_PI_XML_TARGET
>>
>> <http://grepcode.com/file/repo1.maven.org/maven2/org.
>> codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/cfg/
>> ErrorConsts.java#ErrorConsts.0ERR_WF_PI_XML_TARGET>,
>>
>> target);
>>
>>     }
>>
>>
>> I also see next() catching IOException and throwing
>> com.ctc.wstx.exc.WstxIOException.
>>
>> Does this mean that cmisra:content somehow contains another xml document
>> with an xml root processing instruction? Since I'm not sure how to capture
>> the source AtomPub document that is triggering this I find that I'm really
>> grasping at straws, I hope someone with more experience in the code base
>> can shed some light :)
>>
>> Thanks for reading through my ramblings...
>>
>>
>>
>> On Wed, Mar 19, 2014 at 8:33 PM, Bindu Wavell <bw...@google.com> wrote:
>>
>>  As I said we had a bunch of very large string arrays allocated and
>>> collected... These huge arrays have the following list of strings over
>>> and
>>> again (as indicated in the linked picture):
>>>
>>> atom
>>> entry
>>> http://www.w3.org/2005/Atom
>>> cmisra
>>> content
>>> http://docs.oasis-open.org/ns/cmis/restatom/200908/
>>> chemistry
>>> filename
>>> http://chemistry.apache.org/
>>>
>>> Since Alfresco Workdesk 4.1.1.1 is using the openchemistry 0.8.0 AtomPub
>>> client the chemistry:filename element is included in the cmisra:content
>>> element. The parser in openchemistry 0.7.0 is trying to skip this
>>> (because
>>> it doesn't know about the chemistry:filename extension.) Notice that we
>>> aren't seeing cmisra:mediatype or cmisra:base64 here.
>>>
>>> For some reason this is introducing a parsing loop where we re-parse the
>>> atom:entry over and over again. I guess it's possible the client is
>>> producing this repetition too...
>>>
>>>
>>>
>>> On Wed, Mar 19, 2014 at 3:18 PM, Bindu Wavell <bw...@google.com>
>>> wrote:
>>>
>>>  I believe they are using 3.2.4, assuming this is the right
>>>> jar: wstx-asl-3.2.4.jar
>>>>
>>>>
>>>> On Wed, Mar 19, 2014 at 2:11 PM, Florian Müller <fm...@apache.org>
>>>> wrote:
>>>>
>>>>  Hi Bindu,
>>>>>
>>>>> I've never seen something like this. Do you know which Woodstox version
>>>>> is used?
>>>>>
>>>>> The CappedInputStream is something different. It prevents malicious
>>>>> clients from sending an endless XML and consuming server resources.
>>>>>
>>>>>
>>>>> - Florian
>>>>>
>>>>>
>>>>>
>>>>> > Our server is Alfresco Enterprise 4.1.3.0 which is in turn using
>>>>> > openchemistry 0.7.0. Our client is Alfresco Workdesk 4.1.1.1 which is
>>>>> in
>>>>> > turn using openchemistry 0.8.0.
>>>>> >
>>>>> > For the last few weeks we have run into a situation where the
>>>>> Alfresco
>>>>> > server becomes unresponsive. Specifically it doesn't crash it simply
>>>>> > becomes so overloaded that we can't login via Workdesk/CMIS. Using
>>>>> > JConsole and subsequently YourKit we can see that the server runs
>>>>> fine
>>>>> > and uses between 3-6GB of RAM with nice slow growth and cleanup. Then
>>>>> at
>>>>> > some point we see a huge amount of memory allocation and deallocation
>>>>> > between 3GB and 15GB every couple of seconds and this occurs forever
>>>>> > until we restart the server. The parallel GC cleaning up this memory
>>>>> > consumes so much CPU that folks can't actually use the system.
>>>>> >
>>>>> > To begin with we took a stackshot every 10 seconds for a minute. We
>>>>> saw
>>>>> > that there were 5 opencmis threads "frozen" while parsing an AtomPub
>>>>> > document. They were not deadlocked. Subsequently we found the same
>>>>> thing
>>>>> > using YourKit, it calls these frozen threads. Following is an example
>>>>> > stack trace from YourKit:
>>>>> >
>>>>> > http-127.0.0.1-8080-2 <--- Frozen for at least 47m 15 sec
>>>>> > com.ctc.wstx.sr.StreamScanner.getNext()
>>>>> > com.ctc.wstx.sr.BasicStreamReader.skipToken()
>>>>> > com.ctc.wstx.sr.BasicStreamReader.nextFromTree()
>>>>> > com.ctc.wstx.sr.BasicStreamReader.next()
>>>>> >
>>>>>
>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>> AtomEntryParser.next(XMLStreamReader)
>>>>> >
>>>>>
>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>> AtomEntryParser.skip(XMLStreamReader)
>>>>> >
>>>>>
>>>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.
>>>>> parseCmisContent(XMLStreamReader)
>>>>> >
>>>>>
>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>> AtomEntryParser.parseEntry(XMLStreamReader)
>>>>> >
>>>>>
>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>> AtomEntryParser.parse(InputStream)
>>>>> >
>>>>>
>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>> ObjectService.create(CallContext,
>>>>> > CmisService, String, HttpServletRequest, HttpServletResponse)
>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke0(Method, Object,
>>>>> Object[])
>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(Object, Object[])
>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[])
>>>>> > java.lang.reflect.Method.invoke(Object, Object[])
>>>>> > org.apache.chemistry.opencmis.server.shared.Dispatcher.
>>>>> dispatch(String,
>>>>> > String, CallContext, CmisService, String, HttpServletRequest,
>>>>> > HttpServletResponse)
>>>>> >
>>>>>
>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>> CmisAtomPubServlet.dispatch(CallContext,
>>>>> > HttpServletRequest, HttpServletResponse)
>>>>> >
>>>>>
>>>>> org.apache.chemistry.opencmis.server.impl.atompub.
>>>>> CmisAtomPubServlet.service(HttpServletRequest,
>>>>> > HttpServletResponse)
>>>>> > javax.servlet.http.HttpServlet.service(ServletRequest,
>>>>> ServletResponse)
>>>>> >
>>>>>
>>>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
>>>>> ServletRequest,
>>>>> > ServletResponse)
>>>>> >
>>>>>
>>>>> org.apache.catalina.core.ApplicationFilterChain.
>>>>> doFilter(ServletRequest,
>>>>> > ServletResponse)
>>>>> >
>>>>>
>>>>> org.alfresco.web.app.servlet.GlobalLocalizationFilter.
>>>>> doFilter(ServletRequest,
>>>>> > ServletResponse, FilterChain)
>>>>> >
>>>>>
>>>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
>>>>> ServletRequest,
>>>>> > ServletResponse)
>>>>> >
>>>>>
>>>>> org.apache.catalina.core.ApplicationFilterChain.
>>>>> doFilter(ServletRequest,
>>>>> > ServletResponse)
>>>>> > org.apache.catalina.core.StandardWrapperValve.invoke(Request,
>>>>> Response)
>>>>> > org.apache.catalina.core.StandardContextValve.invoke(Request,
>>>>> Response)
>>>>> > org.apache.catalina.authenticator.AuthenticatorBase.invoke(Request,
>>>>> > Response)
>>>>> > org.apache.catalina.core.StandardHostValve.invoke(Request, Response)
>>>>> > org.apache.catalina.valves.ErrorReportValve.invoke(Request,
>>>>> Response)
>>>>> > org.apache.catalina.core.StandardEngineValve.invoke(Request,
>>>>> Response)
>>>>> > org.apache.catalina.connector.CoyoteAdapter.service(Request,
>>>>> Response)
>>>>> > org.apache.coyote.http11.Http11Processor.process(Socket)
>>>>> >
>>>>>
>>>>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.
>>>>> process(Socket)
>>>>> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run()
>>>>> > java.lang.Thread.run()
>>>>> >
>>>>> > By inspecting garbage collected objects we can see that this is
>>>>> almost
>>>>> > all string[] tied back to AtomEntryParser (and the woodstox parser
>>>>> stuff
>>>>> > under it.) If we look at one of these arrays we see the same stuff
>>>>> over
>>>>> > and over again... see a screenshot here of the first few elements in
>>>>> > such an array: http://i.imgur.com/RzNQBYG.png . There are a bunch of
>>>>> > these VERY large string arrays and they all look like they are
>>>>> looping
>>>>> > over the front of an AtomPub document.
>>>>> >
>>>>> > I noticed in opencmis 0.9.0 in the implementation of AtomEntryParser
>>>>> > that a CappedInputStream was added while parsing the AtomPub stream.
>>>>> > This appears to be limited to 10MB. Not sure if that is related to
>>>>> this
>>>>> > issue. Or if others have run into this issue. Any thought/guidance
>>>>> would
>>>>> > be greatly appreciated.
>>>>> >
>>>>> > I'm bcc'ing Florian and Gab (the spy) but figured it was worth having
>>>>> > this public as I've been trying to track this issue down for a while
>>>>> and
>>>>> > have not found an answer.
>>>>> >
>>>>> > Thanks all!
>>>>>
>>>>>
>>>>
>>>>
>>>
>

Re: Infinite loop in AtomEntryParser

Posted by Florian Müller <fm...@apache.org>.

 Hi Bindu,

 I've scanned the OpenCMIS and Woodstox code again, but I can't find the 
 cause of this issue. It's difficult without being able to reproduce 
 this.
 I can exclude at least that OpenCMIS parses the entry over and over 
 again. The incoming stream is directly read and parsed. There is (at 
 least for OpenCMIS) no way to jump back and parse it again.

 There might be some kind of IO issue (see the exceptions). What servlet 
 container do you use? Is there a proxy or firewall between the client 
 and the server?
 In one of your emails I found this: "Frozen for at least 47m 15 sec". 
 Usually a connection times out when no bytes are transferred. Did you 
 disable the timeout on the client and the server? If not, the client 
 must send more and more data, but I don't know how. The OpenCMIS client 
 doesn't do that. Could you check if there is steady incoming network 
 traffic when the server is in that state?

 - Florian



> I should add that I'm seeing 3.6 million of the following three 
> exceptions
> (with associate stacks resulting in the exception):
>
> com.ctc.wstx.exc.WstxParsingException
>   BasicStreamReader.readPIPrimary()
>     BasicStreamReader.nextFromTree()
>       BasicStreamReader.next()
>         AtomEntryParser.next()
>           AtomEntryParser.skip()
>             AtomEntryParser.parseCmisContent()
>               AtomEntryParser.parseEntry()
>                 AtomEntryParser.parse()
>                   ObjectService.create()
>
> java.io.IOException
>   ChunkedInputFilter.doRead()
>     InternalInputBuffer.doRead()
>       Request.doRead()
>         InputBuffer.realReadBytes()
>           ByteChunk.substract()
>             InputBuffer.read()
>               CoyoteInputStream.read()
>                 UTF8Reader.loadMore()
>                   UTF8Reader.read()
>                     ReaderSource.readInto()
>                       BranchingReaderSource.readInto()
>                         StreamScanner.loadMore()
>
> com.ctc.wstx.exc.WstxIOException
>   BasicStreamReader.next()
>     AtomEntryParser.next()
>       AtomEntryParser.skip()
>         AtomEntryParser.parseCmisContent()
>           AtomEntryParser.parseEntry()
>             AtomEntryParser.parse()
>               ObjectService.create()
>
>
> Looking at the 3.2.4 version of com.ctc.wstx.sr.BasicStreamReader I 
> see the
> following in readPIPrimary()
>
>
> if (target.equalsIgnoreCase
> 
> <http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/lang/String.java#String.equalsIgnoreCase%28java.lang.String%29>("xml"))
> {
>
>     // 07-Oct-2005, TSa: Still legal in multi-doc mode...
>
>     if (!mConfig
> 
> <http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/sr/StreamScanner.java#StreamScanner.0mConfig>.inputParsingModeDocuments
> 
> <http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/api/ReaderConfig.java#ReaderConfig.inputParsingModeDocuments%28%29>())
> {
>
>         throwParseError
> 
> <http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/sr/StreamScanner.java#StreamScanner.throwParseError%28java.lang.String%2Cjava.lang.Object%29>(ErrorConsts
> 
> <http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/cfg/ErrorConsts.java#ErrorConsts.0ERR_WF_PI_XML_TARGET>.ERR_WF_PI_XML_TARGET
> 
> <http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/cfg/ErrorConsts.java#ErrorConsts.0ERR_WF_PI_XML_TARGET>,
> target);
>
>     }
>
>
> I also see next() catching IOException and throwing
> com.ctc.wstx.exc.WstxIOException.
>
> Does this mean that cmisra:content somehow contains another xml 
> document
> with an xml root processing instruction? Since I'm not sure how to 
> capture
> the source AtomPub document that is triggering this I find that I'm 
> really
> grasping at straws, I hope someone with more experience in the code 
> base
> can shed some light :)
>
> Thanks for reading through my ramblings...
>
>
>
> On Wed, Mar 19, 2014 at 8:33 PM, Bindu Wavell <bw...@google.com> 
> wrote:
>
>> As I said we had a bunch of very large string arrays allocated and
>> collected... These huge arrays have the following list of strings 
>> over and
>> again (as indicated in the linked picture):
>>
>> atom
>> entry
>> http://www.w3.org/2005/Atom
>> cmisra
>> content
>> http://docs.oasis-open.org/ns/cmis/restatom/200908/
>> chemistry
>> filename
>> http://chemistry.apache.org/
>>
>> Since Alfresco Workdesk 4.1.1.1 is using the openchemistry 0.8.0 
>> AtomPub
>> client the chemistry:filename element is included in the 
>> cmisra:content
>> element. The parser in openchemistry 0.7.0 is trying to skip this 
>> (because
>> it doesn't know about the chemistry:filename extension.) Notice that 
>> we
>> aren't seeing cmisra:mediatype or cmisra:base64 here.
>>
>> For some reason this is introducing a parsing loop where we re-parse 
>> the
>> atom:entry over and over again. I guess it's possible the client is
>> producing this repetition too...
>>
>>
>>
>> On Wed, Mar 19, 2014 at 3:18 PM, Bindu Wavell <bw...@google.com> 
>> wrote:
>>
>>> I believe they are using 3.2.4, assuming this is the right
>>> jar: wstx-asl-3.2.4.jar
>>>
>>>
>>> On Wed, Mar 19, 2014 at 2:11 PM, Florian Müller <fm...@apache.org> 
>>> wrote:
>>>
>>>> Hi Bindu,
>>>>
>>>> I've never seen something like this. Do you know which Woodstox 
>>>> version
>>>> is used?
>>>>
>>>> The CappedInputStream is something different. It prevents 
>>>> malicious
>>>> clients from sending an endless XML and consuming server 
>>>> resources.
>>>>
>>>>
>>>> - Florian
>>>>
>>>>
>>>>
>>>> > Our server is Alfresco Enterprise 4.1.3.0 which is in turn using
>>>> > openchemistry 0.7.0. Our client is Alfresco Workdesk 4.1.1.1 
>>>> which is
>>>> in
>>>> > turn using openchemistry 0.8.0.
>>>> >
>>>> > For the last few weeks we have run into a situation where the 
>>>> Alfresco
>>>> > server becomes unresponsive. Specifically it doesn't crash it 
>>>> simply
>>>> > becomes so overloaded that we can't login via Workdesk/CMIS. 
>>>> Using
>>>> > JConsole and subsequently YourKit we can see that the server 
>>>> runs fine
>>>> > and uses between 3-6GB of RAM with nice slow growth and cleanup. 
>>>> Then
>>>> at
>>>> > some point we see a huge amount of memory allocation and 
>>>> deallocation
>>>> > between 3GB and 15GB every couple of seconds and this occurs 
>>>> forever
>>>> > until we restart the server. The parallel GC cleaning up this 
>>>> memory
>>>> > consumes so much CPU that folks can't actually use the system.
>>>> >
>>>> > To begin with we took a stackshot every 10 seconds for a minute. 
>>>> We saw
>>>> > that there were 5 opencmis threads "frozen" while parsing an 
>>>> AtomPub
>>>> > document. They were not deadlocked. Subsequently we found the 
>>>> same
>>>> thing
>>>> > using YourKit, it calls these frozen threads. Following is an 
>>>> example
>>>> > stack trace from YourKit:
>>>> >
>>>> > http-127.0.0.1-8080-2 <--- Frozen for at least 47m 15 sec
>>>> > com.ctc.wstx.sr.StreamScanner.getNext()
>>>> > com.ctc.wstx.sr.BasicStreamReader.skipToken()
>>>> > com.ctc.wstx.sr.BasicStreamReader.nextFromTree()
>>>> > com.ctc.wstx.sr.BasicStreamReader.next()
>>>> >
>>>> 
>>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.next(XMLStreamReader)
>>>> >
>>>> 
>>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.skip(XMLStreamReader)
>>>> >
>>>> 
>>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseCmisContent(XMLStreamReader)
>>>> >
>>>> 
>>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseEntry(XMLStreamReader)
>>>> >
>>>> 
>>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parse(InputStream)
>>>> >
>>>> 
>>>> org.apache.chemistry.opencmis.server.impl.atompub.ObjectService.create(CallContext,
>>>> > CmisService, String, HttpServletRequest, HttpServletResponse)
>>>> > sun.reflect.NativeMethodAccessorImpl.invoke0(Method, Object, 
>>>> Object[])
>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(Object, Object[])
>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, 
>>>> Object[])
>>>> > java.lang.reflect.Method.invoke(Object, Object[])
>>>> > 
>>>> org.apache.chemistry.opencmis.server.shared.Dispatcher.dispatch(String,
>>>> > String, CallContext, CmisService, String, HttpServletRequest,
>>>> > HttpServletResponse)
>>>> >
>>>> 
>>>> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.dispatch(CallContext,
>>>> > HttpServletRequest, HttpServletResponse)
>>>> >
>>>> 
>>>> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.service(HttpServletRequest,
>>>> > HttpServletResponse)
>>>> > javax.servlet.http.HttpServlet.service(ServletRequest, 
>>>> ServletResponse)
>>>> >
>>>> 
>>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
>>>> > ServletResponse)
>>>> >
>>>> 
>>>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
>>>> > ServletResponse)
>>>> >
>>>> 
>>>> org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(ServletRequest,
>>>> > ServletResponse, FilterChain)
>>>> >
>>>> 
>>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
>>>> > ServletResponse)
>>>> >
>>>> 
>>>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
>>>> > ServletResponse)
>>>> > org.apache.catalina.core.StandardWrapperValve.invoke(Request, 
>>>> Response)
>>>> > org.apache.catalina.core.StandardContextValve.invoke(Request, 
>>>> Response)
>>>> > 
>>>> org.apache.catalina.authenticator.AuthenticatorBase.invoke(Request,
>>>> > Response)
>>>> > org.apache.catalina.core.StandardHostValve.invoke(Request, 
>>>> Response)
>>>> > org.apache.catalina.valves.ErrorReportValve.invoke(Request, 
>>>> Response)
>>>> > org.apache.catalina.core.StandardEngineValve.invoke(Request, 
>>>> Response)
>>>> > org.apache.catalina.connector.CoyoteAdapter.service(Request, 
>>>> Response)
>>>> > org.apache.coyote.http11.Http11Processor.process(Socket)
>>>> >
>>>> 
>>>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Socket)
>>>> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run()
>>>> > java.lang.Thread.run()
>>>> >
>>>> > By inspecting garbage collected objects we can see that this is 
>>>> almost
>>>> > all string[] tied back to AtomEntryParser (and the woodstox 
>>>> parser
>>>> stuff
>>>> > under it.) If we look at one of these arrays we see the same 
>>>> stuff over
>>>> > and over again... see a screenshot here of the first few 
>>>> elements in
>>>> > such an array: http://i.imgur.com/RzNQBYG.png . There are a 
>>>> bunch of
>>>> > these VERY large string arrays and they all look like they are 
>>>> looping
>>>> > over the front of an AtomPub document.
>>>> >
>>>> > I noticed in opencmis 0.9.0 in the implementation of 
>>>> AtomEntryParser
>>>> > that a CappedInputStream was added while parsing the AtomPub 
>>>> stream.
>>>> > This appears to be limited to 10MB. Not sure if that is related 
>>>> to this
>>>> > issue. Or if others have run into this issue. Any 
>>>> thought/guidance
>>>> would
>>>> > be greatly appreciated.
>>>> >
>>>> > I'm bcc'ing Florian and Gab (the spy) but figured it was worth 
>>>> having
>>>> > this public as I've been trying to track this issue down for a 
>>>> while
>>>> and
>>>> > have not found an answer.
>>>> >
>>>> > Thanks all!
>>>>
>>>
>>>
>>

Re: Infinite loop in AtomEntryParser

Posted by Bindu Wavell <bw...@google.com>.

I should add that I'm seeing 3.6 million of the following three exceptions
(with associate stacks resulting in the exception):

com.ctc.wstx.exc.WstxParsingException
  BasicStreamReader.readPIPrimary()
    BasicStreamReader.nextFromTree()
      BasicStreamReader.next()
        AtomEntryParser.next()
          AtomEntryParser.skip()
            AtomEntryParser.parseCmisContent()
              AtomEntryParser.parseEntry()
                AtomEntryParser.parse()
                  ObjectService.create()

java.io.IOException
  ChunkedInputFilter.doRead()
    InternalInputBuffer.doRead()
      Request.doRead()
        InputBuffer.realReadBytes()
          ByteChunk.substract()
            InputBuffer.read()
              CoyoteInputStream.read()
                UTF8Reader.loadMore()
                  UTF8Reader.read()
                    ReaderSource.readInto()
                      BranchingReaderSource.readInto()
                        StreamScanner.loadMore()

com.ctc.wstx.exc.WstxIOException
  BasicStreamReader.next()
    AtomEntryParser.next()
      AtomEntryParser.skip()
        AtomEntryParser.parseCmisContent()
          AtomEntryParser.parseEntry()
            AtomEntryParser.parse()
              ObjectService.create()


Looking at the 3.2.4 version of com.ctc.wstx.sr.BasicStreamReader I see the
following in readPIPrimary()


if (target.equalsIgnoreCase
<http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/lang/String.java#String.equalsIgnoreCase%28java.lang.String%29>("xml"))
{

    // 07-Oct-2005, TSa: Still legal in multi-doc mode...

    if (!mConfig
<http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/sr/StreamScanner.java#StreamScanner.0mConfig>.inputParsingModeDocuments
<http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/api/ReaderConfig.java#ReaderConfig.inputParsingModeDocuments%28%29>())
{

        throwParseError
<http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/sr/StreamScanner.java#StreamScanner.throwParseError%28java.lang.String%2Cjava.lang.Object%29>(ErrorConsts
<http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/cfg/ErrorConsts.java#ErrorConsts.0ERR_WF_PI_XML_TARGET>.ERR_WF_PI_XML_TARGET
<http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.woodstox/wstx-asl/3.2.4/com/ctc/wstx/cfg/ErrorConsts.java#ErrorConsts.0ERR_WF_PI_XML_TARGET>,
target);

    }


I also see next() catching IOException and throwing
com.ctc.wstx.exc.WstxIOException.

Does this mean that cmisra:content somehow contains another xml document
with an xml root processing instruction? Since I'm not sure how to capture
the source AtomPub document that is triggering this I find that I'm really
grasping at straws, I hope someone with more experience in the code base
can shed some light :)

Thanks for reading through my ramblings...



On Wed, Mar 19, 2014 at 8:33 PM, Bindu Wavell <bw...@google.com> wrote:

> As I said we had a bunch of very large string arrays allocated and
> collected... These huge arrays have the following list of strings over and
> again (as indicated in the linked picture):
>
> atom
> entry
> http://www.w3.org/2005/Atom
> cmisra
> content
> http://docs.oasis-open.org/ns/cmis/restatom/200908/
> chemistry
> filename
> http://chemistry.apache.org/
>
> Since Alfresco Workdesk 4.1.1.1 is using the openchemistry 0.8.0 AtomPub
> client the chemistry:filename element is included in the cmisra:content
> element. The parser in openchemistry 0.7.0 is trying to skip this (because
> it doesn't know about the chemistry:filename extension.) Notice that we
> aren't seeing cmisra:mediatype or cmisra:base64 here.
>
> For some reason this is introducing a parsing loop where we re-parse the
> atom:entry over and over again. I guess it's possible the client is
> producing this repetition too...
>
>
>
> On Wed, Mar 19, 2014 at 3:18 PM, Bindu Wavell <bw...@google.com> wrote:
>
>> I believe they are using 3.2.4, assuming this is the right
>> jar: wstx-asl-3.2.4.jar
>>
>>
>> On Wed, Mar 19, 2014 at 2:11 PM, Florian Müller <fm...@apache.org> wrote:
>>
>>> Hi Bindu,
>>>
>>> I've never seen something like this. Do you know which Woodstox version
>>> is used?
>>>
>>> The CappedInputStream is something different. It prevents malicious
>>> clients from sending an endless XML and consuming server resources.
>>>
>>>
>>> - Florian
>>>
>>>
>>>
>>> > Our server is Alfresco Enterprise 4.1.3.0 which is in turn using
>>> > openchemistry 0.7.0. Our client is Alfresco Workdesk 4.1.1.1 which is
>>> in
>>> > turn using openchemistry 0.8.0.
>>> >
>>> > For the last few weeks we have run into a situation where the Alfresco
>>> > server becomes unresponsive. Specifically it doesn't crash it simply
>>> > becomes so overloaded that we can't login via Workdesk/CMIS. Using
>>> > JConsole and subsequently YourKit we can see that the server runs fine
>>> > and uses between 3-6GB of RAM with nice slow growth and cleanup. Then
>>> at
>>> > some point we see a huge amount of memory allocation and deallocation
>>> > between 3GB and 15GB every couple of seconds and this occurs forever
>>> > until we restart the server. The parallel GC cleaning up this memory
>>> > consumes so much CPU that folks can't actually use the system.
>>> >
>>> > To begin with we took a stackshot every 10 seconds for a minute. We saw
>>> > that there were 5 opencmis threads "frozen" while parsing an AtomPub
>>> > document. They were not deadlocked. Subsequently we found the same
>>> thing
>>> > using YourKit, it calls these frozen threads. Following is an example
>>> > stack trace from YourKit:
>>> >
>>> > http-127.0.0.1-8080-2 <--- Frozen for at least 47m 15 sec
>>> > com.ctc.wstx.sr.StreamScanner.getNext()
>>> > com.ctc.wstx.sr.BasicStreamReader.skipToken()
>>> > com.ctc.wstx.sr.BasicStreamReader.nextFromTree()
>>> > com.ctc.wstx.sr.BasicStreamReader.next()
>>> >
>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.next(XMLStreamReader)
>>> >
>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.skip(XMLStreamReader)
>>> >
>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseCmisContent(XMLStreamReader)
>>> >
>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseEntry(XMLStreamReader)
>>> >
>>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parse(InputStream)
>>> >
>>> org.apache.chemistry.opencmis.server.impl.atompub.ObjectService.create(CallContext,
>>> > CmisService, String, HttpServletRequest, HttpServletResponse)
>>> > sun.reflect.NativeMethodAccessorImpl.invoke0(Method, Object, Object[])
>>> > sun.reflect.NativeMethodAccessorImpl.invoke(Object, Object[])
>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[])
>>> > java.lang.reflect.Method.invoke(Object, Object[])
>>> > org.apache.chemistry.opencmis.server.shared.Dispatcher.dispatch(String,
>>> > String, CallContext, CmisService, String, HttpServletRequest,
>>> > HttpServletResponse)
>>> >
>>> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.dispatch(CallContext,
>>> > HttpServletRequest, HttpServletResponse)
>>> >
>>> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.service(HttpServletRequest,
>>> > HttpServletResponse)
>>> > javax.servlet.http.HttpServlet.service(ServletRequest, ServletResponse)
>>> >
>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
>>> > ServletResponse)
>>> >
>>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
>>> > ServletResponse)
>>> >
>>> org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(ServletRequest,
>>> > ServletResponse, FilterChain)
>>> >
>>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
>>> > ServletResponse)
>>> >
>>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
>>> > ServletResponse)
>>> > org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response)
>>> > org.apache.catalina.core.StandardContextValve.invoke(Request, Response)
>>> > org.apache.catalina.authenticator.AuthenticatorBase.invoke(Request,
>>> > Response)
>>> > org.apache.catalina.core.StandardHostValve.invoke(Request, Response)
>>> > org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response)
>>> > org.apache.catalina.core.StandardEngineValve.invoke(Request, Response)
>>> > org.apache.catalina.connector.CoyoteAdapter.service(Request, Response)
>>> > org.apache.coyote.http11.Http11Processor.process(Socket)
>>> >
>>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Socket)
>>> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run()
>>> > java.lang.Thread.run()
>>> >
>>> > By inspecting garbage collected objects we can see that this is almost
>>> > all string[] tied back to AtomEntryParser (and the woodstox parser
>>> stuff
>>> > under it.) If we look at one of these arrays we see the same stuff over
>>> > and over again... see a screenshot here of the first few elements in
>>> > such an array: http://i.imgur.com/RzNQBYG.png . There are a bunch of
>>> > these VERY large string arrays and they all look like they are looping
>>> > over the front of an AtomPub document.
>>> >
>>> > I noticed in opencmis 0.9.0 in the implementation of AtomEntryParser
>>> > that a CappedInputStream was added while parsing the AtomPub stream.
>>> > This appears to be limited to 10MB. Not sure if that is related to this
>>> > issue. Or if others have run into this issue. Any thought/guidance
>>> would
>>> > be greatly appreciated.
>>> >
>>> > I'm bcc'ing Florian and Gab (the spy) but figured it was worth having
>>> > this public as I've been trying to track this issue down for a while
>>> and
>>> > have not found an answer.
>>> >
>>> > Thanks all!
>>>
>>
>>
>

Re: Infinite loop in AtomEntryParser

Posted by Bindu Wavell <bw...@google.com>.

As I said we had a bunch of very large string arrays allocated and
collected... These huge arrays have the following list of strings over and
again (as indicated in the linked picture):

atom
entry
http://www.w3.org/2005/Atom
cmisra
content
http://docs.oasis-open.org/ns/cmis/restatom/200908/
chemistry
filename
http://chemistry.apache.org/

Since Alfresco Workdesk 4.1.1.1 is using the openchemistry 0.8.0 AtomPub
client the chemistry:filename element is included in the cmisra:content
element. The parser in openchemistry 0.7.0 is trying to skip this (because
it doesn't know about the chemistry:filename extension.) Notice that we
aren't seeing cmisra:mediatype or cmisra:base64 here.

For some reason this is introducing a parsing loop where we re-parse the
atom:entry over and over again. I guess it's possible the client is
producing this repetition too...



On Wed, Mar 19, 2014 at 3:18 PM, Bindu Wavell <bw...@google.com> wrote:

> I believe they are using 3.2.4, assuming this is the right
> jar: wstx-asl-3.2.4.jar
>
>
> On Wed, Mar 19, 2014 at 2:11 PM, Florian Müller <fm...@apache.org> wrote:
>
>> Hi Bindu,
>>
>> I've never seen something like this. Do you know which Woodstox version
>> is used?
>>
>> The CappedInputStream is something different. It prevents malicious
>> clients from sending an endless XML and consuming server resources.
>>
>>
>> - Florian
>>
>>
>>
>> > Our server is Alfresco Enterprise 4.1.3.0 which is in turn using
>> > openchemistry 0.7.0. Our client is Alfresco Workdesk 4.1.1.1 which is in
>> > turn using openchemistry 0.8.0.
>> >
>> > For the last few weeks we have run into a situation where the Alfresco
>> > server becomes unresponsive. Specifically it doesn't crash it simply
>> > becomes so overloaded that we can't login via Workdesk/CMIS. Using
>> > JConsole and subsequently YourKit we can see that the server runs fine
>> > and uses between 3-6GB of RAM with nice slow growth and cleanup. Then at
>> > some point we see a huge amount of memory allocation and deallocation
>> > between 3GB and 15GB every couple of seconds and this occurs forever
>> > until we restart the server. The parallel GC cleaning up this memory
>> > consumes so much CPU that folks can't actually use the system.
>> >
>> > To begin with we took a stackshot every 10 seconds for a minute. We saw
>> > that there were 5 opencmis threads "frozen" while parsing an AtomPub
>> > document. They were not deadlocked. Subsequently we found the same thing
>> > using YourKit, it calls these frozen threads. Following is an example
>> > stack trace from YourKit:
>> >
>> > http-127.0.0.1-8080-2 <--- Frozen for at least 47m 15 sec
>> > com.ctc.wstx.sr.StreamScanner.getNext()
>> > com.ctc.wstx.sr.BasicStreamReader.skipToken()
>> > com.ctc.wstx.sr.BasicStreamReader.nextFromTree()
>> > com.ctc.wstx.sr.BasicStreamReader.next()
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.next(XMLStreamReader)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.skip(XMLStreamReader)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseCmisContent(XMLStreamReader)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseEntry(XMLStreamReader)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parse(InputStream)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.ObjectService.create(CallContext,
>> > CmisService, String, HttpServletRequest, HttpServletResponse)
>> > sun.reflect.NativeMethodAccessorImpl.invoke0(Method, Object, Object[])
>> > sun.reflect.NativeMethodAccessorImpl.invoke(Object, Object[])
>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[])
>> > java.lang.reflect.Method.invoke(Object, Object[])
>> > org.apache.chemistry.opencmis.server.shared.Dispatcher.dispatch(String,
>> > String, CallContext, CmisService, String, HttpServletRequest,
>> > HttpServletResponse)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.dispatch(CallContext,
>> > HttpServletRequest, HttpServletResponse)
>> >
>> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.service(HttpServletRequest,
>> > HttpServletResponse)
>> > javax.servlet.http.HttpServlet.service(ServletRequest, ServletResponse)
>> >
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
>> > ServletResponse)
>> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
>> > ServletResponse)
>> >
>> org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(ServletRequest,
>> > ServletResponse, FilterChain)
>> >
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
>> > ServletResponse)
>> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
>> > ServletResponse)
>> > org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response)
>> > org.apache.catalina.core.StandardContextValve.invoke(Request, Response)
>> > org.apache.catalina.authenticator.AuthenticatorBase.invoke(Request,
>> > Response)
>> > org.apache.catalina.core.StandardHostValve.invoke(Request, Response)
>> > org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response)
>> > org.apache.catalina.core.StandardEngineValve.invoke(Request, Response)
>> > org.apache.catalina.connector.CoyoteAdapter.service(Request, Response)
>> > org.apache.coyote.http11.Http11Processor.process(Socket)
>> >
>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Socket)
>> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run()
>> > java.lang.Thread.run()
>> >
>> > By inspecting garbage collected objects we can see that this is almost
>> > all string[] tied back to AtomEntryParser (and the woodstox parser stuff
>> > under it.) If we look at one of these arrays we see the same stuff over
>> > and over again... see a screenshot here of the first few elements in
>> > such an array: http://i.imgur.com/RzNQBYG.png . There are a bunch of
>> > these VERY large string arrays and they all look like they are looping
>> > over the front of an AtomPub document.
>> >
>> > I noticed in opencmis 0.9.0 in the implementation of AtomEntryParser
>> > that a CappedInputStream was added while parsing the AtomPub stream.
>> > This appears to be limited to 10MB. Not sure if that is related to this
>> > issue. Or if others have run into this issue. Any thought/guidance would
>> > be greatly appreciated.
>> >
>> > I'm bcc'ing Florian and Gab (the spy) but figured it was worth having
>> > this public as I've been trying to track this issue down for a while and
>> > have not found an answer.
>> >
>> > Thanks all!
>>
>
>

Re: Infinite loop in AtomEntryParser

Posted by Bindu Wavell <bw...@google.com>.

I believe they are using 3.2.4, assuming this is the right
jar: wstx-asl-3.2.4.jar


On Wed, Mar 19, 2014 at 2:11 PM, Florian Müller <fm...@apache.org> wrote:

> Hi Bindu,
>
> I've never seen something like this. Do you know which Woodstox version
> is used?
>
> The CappedInputStream is something different. It prevents malicious
> clients from sending an endless XML and consuming server resources.
>
>
> - Florian
>
>
>
> > Our server is Alfresco Enterprise 4.1.3.0 which is in turn using
> > openchemistry 0.7.0. Our client is Alfresco Workdesk 4.1.1.1 which is in
> > turn using openchemistry 0.8.0.
> >
> > For the last few weeks we have run into a situation where the Alfresco
> > server becomes unresponsive. Specifically it doesn't crash it simply
> > becomes so overloaded that we can't login via Workdesk/CMIS. Using
> > JConsole and subsequently YourKit we can see that the server runs fine
> > and uses between 3-6GB of RAM with nice slow growth and cleanup. Then at
> > some point we see a huge amount of memory allocation and deallocation
> > between 3GB and 15GB every couple of seconds and this occurs forever
> > until we restart the server. The parallel GC cleaning up this memory
> > consumes so much CPU that folks can't actually use the system.
> >
> > To begin with we took a stackshot every 10 seconds for a minute. We saw
> > that there were 5 opencmis threads "frozen" while parsing an AtomPub
> > document. They were not deadlocked. Subsequently we found the same thing
> > using YourKit, it calls these frozen threads. Following is an example
> > stack trace from YourKit:
> >
> > http-127.0.0.1-8080-2 <--- Frozen for at least 47m 15 sec
> > com.ctc.wstx.sr.StreamScanner.getNext()
> > com.ctc.wstx.sr.BasicStreamReader.skipToken()
> > com.ctc.wstx.sr.BasicStreamReader.nextFromTree()
> > com.ctc.wstx.sr.BasicStreamReader.next()
> >
> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.next(XMLStreamReader)
> >
> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.skip(XMLStreamReader)
> >
> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseCmisContent(XMLStreamReader)
> >
> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseEntry(XMLStreamReader)
> >
> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parse(InputStream)
> >
> org.apache.chemistry.opencmis.server.impl.atompub.ObjectService.create(CallContext,
> > CmisService, String, HttpServletRequest, HttpServletResponse)
> > sun.reflect.NativeMethodAccessorImpl.invoke0(Method, Object, Object[])
> > sun.reflect.NativeMethodAccessorImpl.invoke(Object, Object[])
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[])
> > java.lang.reflect.Method.invoke(Object, Object[])
> > org.apache.chemistry.opencmis.server.shared.Dispatcher.dispatch(String,
> > String, CallContext, CmisService, String, HttpServletRequest,
> > HttpServletResponse)
> >
> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.dispatch(CallContext,
> > HttpServletRequest, HttpServletResponse)
> >
> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.service(HttpServletRequest,
> > HttpServletResponse)
> > javax.servlet.http.HttpServlet.service(ServletRequest, ServletResponse)
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
> > ServletResponse)
> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
> > ServletResponse)
> >
> org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(ServletRequest,
> > ServletResponse, FilterChain)
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
> > ServletResponse)
> > org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
> > ServletResponse)
> > org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response)
> > org.apache.catalina.core.StandardContextValve.invoke(Request, Response)
> > org.apache.catalina.authenticator.AuthenticatorBase.invoke(Request,
> > Response)
> > org.apache.catalina.core.StandardHostValve.invoke(Request, Response)
> > org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response)
> > org.apache.catalina.core.StandardEngineValve.invoke(Request, Response)
> > org.apache.catalina.connector.CoyoteAdapter.service(Request, Response)
> > org.apache.coyote.http11.Http11Processor.process(Socket)
> >
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Socket)
> > org.apache.tomcat.util.net.JIoEndpoint$Worker.run()
> > java.lang.Thread.run()
> >
> > By inspecting garbage collected objects we can see that this is almost
> > all string[] tied back to AtomEntryParser (and the woodstox parser stuff
> > under it.) If we look at one of these arrays we see the same stuff over
> > and over again... see a screenshot here of the first few elements in
> > such an array: http://i.imgur.com/RzNQBYG.png . There are a bunch of
> > these VERY large string arrays and they all look like they are looping
> > over the front of an AtomPub document.
> >
> > I noticed in opencmis 0.9.0 in the implementation of AtomEntryParser
> > that a CappedInputStream was added while parsing the AtomPub stream.
> > This appears to be limited to 10MB. Not sure if that is related to this
> > issue. Or if others have run into this issue. Any thought/guidance would
> > be greatly appreciated.
> >
> > I'm bcc'ing Florian and Gab (the spy) but figured it was worth having
> > this public as I've been trying to track this issue down for a while and
> > have not found an answer.
> >
> > Thanks all!
>

Re: Infinite loop in AtomEntryParser

Posted by Florian Müller <fm...@apache.org>.

Hi Bindu,

I've never seen something like this. Do you know which Woodstox version
is used?

The CappedInputStream is something different. It prevents malicious
clients from sending an endless XML and consuming server resources.


- Florian



> Our server is Alfresco Enterprise 4.1.3.0 which is in turn using
> openchemistry 0.7.0. Our client is Alfresco Workdesk 4.1.1.1 which is in
> turn using openchemistry 0.8.0.
> 
> For the last few weeks we have run into a situation where the Alfresco
> server becomes unresponsive. Specifically it doesn't crash it simply
> becomes so overloaded that we can't login via Workdesk/CMIS. Using
> JConsole and subsequently YourKit we can see that the server runs fine
> and uses between 3-6GB of RAM with nice slow growth and cleanup. Then at
> some point we see a huge amount of memory allocation and deallocation
> between 3GB and 15GB every couple of seconds and this occurs forever
> until we restart the server. The parallel GC cleaning up this memory
> consumes so much CPU that folks can't actually use the system.
> 
> To begin with we took a stackshot every 10 seconds for a minute. We saw
> that there were 5 opencmis threads "frozen" while parsing an AtomPub
> document. They were not deadlocked. Subsequently we found the same thing
> using YourKit, it calls these frozen threads. Following is an example
> stack trace from YourKit:
> 
> http-127.0.0.1-8080-2 <--- Frozen for at least 47m 15 sec 
> com.ctc.wstx.sr.StreamScanner.getNext() 
> com.ctc.wstx.sr.BasicStreamReader.skipToken() 
> com.ctc.wstx.sr.BasicStreamReader.nextFromTree() 
> com.ctc.wstx.sr.BasicStreamReader.next() 
> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.next(XMLStreamReader) 
> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.skip(XMLStreamReader) 
> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseCmisContent(XMLStreamReader) 
> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parseEntry(XMLStreamReader) 
> org.apache.chemistry.opencmis.server.impl.atompub.AtomEntryParser.parse(InputStream) 
> org.apache.chemistry.opencmis.server.impl.atompub.ObjectService.create(CallContext,
> CmisService, String, HttpServletRequest, HttpServletResponse) 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) 
> sun.reflect.NativeMethodAccessorImpl.invoke(Object, Object[]) 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(Object, Object[]) 
> java.lang.reflect.Method.invoke(Object, Object[]) 
> org.apache.chemistry.opencmis.server.shared.Dispatcher.dispatch(String,
> String, CallContext, CmisService, String, HttpServletRequest,
> HttpServletResponse) 
> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.dispatch(CallContext,
> HttpServletRequest, HttpServletResponse) 
> org.apache.chemistry.opencmis.server.impl.atompub.CmisAtomPubServlet.service(HttpServletRequest,
> HttpServletResponse) 
> javax.servlet.http.HttpServlet.service(ServletRequest, ServletResponse) 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
> ServletResponse) 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
> ServletResponse) 
> org.alfresco.web.app.servlet.GlobalLocalizationFilter.doFilter(ServletRequest,
> ServletResponse, FilterChain) 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ServletRequest,
> ServletResponse) 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ServletRequest,
> ServletResponse) 
> org.apache.catalina.core.StandardWrapperValve.invoke(Request, Response) 
> org.apache.catalina.core.StandardContextValve.invoke(Request, Response) 
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(Request,
> Response) 
> org.apache.catalina.core.StandardHostValve.invoke(Request, Response) 
> org.apache.catalina.valves.ErrorReportValve.invoke(Request, Response) 
> org.apache.catalina.core.StandardEngineValve.invoke(Request, Response) 
> org.apache.catalina.connector.CoyoteAdapter.service(Request, Response) 
> org.apache.coyote.http11.Http11Processor.process(Socket) 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Socket) 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run() 
> java.lang.Thread.run()
> 
> By inspecting garbage collected objects we can see that this is almost
> all string[] tied back to AtomEntryParser (and the woodstox parser stuff
> under it.) If we look at one of these arrays we see the same stuff over
> and over again... see a screenshot here of the first few elements in
> such an array: http://i.imgur.com/RzNQBYG.png . There are a bunch of
> these VERY large string arrays and they all look like they are looping
> over the front of an AtomPub document.
> 
> I noticed in opencmis 0.9.0 in the implementation of AtomEntryParser
> that a CappedInputStream was added while parsing the AtomPub stream.
> This appears to be limited to 10MB. Not sure if that is related to this
> issue. Or if others have run into this issue. Any thought/guidance would
> be greatly appreciated.
> 
> I'm bcc'ing Florian and Gab (the spy) but figured it was worth having
> this public as I've been trying to track this issue down for a while and
> have not found an answer.
> 
> Thanks all!