You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@synapse.apache.org by kimhorn <ki...@icsglobal.net> on 2009/03/05 03:34:14 UTC

VFS - Synapse Memory Leak

Have most basic VFS script (see below) that triggers when file written to
'IN' directory.
The file is 140,000K.
Heap set at Maximum for XP 1.5GB in Warpper.conf

I can drop the file in the 'IN' directory 3 times but then on 4th run Out of
Memory.

So java system Memory used is:
Waiting:     35,000K
File 1  :     850,000K
File 2  :    1450,000K
File 3  :    1480,000K  - must have hit max.
File 4  :    Out of Memory

It appears Synapse or VFS accumulates this data and won't give up the old
<text> data <text/>, in the payload. Not sure what is keeping this reference
and how to get around this.
Why does 140 MB file end up consuming 10 times as much memory ?


<definitions xmlns="http://ws.apache.org/ns/synapse">
  <proxy name="DoStuff" transports="vfs">
    <parameter name="transport.vfs.FileURI">file:///C:/test/in</parameter>
    <parameter name="transport.vfs.ContentType">text/plain</parameter>
    <parameter name="transport.vfs.FileNamePattern">.*\.edi</parameter>
    <parameter name="transport.PollInterval">60</parameter>
    <parameter
name="transport.vfs.MoveAfterProcess">file:///C:/test/original</parameter>
    <parameter
name="transport.vfs.MoveAfterFailure">file:///C:/test/original</parameter>
    <parameter name="transport.vfs.ActionAfterProcess">MOVE</parameter>
    <parameter name="transport.vfs.ActionAfterFailure">MOVE</parameter>
    <target>
      <inSequence>
        <send>
          <endpoint>
            <address uri="vfs:file:///C:/test/out"/>
          </endpoint>
        </send>
      </inSequence>
      <outSequence>
        <drop/>
      </outSequence>
    </target>
  </proxy>
</definitions>



2009-03-05 12:50:18,144 [192.168.0.204-icsws-kh] [WrapperSimpleAppMain] 
INFO ServerManager Ready for processing
2009-03-05 12:51:22,987 [-] [vfs-Worker-1]  INFO TimeoutHandler This engine
will expire all callbacks after : 86400 seconds, irrespective o
ut action, after the specified or optional timeout
Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: Java heap
space
        at
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
        at
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
        at java.lang.StringBuffer.append(StringBuffer.java:307)
        at java.io.StringWriter.write(StringWriter.java:72)
        at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
        at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
        at
org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
        at
org.apache.axis2.transport.TransportUtils.createDocumentElement(TransportUtils.java:164)
        at
org.apache.axis2.transport.TransportUtils.createSOAPMessage(TransportUtils.java:112)
        at
org.apache.synapse.transport.vfs.VFSTransportListener.processFile(VFSTransportListener.java:433)
        at
org.apache.synapse.transport.vfs.VFSTransportListener.scanFileOrDirectory(VFSTransportListener.java:241)
        at
org.apache.synapse.transport.vfs.VFSTransportListener.onPoll(VFSTransportListener.java:145)
        at
org.apache.synapse.transport.base.AbstractPollingTransportListener$1$1.run(AbstractPollingTransportListener.java:94)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
        at java.lang.Thread.run(Thread.java:595)
Exception in thread "vfs-Worker-5" java.lang.OutOfMemoryError: Java heap
space
        at
java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
        at
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
        at java.lang.StringBuffer.append(StringBuffer.java:307)
        at java.io.StringWriter.write(StringWriter.java:72)
        at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
        at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
        at
org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
        at
org.apache.axis2.transport.TransportUtils.createDocumentElement(TransportUtils.java:164)
        at
org.apache.axis2.transport.TransportUtils.createSOAPMessage(TransportUtils.java:112)
        at
org.apache.synapse.transport.vfs.VFSTransportListener.processFile(VFSTransportListener.java:433)
        at
org.apache.synapse.transport.vfs.VFSTransportListener.scanFileOrDirectory(VFSTransportListener.java:241)
        at
org.apache.synapse.transport.vfs.VFSTransportListener.onPoll(VFSTransportListener.java:145)
        at
org.apache.synapse.transport.base.AbstractPollingTransportListener$1$1.run(AbstractPollingTransportListener.java:94)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
        at java.lang.Thread.run(Thread.java:595)






-- 
View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22344176.html
Sent from the Synapse - Dev mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by Andreas Veithen <an...@gmail.com>.
I outlined the solution for this issue in [1]. It will take a few days
to get this implemented properly.

Andreas

[1] http://markmail.org/message/s7i5zewk5mudyxyl

On Thu, Mar 5, 2009 at 03:42, Asankha C. Perera <as...@apache.org> wrote:
>
>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: Java heap
>> space
>>        at
>>
>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>        at
>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>        at java.lang.StringBuffer.append(StringBuffer.java:307)
>>        at java.io.StringWriter.write(StringWriter.java:72)
>>        at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>        at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>        at
>>
>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>
>
> Since the content type is text, the plain text formatter is trying to use a
> String to parse as I see.. which is a problem for large content..
>
> A definite bug we need to fix ..
>
> cheers
> asankha
>
> --
> Asankha C. Perera
> AdroitLogic, http://adroitlogic.org
>
> http://esbmagic.blogspot.com
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by Ruwan Linton <ru...@gmail.com>.
Can you report a JIRA issue on this in synapse JIRA?

https://issues.apache.org/jira/browse/SYNAPSE

Thanks,
Ruwan

On Thu, Mar 5, 2009 at 11:51 AM, kimhorn <ki...@icsglobal.net> wrote:

>
> Thats good; as this stops us using Synapse.
>
>
>
> Asankha C. Perera wrote:
> >
> >
> >> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: Java heap
> >> space
> >>         at
> >>
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
> >>         at
> >> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
> >>         at java.lang.StringBuffer.append(StringBuffer.java:307)
> >>         at java.io.StringWriter.write(StringWriter.java:72)
> >>         at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
> >>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
> >>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
> >>         at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
> >>         at
> >>
> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
> >>
> > Since the content type is text, the plain text formatter is trying to
> > use a String to parse as I see.. which is a problem for large content..
> >
> > A definite bug we need to fix ..
> >
> > cheers
> > asankha
> >
> > --
> > Asankha C. Perera
> > AdroitLogic, http://adroitlogic.org
> >
> > http://esbmagic.blogspot.com
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> > For additional commands, e-mail: dev-help@synapse.apache.org
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>


-- 
Ruwan Linton
http://wso2.org - "Oxygenating the Web Services Platform"
http://ruwansblog.blogspot.com/

RE: VFS - Synapse Memory Leak

Posted by Kim Horn <ki...@icsglobal.net>.
Thanks; this totally stopped the bit of Synapse we actually got working :-)
Is this in todays trunk ?
Kim

-----Original Message-----
From: Andreas Veithen [mailto:andreas.veithen@gmail.com] 
Sent: Wednesday, 1 April 2009 6:29 PM
To: dev@synapse.apache.org
Subject: Re: VFS - Synapse Memory Leak

I temporarily reverted the change that does the "doesn't support
synchronous responses" check in the VFS transport. I noticed that this
check also breaks sample 254. I will need to do some further
investigations on this.

Andreas

On Wed, Apr 1, 2009 at 04:14, Kim Horn <ki...@icsglobal.net> wrote:
>
> However now I cannot get any of our "real" scripts to work using Build 529. They all throw  VFS doesn;t support syncronous responces exceptions !! When I put in the <property> for OUT_ONLY they all stop half way.
>
> I just can win with Synapse. It seems impossible to get a real script to work.
>
> So looked through latest doc but could not see where is best to set this property. I placed it
> in various places below but gave up ...just doesn't work anymore ???
> Is this nested <out> approach to call 2 Web services broken by the OUT_ONLY ?
>
> These scripts have basic pattern:
>  <proxy name="FileCheckProxyMRS" transports="vfs">
>          ........parameters....
>    <target inSequence="inSequence" outSequence="outSequence"/>
>  </proxy>
>  <sequence name="inSequence">
>          ... create Web Service call-1 - get ID to submit with claim.
>    <send>
>      <endpoint>
>        <address uri="https://stage.thelma-us.com/WS/SOAP/WebService-1"/>
>      </endpoint>
>    </send>
>  </sequence>
>  <sequence name="outSequence">
>    <filter xmlns:ns="http://ws1.thelma.icsglobal.net" xpath="fn:count(//ns:submitHealthCareClaimBatchResponse) = 1">
>      <then sequence="processClaimResponse"/>
>      <else sequence="sendClaimToThelma"/>
>    </filter>
>  </sequence>
>  <sequence name="sendClaimToThelma">
>          ...creat Web Service call-2 to get claim
>    <send>
>      <endpoint>
>        <address uri="https://stage.thelma-us.com/WS/SOAP/GetClaimService"/>
>      </endpoint>
>    </send>
>  </sequence>
>  <sequence name="processClaimResponse">
>    <property ..name spaces.. name="claim997" expression="//ns1:submitHealthCareClaimBatchResponse/ns1:out/ns2:response"/>
>    <property ..name spaces.. name="fName" expression="//ns1:submitHealthCareClaimBatchResponse/ns1:out/ns2:timestamp"/>
>    <property name="transport.vfs.ReplyFileName" expression="fn:translate(fn:concat(get-property('fName'),'.997'),':+','--')" scope="transport"/>
>    <script language="js"><![CDATA[var claim997 = mc.getProperty("claim997").toString();
>               mc.setPayloadXML(<axis2ns1:text xmlns:axis2ns1="http://ws.apache.org/commons/ns/payload">{claim997}</axis2ns1:text>);
>      ]]></script>
>    <send>
>      <endpoint>
>        <address uri="vfs:file:///C:/synapse-prod/mrs/997"/>
>      </endpoint>
>    </send>
>  </sequence>
> </definitions>
>
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
> Sent: Thu 26/03/2009 21:45
> To: dev@synapse.apache.org
> Subject: Re: VFS - Synapse Memory Leak
>
> Exactly as expected... :-)
>
> Andreas
>
> On Wed, Mar 25, 2009 at 05:02, Kim Horn <ki...@icsglobal.net> wrote:
>> Hello Andreas,
>>
>> This all works really well. Streaming uses no memory at all.
>> Got a java mediator also streaming payloads and with massive files never
>> uses much more than 40K for Synapse.
>>
>> Using just the OUT_ONLY property set; uses much more memory but it stabilises
>> and does not grow.
>> Thanks.
>>
>> -----Original Message-----
>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>> Sent: Friday, 20 March 2009 8:05 PM
>> To: dev@synapse.apache.org
>> Subject: Re: VFS - Synapse Memory Leak
>>
>> Of course the memory allocated to a message will be freed once the
>> message has been processed. That is why it's important to set the
>> OUT_ONLY property: if it is not set correctly, Synapse will keep the
>> message context (with the payload) in a callback table to correlate it
>> with a future response (which in your case never comes in). Probably
>> there is something to improve here in Synapse:
>> - The VFS transport should trigger an error if there is a mismatch
>> between the message exchange pattern and the transport configuration
>> of the service (the transport.vfs.* parameters).
>> - Synapse should start issuing warnings when the number of entries in
>> the callback table reaches a certain threshold.
>>
>> Andreas
>>
>> On Fri, Mar 20, 2009 at 01:41, Kim Horn <ki...@icsglobal.net> wrote:
>>> Not really; I cannot see why memory should permanently grow when I pass the same file
>>> repeatedly through VFS. In theory this means VFS will always consume all the available memory
>>> given enough time and file iterations. Therefore VFS cannot be used in a production system.
>>> This is definition of Memory Leak. I would expect SOME overhead on top of file size but
>>> I would assume the memory no longer required would be re-claimed. I would also assume
>>> The overhead was not 10 times the file size; seems excessive.
>>>
>>> Yes I understand the streaming approach should in theory use a fixed and much smaller amount of memory;
>>> but haven't tested that yet either. No reason given above memory leak that it should not permanently grow
>>> but at a smaller rate aswell.
>>>
>>> Thanks
>>> Kim
>>>
>>> -----Original Message-----
>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>> Sent: Friday, 20 March 2009 10:52 AM
>>> To: dev@synapse.apache.org
>>> Subject: Re: VFS - Synapse Memory Leak
>>>
>>> If N is the size of the file, the memory consumption caused by the
>>> transport is O(N) with transport.vfs.Streaming=false and O(1) with
>>> transport.vfs.Streaming=true. The getTextAsStream and writeTextTo
>>> methods in org.apache.axis2.format.ElementHelper are there to allow
>>> you to implement your mediator with O(1) memory usage, so that the
>>> overall memory consumption remains O(1). Does that answer your
>>> question?
>>>
>>> Andreas
>>>
>>> On Thu, Mar 19, 2009 at 23:33, Kim Horn <ki...@icsglobal.net> wrote:
>>>> It's the same Synapse.xml as specified originally and same trace. If you are using Nabble you can see this, in case you lost the prior emails I can post them again.
>>>>
>>>> I must admit I did not set those extra parameters, you mentioned, but I don't see why you should set parameter to Stop a memory leak. I guessed these parameter would just reduce the large amounts of memory it appears to be using, e.g. 10 times the file size, via streaming ? Why is their 10 copies of the data floating around ? Lots of buffering. This issue suggests to me that any use of VFS will eventually kill the Server. Even with smaller files it will eventually use all available memory. I guess I did not understand the actual reason for this issue from prior discussion.
>>>>
>>>> I will try your extra parameters today though.
>>>>
>>>> Thanks
>>>> Kim
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>> Sent: Thursday, 19 March 2009 5:48 PM
>>>> To: dev@synapse.apache.org
>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>
>>>> Kim,
>>>>
>>>> Can you post your current synapse.xml as well as the stack trace you get now?
>>>>
>>>> Andreas
>>>>
>>>> On Thu, Mar 19, 2009 at 07:20, kimhorn <ki...@icsglobal.net> wrote:
>>>>>
>>>>> Using the last stable build from 15 March 2009 I still get exactly same
>>>>> behaviour as originally
>>>>> described with the above script. VFS still just dies. Would your fixes be in
>>>>> this ?
>>>>>
>>>>> Using the last st
>>>>>
>>>>> Andreas Veithen-2 wrote:
>>>>>>
>>>>>> I committed the code and it will be available in the next WS-Commons
>>>>>> transport build. The methods are located in
>>>>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>>>>>> module.
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>>>>>>> Hello Andreas,
>>>>>>> This is great and really helps, have not had time to try it out but will
>>>>>>> soon.
>>>>>>>
>>>>>>> Contributing the java.io.Reader would be a great help but it will take me
>>>>>>> a while to get up to speed to do the Synapse iterator.
>>>>>>>
>>>>>>> In the short term I am going to use a brute force approach that is now
>>>>>>> feasible given the memory issue is resolved. Just thought of this one
>>>>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>>>>>> POJOCommand on <out> to split file into another directory, stream in and
>>>>>>> out. Another independent VFS proxy watches that directory and submits
>>>>>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>>>>>> destination may still be an issue ?
>>>>>>>
>>>>>>> Kim
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>>>>> Sent: Monday, 9 March 2009 10:55 PM
>>>>>>> To: dev@synapse.apache.org
>>>>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>>>>
>>>>>>> The changes I did in the VFS transport and the message builders for
>>>>>>> text/plain and application/octet-stream certainly don't provide an
>>>>>>> out-of-the-box solution for your use case, but they are the
>>>>>>> prerequisite.
>>>>>>>
>>>>>>> Concerning your first proposed solution (let the VFS write the content
>>>>>>> to a temporary file), I don't like this because it would create a
>>>>>>> tight coupling between the VFS transport and the mediator. A design
>>>>>>> goal should be that the solution will still work if the file comes
>>>>>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>>>>>
>>>>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>>>>>> this will require development of a custom mediator. This mediator
>>>>>>> would read the content, split it up (and store the chunks in memory or
>>>>>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>>>>>> the sub-sequence would happen synchronously to limit the memory/disk
>>>>>>> space consumption (to the maximum chunk size) and to avoid flooding
>>>>>>> the destination service.
>>>>>>>
>>>>>>> Note that it is probably not possible to implemented the mediator
>>>>>>> using a script because of the problematic String handling. Also,
>>>>>>> Spring, POJO and class mediators don't support sub-sequences (I
>>>>>>> think). Therefore it should be implemented as a full-featured Java
>>>>>>> mediator, probably taking the existing iterate mediator as a template.
>>>>>>> I can contribute the required code to get the text content in the form
>>>>>>> of a java.io.Reader.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Andreas
>>>>>>>
>>>>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>>>>>>
>>>>>>>> Although this is a good feature it may not solve the actual problem ?
>>>>>>>> The main first issue on my list was the memory leak.
>>>>>>>> However, the real problem is once I get this massive files I �have to
>>>>>>>> send
>>>>>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>>>>>> Streaming it straight out would just kill the destination Web service.
>>>>>>>> It
>>>>>>>> would get the memory error. The text document can be split apart easily,
>>>>>>>> as
>>>>>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>>>>>
>>>>>>>> In an earlier post; that was not responded too, I mentioned:
>>>>>>>>
>>>>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>>>>>> through
>>>>>>>> input file and outputs smaller
>>>>>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>>>>>
>>>>>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>>>>>
>>>>>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>>>>>> process
>>>>>>>> the file by splitting it into many smaller files. These files then
>>>>>>>> trigger
>>>>>>>> another VFS proxy that submits these to the final web Service.
>>>>>>>> The problem is is that is uses the file system (not so bad).
>>>>>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>>>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>>>>>> Iterator. So replace the text message with many smaller XML elements.
>>>>>>>> Problem is that this loads whole message into memory.
>>>>>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>>>>>> (to
>>>>>>>> split the text data) or actually uses a for loop approach to chop the
>>>>>>>> file
>>>>>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>>>>>> chunk
>>>>>>>> 23 chunks into the data.
>>>>>>>> 4) Using the approach proposed now - just submit the file straight
>>>>>>>> (stream
>>>>>>>> it) to another web service that chops it up. It may return an XML
>>>>>>>> document
>>>>>>>> with many sub elelements that allows the standard Iterator to work.
>>>>>>>> Similar
>>>>>>>> to (2) but using another service rather than Java to split document.
>>>>>>>> 5) Using the approach proposed now - just submit the file straight
>>>>>>>> (stream
>>>>>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>>>>>> with
>>>>>>>> each small packet of data that then forwards it to the final WEb
>>>>>>>> Service. So
>>>>>>>> the Web Service iterates across the data; and not Synapse.
>>>>>>>>
>>>>>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>>>>>> the
>>>>>>>> front end.
>>>>>>>>
>>>>>>>> Another issue here is throttling: Splitting the file is one issues but
>>>>>>>> submitting 100's of calls in parralel to the destination service would
>>>>>>>> result in time outs... So need to work in throttling.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Ruwan Linton wrote:
>>>>>>>>>
>>>>>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>>>>>> stuff
>>>>>>>>> than trying to invent the wheel again :-)
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Ruwan
>>>>>>>>>
>>>>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>>>>>> <an...@gmail.com>wrote:
>>>>>>>>>
>>>>>>>>>> Ruwan,
>>>>>>>>>>
>>>>>>>>>> It's not a question of possibility, it is a question of available time
>>>>>>>>>> :-)
>>>>>>>>>>
>>>>>>>>>> Also note that some of the features that we might want to implement
>>>>>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>>>>>> (except that an attachment is only available once, while a file over
>>>>>>>>>> VFS can be read several times). I think there is also some existing
>>>>>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>>>>>> things but try to make the existing code reusable. This however is
>>>>>>>>>> only realistic for the next release after 1.3.
>>>>>>>>>>
>>>>>>>>>> Andreas
>>>>>>>>>>
>>>>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> > Andreas,
>>>>>>>>>> >
>>>>>>>>>> > Can we have the caching at the file system as a property to support
>>>>>>>>>> the
>>>>>>>>>> > multiple layers touching the full message and is it possible make it
>>>>>>>>>> to
>>>>>>>>>> > specify a threshold for streaming? For example if the message is
>>>>>>>>>> touched
>>>>>>>>>> > several time we might still need streaming but not for the 100KB or
>>>>>>>>>> lesser
>>>>>>>>>> > files.
>>>>>>>>>> >
>>>>>>>>>> > Thanks,
>>>>>>>>>> > Ruwan
>>>>>>>>>> >
>>>>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>>>>>> andreas.veithen@gmail.com>
>>>>>>>>>> > wrote:
>>>>>>>>>> >>
>>>>>>>>>> >> I've done an initial implementation of this feature. It is
>>>>>>>>>> available
>>>>>>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>>>>>>> to
>>>>>>>>>> >> enable this in your configuration, you need to add the following
>>>>>>>>>> >> property to the proxy:
>>>>>>>>>> >>
>>>>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>>>>>> >>
>>>>>>>>>> >> You also need to add the following mediators just before the <send>
>>>>>>>>>> >> mediator:
>>>>>>>>>> >>
>>>>>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>>>>>> scope="axis2"/>
>>>>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>>>>>> >>
>>>>>>>>>> >> With this configuration Synapse will stream the data directly from
>>>>>>>>>> the
>>>>>>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>>>>>>> in
>>>>>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>>>>>>> only
>>>>>>>>>> >> be opened on demand. In this case this happens during execution of
>>>>>>>>>> the
>>>>>>>>>> >> <send> mediator.
>>>>>>>>>> >> * If during the mediation the content of the file is needed several
>>>>>>>>>> >> time (which is not the case in your example), it will be read
>>>>>>>>>> several
>>>>>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>>>>>> >>
>>>>>>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>>>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>>>>>>> the
>>>>>>>>>> >> memory consumption is constant.
>>>>>>>>>> >>
>>>>>>>>>> >> Some additional comments:
>>>>>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>>>>>>> SOAP
>>>>>>>>>> >> processing: this type of content is processed exactly as before.
>>>>>>>>>> >> * With the changes described here, we have now two different
>>>>>>>>>> policies
>>>>>>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>>>>>>> no
>>>>>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>>>>>>> >> should define a wider range of policies in the future, including
>>>>>>>>>> file
>>>>>>>>>> >> system caching + streaming.
>>>>>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>>>>>> mediator
>>>>>>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>>>>>>> >> transport in a separate thread. This property is set by the
>>>>>>>>>> incoming
>>>>>>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>>>>>>> >> why the transport that handles the incoming request should
>>>>>>>>>> determine
>>>>>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>>>>>>> >>
>>>>>>>>>> >> Andreas
>>>>>>>>>> >>
>>>>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>>>>>>> wrote:
>>>>>>>>>> >> >
>>>>>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>>>>>> >> >
>>>>>>>>>> >> >
>>>>>>>>>> >> >
>>>>>>>>>> >> > Asankha C. Perera wrote:
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>>>>>> Java
>>>>>>>>>> >> >>> heap
>>>>>>>>>> >> >>> space
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>>
>>>>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> >> >>>
>>>>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>>>>>> >> >>> � � � � at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>>>>>> >> >>> � � � � at java.io.StringWriter.write(StringWriter.java:72)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>>
>>>>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>>>>>>> trying
>>>>>>>>>> to
>>>>>>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>>>>>>> content..
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> A definite bug we need to fix ..
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> cheers
>>>>>>>>>> >> >> asankha
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> --
>>>>>>>>>> >> >> Asankha C. Perera
>>>>>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >
>>>>>>>>>> >> > --
>>>>>>>>>> >> > View this message in context:
>>>>>>>>>> >> >
>>>>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>>>>> >> >
>>>>>>>>>> >> >
>>>>>>>>>> >> >
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>>> >> >
>>>>>>>>>> >> >
>>>>>>>>>> >>
>>>>>>>>>> >>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>>> >>
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> > --
>>>>>>>>>> > Ruwan Linton
>>>>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>>>> > http://ruwansblog.blogspot.com/
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Ruwan Linton
>>>>>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>>> http://ruwansblog.blogspot.com/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> View this message in context:
>>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by Andreas Veithen <an...@gmail.com>.
I temporarily reverted the change that does the "doesn't support
synchronous responses" check in the VFS transport. I noticed that this
check also breaks sample 254. I will need to do some further
investigations on this.

Andreas

On Wed, Apr 1, 2009 at 04:14, Kim Horn <ki...@icsglobal.net> wrote:
>
> However now I cannot get any of our "real" scripts to work using Build 529. They all throw  VFS doesn;t support syncronous responces exceptions !! When I put in the <property> for OUT_ONLY they all stop half way.
>
> I just can win with Synapse. It seems impossible to get a real script to work.
>
> So looked through latest doc but could not see where is best to set this property. I placed it
> in various places below but gave up ...just doesn't work anymore ???
> Is this nested <out> approach to call 2 Web services broken by the OUT_ONLY ?
>
> These scripts have basic pattern:
>  <proxy name="FileCheckProxyMRS" transports="vfs">
>          ........parameters....
>    <target inSequence="inSequence" outSequence="outSequence"/>
>  </proxy>
>  <sequence name="inSequence">
>          ... create Web Service call-1 - get ID to submit with claim.
>    <send>
>      <endpoint>
>        <address uri="https://stage.thelma-us.com/WS/SOAP/WebService-1"/>
>      </endpoint>
>    </send>
>  </sequence>
>  <sequence name="outSequence">
>    <filter xmlns:ns="http://ws1.thelma.icsglobal.net" xpath="fn:count(//ns:submitHealthCareClaimBatchResponse) = 1">
>      <then sequence="processClaimResponse"/>
>      <else sequence="sendClaimToThelma"/>
>    </filter>
>  </sequence>
>  <sequence name="sendClaimToThelma">
>          ...creat Web Service call-2 to get claim
>    <send>
>      <endpoint>
>        <address uri="https://stage.thelma-us.com/WS/SOAP/GetClaimService"/>
>      </endpoint>
>    </send>
>  </sequence>
>  <sequence name="processClaimResponse">
>    <property ..name spaces.. name="claim997" expression="//ns1:submitHealthCareClaimBatchResponse/ns1:out/ns2:response"/>
>    <property ..name spaces.. name="fName" expression="//ns1:submitHealthCareClaimBatchResponse/ns1:out/ns2:timestamp"/>
>    <property name="transport.vfs.ReplyFileName" expression="fn:translate(fn:concat(get-property('fName'),'.997'),':+','--')" scope="transport"/>
>    <script language="js"><![CDATA[var claim997 = mc.getProperty("claim997").toString();
>               mc.setPayloadXML(<axis2ns1:text xmlns:axis2ns1="http://ws.apache.org/commons/ns/payload">{claim997}</axis2ns1:text>);
>      ]]></script>
>    <send>
>      <endpoint>
>        <address uri="vfs:file:///C:/synapse-prod/mrs/997"/>
>      </endpoint>
>    </send>
>  </sequence>
> </definitions>
>
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
> Sent: Thu 26/03/2009 21:45
> To: dev@synapse.apache.org
> Subject: Re: VFS - Synapse Memory Leak
>
> Exactly as expected... :-)
>
> Andreas
>
> On Wed, Mar 25, 2009 at 05:02, Kim Horn <ki...@icsglobal.net> wrote:
>> Hello Andreas,
>>
>> This all works really well. Streaming uses no memory at all.
>> Got a java mediator also streaming payloads and with massive files never
>> uses much more than 40K for Synapse.
>>
>> Using just the OUT_ONLY property set; uses much more memory but it stabilises
>> and does not grow.
>> Thanks.
>>
>> -----Original Message-----
>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>> Sent: Friday, 20 March 2009 8:05 PM
>> To: dev@synapse.apache.org
>> Subject: Re: VFS - Synapse Memory Leak
>>
>> Of course the memory allocated to a message will be freed once the
>> message has been processed. That is why it's important to set the
>> OUT_ONLY property: if it is not set correctly, Synapse will keep the
>> message context (with the payload) in a callback table to correlate it
>> with a future response (which in your case never comes in). Probably
>> there is something to improve here in Synapse:
>> - The VFS transport should trigger an error if there is a mismatch
>> between the message exchange pattern and the transport configuration
>> of the service (the transport.vfs.* parameters).
>> - Synapse should start issuing warnings when the number of entries in
>> the callback table reaches a certain threshold.
>>
>> Andreas
>>
>> On Fri, Mar 20, 2009 at 01:41, Kim Horn <ki...@icsglobal.net> wrote:
>>> Not really; I cannot see why memory should permanently grow when I pass the same file
>>> repeatedly through VFS. In theory this means VFS will always consume all the available memory
>>> given enough time and file iterations. Therefore VFS cannot be used in a production system.
>>> This is definition of Memory Leak. I would expect SOME overhead on top of file size but
>>> I would assume the memory no longer required would be re-claimed. I would also assume
>>> The overhead was not 10 times the file size; seems excessive.
>>>
>>> Yes I understand the streaming approach should in theory use a fixed and much smaller amount of memory;
>>> but haven't tested that yet either. No reason given above memory leak that it should not permanently grow
>>> but at a smaller rate aswell.
>>>
>>> Thanks
>>> Kim
>>>
>>> -----Original Message-----
>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>> Sent: Friday, 20 March 2009 10:52 AM
>>> To: dev@synapse.apache.org
>>> Subject: Re: VFS - Synapse Memory Leak
>>>
>>> If N is the size of the file, the memory consumption caused by the
>>> transport is O(N) with transport.vfs.Streaming=false and O(1) with
>>> transport.vfs.Streaming=true. The getTextAsStream and writeTextTo
>>> methods in org.apache.axis2.format.ElementHelper are there to allow
>>> you to implement your mediator with O(1) memory usage, so that the
>>> overall memory consumption remains O(1). Does that answer your
>>> question?
>>>
>>> Andreas
>>>
>>> On Thu, Mar 19, 2009 at 23:33, Kim Horn <ki...@icsglobal.net> wrote:
>>>> It's the same Synapse.xml as specified originally and same trace. If you are using Nabble you can see this, in case you lost the prior emails I can post them again.
>>>>
>>>> I must admit I did not set those extra parameters, you mentioned, but I don't see why you should set parameter to Stop a memory leak. I guessed these parameter would just reduce the large amounts of memory it appears to be using, e.g. 10 times the file size, via streaming ? Why is their 10 copies of the data floating around ? Lots of buffering. This issue suggests to me that any use of VFS will eventually kill the Server. Even with smaller files it will eventually use all available memory. I guess I did not understand the actual reason for this issue from prior discussion.
>>>>
>>>> I will try your extra parameters today though.
>>>>
>>>> Thanks
>>>> Kim
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>> Sent: Thursday, 19 March 2009 5:48 PM
>>>> To: dev@synapse.apache.org
>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>
>>>> Kim,
>>>>
>>>> Can you post your current synapse.xml as well as the stack trace you get now?
>>>>
>>>> Andreas
>>>>
>>>> On Thu, Mar 19, 2009 at 07:20, kimhorn <ki...@icsglobal.net> wrote:
>>>>>
>>>>> Using the last stable build from 15 March 2009 I still get exactly same
>>>>> behaviour as originally
>>>>> described with the above script. VFS still just dies. Would your fixes be in
>>>>> this ?
>>>>>
>>>>> Using the last st
>>>>>
>>>>> Andreas Veithen-2 wrote:
>>>>>>
>>>>>> I committed the code and it will be available in the next WS-Commons
>>>>>> transport build. The methods are located in
>>>>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>>>>>> module.
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>>>>>>> Hello Andreas,
>>>>>>> This is great and really helps, have not had time to try it out but will
>>>>>>> soon.
>>>>>>>
>>>>>>> Contributing the java.io.Reader would be a great help but it will take me
>>>>>>> a while to get up to speed to do the Synapse iterator.
>>>>>>>
>>>>>>> In the short term I am going to use a brute force approach that is now
>>>>>>> feasible given the memory issue is resolved. Just thought of this one
>>>>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>>>>>> POJOCommand on <out> to split file into another directory, stream in and
>>>>>>> out. Another independent VFS proxy watches that directory and submits
>>>>>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>>>>>> destination may still be an issue ?
>>>>>>>
>>>>>>> Kim
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>>>>> Sent: Monday, 9 March 2009 10:55 PM
>>>>>>> To: dev@synapse.apache.org
>>>>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>>>>
>>>>>>> The changes I did in the VFS transport and the message builders for
>>>>>>> text/plain and application/octet-stream certainly don't provide an
>>>>>>> out-of-the-box solution for your use case, but they are the
>>>>>>> prerequisite.
>>>>>>>
>>>>>>> Concerning your first proposed solution (let the VFS write the content
>>>>>>> to a temporary file), I don't like this because it would create a
>>>>>>> tight coupling between the VFS transport and the mediator. A design
>>>>>>> goal should be that the solution will still work if the file comes
>>>>>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>>>>>
>>>>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>>>>>> this will require development of a custom mediator. This mediator
>>>>>>> would read the content, split it up (and store the chunks in memory or
>>>>>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>>>>>> the sub-sequence would happen synchronously to limit the memory/disk
>>>>>>> space consumption (to the maximum chunk size) and to avoid flooding
>>>>>>> the destination service.
>>>>>>>
>>>>>>> Note that it is probably not possible to implemented the mediator
>>>>>>> using a script because of the problematic String handling. Also,
>>>>>>> Spring, POJO and class mediators don't support sub-sequences (I
>>>>>>> think). Therefore it should be implemented as a full-featured Java
>>>>>>> mediator, probably taking the existing iterate mediator as a template.
>>>>>>> I can contribute the required code to get the text content in the form
>>>>>>> of a java.io.Reader.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Andreas
>>>>>>>
>>>>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>>>>>>
>>>>>>>> Although this is a good feature it may not solve the actual problem ?
>>>>>>>> The main first issue on my list was the memory leak.
>>>>>>>> However, the real problem is once I get this massive files I �have to
>>>>>>>> send
>>>>>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>>>>>> Streaming it straight out would just kill the destination Web service.
>>>>>>>> It
>>>>>>>> would get the memory error. The text document can be split apart easily,
>>>>>>>> as
>>>>>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>>>>>
>>>>>>>> In an earlier post; that was not responded too, I mentioned:
>>>>>>>>
>>>>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>>>>>> through
>>>>>>>> input file and outputs smaller
>>>>>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>>>>>
>>>>>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>>>>>
>>>>>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>>>>>> process
>>>>>>>> the file by splitting it into many smaller files. These files then
>>>>>>>> trigger
>>>>>>>> another VFS proxy that submits these to the final web Service.
>>>>>>>> The problem is is that is uses the file system (not so bad).
>>>>>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>>>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>>>>>> Iterator. So replace the text message with many smaller XML elements.
>>>>>>>> Problem is that this loads whole message into memory.
>>>>>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>>>>>> (to
>>>>>>>> split the text data) or actually uses a for loop approach to chop the
>>>>>>>> file
>>>>>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>>>>>> chunk
>>>>>>>> 23 chunks into the data.
>>>>>>>> 4) Using the approach proposed now - just submit the file straight
>>>>>>>> (stream
>>>>>>>> it) to another web service that chops it up. It may return an XML
>>>>>>>> document
>>>>>>>> with many sub elelements that allows the standard Iterator to work.
>>>>>>>> Similar
>>>>>>>> to (2) but using another service rather than Java to split document.
>>>>>>>> 5) Using the approach proposed now - just submit the file straight
>>>>>>>> (stream
>>>>>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>>>>>> with
>>>>>>>> each small packet of data that then forwards it to the final WEb
>>>>>>>> Service. So
>>>>>>>> the Web Service iterates across the data; and not Synapse.
>>>>>>>>
>>>>>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>>>>>> the
>>>>>>>> front end.
>>>>>>>>
>>>>>>>> Another issue here is throttling: Splitting the file is one issues but
>>>>>>>> submitting 100's of calls in parralel to the destination service would
>>>>>>>> result in time outs... So need to work in throttling.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Ruwan Linton wrote:
>>>>>>>>>
>>>>>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>>>>>> stuff
>>>>>>>>> than trying to invent the wheel again :-)
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Ruwan
>>>>>>>>>
>>>>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>>>>>> <an...@gmail.com>wrote:
>>>>>>>>>
>>>>>>>>>> Ruwan,
>>>>>>>>>>
>>>>>>>>>> It's not a question of possibility, it is a question of available time
>>>>>>>>>> :-)
>>>>>>>>>>
>>>>>>>>>> Also note that some of the features that we might want to implement
>>>>>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>>>>>> (except that an attachment is only available once, while a file over
>>>>>>>>>> VFS can be read several times). I think there is also some existing
>>>>>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>>>>>> things but try to make the existing code reusable. This however is
>>>>>>>>>> only realistic for the next release after 1.3.
>>>>>>>>>>
>>>>>>>>>> Andreas
>>>>>>>>>>
>>>>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> > Andreas,
>>>>>>>>>> >
>>>>>>>>>> > Can we have the caching at the file system as a property to support
>>>>>>>>>> the
>>>>>>>>>> > multiple layers touching the full message and is it possible make it
>>>>>>>>>> to
>>>>>>>>>> > specify a threshold for streaming? For example if the message is
>>>>>>>>>> touched
>>>>>>>>>> > several time we might still need streaming but not for the 100KB or
>>>>>>>>>> lesser
>>>>>>>>>> > files.
>>>>>>>>>> >
>>>>>>>>>> > Thanks,
>>>>>>>>>> > Ruwan
>>>>>>>>>> >
>>>>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>>>>>> andreas.veithen@gmail.com>
>>>>>>>>>> > wrote:
>>>>>>>>>> >>
>>>>>>>>>> >> I've done an initial implementation of this feature. It is
>>>>>>>>>> available
>>>>>>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>>>>>>> to
>>>>>>>>>> >> enable this in your configuration, you need to add the following
>>>>>>>>>> >> property to the proxy:
>>>>>>>>>> >>
>>>>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>>>>>> >>
>>>>>>>>>> >> You also need to add the following mediators just before the <send>
>>>>>>>>>> >> mediator:
>>>>>>>>>> >>
>>>>>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>>>>>> scope="axis2"/>
>>>>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>>>>>> >>
>>>>>>>>>> >> With this configuration Synapse will stream the data directly from
>>>>>>>>>> the
>>>>>>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>>>>>>> in
>>>>>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>>>>>>> only
>>>>>>>>>> >> be opened on demand. In this case this happens during execution of
>>>>>>>>>> the
>>>>>>>>>> >> <send> mediator.
>>>>>>>>>> >> * If during the mediation the content of the file is needed several
>>>>>>>>>> >> time (which is not the case in your example), it will be read
>>>>>>>>>> several
>>>>>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>>>>>> >>
>>>>>>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>>>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>>>>>>> the
>>>>>>>>>> >> memory consumption is constant.
>>>>>>>>>> >>
>>>>>>>>>> >> Some additional comments:
>>>>>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>>>>>>> SOAP
>>>>>>>>>> >> processing: this type of content is processed exactly as before.
>>>>>>>>>> >> * With the changes described here, we have now two different
>>>>>>>>>> policies
>>>>>>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>>>>>>> no
>>>>>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>>>>>>> >> should define a wider range of policies in the future, including
>>>>>>>>>> file
>>>>>>>>>> >> system caching + streaming.
>>>>>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>>>>>> mediator
>>>>>>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>>>>>>> >> transport in a separate thread. This property is set by the
>>>>>>>>>> incoming
>>>>>>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>>>>>>> >> why the transport that handles the incoming request should
>>>>>>>>>> determine
>>>>>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>>>>>>> >>
>>>>>>>>>> >> Andreas
>>>>>>>>>> >>
>>>>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>>>>>>> wrote:
>>>>>>>>>> >> >
>>>>>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>>>>>> >> >
>>>>>>>>>> >> >
>>>>>>>>>> >> >
>>>>>>>>>> >> > Asankha C. Perera wrote:
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>>>>>> Java
>>>>>>>>>> >> >>> heap
>>>>>>>>>> >> >>> space
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>>
>>>>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> >> >>>
>>>>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>>>>>> >> >>> � � � � at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>>>>>> >> >>> � � � � at java.io.StringWriter.write(StringWriter.java:72)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>>>>>> >> >>> � � � � at
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>>
>>>>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>>>>>>> trying
>>>>>>>>>> to
>>>>>>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>>>>>>> content..
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> A definite bug we need to fix ..
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> cheers
>>>>>>>>>> >> >> asankha
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> --
>>>>>>>>>> >> >> Asankha C. Perera
>>>>>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >
>>>>>>>>>> >> > --
>>>>>>>>>> >> > View this message in context:
>>>>>>>>>> >> >
>>>>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>>>>> >> >
>>>>>>>>>> >> >
>>>>>>>>>> >> >
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>>> >> >
>>>>>>>>>> >> >
>>>>>>>>>> >>
>>>>>>>>>> >>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>>> >>
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> > --
>>>>>>>>>> > Ruwan Linton
>>>>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>>>> > http://ruwansblog.blogspot.com/
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Ruwan Linton
>>>>>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>>> http://ruwansblog.blogspot.com/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> View this message in context:
>>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


RE: VFS - Synapse Memory Leak

Posted by Kim Horn <ki...@icsglobal.net>.
However now I cannot get any of our "real" scripts to work using Build 529. They all throw  VFS doesn;t support syncronous responces exceptions !! When I put in the <property> for OUT_ONLY they all stop half way.

I just can win with Synapse. It seems impossible to get a real script to work.

So looked through latest doc but could not see where is best to set this property. I placed it
in various places below but gave up ...just doesn't work anymore ???
Is this nested <out> approach to call 2 Web services broken by the OUT_ONLY ?

These scripts have basic pattern:
  <proxy name="FileCheckProxyMRS" transports="vfs">
	  ........parameters....
    <target inSequence="inSequence" outSequence="outSequence"/>
  </proxy>
  <sequence name="inSequence">
	  ... create Web Service call-1 - get ID to submit with claim.
    <send>
      <endpoint>
        <address uri="https://stage.thelma-us.com/WS/SOAP/WebService-1"/>
      </endpoint>
    </send>
  </sequence>
  <sequence name="outSequence">
    <filter xmlns:ns="http://ws1.thelma.icsglobal.net" xpath="fn:count(//ns:submitHealthCareClaimBatchResponse) = 1">
      <then sequence="processClaimResponse"/>
      <else sequence="sendClaimToThelma"/>
    </filter>
  </sequence>
  <sequence name="sendClaimToThelma">
	  ...creat Web Service call-2 to get claim 
    <send>
      <endpoint>
        <address uri="https://stage.thelma-us.com/WS/SOAP/GetClaimService"/>
      </endpoint>
    </send>
  </sequence>
  <sequence name="processClaimResponse">
    <property ..name spaces.. name="claim997" expression="//ns1:submitHealthCareClaimBatchResponse/ns1:out/ns2:response"/>
    <property ..name spaces.. name="fName" expression="//ns1:submitHealthCareClaimBatchResponse/ns1:out/ns2:timestamp"/>
    <property name="transport.vfs.ReplyFileName" expression="fn:translate(fn:concat(get-property('fName'),'.997'),':+','--')" scope="transport"/>
    <script language="js"><![CDATA[var claim997 = mc.getProperty("claim997").toString();
	       mc.setPayloadXML(<axis2ns1:text xmlns:axis2ns1="http://ws.apache.org/commons/ns/payload">{claim997}</axis2ns1:text>);
      ]]></script>
    <send>
      <endpoint>
        <address uri="vfs:file:///C:/synapse-prod/mrs/997"/>
      </endpoint>
    </send>
  </sequence>
</definitions>









-----Original Message-----
From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
Sent: Thu 26/03/2009 21:45
To: dev@synapse.apache.org
Subject: Re: VFS - Synapse Memory Leak
 
Exactly as expected... :-)

Andreas

On Wed, Mar 25, 2009 at 05:02, Kim Horn <ki...@icsglobal.net> wrote:
> Hello Andreas,
>
> This all works really well. Streaming uses no memory at all.
> Got a java mediator also streaming payloads and with massive files never
> uses much more than 40K for Synapse.
>
> Using just the OUT_ONLY property set; uses much more memory but it stabilises
> and does not grow.
> Thanks.
>
> -----Original Message-----
> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
> Sent: Friday, 20 March 2009 8:05 PM
> To: dev@synapse.apache.org
> Subject: Re: VFS - Synapse Memory Leak
>
> Of course the memory allocated to a message will be freed once the
> message has been processed. That is why it's important to set the
> OUT_ONLY property: if it is not set correctly, Synapse will keep the
> message context (with the payload) in a callback table to correlate it
> with a future response (which in your case never comes in). Probably
> there is something to improve here in Synapse:
> - The VFS transport should trigger an error if there is a mismatch
> between the message exchange pattern and the transport configuration
> of the service (the transport.vfs.* parameters).
> - Synapse should start issuing warnings when the number of entries in
> the callback table reaches a certain threshold.
>
> Andreas
>
> On Fri, Mar 20, 2009 at 01:41, Kim Horn <ki...@icsglobal.net> wrote:
>> Not really; I cannot see why memory should permanently grow when I pass the same file
>> repeatedly through VFS. In theory this means VFS will always consume all the available memory
>> given enough time and file iterations. Therefore VFS cannot be used in a production system.
>> This is definition of Memory Leak. I would expect SOME overhead on top of file size but
>> I would assume the memory no longer required would be re-claimed. I would also assume
>> The overhead was not 10 times the file size; seems excessive.
>>
>> Yes I understand the streaming approach should in theory use a fixed and much smaller amount of memory;
>> but haven't tested that yet either. No reason given above memory leak that it should not permanently grow
>> but at a smaller rate aswell.
>>
>> Thanks
>> Kim
>>
>> -----Original Message-----
>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>> Sent: Friday, 20 March 2009 10:52 AM
>> To: dev@synapse.apache.org
>> Subject: Re: VFS - Synapse Memory Leak
>>
>> If N is the size of the file, the memory consumption caused by the
>> transport is O(N) with transport.vfs.Streaming=false and O(1) with
>> transport.vfs.Streaming=true. The getTextAsStream and writeTextTo
>> methods in org.apache.axis2.format.ElementHelper are there to allow
>> you to implement your mediator with O(1) memory usage, so that the
>> overall memory consumption remains O(1). Does that answer your
>> question?
>>
>> Andreas
>>
>> On Thu, Mar 19, 2009 at 23:33, Kim Horn <ki...@icsglobal.net> wrote:
>>> It's the same Synapse.xml as specified originally and same trace. If you are using Nabble you can see this, in case you lost the prior emails I can post them again.
>>>
>>> I must admit I did not set those extra parameters, you mentioned, but I don't see why you should set parameter to Stop a memory leak. I guessed these parameter would just reduce the large amounts of memory it appears to be using, e.g. 10 times the file size, via streaming ? Why is their 10 copies of the data floating around ? Lots of buffering. This issue suggests to me that any use of VFS will eventually kill the Server. Even with smaller files it will eventually use all available memory. I guess I did not understand the actual reason for this issue from prior discussion.
>>>
>>> I will try your extra parameters today though.
>>>
>>> Thanks
>>> Kim
>>>
>>>
>>> -----Original Message-----
>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>> Sent: Thursday, 19 March 2009 5:48 PM
>>> To: dev@synapse.apache.org
>>> Subject: Re: VFS - Synapse Memory Leak
>>>
>>> Kim,
>>>
>>> Can you post your current synapse.xml as well as the stack trace you get now?
>>>
>>> Andreas
>>>
>>> On Thu, Mar 19, 2009 at 07:20, kimhorn <ki...@icsglobal.net> wrote:
>>>>
>>>> Using the last stable build from 15 March 2009 I still get exactly same
>>>> behaviour as originally
>>>> described with the above script. VFS still just dies. Would your fixes be in
>>>> this ?
>>>>
>>>> Using the last st
>>>>
>>>> Andreas Veithen-2 wrote:
>>>>>
>>>>> I committed the code and it will be available in the next WS-Commons
>>>>> transport build. The methods are located in
>>>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>>>>> module.
>>>>>
>>>>> Andreas
>>>>>
>>>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>>>>>> Hello Andreas,
>>>>>> This is great and really helps, have not had time to try it out but will
>>>>>> soon.
>>>>>>
>>>>>> Contributing the java.io.Reader would be a great help but it will take me
>>>>>> a while to get up to speed to do the Synapse iterator.
>>>>>>
>>>>>> In the short term I am going to use a brute force approach that is now
>>>>>> feasible given the memory issue is resolved. Just thought of this one
>>>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>>>>> POJOCommand on <out> to split file into another directory, stream in and
>>>>>> out. Another independent VFS proxy watches that directory and submits
>>>>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>>>>> destination may still be an issue ?
>>>>>>
>>>>>> Kim
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>>>> Sent: Monday, 9 March 2009 10:55 PM
>>>>>> To: dev@synapse.apache.org
>>>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>>>
>>>>>> The changes I did in the VFS transport and the message builders for
>>>>>> text/plain and application/octet-stream certainly don't provide an
>>>>>> out-of-the-box solution for your use case, but they are the
>>>>>> prerequisite.
>>>>>>
>>>>>> Concerning your first proposed solution (let the VFS write the content
>>>>>> to a temporary file), I don't like this because it would create a
>>>>>> tight coupling between the VFS transport and the mediator. A design
>>>>>> goal should be that the solution will still work if the file comes
>>>>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>>>>
>>>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>>>>> this will require development of a custom mediator. This mediator
>>>>>> would read the content, split it up (and store the chunks in memory or
>>>>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>>>>> the sub-sequence would happen synchronously to limit the memory/disk
>>>>>> space consumption (to the maximum chunk size) and to avoid flooding
>>>>>> the destination service.
>>>>>>
>>>>>> Note that it is probably not possible to implemented the mediator
>>>>>> using a script because of the problematic String handling. Also,
>>>>>> Spring, POJO and class mediators don't support sub-sequences (I
>>>>>> think). Therefore it should be implemented as a full-featured Java
>>>>>> mediator, probably taking the existing iterate mediator as a template.
>>>>>> I can contribute the required code to get the text content in the form
>>>>>> of a java.io.Reader.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>>>>>
>>>>>>> Although this is a good feature it may not solve the actual problem ?
>>>>>>> The main first issue on my list was the memory leak.
>>>>>>> However, the real problem is once I get this massive files I �have to
>>>>>>> send
>>>>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>>>>> Streaming it straight out would just kill the destination Web service.
>>>>>>> It
>>>>>>> would get the memory error. The text document can be split apart easily,
>>>>>>> as
>>>>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>>>>
>>>>>>> In an earlier post; that was not responded too, I mentioned:
>>>>>>>
>>>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>>>>> through
>>>>>>> input file and outputs smaller
>>>>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>>>>
>>>>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>>>>
>>>>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>>>>> process
>>>>>>> the file by splitting it into many smaller files. These files then
>>>>>>> trigger
>>>>>>> another VFS proxy that submits these to the final web Service.
>>>>>>> The problem is is that is uses the file system (not so bad).
>>>>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>>>>> Iterator. So replace the text message with many smaller XML elements.
>>>>>>> Problem is that this loads whole message into memory.
>>>>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>>>>> (to
>>>>>>> split the text data) or actually uses a for loop approach to chop the
>>>>>>> file
>>>>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>>>>> chunk
>>>>>>> 23 chunks into the data.
>>>>>>> 4) Using the approach proposed now - just submit the file straight
>>>>>>> (stream
>>>>>>> it) to another web service that chops it up. It may return an XML
>>>>>>> document
>>>>>>> with many sub elelements that allows the standard Iterator to work.
>>>>>>> Similar
>>>>>>> to (2) but using another service rather than Java to split document.
>>>>>>> 5) Using the approach proposed now - just submit the file straight
>>>>>>> (stream
>>>>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>>>>> with
>>>>>>> each small packet of data that then forwards it to the final WEb
>>>>>>> Service. So
>>>>>>> the Web Service iterates across the data; and not Synapse.
>>>>>>>
>>>>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>>>>> the
>>>>>>> front end.
>>>>>>>
>>>>>>> Another issue here is throttling: Splitting the file is one issues but
>>>>>>> submitting 100's of calls in parralel to the destination service would
>>>>>>> result in time outs... So need to work in throttling.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Ruwan Linton wrote:
>>>>>>>>
>>>>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>>>>> stuff
>>>>>>>> than trying to invent the wheel again :-)
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Ruwan
>>>>>>>>
>>>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>>>>> <an...@gmail.com>wrote:
>>>>>>>>
>>>>>>>>> Ruwan,
>>>>>>>>>
>>>>>>>>> It's not a question of possibility, it is a question of available time
>>>>>>>>> :-)
>>>>>>>>>
>>>>>>>>> Also note that some of the features that we might want to implement
>>>>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>>>>> (except that an attachment is only available once, while a file over
>>>>>>>>> VFS can be read several times). I think there is also some existing
>>>>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>>>>> things but try to make the existing code reusable. This however is
>>>>>>>>> only realistic for the next release after 1.3.
>>>>>>>>>
>>>>>>>>> Andreas
>>>>>>>>>
>>>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> > Andreas,
>>>>>>>>> >
>>>>>>>>> > Can we have the caching at the file system as a property to support
>>>>>>>>> the
>>>>>>>>> > multiple layers touching the full message and is it possible make it
>>>>>>>>> to
>>>>>>>>> > specify a threshold for streaming? For example if the message is
>>>>>>>>> touched
>>>>>>>>> > several time we might still need streaming but not for the 100KB or
>>>>>>>>> lesser
>>>>>>>>> > files.
>>>>>>>>> >
>>>>>>>>> > Thanks,
>>>>>>>>> > Ruwan
>>>>>>>>> >
>>>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>>>>> andreas.veithen@gmail.com>
>>>>>>>>> > wrote:
>>>>>>>>> >>
>>>>>>>>> >> I've done an initial implementation of this feature. It is
>>>>>>>>> available
>>>>>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>>>>>> to
>>>>>>>>> >> enable this in your configuration, you need to add the following
>>>>>>>>> >> property to the proxy:
>>>>>>>>> >>
>>>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>>>>> >>
>>>>>>>>> >> You also need to add the following mediators just before the <send>
>>>>>>>>> >> mediator:
>>>>>>>>> >>
>>>>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>>>>> scope="axis2"/>
>>>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>>>>> >>
>>>>>>>>> >> With this configuration Synapse will stream the data directly from
>>>>>>>>> the
>>>>>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>>>>>> in
>>>>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>>>>>> only
>>>>>>>>> >> be opened on demand. In this case this happens during execution of
>>>>>>>>> the
>>>>>>>>> >> <send> mediator.
>>>>>>>>> >> * If during the mediation the content of the file is needed several
>>>>>>>>> >> time (which is not the case in your example), it will be read
>>>>>>>>> several
>>>>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>>>>> >>
>>>>>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>>>>>> the
>>>>>>>>> >> memory consumption is constant.
>>>>>>>>> >>
>>>>>>>>> >> Some additional comments:
>>>>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>>>>>> SOAP
>>>>>>>>> >> processing: this type of content is processed exactly as before.
>>>>>>>>> >> * With the changes described here, we have now two different
>>>>>>>>> policies
>>>>>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>>>>>> no
>>>>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>>>>>> >> should define a wider range of policies in the future, including
>>>>>>>>> file
>>>>>>>>> >> system caching + streaming.
>>>>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>>>>> mediator
>>>>>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>>>>>> >> transport in a separate thread. This property is set by the
>>>>>>>>> incoming
>>>>>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>>>>>> >> why the transport that handles the incoming request should
>>>>>>>>> determine
>>>>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>>>>>> >>
>>>>>>>>> >> Andreas
>>>>>>>>> >>
>>>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>>>>>> wrote:
>>>>>>>>> >> >
>>>>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> > Asankha C. Perera wrote:
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>>>>> Java
>>>>>>>>> >> >>> heap
>>>>>>>>> >> >>> space
>>>>>>>>> >> >>> � � � � at
>>>>>>>>> >> >>>
>>>>>>>>> >> >>>
>>>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>>>>> >> >>> � � � � at
>>>>>>>>> >> >>>
>>>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>>>>> >> >>> � � � � at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>>>>> >> >>> � � � � at java.io.StringWriter.write(StringWriter.java:72)
>>>>>>>>> >> >>> � � � � at
>>>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>>>>> >> >>> � � � � at
>>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>>>>> >> >>> � � � � at
>>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>>>>> >> >>> � � � � at
>>>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>>>>> >> >>> � � � � at
>>>>>>>>> >> >>>
>>>>>>>>> >> >>>
>>>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>>>>> >> >>>
>>>>>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>>>>>> trying
>>>>>>>>> to
>>>>>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>>>>>> content..
>>>>>>>>> >> >>
>>>>>>>>> >> >> A definite bug we need to fix ..
>>>>>>>>> >> >>
>>>>>>>>> >> >> cheers
>>>>>>>>> >> >> asankha
>>>>>>>>> >> >>
>>>>>>>>> >> >> --
>>>>>>>>> >> >> Asankha C. Perera
>>>>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>>>>> >> >>
>>>>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >
>>>>>>>>> >> > --
>>>>>>>>> >> > View this message in context:
>>>>>>>>> >> >
>>>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>> >>
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > --
>>>>>>>>> > Ruwan Linton
>>>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>>> > http://ruwansblog.blogspot.com/
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Ruwan Linton
>>>>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>> http://ruwansblog.blogspot.com/
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org



Re: VFS - Synapse Memory Leak

Posted by Andreas Veithen <an...@gmail.com>.
Exactly as expected... :-)

Andreas

On Wed, Mar 25, 2009 at 05:02, Kim Horn <ki...@icsglobal.net> wrote:
> Hello Andreas,
>
> This all works really well. Streaming uses no memory at all.
> Got a java mediator also streaming payloads and with massive files never
> uses much more than 40K for Synapse.
>
> Using just the OUT_ONLY property set; uses much more memory but it stabilises
> and does not grow.
> Thanks.
>
> -----Original Message-----
> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
> Sent: Friday, 20 March 2009 8:05 PM
> To: dev@synapse.apache.org
> Subject: Re: VFS - Synapse Memory Leak
>
> Of course the memory allocated to a message will be freed once the
> message has been processed. That is why it's important to set the
> OUT_ONLY property: if it is not set correctly, Synapse will keep the
> message context (with the payload) in a callback table to correlate it
> with a future response (which in your case never comes in). Probably
> there is something to improve here in Synapse:
> - The VFS transport should trigger an error if there is a mismatch
> between the message exchange pattern and the transport configuration
> of the service (the transport.vfs.* parameters).
> - Synapse should start issuing warnings when the number of entries in
> the callback table reaches a certain threshold.
>
> Andreas
>
> On Fri, Mar 20, 2009 at 01:41, Kim Horn <ki...@icsglobal.net> wrote:
>> Not really; I cannot see why memory should permanently grow when I pass the same file
>> repeatedly through VFS. In theory this means VFS will always consume all the available memory
>> given enough time and file iterations. Therefore VFS cannot be used in a production system.
>> This is definition of Memory Leak. I would expect SOME overhead on top of file size but
>> I would assume the memory no longer required would be re-claimed. I would also assume
>> The overhead was not 10 times the file size; seems excessive.
>>
>> Yes I understand the streaming approach should in theory use a fixed and much smaller amount of memory;
>> but haven't tested that yet either. No reason given above memory leak that it should not permanently grow
>> but at a smaller rate aswell.
>>
>> Thanks
>> Kim
>>
>> -----Original Message-----
>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>> Sent: Friday, 20 March 2009 10:52 AM
>> To: dev@synapse.apache.org
>> Subject: Re: VFS - Synapse Memory Leak
>>
>> If N is the size of the file, the memory consumption caused by the
>> transport is O(N) with transport.vfs.Streaming=false and O(1) with
>> transport.vfs.Streaming=true. The getTextAsStream and writeTextTo
>> methods in org.apache.axis2.format.ElementHelper are there to allow
>> you to implement your mediator with O(1) memory usage, so that the
>> overall memory consumption remains O(1). Does that answer your
>> question?
>>
>> Andreas
>>
>> On Thu, Mar 19, 2009 at 23:33, Kim Horn <ki...@icsglobal.net> wrote:
>>> It's the same Synapse.xml as specified originally and same trace. If you are using Nabble you can see this, in case you lost the prior emails I can post them again.
>>>
>>> I must admit I did not set those extra parameters, you mentioned, but I don't see why you should set parameter to Stop a memory leak. I guessed these parameter would just reduce the large amounts of memory it appears to be using, e.g. 10 times the file size, via streaming ? Why is their 10 copies of the data floating around ? Lots of buffering. This issue suggests to me that any use of VFS will eventually kill the Server. Even with smaller files it will eventually use all available memory. I guess I did not understand the actual reason for this issue from prior discussion.
>>>
>>> I will try your extra parameters today though.
>>>
>>> Thanks
>>> Kim
>>>
>>>
>>> -----Original Message-----
>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>> Sent: Thursday, 19 March 2009 5:48 PM
>>> To: dev@synapse.apache.org
>>> Subject: Re: VFS - Synapse Memory Leak
>>>
>>> Kim,
>>>
>>> Can you post your current synapse.xml as well as the stack trace you get now?
>>>
>>> Andreas
>>>
>>> On Thu, Mar 19, 2009 at 07:20, kimhorn <ki...@icsglobal.net> wrote:
>>>>
>>>> Using the last stable build from 15 March 2009 I still get exactly same
>>>> behaviour as originally
>>>> described with the above script. VFS still just dies. Would your fixes be in
>>>> this ?
>>>>
>>>> Using the last st
>>>>
>>>> Andreas Veithen-2 wrote:
>>>>>
>>>>> I committed the code and it will be available in the next WS-Commons
>>>>> transport build. The methods are located in
>>>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>>>>> module.
>>>>>
>>>>> Andreas
>>>>>
>>>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>>>>>> Hello Andreas,
>>>>>> This is great and really helps, have not had time to try it out but will
>>>>>> soon.
>>>>>>
>>>>>> Contributing the java.io.Reader would be a great help but it will take me
>>>>>> a while to get up to speed to do the Synapse iterator.
>>>>>>
>>>>>> In the short term I am going to use a brute force approach that is now
>>>>>> feasible given the memory issue is resolved. Just thought of this one
>>>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>>>>> POJOCommand on <out> to split file into another directory, stream in and
>>>>>> out. Another independent VFS proxy watches that directory and submits
>>>>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>>>>> destination may still be an issue ?
>>>>>>
>>>>>> Kim
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>>>> Sent: Monday, 9 March 2009 10:55 PM
>>>>>> To: dev@synapse.apache.org
>>>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>>>
>>>>>> The changes I did in the VFS transport and the message builders for
>>>>>> text/plain and application/octet-stream certainly don't provide an
>>>>>> out-of-the-box solution for your use case, but they are the
>>>>>> prerequisite.
>>>>>>
>>>>>> Concerning your first proposed solution (let the VFS write the content
>>>>>> to a temporary file), I don't like this because it would create a
>>>>>> tight coupling between the VFS transport and the mediator. A design
>>>>>> goal should be that the solution will still work if the file comes
>>>>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>>>>
>>>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>>>>> this will require development of a custom mediator. This mediator
>>>>>> would read the content, split it up (and store the chunks in memory or
>>>>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>>>>> the sub-sequence would happen synchronously to limit the memory/disk
>>>>>> space consumption (to the maximum chunk size) and to avoid flooding
>>>>>> the destination service.
>>>>>>
>>>>>> Note that it is probably not possible to implemented the mediator
>>>>>> using a script because of the problematic String handling. Also,
>>>>>> Spring, POJO and class mediators don't support sub-sequences (I
>>>>>> think). Therefore it should be implemented as a full-featured Java
>>>>>> mediator, probably taking the existing iterate mediator as a template.
>>>>>> I can contribute the required code to get the text content in the form
>>>>>> of a java.io.Reader.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>>>>>
>>>>>>> Although this is a good feature it may not solve the actual problem ?
>>>>>>> The main first issue on my list was the memory leak.
>>>>>>> However, the real problem is once I get this massive files I  have to
>>>>>>> send
>>>>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>>>>> Streaming it straight out would just kill the destination Web service.
>>>>>>> It
>>>>>>> would get the memory error. The text document can be split apart easily,
>>>>>>> as
>>>>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>>>>
>>>>>>> In an earlier post; that was not responded too, I mentioned:
>>>>>>>
>>>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>>>>> through
>>>>>>> input file and outputs smaller
>>>>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>>>>
>>>>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>>>>
>>>>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>>>>> process
>>>>>>> the file by splitting it into many smaller files. These files then
>>>>>>> trigger
>>>>>>> another VFS proxy that submits these to the final web Service.
>>>>>>> The problem is is that is uses the file system (not so bad).
>>>>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>>>>> Iterator. So replace the text message with many smaller XML elements.
>>>>>>> Problem is that this loads whole message into memory.
>>>>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>>>>> (to
>>>>>>> split the text data) or actually uses a for loop approach to chop the
>>>>>>> file
>>>>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>>>>> chunk
>>>>>>> 23 chunks into the data.
>>>>>>> 4) Using the approach proposed now - just submit the file straight
>>>>>>> (stream
>>>>>>> it) to another web service that chops it up. It may return an XML
>>>>>>> document
>>>>>>> with many sub elelements that allows the standard Iterator to work.
>>>>>>> Similar
>>>>>>> to (2) but using another service rather than Java to split document.
>>>>>>> 5) Using the approach proposed now - just submit the file straight
>>>>>>> (stream
>>>>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>>>>> with
>>>>>>> each small packet of data that then forwards it to the final WEb
>>>>>>> Service. So
>>>>>>> the Web Service iterates across the data; and not Synapse.
>>>>>>>
>>>>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>>>>> the
>>>>>>> front end.
>>>>>>>
>>>>>>> Another issue here is throttling: Splitting the file is one issues but
>>>>>>> submitting 100's of calls in parralel to the destination service would
>>>>>>> result in time outs... So need to work in throttling.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Ruwan Linton wrote:
>>>>>>>>
>>>>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>>>>> stuff
>>>>>>>> than trying to invent the wheel again :-)
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Ruwan
>>>>>>>>
>>>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>>>>> <an...@gmail.com>wrote:
>>>>>>>>
>>>>>>>>> Ruwan,
>>>>>>>>>
>>>>>>>>> It's not a question of possibility, it is a question of available time
>>>>>>>>> :-)
>>>>>>>>>
>>>>>>>>> Also note that some of the features that we might want to implement
>>>>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>>>>> (except that an attachment is only available once, while a file over
>>>>>>>>> VFS can be read several times). I think there is also some existing
>>>>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>>>>> things but try to make the existing code reusable. This however is
>>>>>>>>> only realistic for the next release after 1.3.
>>>>>>>>>
>>>>>>>>> Andreas
>>>>>>>>>
>>>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> > Andreas,
>>>>>>>>> >
>>>>>>>>> > Can we have the caching at the file system as a property to support
>>>>>>>>> the
>>>>>>>>> > multiple layers touching the full message and is it possible make it
>>>>>>>>> to
>>>>>>>>> > specify a threshold for streaming? For example if the message is
>>>>>>>>> touched
>>>>>>>>> > several time we might still need streaming but not for the 100KB or
>>>>>>>>> lesser
>>>>>>>>> > files.
>>>>>>>>> >
>>>>>>>>> > Thanks,
>>>>>>>>> > Ruwan
>>>>>>>>> >
>>>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>>>>> andreas.veithen@gmail.com>
>>>>>>>>> > wrote:
>>>>>>>>> >>
>>>>>>>>> >> I've done an initial implementation of this feature. It is
>>>>>>>>> available
>>>>>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>>>>>> to
>>>>>>>>> >> enable this in your configuration, you need to add the following
>>>>>>>>> >> property to the proxy:
>>>>>>>>> >>
>>>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>>>>> >>
>>>>>>>>> >> You also need to add the following mediators just before the <send>
>>>>>>>>> >> mediator:
>>>>>>>>> >>
>>>>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>>>>> scope="axis2"/>
>>>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>>>>> >>
>>>>>>>>> >> With this configuration Synapse will stream the data directly from
>>>>>>>>> the
>>>>>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>>>>>> in
>>>>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>>>>>> only
>>>>>>>>> >> be opened on demand. In this case this happens during execution of
>>>>>>>>> the
>>>>>>>>> >> <send> mediator.
>>>>>>>>> >> * If during the mediation the content of the file is needed several
>>>>>>>>> >> time (which is not the case in your example), it will be read
>>>>>>>>> several
>>>>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>>>>> >>
>>>>>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>>>>>> the
>>>>>>>>> >> memory consumption is constant.
>>>>>>>>> >>
>>>>>>>>> >> Some additional comments:
>>>>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>>>>>> SOAP
>>>>>>>>> >> processing: this type of content is processed exactly as before.
>>>>>>>>> >> * With the changes described here, we have now two different
>>>>>>>>> policies
>>>>>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>>>>>> no
>>>>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>>>>>> >> should define a wider range of policies in the future, including
>>>>>>>>> file
>>>>>>>>> >> system caching + streaming.
>>>>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>>>>> mediator
>>>>>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>>>>>> >> transport in a separate thread. This property is set by the
>>>>>>>>> incoming
>>>>>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>>>>>> >> why the transport that handles the incoming request should
>>>>>>>>> determine
>>>>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>>>>>> >>
>>>>>>>>> >> Andreas
>>>>>>>>> >>
>>>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>>>>>> wrote:
>>>>>>>>> >> >
>>>>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> > Asankha C. Perera wrote:
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>>>>> Java
>>>>>>>>> >> >>> heap
>>>>>>>>> >> >>> space
>>>>>>>>> >> >>>         at
>>>>>>>>> >> >>>
>>>>>>>>> >> >>>
>>>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>>>>> >> >>>         at
>>>>>>>>> >> >>>
>>>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>>>>>> >> >>>         at
>>>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>>>>> >> >>>         at
>>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>>>>> >> >>>         at
>>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>>>>> >> >>>         at
>>>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>>>>> >> >>>         at
>>>>>>>>> >> >>>
>>>>>>>>> >> >>>
>>>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>>>>> >> >>>
>>>>>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>>>>>> trying
>>>>>>>>> to
>>>>>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>>>>>> content..
>>>>>>>>> >> >>
>>>>>>>>> >> >> A definite bug we need to fix ..
>>>>>>>>> >> >>
>>>>>>>>> >> >> cheers
>>>>>>>>> >> >> asankha
>>>>>>>>> >> >>
>>>>>>>>> >> >> --
>>>>>>>>> >> >> Asankha C. Perera
>>>>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>>>>> >> >>
>>>>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >>
>>>>>>>>> >> >
>>>>>>>>> >> > --
>>>>>>>>> >> > View this message in context:
>>>>>>>>> >> >
>>>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>> >> >
>>>>>>>>> >> >
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>> >>
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > --
>>>>>>>>> > Ruwan Linton
>>>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>>> > http://ruwansblog.blogspot.com/
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Ruwan Linton
>>>>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>> http://ruwansblog.blogspot.com/
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


RE: VFS - Synapse Memory Leak

Posted by Kim Horn <ki...@icsglobal.net>.
Great, now I understand. Solves the problem. I never knew about the "callback Table", another one of these hidden features ? Also Need a way to flush this table manually. How do you how to access the table ?

Thanks
Kim

-----Original Message-----
From: Andreas Veithen [mailto:andreas.veithen@gmail.com] 
Sent: Friday, 20 March 2009 8:05 PM
To: dev@synapse.apache.org
Subject: Re: VFS - Synapse Memory Leak

Of course the memory allocated to a message will be freed once the
message has been processed. That is why it's important to set the
OUT_ONLY property: if it is not set correctly, Synapse will keep the
message context (with the payload) in a callback table to correlate it
with a future response (which in your case never comes in). Probably
there is something to improve here in Synapse:
- The VFS transport should trigger an error if there is a mismatch
between the message exchange pattern and the transport configuration
of the service (the transport.vfs.* parameters).
- Synapse should start issuing warnings when the number of entries in
the callback table reaches a certain threshold.

Andreas

On Fri, Mar 20, 2009 at 01:41, Kim Horn <ki...@icsglobal.net> wrote:
> Not really; I cannot see why memory should permanently grow when I pass the same file
> repeatedly through VFS. In theory this means VFS will always consume all the available memory
> given enough time and file iterations. Therefore VFS cannot be used in a production system.
> This is definition of Memory Leak. I would expect SOME overhead on top of file size but
> I would assume the memory no longer required would be re-claimed. I would also assume
> The overhead was not 10 times the file size; seems excessive.
>
> Yes I understand the streaming approach should in theory use a fixed and much smaller amount of memory;
> but haven't tested that yet either. No reason given above memory leak that it should not permanently grow
> but at a smaller rate aswell.
>
> Thanks
> Kim
>
> -----Original Message-----
> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
> Sent: Friday, 20 March 2009 10:52 AM
> To: dev@synapse.apache.org
> Subject: Re: VFS - Synapse Memory Leak
>
> If N is the size of the file, the memory consumption caused by the
> transport is O(N) with transport.vfs.Streaming=false and O(1) with
> transport.vfs.Streaming=true. The getTextAsStream and writeTextTo
> methods in org.apache.axis2.format.ElementHelper are there to allow
> you to implement your mediator with O(1) memory usage, so that the
> overall memory consumption remains O(1). Does that answer your
> question?
>
> Andreas
>
> On Thu, Mar 19, 2009 at 23:33, Kim Horn <ki...@icsglobal.net> wrote:
>> It's the same Synapse.xml as specified originally and same trace. If you are using Nabble you can see this, in case you lost the prior emails I can post them again.
>>
>> I must admit I did not set those extra parameters, you mentioned, but I don't see why you should set parameter to Stop a memory leak. I guessed these parameter would just reduce the large amounts of memory it appears to be using, e.g. 10 times the file size, via streaming ? Why is their 10 copies of the data floating around ? Lots of buffering. This issue suggests to me that any use of VFS will eventually kill the Server. Even with smaller files it will eventually use all available memory. I guess I did not understand the actual reason for this issue from prior discussion.
>>
>> I will try your extra parameters today though.
>>
>> Thanks
>> Kim
>>
>>
>> -----Original Message-----
>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>> Sent: Thursday, 19 March 2009 5:48 PM
>> To: dev@synapse.apache.org
>> Subject: Re: VFS - Synapse Memory Leak
>>
>> Kim,
>>
>> Can you post your current synapse.xml as well as the stack trace you get now?
>>
>> Andreas
>>
>> On Thu, Mar 19, 2009 at 07:20, kimhorn <ki...@icsglobal.net> wrote:
>>>
>>> Using the last stable build from 15 March 2009 I still get exactly same
>>> behaviour as originally
>>> described with the above script. VFS still just dies. Would your fixes be in
>>> this ?
>>>
>>> Using the last st
>>>
>>> Andreas Veithen-2 wrote:
>>>>
>>>> I committed the code and it will be available in the next WS-Commons
>>>> transport build. The methods are located in
>>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>>>> module.
>>>>
>>>> Andreas
>>>>
>>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>>>>> Hello Andreas,
>>>>> This is great and really helps, have not had time to try it out but will
>>>>> soon.
>>>>>
>>>>> Contributing the java.io.Reader would be a great help but it will take me
>>>>> a while to get up to speed to do the Synapse iterator.
>>>>>
>>>>> In the short term I am going to use a brute force approach that is now
>>>>> feasible given the memory issue is resolved. Just thought of this one
>>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>>>> POJOCommand on <out> to split file into another directory, stream in and
>>>>> out. Another independent VFS proxy watches that directory and submits
>>>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>>>> destination may still be an issue ?
>>>>>
>>>>> Kim
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>>> Sent: Monday, 9 March 2009 10:55 PM
>>>>> To: dev@synapse.apache.org
>>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>>
>>>>> The changes I did in the VFS transport and the message builders for
>>>>> text/plain and application/octet-stream certainly don't provide an
>>>>> out-of-the-box solution for your use case, but they are the
>>>>> prerequisite.
>>>>>
>>>>> Concerning your first proposed solution (let the VFS write the content
>>>>> to a temporary file), I don't like this because it would create a
>>>>> tight coupling between the VFS transport and the mediator. A design
>>>>> goal should be that the solution will still work if the file comes
>>>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>>>
>>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>>>> this will require development of a custom mediator. This mediator
>>>>> would read the content, split it up (and store the chunks in memory or
>>>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>>>> the sub-sequence would happen synchronously to limit the memory/disk
>>>>> space consumption (to the maximum chunk size) and to avoid flooding
>>>>> the destination service.
>>>>>
>>>>> Note that it is probably not possible to implemented the mediator
>>>>> using a script because of the problematic String handling. Also,
>>>>> Spring, POJO and class mediators don't support sub-sequences (I
>>>>> think). Therefore it should be implemented as a full-featured Java
>>>>> mediator, probably taking the existing iterate mediator as a template.
>>>>> I can contribute the required code to get the text content in the form
>>>>> of a java.io.Reader.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Andreas
>>>>>
>>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>>>>
>>>>>> Although this is a good feature it may not solve the actual problem ?
>>>>>> The main first issue on my list was the memory leak.
>>>>>> However, the real problem is once I get this massive files I  have to
>>>>>> send
>>>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>>>> Streaming it straight out would just kill the destination Web service.
>>>>>> It
>>>>>> would get the memory error. The text document can be split apart easily,
>>>>>> as
>>>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>>>
>>>>>> In an earlier post; that was not responded too, I mentioned:
>>>>>>
>>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>>>> through
>>>>>> input file and outputs smaller
>>>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>>>
>>>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>>>
>>>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>>>> process
>>>>>> the file by splitting it into many smaller files. These files then
>>>>>> trigger
>>>>>> another VFS proxy that submits these to the final web Service.
>>>>>> The problem is is that is uses the file system (not so bad).
>>>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>>>> Iterator. So replace the text message with many smaller XML elements.
>>>>>> Problem is that this loads whole message into memory.
>>>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>>>> (to
>>>>>> split the text data) or actually uses a for loop approach to chop the
>>>>>> file
>>>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>>>> chunk
>>>>>> 23 chunks into the data.
>>>>>> 4) Using the approach proposed now - just submit the file straight
>>>>>> (stream
>>>>>> it) to another web service that chops it up. It may return an XML
>>>>>> document
>>>>>> with many sub elelements that allows the standard Iterator to work.
>>>>>> Similar
>>>>>> to (2) but using another service rather than Java to split document.
>>>>>> 5) Using the approach proposed now - just submit the file straight
>>>>>> (stream
>>>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>>>> with
>>>>>> each small packet of data that then forwards it to the final WEb
>>>>>> Service. So
>>>>>> the Web Service iterates across the data; and not Synapse.
>>>>>>
>>>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>>>> the
>>>>>> front end.
>>>>>>
>>>>>> Another issue here is throttling: Splitting the file is one issues but
>>>>>> submitting 100's of calls in parralel to the destination service would
>>>>>> result in time outs... So need to work in throttling.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ruwan Linton wrote:
>>>>>>>
>>>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>>>> stuff
>>>>>>> than trying to invent the wheel again :-)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ruwan
>>>>>>>
>>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>>>> <an...@gmail.com>wrote:
>>>>>>>
>>>>>>>> Ruwan,
>>>>>>>>
>>>>>>>> It's not a question of possibility, it is a question of available time
>>>>>>>> :-)
>>>>>>>>
>>>>>>>> Also note that some of the features that we might want to implement
>>>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>>>> (except that an attachment is only available once, while a file over
>>>>>>>> VFS can be read several times). I think there is also some existing
>>>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>>>> things but try to make the existing code reusable. This however is
>>>>>>>> only realistic for the next release after 1.3.
>>>>>>>>
>>>>>>>> Andreas
>>>>>>>>
>>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> > Andreas,
>>>>>>>> >
>>>>>>>> > Can we have the caching at the file system as a property to support
>>>>>>>> the
>>>>>>>> > multiple layers touching the full message and is it possible make it
>>>>>>>> to
>>>>>>>> > specify a threshold for streaming? For example if the message is
>>>>>>>> touched
>>>>>>>> > several time we might still need streaming but not for the 100KB or
>>>>>>>> lesser
>>>>>>>> > files.
>>>>>>>> >
>>>>>>>> > Thanks,
>>>>>>>> > Ruwan
>>>>>>>> >
>>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>>>> andreas.veithen@gmail.com>
>>>>>>>> > wrote:
>>>>>>>> >>
>>>>>>>> >> I've done an initial implementation of this feature. It is
>>>>>>>> available
>>>>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>>>>> to
>>>>>>>> >> enable this in your configuration, you need to add the following
>>>>>>>> >> property to the proxy:
>>>>>>>> >>
>>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>>>> >>
>>>>>>>> >> You also need to add the following mediators just before the <send>
>>>>>>>> >> mediator:
>>>>>>>> >>
>>>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>>>> scope="axis2"/>
>>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>>>> >>
>>>>>>>> >> With this configuration Synapse will stream the data directly from
>>>>>>>> the
>>>>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>>>>> in
>>>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>>>>> only
>>>>>>>> >> be opened on demand. In this case this happens during execution of
>>>>>>>> the
>>>>>>>> >> <send> mediator.
>>>>>>>> >> * If during the mediation the content of the file is needed several
>>>>>>>> >> time (which is not the case in your example), it will be read
>>>>>>>> several
>>>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>>>> >>
>>>>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>>>>> the
>>>>>>>> >> memory consumption is constant.
>>>>>>>> >>
>>>>>>>> >> Some additional comments:
>>>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>>>>> SOAP
>>>>>>>> >> processing: this type of content is processed exactly as before.
>>>>>>>> >> * With the changes described here, we have now two different
>>>>>>>> policies
>>>>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>>>>> no
>>>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>>>>> >> should define a wider range of policies in the future, including
>>>>>>>> file
>>>>>>>> >> system caching + streaming.
>>>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>>>> mediator
>>>>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>>>>> >> transport in a separate thread. This property is set by the
>>>>>>>> incoming
>>>>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>>>>> >> why the transport that handles the incoming request should
>>>>>>>> determine
>>>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>>>>> >>
>>>>>>>> >> Andreas
>>>>>>>> >>
>>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>>>>> wrote:
>>>>>>>> >> >
>>>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> > Asankha C. Perera wrote:
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>>>> Java
>>>>>>>> >> >>> heap
>>>>>>>> >> >>> space
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> >> >>>
>>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> >> >>>
>>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>>>> >> >>>
>>>>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>>>>> trying
>>>>>>>> to
>>>>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>>>>> content..
>>>>>>>> >> >>
>>>>>>>> >> >> A definite bug we need to fix ..
>>>>>>>> >> >>
>>>>>>>> >> >> cheers
>>>>>>>> >> >> asankha
>>>>>>>> >> >>
>>>>>>>> >> >> --
>>>>>>>> >> >> Asankha C. Perera
>>>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>>>> >> >>
>>>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >
>>>>>>>> >> > --
>>>>>>>> >> > View this message in context:
>>>>>>>> >> >
>>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >>
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Ruwan Linton
>>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>> > http://ruwansblog.blogspot.com/
>>>>>>>> >
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ruwan Linton
>>>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>> http://ruwansblog.blogspot.com/
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


RE: VFS - Synapse Memory Leak

Posted by Kim Horn <ki...@icsglobal.net>.
Hello Andreas,

This all works really well. Streaming uses no memory at all.
Got a java mediator also streaming payloads and with massive files never 
uses much more than 40K for Synapse.

Using just the OUT_ONLY property set; uses much more memory but it stabilises
and does not grow.
Thanks.

-----Original Message-----
From: Andreas Veithen [mailto:andreas.veithen@gmail.com] 
Sent: Friday, 20 March 2009 8:05 PM
To: dev@synapse.apache.org
Subject: Re: VFS - Synapse Memory Leak

Of course the memory allocated to a message will be freed once the
message has been processed. That is why it's important to set the
OUT_ONLY property: if it is not set correctly, Synapse will keep the
message context (with the payload) in a callback table to correlate it
with a future response (which in your case never comes in). Probably
there is something to improve here in Synapse:
- The VFS transport should trigger an error if there is a mismatch
between the message exchange pattern and the transport configuration
of the service (the transport.vfs.* parameters).
- Synapse should start issuing warnings when the number of entries in
the callback table reaches a certain threshold.

Andreas

On Fri, Mar 20, 2009 at 01:41, Kim Horn <ki...@icsglobal.net> wrote:
> Not really; I cannot see why memory should permanently grow when I pass the same file
> repeatedly through VFS. In theory this means VFS will always consume all the available memory
> given enough time and file iterations. Therefore VFS cannot be used in a production system.
> This is definition of Memory Leak. I would expect SOME overhead on top of file size but
> I would assume the memory no longer required would be re-claimed. I would also assume
> The overhead was not 10 times the file size; seems excessive.
>
> Yes I understand the streaming approach should in theory use a fixed and much smaller amount of memory;
> but haven't tested that yet either. No reason given above memory leak that it should not permanently grow
> but at a smaller rate aswell.
>
> Thanks
> Kim
>
> -----Original Message-----
> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
> Sent: Friday, 20 March 2009 10:52 AM
> To: dev@synapse.apache.org
> Subject: Re: VFS - Synapse Memory Leak
>
> If N is the size of the file, the memory consumption caused by the
> transport is O(N) with transport.vfs.Streaming=false and O(1) with
> transport.vfs.Streaming=true. The getTextAsStream and writeTextTo
> methods in org.apache.axis2.format.ElementHelper are there to allow
> you to implement your mediator with O(1) memory usage, so that the
> overall memory consumption remains O(1). Does that answer your
> question?
>
> Andreas
>
> On Thu, Mar 19, 2009 at 23:33, Kim Horn <ki...@icsglobal.net> wrote:
>> It's the same Synapse.xml as specified originally and same trace. If you are using Nabble you can see this, in case you lost the prior emails I can post them again.
>>
>> I must admit I did not set those extra parameters, you mentioned, but I don't see why you should set parameter to Stop a memory leak. I guessed these parameter would just reduce the large amounts of memory it appears to be using, e.g. 10 times the file size, via streaming ? Why is their 10 copies of the data floating around ? Lots of buffering. This issue suggests to me that any use of VFS will eventually kill the Server. Even with smaller files it will eventually use all available memory. I guess I did not understand the actual reason for this issue from prior discussion.
>>
>> I will try your extra parameters today though.
>>
>> Thanks
>> Kim
>>
>>
>> -----Original Message-----
>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>> Sent: Thursday, 19 March 2009 5:48 PM
>> To: dev@synapse.apache.org
>> Subject: Re: VFS - Synapse Memory Leak
>>
>> Kim,
>>
>> Can you post your current synapse.xml as well as the stack trace you get now?
>>
>> Andreas
>>
>> On Thu, Mar 19, 2009 at 07:20, kimhorn <ki...@icsglobal.net> wrote:
>>>
>>> Using the last stable build from 15 March 2009 I still get exactly same
>>> behaviour as originally
>>> described with the above script. VFS still just dies. Would your fixes be in
>>> this ?
>>>
>>> Using the last st
>>>
>>> Andreas Veithen-2 wrote:
>>>>
>>>> I committed the code and it will be available in the next WS-Commons
>>>> transport build. The methods are located in
>>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>>>> module.
>>>>
>>>> Andreas
>>>>
>>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>>>>> Hello Andreas,
>>>>> This is great and really helps, have not had time to try it out but will
>>>>> soon.
>>>>>
>>>>> Contributing the java.io.Reader would be a great help but it will take me
>>>>> a while to get up to speed to do the Synapse iterator.
>>>>>
>>>>> In the short term I am going to use a brute force approach that is now
>>>>> feasible given the memory issue is resolved. Just thought of this one
>>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>>>> POJOCommand on <out> to split file into another directory, stream in and
>>>>> out. Another independent VFS proxy watches that directory and submits
>>>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>>>> destination may still be an issue ?
>>>>>
>>>>> Kim
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>>> Sent: Monday, 9 March 2009 10:55 PM
>>>>> To: dev@synapse.apache.org
>>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>>
>>>>> The changes I did in the VFS transport and the message builders for
>>>>> text/plain and application/octet-stream certainly don't provide an
>>>>> out-of-the-box solution for your use case, but they are the
>>>>> prerequisite.
>>>>>
>>>>> Concerning your first proposed solution (let the VFS write the content
>>>>> to a temporary file), I don't like this because it would create a
>>>>> tight coupling between the VFS transport and the mediator. A design
>>>>> goal should be that the solution will still work if the file comes
>>>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>>>
>>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>>>> this will require development of a custom mediator. This mediator
>>>>> would read the content, split it up (and store the chunks in memory or
>>>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>>>> the sub-sequence would happen synchronously to limit the memory/disk
>>>>> space consumption (to the maximum chunk size) and to avoid flooding
>>>>> the destination service.
>>>>>
>>>>> Note that it is probably not possible to implemented the mediator
>>>>> using a script because of the problematic String handling. Also,
>>>>> Spring, POJO and class mediators don't support sub-sequences (I
>>>>> think). Therefore it should be implemented as a full-featured Java
>>>>> mediator, probably taking the existing iterate mediator as a template.
>>>>> I can contribute the required code to get the text content in the form
>>>>> of a java.io.Reader.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Andreas
>>>>>
>>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>>>>
>>>>>> Although this is a good feature it may not solve the actual problem ?
>>>>>> The main first issue on my list was the memory leak.
>>>>>> However, the real problem is once I get this massive files I  have to
>>>>>> send
>>>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>>>> Streaming it straight out would just kill the destination Web service.
>>>>>> It
>>>>>> would get the memory error. The text document can be split apart easily,
>>>>>> as
>>>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>>>
>>>>>> In an earlier post; that was not responded too, I mentioned:
>>>>>>
>>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>>>> through
>>>>>> input file and outputs smaller
>>>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>>>
>>>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>>>
>>>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>>>> process
>>>>>> the file by splitting it into many smaller files. These files then
>>>>>> trigger
>>>>>> another VFS proxy that submits these to the final web Service.
>>>>>> The problem is is that is uses the file system (not so bad).
>>>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>>>> Iterator. So replace the text message with many smaller XML elements.
>>>>>> Problem is that this loads whole message into memory.
>>>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>>>> (to
>>>>>> split the text data) or actually uses a for loop approach to chop the
>>>>>> file
>>>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>>>> chunk
>>>>>> 23 chunks into the data.
>>>>>> 4) Using the approach proposed now - just submit the file straight
>>>>>> (stream
>>>>>> it) to another web service that chops it up. It may return an XML
>>>>>> document
>>>>>> with many sub elelements that allows the standard Iterator to work.
>>>>>> Similar
>>>>>> to (2) but using another service rather than Java to split document.
>>>>>> 5) Using the approach proposed now - just submit the file straight
>>>>>> (stream
>>>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>>>> with
>>>>>> each small packet of data that then forwards it to the final WEb
>>>>>> Service. So
>>>>>> the Web Service iterates across the data; and not Synapse.
>>>>>>
>>>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>>>> the
>>>>>> front end.
>>>>>>
>>>>>> Another issue here is throttling: Splitting the file is one issues but
>>>>>> submitting 100's of calls in parralel to the destination service would
>>>>>> result in time outs... So need to work in throttling.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ruwan Linton wrote:
>>>>>>>
>>>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>>>> stuff
>>>>>>> than trying to invent the wheel again :-)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ruwan
>>>>>>>
>>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>>>> <an...@gmail.com>wrote:
>>>>>>>
>>>>>>>> Ruwan,
>>>>>>>>
>>>>>>>> It's not a question of possibility, it is a question of available time
>>>>>>>> :-)
>>>>>>>>
>>>>>>>> Also note that some of the features that we might want to implement
>>>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>>>> (except that an attachment is only available once, while a file over
>>>>>>>> VFS can be read several times). I think there is also some existing
>>>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>>>> things but try to make the existing code reusable. This however is
>>>>>>>> only realistic for the next release after 1.3.
>>>>>>>>
>>>>>>>> Andreas
>>>>>>>>
>>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> > Andreas,
>>>>>>>> >
>>>>>>>> > Can we have the caching at the file system as a property to support
>>>>>>>> the
>>>>>>>> > multiple layers touching the full message and is it possible make it
>>>>>>>> to
>>>>>>>> > specify a threshold for streaming? For example if the message is
>>>>>>>> touched
>>>>>>>> > several time we might still need streaming but not for the 100KB or
>>>>>>>> lesser
>>>>>>>> > files.
>>>>>>>> >
>>>>>>>> > Thanks,
>>>>>>>> > Ruwan
>>>>>>>> >
>>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>>>> andreas.veithen@gmail.com>
>>>>>>>> > wrote:
>>>>>>>> >>
>>>>>>>> >> I've done an initial implementation of this feature. It is
>>>>>>>> available
>>>>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>>>>> to
>>>>>>>> >> enable this in your configuration, you need to add the following
>>>>>>>> >> property to the proxy:
>>>>>>>> >>
>>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>>>> >>
>>>>>>>> >> You also need to add the following mediators just before the <send>
>>>>>>>> >> mediator:
>>>>>>>> >>
>>>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>>>> scope="axis2"/>
>>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>>>> >>
>>>>>>>> >> With this configuration Synapse will stream the data directly from
>>>>>>>> the
>>>>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>>>>> in
>>>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>>>>> only
>>>>>>>> >> be opened on demand. In this case this happens during execution of
>>>>>>>> the
>>>>>>>> >> <send> mediator.
>>>>>>>> >> * If during the mediation the content of the file is needed several
>>>>>>>> >> time (which is not the case in your example), it will be read
>>>>>>>> several
>>>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>>>> >>
>>>>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>>>>> the
>>>>>>>> >> memory consumption is constant.
>>>>>>>> >>
>>>>>>>> >> Some additional comments:
>>>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>>>>> SOAP
>>>>>>>> >> processing: this type of content is processed exactly as before.
>>>>>>>> >> * With the changes described here, we have now two different
>>>>>>>> policies
>>>>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>>>>> no
>>>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>>>>> >> should define a wider range of policies in the future, including
>>>>>>>> file
>>>>>>>> >> system caching + streaming.
>>>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>>>> mediator
>>>>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>>>>> >> transport in a separate thread. This property is set by the
>>>>>>>> incoming
>>>>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>>>>> >> why the transport that handles the incoming request should
>>>>>>>> determine
>>>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>>>>> >>
>>>>>>>> >> Andreas
>>>>>>>> >>
>>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>>>>> wrote:
>>>>>>>> >> >
>>>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> > Asankha C. Perera wrote:
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>>>> Java
>>>>>>>> >> >>> heap
>>>>>>>> >> >>> space
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> >> >>>
>>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> >> >>>
>>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>>>> >> >>>
>>>>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>>>>> trying
>>>>>>>> to
>>>>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>>>>> content..
>>>>>>>> >> >>
>>>>>>>> >> >> A definite bug we need to fix ..
>>>>>>>> >> >>
>>>>>>>> >> >> cheers
>>>>>>>> >> >> asankha
>>>>>>>> >> >>
>>>>>>>> >> >> --
>>>>>>>> >> >> Asankha C. Perera
>>>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>>>> >> >>
>>>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >
>>>>>>>> >> > --
>>>>>>>> >> > View this message in context:
>>>>>>>> >> >
>>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >>
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Ruwan Linton
>>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>> > http://ruwansblog.blogspot.com/
>>>>>>>> >
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ruwan Linton
>>>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>> http://ruwansblog.blogspot.com/
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by Andreas Veithen <an...@gmail.com>.
Of course the memory allocated to a message will be freed once the
message has been processed. That is why it's important to set the
OUT_ONLY property: if it is not set correctly, Synapse will keep the
message context (with the payload) in a callback table to correlate it
with a future response (which in your case never comes in). Probably
there is something to improve here in Synapse:
- The VFS transport should trigger an error if there is a mismatch
between the message exchange pattern and the transport configuration
of the service (the transport.vfs.* parameters).
- Synapse should start issuing warnings when the number of entries in
the callback table reaches a certain threshold.

Andreas

On Fri, Mar 20, 2009 at 01:41, Kim Horn <ki...@icsglobal.net> wrote:
> Not really; I cannot see why memory should permanently grow when I pass the same file
> repeatedly through VFS. In theory this means VFS will always consume all the available memory
> given enough time and file iterations. Therefore VFS cannot be used in a production system.
> This is definition of Memory Leak. I would expect SOME overhead on top of file size but
> I would assume the memory no longer required would be re-claimed. I would also assume
> The overhead was not 10 times the file size; seems excessive.
>
> Yes I understand the streaming approach should in theory use a fixed and much smaller amount of memory;
> but haven't tested that yet either. No reason given above memory leak that it should not permanently grow
> but at a smaller rate aswell.
>
> Thanks
> Kim
>
> -----Original Message-----
> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
> Sent: Friday, 20 March 2009 10:52 AM
> To: dev@synapse.apache.org
> Subject: Re: VFS - Synapse Memory Leak
>
> If N is the size of the file, the memory consumption caused by the
> transport is O(N) with transport.vfs.Streaming=false and O(1) with
> transport.vfs.Streaming=true. The getTextAsStream and writeTextTo
> methods in org.apache.axis2.format.ElementHelper are there to allow
> you to implement your mediator with O(1) memory usage, so that the
> overall memory consumption remains O(1). Does that answer your
> question?
>
> Andreas
>
> On Thu, Mar 19, 2009 at 23:33, Kim Horn <ki...@icsglobal.net> wrote:
>> It's the same Synapse.xml as specified originally and same trace. If you are using Nabble you can see this, in case you lost the prior emails I can post them again.
>>
>> I must admit I did not set those extra parameters, you mentioned, but I don't see why you should set parameter to Stop a memory leak. I guessed these parameter would just reduce the large amounts of memory it appears to be using, e.g. 10 times the file size, via streaming ? Why is their 10 copies of the data floating around ? Lots of buffering. This issue suggests to me that any use of VFS will eventually kill the Server. Even with smaller files it will eventually use all available memory. I guess I did not understand the actual reason for this issue from prior discussion.
>>
>> I will try your extra parameters today though.
>>
>> Thanks
>> Kim
>>
>>
>> -----Original Message-----
>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>> Sent: Thursday, 19 March 2009 5:48 PM
>> To: dev@synapse.apache.org
>> Subject: Re: VFS - Synapse Memory Leak
>>
>> Kim,
>>
>> Can you post your current synapse.xml as well as the stack trace you get now?
>>
>> Andreas
>>
>> On Thu, Mar 19, 2009 at 07:20, kimhorn <ki...@icsglobal.net> wrote:
>>>
>>> Using the last stable build from 15 March 2009 I still get exactly same
>>> behaviour as originally
>>> described with the above script. VFS still just dies. Would your fixes be in
>>> this ?
>>>
>>> Using the last st
>>>
>>> Andreas Veithen-2 wrote:
>>>>
>>>> I committed the code and it will be available in the next WS-Commons
>>>> transport build. The methods are located in
>>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>>>> module.
>>>>
>>>> Andreas
>>>>
>>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>>>>> Hello Andreas,
>>>>> This is great and really helps, have not had time to try it out but will
>>>>> soon.
>>>>>
>>>>> Contributing the java.io.Reader would be a great help but it will take me
>>>>> a while to get up to speed to do the Synapse iterator.
>>>>>
>>>>> In the short term I am going to use a brute force approach that is now
>>>>> feasible given the memory issue is resolved. Just thought of this one
>>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>>>> POJOCommand on <out> to split file into another directory, stream in and
>>>>> out. Another independent VFS proxy watches that directory and submits
>>>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>>>> destination may still be an issue ?
>>>>>
>>>>> Kim
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>>> Sent: Monday, 9 March 2009 10:55 PM
>>>>> To: dev@synapse.apache.org
>>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>>
>>>>> The changes I did in the VFS transport and the message builders for
>>>>> text/plain and application/octet-stream certainly don't provide an
>>>>> out-of-the-box solution for your use case, but they are the
>>>>> prerequisite.
>>>>>
>>>>> Concerning your first proposed solution (let the VFS write the content
>>>>> to a temporary file), I don't like this because it would create a
>>>>> tight coupling between the VFS transport and the mediator. A design
>>>>> goal should be that the solution will still work if the file comes
>>>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>>>
>>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>>>> this will require development of a custom mediator. This mediator
>>>>> would read the content, split it up (and store the chunks in memory or
>>>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>>>> the sub-sequence would happen synchronously to limit the memory/disk
>>>>> space consumption (to the maximum chunk size) and to avoid flooding
>>>>> the destination service.
>>>>>
>>>>> Note that it is probably not possible to implemented the mediator
>>>>> using a script because of the problematic String handling. Also,
>>>>> Spring, POJO and class mediators don't support sub-sequences (I
>>>>> think). Therefore it should be implemented as a full-featured Java
>>>>> mediator, probably taking the existing iterate mediator as a template.
>>>>> I can contribute the required code to get the text content in the form
>>>>> of a java.io.Reader.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Andreas
>>>>>
>>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>>>>
>>>>>> Although this is a good feature it may not solve the actual problem ?
>>>>>> The main first issue on my list was the memory leak.
>>>>>> However, the real problem is once I get this massive files I  have to
>>>>>> send
>>>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>>>> Streaming it straight out would just kill the destination Web service.
>>>>>> It
>>>>>> would get the memory error. The text document can be split apart easily,
>>>>>> as
>>>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>>>
>>>>>> In an earlier post; that was not responded too, I mentioned:
>>>>>>
>>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>>>> through
>>>>>> input file and outputs smaller
>>>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>>>
>>>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>>>
>>>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>>>> process
>>>>>> the file by splitting it into many smaller files. These files then
>>>>>> trigger
>>>>>> another VFS proxy that submits these to the final web Service.
>>>>>> The problem is is that is uses the file system (not so bad).
>>>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>>>> Iterator. So replace the text message with many smaller XML elements.
>>>>>> Problem is that this loads whole message into memory.
>>>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>>>> (to
>>>>>> split the text data) or actually uses a for loop approach to chop the
>>>>>> file
>>>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>>>> chunk
>>>>>> 23 chunks into the data.
>>>>>> 4) Using the approach proposed now - just submit the file straight
>>>>>> (stream
>>>>>> it) to another web service that chops it up. It may return an XML
>>>>>> document
>>>>>> with many sub elelements that allows the standard Iterator to work.
>>>>>> Similar
>>>>>> to (2) but using another service rather than Java to split document.
>>>>>> 5) Using the approach proposed now - just submit the file straight
>>>>>> (stream
>>>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>>>> with
>>>>>> each small packet of data that then forwards it to the final WEb
>>>>>> Service. So
>>>>>> the Web Service iterates across the data; and not Synapse.
>>>>>>
>>>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>>>> the
>>>>>> front end.
>>>>>>
>>>>>> Another issue here is throttling: Splitting the file is one issues but
>>>>>> submitting 100's of calls in parralel to the destination service would
>>>>>> result in time outs... So need to work in throttling.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ruwan Linton wrote:
>>>>>>>
>>>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>>>> stuff
>>>>>>> than trying to invent the wheel again :-)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ruwan
>>>>>>>
>>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>>>> <an...@gmail.com>wrote:
>>>>>>>
>>>>>>>> Ruwan,
>>>>>>>>
>>>>>>>> It's not a question of possibility, it is a question of available time
>>>>>>>> :-)
>>>>>>>>
>>>>>>>> Also note that some of the features that we might want to implement
>>>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>>>> (except that an attachment is only available once, while a file over
>>>>>>>> VFS can be read several times). I think there is also some existing
>>>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>>>> things but try to make the existing code reusable. This however is
>>>>>>>> only realistic for the next release after 1.3.
>>>>>>>>
>>>>>>>> Andreas
>>>>>>>>
>>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> > Andreas,
>>>>>>>> >
>>>>>>>> > Can we have the caching at the file system as a property to support
>>>>>>>> the
>>>>>>>> > multiple layers touching the full message and is it possible make it
>>>>>>>> to
>>>>>>>> > specify a threshold for streaming? For example if the message is
>>>>>>>> touched
>>>>>>>> > several time we might still need streaming but not for the 100KB or
>>>>>>>> lesser
>>>>>>>> > files.
>>>>>>>> >
>>>>>>>> > Thanks,
>>>>>>>> > Ruwan
>>>>>>>> >
>>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>>>> andreas.veithen@gmail.com>
>>>>>>>> > wrote:
>>>>>>>> >>
>>>>>>>> >> I've done an initial implementation of this feature. It is
>>>>>>>> available
>>>>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>>>>> to
>>>>>>>> >> enable this in your configuration, you need to add the following
>>>>>>>> >> property to the proxy:
>>>>>>>> >>
>>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>>>> >>
>>>>>>>> >> You also need to add the following mediators just before the <send>
>>>>>>>> >> mediator:
>>>>>>>> >>
>>>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>>>> scope="axis2"/>
>>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>>>> >>
>>>>>>>> >> With this configuration Synapse will stream the data directly from
>>>>>>>> the
>>>>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>>>>> in
>>>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>>>>> only
>>>>>>>> >> be opened on demand. In this case this happens during execution of
>>>>>>>> the
>>>>>>>> >> <send> mediator.
>>>>>>>> >> * If during the mediation the content of the file is needed several
>>>>>>>> >> time (which is not the case in your example), it will be read
>>>>>>>> several
>>>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>>>> >>
>>>>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>>>>> the
>>>>>>>> >> memory consumption is constant.
>>>>>>>> >>
>>>>>>>> >> Some additional comments:
>>>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>>>>> SOAP
>>>>>>>> >> processing: this type of content is processed exactly as before.
>>>>>>>> >> * With the changes described here, we have now two different
>>>>>>>> policies
>>>>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>>>>> no
>>>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>>>>> >> should define a wider range of policies in the future, including
>>>>>>>> file
>>>>>>>> >> system caching + streaming.
>>>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>>>> mediator
>>>>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>>>>> >> transport in a separate thread. This property is set by the
>>>>>>>> incoming
>>>>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>>>>> >> why the transport that handles the incoming request should
>>>>>>>> determine
>>>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>>>>> >>
>>>>>>>> >> Andreas
>>>>>>>> >>
>>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>>>>> wrote:
>>>>>>>> >> >
>>>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> > Asankha C. Perera wrote:
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>>>> Java
>>>>>>>> >> >>> heap
>>>>>>>> >> >>> space
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> >> >>>
>>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>>>> >> >>>         at
>>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>>>> >> >>>         at
>>>>>>>> >> >>>
>>>>>>>> >> >>>
>>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>>>> >> >>>
>>>>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>>>>> trying
>>>>>>>> to
>>>>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>>>>> content..
>>>>>>>> >> >>
>>>>>>>> >> >> A definite bug we need to fix ..
>>>>>>>> >> >>
>>>>>>>> >> >> cheers
>>>>>>>> >> >> asankha
>>>>>>>> >> >>
>>>>>>>> >> >> --
>>>>>>>> >> >> Asankha C. Perera
>>>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>>>> >> >>
>>>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >
>>>>>>>> >> > --
>>>>>>>> >> > View this message in context:
>>>>>>>> >> >
>>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>> >>
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Ruwan Linton
>>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>>> > http://ruwansblog.blogspot.com/
>>>>>>>> >
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ruwan Linton
>>>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>> http://ruwansblog.blogspot.com/
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>>
>>>
>>> --
>>> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


RE: VFS - Synapse Memory Leak

Posted by Kim Horn <ki...@icsglobal.net>.
Not really; I cannot see why memory should permanently grow when I pass the same file
repeatedly through VFS. In theory this means VFS will always consume all the available memory
given enough time and file iterations. Therefore VFS cannot be used in a production system.
This is definition of Memory Leak. I would expect SOME overhead on top of file size but
I would assume the memory no longer required would be re-claimed. I would also assume
The overhead was not 10 times the file size; seems excessive.

Yes I understand the streaming approach should in theory use a fixed and much smaller amount of memory;
but haven't tested that yet either. No reason given above memory leak that it should not permanently grow
but at a smaller rate aswell.

Thanks 
Kim

-----Original Message-----
From: Andreas Veithen [mailto:andreas.veithen@gmail.com] 
Sent: Friday, 20 March 2009 10:52 AM
To: dev@synapse.apache.org
Subject: Re: VFS - Synapse Memory Leak

If N is the size of the file, the memory consumption caused by the
transport is O(N) with transport.vfs.Streaming=false and O(1) with
transport.vfs.Streaming=true. The getTextAsStream and writeTextTo
methods in org.apache.axis2.format.ElementHelper are there to allow
you to implement your mediator with O(1) memory usage, so that the
overall memory consumption remains O(1). Does that answer your
question?

Andreas

On Thu, Mar 19, 2009 at 23:33, Kim Horn <ki...@icsglobal.net> wrote:
> It's the same Synapse.xml as specified originally and same trace. If you are using Nabble you can see this, in case you lost the prior emails I can post them again.
>
> I must admit I did not set those extra parameters, you mentioned, but I don't see why you should set parameter to Stop a memory leak. I guessed these parameter would just reduce the large amounts of memory it appears to be using, e.g. 10 times the file size, via streaming ? Why is their 10 copies of the data floating around ? Lots of buffering. This issue suggests to me that any use of VFS will eventually kill the Server. Even with smaller files it will eventually use all available memory. I guess I did not understand the actual reason for this issue from prior discussion.
>
> I will try your extra parameters today though.
>
> Thanks
> Kim
>
>
> -----Original Message-----
> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
> Sent: Thursday, 19 March 2009 5:48 PM
> To: dev@synapse.apache.org
> Subject: Re: VFS - Synapse Memory Leak
>
> Kim,
>
> Can you post your current synapse.xml as well as the stack trace you get now?
>
> Andreas
>
> On Thu, Mar 19, 2009 at 07:20, kimhorn <ki...@icsglobal.net> wrote:
>>
>> Using the last stable build from 15 March 2009 I still get exactly same
>> behaviour as originally
>> described with the above script. VFS still just dies. Would your fixes be in
>> this ?
>>
>> Using the last st
>>
>> Andreas Veithen-2 wrote:
>>>
>>> I committed the code and it will be available in the next WS-Commons
>>> transport build. The methods are located in
>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>>> module.
>>>
>>> Andreas
>>>
>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>>>> Hello Andreas,
>>>> This is great and really helps, have not had time to try it out but will
>>>> soon.
>>>>
>>>> Contributing the java.io.Reader would be a great help but it will take me
>>>> a while to get up to speed to do the Synapse iterator.
>>>>
>>>> In the short term I am going to use a brute force approach that is now
>>>> feasible given the memory issue is resolved. Just thought of this one
>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>>> POJOCommand on <out> to split file into another directory, stream in and
>>>> out. Another independent VFS proxy watches that directory and submits
>>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>>> destination may still be an issue ?
>>>>
>>>> Kim
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>> Sent: Monday, 9 March 2009 10:55 PM
>>>> To: dev@synapse.apache.org
>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>
>>>> The changes I did in the VFS transport and the message builders for
>>>> text/plain and application/octet-stream certainly don't provide an
>>>> out-of-the-box solution for your use case, but they are the
>>>> prerequisite.
>>>>
>>>> Concerning your first proposed solution (let the VFS write the content
>>>> to a temporary file), I don't like this because it would create a
>>>> tight coupling between the VFS transport and the mediator. A design
>>>> goal should be that the solution will still work if the file comes
>>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>>
>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>>> this will require development of a custom mediator. This mediator
>>>> would read the content, split it up (and store the chunks in memory or
>>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>>> the sub-sequence would happen synchronously to limit the memory/disk
>>>> space consumption (to the maximum chunk size) and to avoid flooding
>>>> the destination service.
>>>>
>>>> Note that it is probably not possible to implemented the mediator
>>>> using a script because of the problematic String handling. Also,
>>>> Spring, POJO and class mediators don't support sub-sequences (I
>>>> think). Therefore it should be implemented as a full-featured Java
>>>> mediator, probably taking the existing iterate mediator as a template.
>>>> I can contribute the required code to get the text content in the form
>>>> of a java.io.Reader.
>>>>
>>>> Regards,
>>>>
>>>> Andreas
>>>>
>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>>>
>>>>> Although this is a good feature it may not solve the actual problem ?
>>>>> The main first issue on my list was the memory leak.
>>>>> However, the real problem is once I get this massive files I  have to
>>>>> send
>>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>>> Streaming it straight out would just kill the destination Web service.
>>>>> It
>>>>> would get the memory error. The text document can be split apart easily,
>>>>> as
>>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>>
>>>>> In an earlier post; that was not responded too, I mentioned:
>>>>>
>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>>> through
>>>>> input file and outputs smaller
>>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>>
>>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>>
>>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>>> process
>>>>> the file by splitting it into many smaller files. These files then
>>>>> trigger
>>>>> another VFS proxy that submits these to the final web Service.
>>>>> The problem is is that is uses the file system (not so bad).
>>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>>> Iterator. So replace the text message with many smaller XML elements.
>>>>> Problem is that this loads whole message into memory.
>>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>>> (to
>>>>> split the text data) or actually uses a for loop approach to chop the
>>>>> file
>>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>>> chunk
>>>>> 23 chunks into the data.
>>>>> 4) Using the approach proposed now - just submit the file straight
>>>>> (stream
>>>>> it) to another web service that chops it up. It may return an XML
>>>>> document
>>>>> with many sub elelements that allows the standard Iterator to work.
>>>>> Similar
>>>>> to (2) but using another service rather than Java to split document.
>>>>> 5) Using the approach proposed now - just submit the file straight
>>>>> (stream
>>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>>> with
>>>>> each small packet of data that then forwards it to the final WEb
>>>>> Service. So
>>>>> the Web Service iterates across the data; and not Synapse.
>>>>>
>>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>>> the
>>>>> front end.
>>>>>
>>>>> Another issue here is throttling: Splitting the file is one issues but
>>>>> submitting 100's of calls in parralel to the destination service would
>>>>> result in time outs... So need to work in throttling.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Ruwan Linton wrote:
>>>>>>
>>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>>> stuff
>>>>>> than trying to invent the wheel again :-)
>>>>>>
>>>>>> Thanks,
>>>>>> Ruwan
>>>>>>
>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>>> <an...@gmail.com>wrote:
>>>>>>
>>>>>>> Ruwan,
>>>>>>>
>>>>>>> It's not a question of possibility, it is a question of available time
>>>>>>> :-)
>>>>>>>
>>>>>>> Also note that some of the features that we might want to implement
>>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>>> (except that an attachment is only available once, while a file over
>>>>>>> VFS can be read several times). I think there is also some existing
>>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>>> things but try to make the existing code reusable. This however is
>>>>>>> only realistic for the next release after 1.3.
>>>>>>>
>>>>>>> Andreas
>>>>>>>
>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>>>> wrote:
>>>>>>> > Andreas,
>>>>>>> >
>>>>>>> > Can we have the caching at the file system as a property to support
>>>>>>> the
>>>>>>> > multiple layers touching the full message and is it possible make it
>>>>>>> to
>>>>>>> > specify a threshold for streaming? For example if the message is
>>>>>>> touched
>>>>>>> > several time we might still need streaming but not for the 100KB or
>>>>>>> lesser
>>>>>>> > files.
>>>>>>> >
>>>>>>> > Thanks,
>>>>>>> > Ruwan
>>>>>>> >
>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>>> andreas.veithen@gmail.com>
>>>>>>> > wrote:
>>>>>>> >>
>>>>>>> >> I've done an initial implementation of this feature. It is
>>>>>>> available
>>>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>>>> to
>>>>>>> >> enable this in your configuration, you need to add the following
>>>>>>> >> property to the proxy:
>>>>>>> >>
>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>>> >>
>>>>>>> >> You also need to add the following mediators just before the <send>
>>>>>>> >> mediator:
>>>>>>> >>
>>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>>> scope="axis2"/>
>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>>> >>
>>>>>>> >> With this configuration Synapse will stream the data directly from
>>>>>>> the
>>>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>>>> in
>>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>>>> only
>>>>>>> >> be opened on demand. In this case this happens during execution of
>>>>>>> the
>>>>>>> >> <send> mediator.
>>>>>>> >> * If during the mediation the content of the file is needed several
>>>>>>> >> time (which is not the case in your example), it will be read
>>>>>>> several
>>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>>> >>
>>>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>>>> the
>>>>>>> >> memory consumption is constant.
>>>>>>> >>
>>>>>>> >> Some additional comments:
>>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>>>> SOAP
>>>>>>> >> processing: this type of content is processed exactly as before.
>>>>>>> >> * With the changes described here, we have now two different
>>>>>>> policies
>>>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>>>> no
>>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>>>> >> should define a wider range of policies in the future, including
>>>>>>> file
>>>>>>> >> system caching + streaming.
>>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>>> mediator
>>>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>>>> >> transport in a separate thread. This property is set by the
>>>>>>> incoming
>>>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>>>> >> why the transport that handles the incoming request should
>>>>>>> determine
>>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>>>> >>
>>>>>>> >> Andreas
>>>>>>> >>
>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>>>> wrote:
>>>>>>> >> >
>>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> >> > Asankha C. Perera wrote:
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>>> Java
>>>>>>> >> >>> heap
>>>>>>> >> >>> space
>>>>>>> >> >>>         at
>>>>>>> >> >>>
>>>>>>> >> >>>
>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>>> >> >>>         at
>>>>>>> >> >>>
>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>>>> >> >>>         at
>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>>> >> >>>         at
>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>>> >> >>>         at
>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>>> >> >>>         at
>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>>> >> >>>         at
>>>>>>> >> >>>
>>>>>>> >> >>>
>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>>> >> >>>
>>>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>>>> trying
>>>>>>> to
>>>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>>>> content..
>>>>>>> >> >>
>>>>>>> >> >> A definite bug we need to fix ..
>>>>>>> >> >>
>>>>>>> >> >> cheers
>>>>>>> >> >> asankha
>>>>>>> >> >>
>>>>>>> >> >> --
>>>>>>> >> >> Asankha C. Perera
>>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>>> >> >>
>>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >
>>>>>>> >> > --
>>>>>>> >> > View this message in context:
>>>>>>> >> >
>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> ---------------------------------------------------------------------
>>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>> >>
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > Ruwan Linton
>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>> > http://ruwansblog.blogspot.com/
>>>>>>> >
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ruwan Linton
>>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>> http://ruwansblog.blogspot.com/
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>>
>>
>> --
>> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by Andreas Veithen <an...@gmail.com>.
If N is the size of the file, the memory consumption caused by the
transport is O(N) with transport.vfs.Streaming=false and O(1) with
transport.vfs.Streaming=true. The getTextAsStream and writeTextTo
methods in org.apache.axis2.format.ElementHelper are there to allow
you to implement your mediator with O(1) memory usage, so that the
overall memory consumption remains O(1). Does that answer your
question?

Andreas

On Thu, Mar 19, 2009 at 23:33, Kim Horn <ki...@icsglobal.net> wrote:
> It's the same Synapse.xml as specified originally and same trace. If you are using Nabble you can see this, in case you lost the prior emails I can post them again.
>
> I must admit I did not set those extra parameters, you mentioned, but I don't see why you should set parameter to Stop a memory leak. I guessed these parameter would just reduce the large amounts of memory it appears to be using, e.g. 10 times the file size, via streaming ? Why is their 10 copies of the data floating around ? Lots of buffering. This issue suggests to me that any use of VFS will eventually kill the Server. Even with smaller files it will eventually use all available memory. I guess I did not understand the actual reason for this issue from prior discussion.
>
> I will try your extra parameters today though.
>
> Thanks
> Kim
>
>
> -----Original Message-----
> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
> Sent: Thursday, 19 March 2009 5:48 PM
> To: dev@synapse.apache.org
> Subject: Re: VFS - Synapse Memory Leak
>
> Kim,
>
> Can you post your current synapse.xml as well as the stack trace you get now?
>
> Andreas
>
> On Thu, Mar 19, 2009 at 07:20, kimhorn <ki...@icsglobal.net> wrote:
>>
>> Using the last stable build from 15 March 2009 I still get exactly same
>> behaviour as originally
>> described with the above script. VFS still just dies. Would your fixes be in
>> this ?
>>
>> Using the last st
>>
>> Andreas Veithen-2 wrote:
>>>
>>> I committed the code and it will be available in the next WS-Commons
>>> transport build. The methods are located in
>>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>>> module.
>>>
>>> Andreas
>>>
>>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>>>> Hello Andreas,
>>>> This is great and really helps, have not had time to try it out but will
>>>> soon.
>>>>
>>>> Contributing the java.io.Reader would be a great help but it will take me
>>>> a while to get up to speed to do the Synapse iterator.
>>>>
>>>> In the short term I am going to use a brute force approach that is now
>>>> feasible given the memory issue is resolved. Just thought of this one
>>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>>> POJOCommand on <out> to split file into another directory, stream in and
>>>> out. Another independent VFS proxy watches that directory and submits
>>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>>> destination may still be an issue ?
>>>>
>>>> Kim
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>>> Sent: Monday, 9 March 2009 10:55 PM
>>>> To: dev@synapse.apache.org
>>>> Subject: Re: VFS - Synapse Memory Leak
>>>>
>>>> The changes I did in the VFS transport and the message builders for
>>>> text/plain and application/octet-stream certainly don't provide an
>>>> out-of-the-box solution for your use case, but they are the
>>>> prerequisite.
>>>>
>>>> Concerning your first proposed solution (let the VFS write the content
>>>> to a temporary file), I don't like this because it would create a
>>>> tight coupling between the VFS transport and the mediator. A design
>>>> goal should be that the solution will still work if the file comes
>>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>>
>>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>>> this will require development of a custom mediator. This mediator
>>>> would read the content, split it up (and store the chunks in memory or
>>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>>> the sub-sequence would happen synchronously to limit the memory/disk
>>>> space consumption (to the maximum chunk size) and to avoid flooding
>>>> the destination service.
>>>>
>>>> Note that it is probably not possible to implemented the mediator
>>>> using a script because of the problematic String handling. Also,
>>>> Spring, POJO and class mediators don't support sub-sequences (I
>>>> think). Therefore it should be implemented as a full-featured Java
>>>> mediator, probably taking the existing iterate mediator as a template.
>>>> I can contribute the required code to get the text content in the form
>>>> of a java.io.Reader.
>>>>
>>>> Regards,
>>>>
>>>> Andreas
>>>>
>>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>>>
>>>>> Although this is a good feature it may not solve the actual problem ?
>>>>> The main first issue on my list was the memory leak.
>>>>> However, the real problem is once I get this massive files I  have to
>>>>> send
>>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>>> Streaming it straight out would just kill the destination Web service.
>>>>> It
>>>>> would get the memory error. The text document can be split apart easily,
>>>>> as
>>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>>
>>>>> In an earlier post; that was not responded too, I mentioned:
>>>>>
>>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>>> through
>>>>> input file and outputs smaller
>>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>>
>>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>>
>>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>>> process
>>>>> the file by splitting it into many smaller files. These files then
>>>>> trigger
>>>>> another VFS proxy that submits these to the final web Service.
>>>>> The problem is is that is uses the file system (not so bad).
>>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>>> Iterator. So replace the text message with many smaller XML elements.
>>>>> Problem is that this loads whole message into memory.
>>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>>> (to
>>>>> split the text data) or actually uses a for loop approach to chop the
>>>>> file
>>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>>> chunk
>>>>> 23 chunks into the data.
>>>>> 4) Using the approach proposed now - just submit the file straight
>>>>> (stream
>>>>> it) to another web service that chops it up. It may return an XML
>>>>> document
>>>>> with many sub elelements that allows the standard Iterator to work.
>>>>> Similar
>>>>> to (2) but using another service rather than Java to split document.
>>>>> 5) Using the approach proposed now - just submit the file straight
>>>>> (stream
>>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>>> with
>>>>> each small packet of data that then forwards it to the final WEb
>>>>> Service. So
>>>>> the Web Service iterates across the data; and not Synapse.
>>>>>
>>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>>> the
>>>>> front end.
>>>>>
>>>>> Another issue here is throttling: Splitting the file is one issues but
>>>>> submitting 100's of calls in parralel to the destination service would
>>>>> result in time outs... So need to work in throttling.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Ruwan Linton wrote:
>>>>>>
>>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>>> stuff
>>>>>> than trying to invent the wheel again :-)
>>>>>>
>>>>>> Thanks,
>>>>>> Ruwan
>>>>>>
>>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>>> <an...@gmail.com>wrote:
>>>>>>
>>>>>>> Ruwan,
>>>>>>>
>>>>>>> It's not a question of possibility, it is a question of available time
>>>>>>> :-)
>>>>>>>
>>>>>>> Also note that some of the features that we might want to implement
>>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>>> (except that an attachment is only available once, while a file over
>>>>>>> VFS can be read several times). I think there is also some existing
>>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>>> things but try to make the existing code reusable. This however is
>>>>>>> only realistic for the next release after 1.3.
>>>>>>>
>>>>>>> Andreas
>>>>>>>
>>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>>>> wrote:
>>>>>>> > Andreas,
>>>>>>> >
>>>>>>> > Can we have the caching at the file system as a property to support
>>>>>>> the
>>>>>>> > multiple layers touching the full message and is it possible make it
>>>>>>> to
>>>>>>> > specify a threshold for streaming? For example if the message is
>>>>>>> touched
>>>>>>> > several time we might still need streaming but not for the 100KB or
>>>>>>> lesser
>>>>>>> > files.
>>>>>>> >
>>>>>>> > Thanks,
>>>>>>> > Ruwan
>>>>>>> >
>>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>>> andreas.veithen@gmail.com>
>>>>>>> > wrote:
>>>>>>> >>
>>>>>>> >> I've done an initial implementation of this feature. It is
>>>>>>> available
>>>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>>>> to
>>>>>>> >> enable this in your configuration, you need to add the following
>>>>>>> >> property to the proxy:
>>>>>>> >>
>>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>>> >>
>>>>>>> >> You also need to add the following mediators just before the <send>
>>>>>>> >> mediator:
>>>>>>> >>
>>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>>> scope="axis2"/>
>>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>>> >>
>>>>>>> >> With this configuration Synapse will stream the data directly from
>>>>>>> the
>>>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>>>> in
>>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>>>> only
>>>>>>> >> be opened on demand. In this case this happens during execution of
>>>>>>> the
>>>>>>> >> <send> mediator.
>>>>>>> >> * If during the mediation the content of the file is needed several
>>>>>>> >> time (which is not the case in your example), it will be read
>>>>>>> several
>>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>>> >>
>>>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>>>> the
>>>>>>> >> memory consumption is constant.
>>>>>>> >>
>>>>>>> >> Some additional comments:
>>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>>>> SOAP
>>>>>>> >> processing: this type of content is processed exactly as before.
>>>>>>> >> * With the changes described here, we have now two different
>>>>>>> policies
>>>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>>>> no
>>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>>>> >> should define a wider range of policies in the future, including
>>>>>>> file
>>>>>>> >> system caching + streaming.
>>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>>> mediator
>>>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>>>> >> transport in a separate thread. This property is set by the
>>>>>>> incoming
>>>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>>>> >> why the transport that handles the incoming request should
>>>>>>> determine
>>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>>>> >>
>>>>>>> >> Andreas
>>>>>>> >>
>>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>>>> wrote:
>>>>>>> >> >
>>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> >> > Asankha C. Perera wrote:
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>>> Java
>>>>>>> >> >>> heap
>>>>>>> >> >>> space
>>>>>>> >> >>>         at
>>>>>>> >> >>>
>>>>>>> >> >>>
>>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>>> >> >>>         at
>>>>>>> >> >>>
>>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>>>> >> >>>         at
>>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>>> >> >>>         at
>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>>> >> >>>         at
>>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>>> >> >>>         at
>>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>>> >> >>>         at
>>>>>>> >> >>>
>>>>>>> >> >>>
>>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>>> >> >>>
>>>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>>>> trying
>>>>>>> to
>>>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>>>> content..
>>>>>>> >> >>
>>>>>>> >> >> A definite bug we need to fix ..
>>>>>>> >> >>
>>>>>>> >> >> cheers
>>>>>>> >> >> asankha
>>>>>>> >> >>
>>>>>>> >> >> --
>>>>>>> >> >> Asankha C. Perera
>>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>>> >> >>
>>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >
>>>>>>> >> > --
>>>>>>> >> > View this message in context:
>>>>>>> >> >
>>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> ---------------------------------------------------------------------
>>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>> >>
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > Ruwan Linton
>>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>>> > http://ruwansblog.blogspot.com/
>>>>>>> >
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Ruwan Linton
>>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>> http://ruwansblog.blogspot.com/
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>>
>>
>> --
>> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


RE: VFS - Synapse Memory Leak

Posted by Kim Horn <ki...@icsglobal.net>.
It's the same Synapse.xml as specified originally and same trace. If you are using Nabble you can see this, in case you lost the prior emails I can post them again.

I must admit I did not set those extra parameters, you mentioned, but I don't see why you should set parameter to Stop a memory leak. I guessed these parameter would just reduce the large amounts of memory it appears to be using, e.g. 10 times the file size, via streaming ? Why is their 10 copies of the data floating around ? Lots of buffering. This issue suggests to me that any use of VFS will eventually kill the Server. Even with smaller files it will eventually use all available memory. I guess I did not understand the actual reason for this issue from prior discussion.

I will try your extra parameters today though.

Thanks
Kim


-----Original Message-----
From: Andreas Veithen [mailto:andreas.veithen@gmail.com] 
Sent: Thursday, 19 March 2009 5:48 PM
To: dev@synapse.apache.org
Subject: Re: VFS - Synapse Memory Leak

Kim,

Can you post your current synapse.xml as well as the stack trace you get now?

Andreas

On Thu, Mar 19, 2009 at 07:20, kimhorn <ki...@icsglobal.net> wrote:
>
> Using the last stable build from 15 March 2009 I still get exactly same
> behaviour as originally
> described with the above script. VFS still just dies. Would your fixes be in
> this ?
>
> Using the last st
>
> Andreas Veithen-2 wrote:
>>
>> I committed the code and it will be available in the next WS-Commons
>> transport build. The methods are located in
>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>> module.
>>
>> Andreas
>>
>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>>> Hello Andreas,
>>> This is great and really helps, have not had time to try it out but will
>>> soon.
>>>
>>> Contributing the java.io.Reader would be a great help but it will take me
>>> a while to get up to speed to do the Synapse iterator.
>>>
>>> In the short term I am going to use a brute force approach that is now
>>> feasible given the memory issue is resolved. Just thought of this one
>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>> POJOCommand on <out> to split file into another directory, stream in and
>>> out. Another independent VFS proxy watches that directory and submits
>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>> destination may still be an issue ?
>>>
>>> Kim
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>> Sent: Monday, 9 March 2009 10:55 PM
>>> To: dev@synapse.apache.org
>>> Subject: Re: VFS - Synapse Memory Leak
>>>
>>> The changes I did in the VFS transport and the message builders for
>>> text/plain and application/octet-stream certainly don't provide an
>>> out-of-the-box solution for your use case, but they are the
>>> prerequisite.
>>>
>>> Concerning your first proposed solution (let the VFS write the content
>>> to a temporary file), I don't like this because it would create a
>>> tight coupling between the VFS transport and the mediator. A design
>>> goal should be that the solution will still work if the file comes
>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>
>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>> this will require development of a custom mediator. This mediator
>>> would read the content, split it up (and store the chunks in memory or
>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>> the sub-sequence would happen synchronously to limit the memory/disk
>>> space consumption (to the maximum chunk size) and to avoid flooding
>>> the destination service.
>>>
>>> Note that it is probably not possible to implemented the mediator
>>> using a script because of the problematic String handling. Also,
>>> Spring, POJO and class mediators don't support sub-sequences (I
>>> think). Therefore it should be implemented as a full-featured Java
>>> mediator, probably taking the existing iterate mediator as a template.
>>> I can contribute the required code to get the text content in the form
>>> of a java.io.Reader.
>>>
>>> Regards,
>>>
>>> Andreas
>>>
>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>>
>>>> Although this is a good feature it may not solve the actual problem ?
>>>> The main first issue on my list was the memory leak.
>>>> However, the real problem is once I get this massive files I  have to
>>>> send
>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>> Streaming it straight out would just kill the destination Web service.
>>>> It
>>>> would get the memory error. The text document can be split apart easily,
>>>> as
>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>
>>>> In an earlier post; that was not responded too, I mentioned:
>>>>
>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>> through
>>>> input file and outputs smaller
>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>
>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>
>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>> process
>>>> the file by splitting it into many smaller files. These files then
>>>> trigger
>>>> another VFS proxy that submits these to the final web Service.
>>>> The problem is is that is uses the file system (not so bad).
>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>> Iterator. So replace the text message with many smaller XML elements.
>>>> Problem is that this loads whole message into memory.
>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>> (to
>>>> split the text data) or actually uses a for loop approach to chop the
>>>> file
>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>> chunk
>>>> 23 chunks into the data.
>>>> 4) Using the approach proposed now - just submit the file straight
>>>> (stream
>>>> it) to another web service that chops it up. It may return an XML
>>>> document
>>>> with many sub elelements that allows the standard Iterator to work.
>>>> Similar
>>>> to (2) but using another service rather than Java to split document.
>>>> 5) Using the approach proposed now - just submit the file straight
>>>> (stream
>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>> with
>>>> each small packet of data that then forwards it to the final WEb
>>>> Service. So
>>>> the Web Service iterates across the data; and not Synapse.
>>>>
>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>> the
>>>> front end.
>>>>
>>>> Another issue here is throttling: Splitting the file is one issues but
>>>> submitting 100's of calls in parralel to the destination service would
>>>> result in time outs... So need to work in throttling.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Ruwan Linton wrote:
>>>>>
>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>> stuff
>>>>> than trying to invent the wheel again :-)
>>>>>
>>>>> Thanks,
>>>>> Ruwan
>>>>>
>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>> <an...@gmail.com>wrote:
>>>>>
>>>>>> Ruwan,
>>>>>>
>>>>>> It's not a question of possibility, it is a question of available time
>>>>>> :-)
>>>>>>
>>>>>> Also note that some of the features that we might want to implement
>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>> (except that an attachment is only available once, while a file over
>>>>>> VFS can be read several times). I think there is also some existing
>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>> things but try to make the existing code reusable. This however is
>>>>>> only realistic for the next release after 1.3.
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>>> wrote:
>>>>>> > Andreas,
>>>>>> >
>>>>>> > Can we have the caching at the file system as a property to support
>>>>>> the
>>>>>> > multiple layers touching the full message and is it possible make it
>>>>>> to
>>>>>> > specify a threshold for streaming? For example if the message is
>>>>>> touched
>>>>>> > several time we might still need streaming but not for the 100KB or
>>>>>> lesser
>>>>>> > files.
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Ruwan
>>>>>> >
>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>> andreas.veithen@gmail.com>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> I've done an initial implementation of this feature. It is
>>>>>> available
>>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>>> to
>>>>>> >> enable this in your configuration, you need to add the following
>>>>>> >> property to the proxy:
>>>>>> >>
>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>> >>
>>>>>> >> You also need to add the following mediators just before the <send>
>>>>>> >> mediator:
>>>>>> >>
>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>> scope="axis2"/>
>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>> >>
>>>>>> >> With this configuration Synapse will stream the data directly from
>>>>>> the
>>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>>> in
>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>>> only
>>>>>> >> be opened on demand. In this case this happens during execution of
>>>>>> the
>>>>>> >> <send> mediator.
>>>>>> >> * If during the mediation the content of the file is needed several
>>>>>> >> time (which is not the case in your example), it will be read
>>>>>> several
>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>> >>
>>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>>> the
>>>>>> >> memory consumption is constant.
>>>>>> >>
>>>>>> >> Some additional comments:
>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>>> SOAP
>>>>>> >> processing: this type of content is processed exactly as before.
>>>>>> >> * With the changes described here, we have now two different
>>>>>> policies
>>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>>> no
>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>>> >> should define a wider range of policies in the future, including
>>>>>> file
>>>>>> >> system caching + streaming.
>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>> mediator
>>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>>> >> transport in a separate thread. This property is set by the
>>>>>> incoming
>>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>>> >> why the transport that handles the incoming request should
>>>>>> determine
>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>>> >>
>>>>>> >> Andreas
>>>>>> >>
>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>>> wrote:
>>>>>> >> >
>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>> >> >
>>>>>> >> >
>>>>>> >> >
>>>>>> >> > Asankha C. Perera wrote:
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>> Java
>>>>>> >> >>> heap
>>>>>> >> >>> space
>>>>>> >> >>>         at
>>>>>> >> >>>
>>>>>> >> >>>
>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>> >> >>>         at
>>>>>> >> >>>
>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>> >> >>>         at
>>>>>> >> >>>
>>>>>> >> >>>
>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>> >> >>>
>>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>>> trying
>>>>>> to
>>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>>> content..
>>>>>> >> >>
>>>>>> >> >> A definite bug we need to fix ..
>>>>>> >> >>
>>>>>> >> >> cheers
>>>>>> >> >> asankha
>>>>>> >> >>
>>>>>> >> >> --
>>>>>> >> >> Asankha C. Perera
>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>> >> >>
>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> ---------------------------------------------------------------------
>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >
>>>>>> >> > --
>>>>>> >> > View this message in context:
>>>>>> >> >
>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>> >> >
>>>>>> >> >
>>>>>> >> >
>>>>>> ---------------------------------------------------------------------
>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>> >> >
>>>>>> >> >
>>>>>> >>
>>>>>> >>
>>>>>> ---------------------------------------------------------------------
>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>> >>
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Ruwan Linton
>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>> > http://ruwansblog.blogspot.com/
>>>>>> >
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ruwan Linton
>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>> http://ruwansblog.blogspot.com/
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by Andreas Veithen <an...@gmail.com>.
Kim,

Can you post your current synapse.xml as well as the stack trace you get now?

Andreas

On Thu, Mar 19, 2009 at 07:20, kimhorn <ki...@icsglobal.net> wrote:
>
> Using the last stable build from 15 March 2009 I still get exactly same
> behaviour as originally
> described with the above script. VFS still just dies. Would your fixes be in
> this ?
>
> Using the last st
>
> Andreas Veithen-2 wrote:
>>
>> I committed the code and it will be available in the next WS-Commons
>> transport build. The methods are located in
>> org.apache.axis2.format.ElementHelper in the axis2-transport-base
>> module.
>>
>> Andreas
>>
>> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>>> Hello Andreas,
>>> This is great and really helps, have not had time to try it out but will
>>> soon.
>>>
>>> Contributing the java.io.Reader would be a great help but it will take me
>>> a while to get up to speed to do the Synapse iterator.
>>>
>>> In the short term I am going to use a brute force approach that is now
>>> feasible given the memory issue is resolved. Just thought of this one
>>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>>> POJOCommand on <out> to split file into another directory, stream in and
>>> out. Another independent VFS proxy watches that directory and submits
>>> each file to Web service. Hopefully memory will be fine. Overloading the
>>> destination may still be an issue ?
>>>
>>> Kim
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>>> Sent: Monday, 9 March 2009 10:55 PM
>>> To: dev@synapse.apache.org
>>> Subject: Re: VFS - Synapse Memory Leak
>>>
>>> The changes I did in the VFS transport and the message builders for
>>> text/plain and application/octet-stream certainly don't provide an
>>> out-of-the-box solution for your use case, but they are the
>>> prerequisite.
>>>
>>> Concerning your first proposed solution (let the VFS write the content
>>> to a temporary file), I don't like this because it would create a
>>> tight coupling between the VFS transport and the mediator. A design
>>> goal should be that the solution will still work if the file comes
>>> from another source, e.g. an attachment in an MTOM or SwA message.
>>>
>>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>>> this will require development of a custom mediator. This mediator
>>> would read the content, split it up (and store the chunks in memory or
>>> an disk) and executes a sub-sequence for each chunk. The execution of
>>> the sub-sequence would happen synchronously to limit the memory/disk
>>> space consumption (to the maximum chunk size) and to avoid flooding
>>> the destination service.
>>>
>>> Note that it is probably not possible to implemented the mediator
>>> using a script because of the problematic String handling. Also,
>>> Spring, POJO and class mediators don't support sub-sequences (I
>>> think). Therefore it should be implemented as a full-featured Java
>>> mediator, probably taking the existing iterate mediator as a template.
>>> I can contribute the required code to get the text content in the form
>>> of a java.io.Reader.
>>>
>>> Regards,
>>>
>>> Andreas
>>>
>>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>>
>>>> Although this is a good feature it may not solve the actual problem ?
>>>> The main first issue on my list was the memory leak.
>>>> However, the real problem is once I get this massive files I  have to
>>>> send
>>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>>> Streaming it straight out would just kill the destination Web service.
>>>> It
>>>> would get the memory error. The text document can be split apart easily,
>>>> as
>>>> it has independant records on each line seperated by <CR> <LF>.
>>>>
>>>> In an earlier post; that was not responded too, I mentioned:
>>>>
>>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>>> through
>>>> input file and outputs smaller
>>>> chunks for processing, in Synapse, may be a solution ? "
>>>>
>>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>>
>>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>>> process
>>>> the file by splitting it into many smaller files. These files then
>>>> trigger
>>>> another VFS proxy that submits these to the final web Service.
>>>> The problem is is that is uses the file system (not so bad).
>>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>>> into many XML <data> elements that can then be acted on by a Synapse
>>>> Iterator. So replace the text message with many smaller XML elements.
>>>> Problem is that this loads whole message into memory.
>>>> 3) Create another Iterator in Synapse that works on Regular expression
>>>> (to
>>>> split the text data) or actually uses a for loop approach to chop the
>>>> file
>>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>>> chunk
>>>> 23 chunks into the data.
>>>> 4) Using the approach proposed now - just submit the file straight
>>>> (stream
>>>> it) to another web service that chops it up. It may return an XML
>>>> document
>>>> with many sub elelements that allows the standard Iterator to work.
>>>> Similar
>>>> to (2) but using another service rather than Java to split document.
>>>> 5) Using the approach proposed now - just submit the file straight
>>>> (stream
>>>> it) to another web service that chops it up but calls a Synapse proxy
>>>> with
>>>> each small packet of data that then forwards it to the final WEb
>>>> Service. So
>>>> the Web Service iterates across the data; and not Synapse.
>>>>
>>>> Then other solutions replace Synapse with a stand alone Java program at
>>>> the
>>>> front end.
>>>>
>>>> Another issue here is throttling: Splitting the file is one issues but
>>>> submitting 100's of calls in parralel to the destination service would
>>>> result in time outs... So need to work in throttling.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Ruwan Linton wrote:
>>>>>
>>>>> I agree and can understand the time factor and also +1 for reusing
>>>>> stuff
>>>>> than trying to invent the wheel again :-)
>>>>>
>>>>> Thanks,
>>>>> Ruwan
>>>>>
>>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>>> <an...@gmail.com>wrote:
>>>>>
>>>>>> Ruwan,
>>>>>>
>>>>>> It's not a question of possibility, it is a question of available time
>>>>>> :-)
>>>>>>
>>>>>> Also note that some of the features that we might want to implement
>>>>>> have some similarities with what is done for attachments in Axiom
>>>>>> (except that an attachment is only available once, while a file over
>>>>>> VFS can be read several times). I think there is also some existing
>>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>>> things but try to make the existing code reusable. This however is
>>>>>> only realistic for the next release after 1.3.
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>>> wrote:
>>>>>> > Andreas,
>>>>>> >
>>>>>> > Can we have the caching at the file system as a property to support
>>>>>> the
>>>>>> > multiple layers touching the full message and is it possible make it
>>>>>> to
>>>>>> > specify a threshold for streaming? For example if the message is
>>>>>> touched
>>>>>> > several time we might still need streaming but not for the 100KB or
>>>>>> lesser
>>>>>> > files.
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Ruwan
>>>>>> >
>>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>>> andreas.veithen@gmail.com>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> I've done an initial implementation of this feature. It is
>>>>>> available
>>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>>> to
>>>>>> >> enable this in your configuration, you need to add the following
>>>>>> >> property to the proxy:
>>>>>> >>
>>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>>> >>
>>>>>> >> You also need to add the following mediators just before the <send>
>>>>>> >> mediator:
>>>>>> >>
>>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>>> scope="axis2"/>
>>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>>> >>
>>>>>> >> With this configuration Synapse will stream the data directly from
>>>>>> the
>>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>>> in
>>>>>> >> a temporary file. Note that this has two other side effects:
>>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>>> only
>>>>>> >> be opened on demand. In this case this happens during execution of
>>>>>> the
>>>>>> >> <send> mediator.
>>>>>> >> * If during the mediation the content of the file is needed several
>>>>>> >> time (which is not the case in your example), it will be read
>>>>>> several
>>>>>> >> times. The reason is of course that the content is not cached.
>>>>>> >>
>>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>>> the
>>>>>> >> memory consumption is constant.
>>>>>> >>
>>>>>> >> Some additional comments:
>>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>>> SOAP
>>>>>> >> processing: this type of content is processed exactly as before.
>>>>>> >> * With the changes described here, we have now two different
>>>>>> policies
>>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>>> no
>>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>>> >> should define a wider range of policies in the future, including
>>>>>> file
>>>>>> >> system caching + streaming.
>>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>>> mediator
>>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>>> >> transport in a separate thread. This property is set by the
>>>>>> incoming
>>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>>> >> why the transport that handles the incoming request should
>>>>>> determine
>>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>>> >>
>>>>>> >> Andreas
>>>>>> >>
>>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>>> wrote:
>>>>>> >> >
>>>>>> >> > Thats good; as this stops us using Synapse.
>>>>>> >> >
>>>>>> >> >
>>>>>> >> >
>>>>>> >> > Asankha C. Perera wrote:
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>>> Java
>>>>>> >> >>> heap
>>>>>> >> >>> space
>>>>>> >> >>>         at
>>>>>> >> >>>
>>>>>> >> >>>
>>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>>> >> >>>         at
>>>>>> >> >>>
>>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>>> >> >>>         at
>>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>>> >> >>>         at
>>>>>> >> >>>
>>>>>> >> >>>
>>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>>> >> >>>
>>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>>> trying
>>>>>> to
>>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>>> content..
>>>>>> >> >>
>>>>>> >> >> A definite bug we need to fix ..
>>>>>> >> >>
>>>>>> >> >> cheers
>>>>>> >> >> asankha
>>>>>> >> >>
>>>>>> >> >> --
>>>>>> >> >> Asankha C. Perera
>>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>>> >> >>
>>>>>> >> >> http://esbmagic.blogspot.com
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> ---------------------------------------------------------------------
>>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >>
>>>>>> >> >
>>>>>> >> > --
>>>>>> >> > View this message in context:
>>>>>> >> >
>>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>>> >> >
>>>>>> >> >
>>>>>> >> >
>>>>>> ---------------------------------------------------------------------
>>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>> >> >
>>>>>> >> >
>>>>>> >>
>>>>>> >>
>>>>>> ---------------------------------------------------------------------
>>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>> >>
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Ruwan Linton
>>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>>> > http://ruwansblog.blogspot.com/
>>>>>> >
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ruwan Linton
>>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>>> http://ruwansblog.blogspot.com/
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by kimhorn <ki...@icsglobal.net>.
Using the last stable build from 15 March 2009 I still get exactly same
behaviour as originally
described with the above script. VFS still just dies. Would your fixes be in
this ?

Using the last st

Andreas Veithen-2 wrote:
> 
> I committed the code and it will be available in the next WS-Commons
> transport build. The methods are located in
> org.apache.axis2.format.ElementHelper in the axis2-transport-base
> module.
> 
> Andreas
> 
> On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
>> Hello Andreas,
>> This is great and really helps, have not had time to try it out but will
>> soon.
>>
>> Contributing the java.io.Reader would be a great help but it will take me
>> a while to get up to speed to do the Synapse iterator.
>>
>> In the short term I am going to use a brute force approach that is now
>> feasible given the memory issue is resolved. Just thought of this one
>> today. Use VFS proxy to FTP file locally; so streaming helps here. A
>> POJOCommand on <out> to split file into another directory, stream in and
>> out. Another independent VFS proxy watches that directory and submits
>> each file to Web service. Hopefully memory will be fine. Overloading the
>> destination may still be an issue ?
>>
>> Kim
>>
>>
>>
>> -----Original Message-----
>> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
>> Sent: Monday, 9 March 2009 10:55 PM
>> To: dev@synapse.apache.org
>> Subject: Re: VFS - Synapse Memory Leak
>>
>> The changes I did in the VFS transport and the message builders for
>> text/plain and application/octet-stream certainly don't provide an
>> out-of-the-box solution for your use case, but they are the
>> prerequisite.
>>
>> Concerning your first proposed solution (let the VFS write the content
>> to a temporary file), I don't like this because it would create a
>> tight coupling between the VFS transport and the mediator. A design
>> goal should be that the solution will still work if the file comes
>> from another source, e.g. an attachment in an MTOM or SwA message.
>>
>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>> this will require development of a custom mediator. This mediator
>> would read the content, split it up (and store the chunks in memory or
>> an disk) and executes a sub-sequence for each chunk. The execution of
>> the sub-sequence would happen synchronously to limit the memory/disk
>> space consumption (to the maximum chunk size) and to avoid flooding
>> the destination service.
>>
>> Note that it is probably not possible to implemented the mediator
>> using a script because of the problematic String handling. Also,
>> Spring, POJO and class mediators don't support sub-sequences (I
>> think). Therefore it should be implemented as a full-featured Java
>> mediator, probably taking the existing iterate mediator as a template.
>> I can contribute the required code to get the text content in the form
>> of a java.io.Reader.
>>
>> Regards,
>>
>> Andreas
>>
>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>>
>>> Although this is a good feature it may not solve the actual problem ?
>>> The main first issue on my list was the memory leak.
>>> However, the real problem is once I get this massive files I  have to
>>> send
>>> it to a web Service that can only take it in small chunks (about 14MB) .
>>> Streaming it straight out would just kill the destination Web service.
>>> It
>>> would get the memory error. The text document can be split apart easily,
>>> as
>>> it has independant records on each line seperated by <CR> <LF>.
>>>
>>> In an earlier post; that was not responded too, I mentioned:
>>>
>>> "Otherwise; for large EDI files a VFS iterator Mediator that streams
>>> through
>>> input file and outputs smaller
>>> chunks for processing, in Synapse, may be a solution ? "
>>>
>>> So I had mentioned a few solutions, in prior posts, solution now are:
>>>
>>> 1) VFS writes straight to temporary file, then a Java mediator can
>>> process
>>> the file by splitting it into many smaller files. These files then
>>> trigger
>>> another VFS proxy that submits these to the final web Service.
>>> The problem is is that is uses the file system (not so bad).
>>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>>> into many XML <data> elements that can then be acted on by a Synapse
>>> Iterator. So replace the text message with many smaller XML elements.
>>> Problem is that this loads whole message into memory.
>>> 3) Create another Iterator in Synapse that works on Regular expression
>>> (to
>>> split the text data) or actually uses a for loop approach to chop the
>>> file
>>> into chunks based on the loop index value. E.g. Index = 23 means a 14K
>>> chunk
>>> 23 chunks into the data.
>>> 4) Using the approach proposed now - just submit the file straight
>>> (stream
>>> it) to another web service that chops it up. It may return an XML
>>> document
>>> with many sub elelements that allows the standard Iterator to work.
>>> Similar
>>> to (2) but using another service rather than Java to split document.
>>> 5) Using the approach proposed now - just submit the file straight
>>> (stream
>>> it) to another web service that chops it up but calls a Synapse proxy
>>> with
>>> each small packet of data that then forwards it to the final WEb
>>> Service. So
>>> the Web Service iterates across the data; and not Synapse.
>>>
>>> Then other solutions replace Synapse with a stand alone Java program at
>>> the
>>> front end.
>>>
>>> Another issue here is throttling: Splitting the file is one issues but
>>> submitting 100's of calls in parralel to the destination service would
>>> result in time outs... So need to work in throttling.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Ruwan Linton wrote:
>>>>
>>>> I agree and can understand the time factor and also +1 for reusing
>>>> stuff
>>>> than trying to invent the wheel again :-)
>>>>
>>>> Thanks,
>>>> Ruwan
>>>>
>>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>>> <an...@gmail.com>wrote:
>>>>
>>>>> Ruwan,
>>>>>
>>>>> It's not a question of possibility, it is a question of available time
>>>>> :-)
>>>>>
>>>>> Also note that some of the features that we might want to implement
>>>>> have some similarities with what is done for attachments in Axiom
>>>>> (except that an attachment is only available once, while a file over
>>>>> VFS can be read several times). I think there is also some existing
>>>>> code in Axis2 that might be useful. We should not reimplement these
>>>>> things but try to make the existing code reusable. This however is
>>>>> only realistic for the next release after 1.3.
>>>>>
>>>>> Andreas
>>>>>
>>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>>> wrote:
>>>>> > Andreas,
>>>>> >
>>>>> > Can we have the caching at the file system as a property to support
>>>>> the
>>>>> > multiple layers touching the full message and is it possible make it
>>>>> to
>>>>> > specify a threshold for streaming? For example if the message is
>>>>> touched
>>>>> > several time we might still need streaming but not for the 100KB or
>>>>> lesser
>>>>> > files.
>>>>> >
>>>>> > Thanks,
>>>>> > Ruwan
>>>>> >
>>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>>> andreas.veithen@gmail.com>
>>>>> > wrote:
>>>>> >>
>>>>> >> I've done an initial implementation of this feature. It is
>>>>> available
>>>>> >> in trunk and should be included in the next nightly build. In order
>>>>> to
>>>>> >> enable this in your configuration, you need to add the following
>>>>> >> property to the proxy:
>>>>> >>
>>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>>> >>
>>>>> >> You also need to add the following mediators just before the <send>
>>>>> >> mediator:
>>>>> >>
>>>>> >> <property action="remove" name="transportNonBlocking"
>>>>> scope="axis2"/>
>>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>>> >>
>>>>> >> With this configuration Synapse will stream the data directly from
>>>>> the
>>>>> >> incoming to the outgoing transport without storing it in memory or
>>>>> in
>>>>> >> a temporary file. Note that this has two other side effects:
>>>>> >> * The incoming file (or connection in case of a remote file) will
>>>>> only
>>>>> >> be opened on demand. In this case this happens during execution of
>>>>> the
>>>>> >> <send> mediator.
>>>>> >> * If during the mediation the content of the file is needed several
>>>>> >> time (which is not the case in your example), it will be read
>>>>> several
>>>>> >> times. The reason is of course that the content is not cached.
>>>>> >>
>>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>>> >> performance of the implementation is not yet optimal, but at least
>>>>> the
>>>>> >> memory consumption is constant.
>>>>> >>
>>>>> >> Some additional comments:
>>>>> >> * The transport.vfs.Streaming property has no impact on XML and
>>>>> SOAP
>>>>> >> processing: this type of content is processed exactly as before.
>>>>> >> * With the changes described here, we have now two different
>>>>> policies
>>>>> >> for plain text and binary content processing: in-memory caching +
>>>>> no
>>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>>> >> should define a wider range of policies in the future, including
>>>>> file
>>>>> >> system caching + streaming.
>>>>> >> * It is necessary to remove the transportNonBlocking property
>>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>>>>> mediator
>>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>>> >> transport in a separate thread. This property is set by the
>>>>> incoming
>>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>>> >> why the transport that handles the incoming request should
>>>>> determine
>>>>> >> the threading behavior of the transport that sends the outgoing
>>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>>> >>
>>>>> >> Andreas
>>>>> >>
>>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>>>>> wrote:
>>>>> >> >
>>>>> >> > Thats good; as this stops us using Synapse.
>>>>> >> >
>>>>> >> >
>>>>> >> >
>>>>> >> > Asankha C. Perera wrote:
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>>> Java
>>>>> >> >>> heap
>>>>> >> >>> space
>>>>> >> >>>         at
>>>>> >> >>>
>>>>> >> >>>
>>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>>> >> >>>         at
>>>>> >> >>>
>>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>>> >> >>>         at
>>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>>> >> >>>         at
>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>>> >> >>>         at
>>>>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>>> >> >>>         at
>>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>>> >> >>>         at
>>>>> >> >>>
>>>>> >> >>>
>>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>>> >> >>>
>>>>> >> >> Since the content type is text, the plain text formatter is
>>>>> trying
>>>>> to
>>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>>> content..
>>>>> >> >>
>>>>> >> >> A definite bug we need to fix ..
>>>>> >> >>
>>>>> >> >> cheers
>>>>> >> >> asankha
>>>>> >> >>
>>>>> >> >> --
>>>>> >> >> Asankha C. Perera
>>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>>> >> >>
>>>>> >> >> http://esbmagic.blogspot.com
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>
>>>>> ---------------------------------------------------------------------
>>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>> >> >>
>>>>> >> >>
>>>>> >> >>
>>>>> >> >
>>>>> >> > --
>>>>> >> > View this message in context:
>>>>> >> >
>>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>>> >> >
>>>>> >> >
>>>>> >> >
>>>>> ---------------------------------------------------------------------
>>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>>> >> >
>>>>> >> >
>>>>> >>
>>>>> >>
>>>>> ---------------------------------------------------------------------
>>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>> >>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > Ruwan Linton
>>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>>> > http://ruwansblog.blogspot.com/
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Ruwan Linton
>>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>>> http://ruwansblog.blogspot.com/
>>>>
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22594321.html
Sent from the Synapse - Dev mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by Andreas Veithen <an...@gmail.com>.
I committed the code and it will be available in the next WS-Commons
transport build. The methods are located in
org.apache.axis2.format.ElementHelper in the axis2-transport-base
module.

Andreas

On Thu, Mar 12, 2009 at 00:06, Kim Horn <ki...@icsglobal.net> wrote:
> Hello Andreas,
> This is great and really helps, have not had time to try it out but will soon.
>
> Contributing the java.io.Reader would be a great help but it will take me a while to get up to speed to do the Synapse iterator.
>
> In the short term I am going to use a brute force approach that is now feasible given the memory issue is resolved. Just thought of this one today. Use VFS proxy to FTP file locally; so streaming helps here. A POJOCommand on <out> to split file into another directory, stream in and out. Another independent VFS proxy watches that directory and submits each file to Web service. Hopefully memory will be fine. Overloading the destination may still be an issue ?
>
> Kim
>
>
>
> -----Original Message-----
> From: Andreas Veithen [mailto:andreas.veithen@gmail.com]
> Sent: Monday, 9 March 2009 10:55 PM
> To: dev@synapse.apache.org
> Subject: Re: VFS - Synapse Memory Leak
>
> The changes I did in the VFS transport and the message builders for
> text/plain and application/octet-stream certainly don't provide an
> out-of-the-box solution for your use case, but they are the
> prerequisite.
>
> Concerning your first proposed solution (let the VFS write the content
> to a temporary file), I don't like this because it would create a
> tight coupling between the VFS transport and the mediator. A design
> goal should be that the solution will still work if the file comes
> from another source, e.g. an attachment in an MTOM or SwA message.
>
> I thing that an all-Synapse solution (2 or 3) should be possible, but
> this will require development of a custom mediator. This mediator
> would read the content, split it up (and store the chunks in memory or
> an disk) and executes a sub-sequence for each chunk. The execution of
> the sub-sequence would happen synchronously to limit the memory/disk
> space consumption (to the maximum chunk size) and to avoid flooding
> the destination service.
>
> Note that it is probably not possible to implemented the mediator
> using a script because of the problematic String handling. Also,
> Spring, POJO and class mediators don't support sub-sequences (I
> think). Therefore it should be implemented as a full-featured Java
> mediator, probably taking the existing iterate mediator as a template.
> I can contribute the required code to get the text content in the form
> of a java.io.Reader.
>
> Regards,
>
> Andreas
>
> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>>
>> Although this is a good feature it may not solve the actual problem ?
>> The main first issue on my list was the memory leak.
>> However, the real problem is once I get this massive files I  have to send
>> it to a web Service that can only take it in small chunks (about 14MB) .
>> Streaming it straight out would just kill the destination Web service. It
>> would get the memory error. The text document can be split apart easily, as
>> it has independant records on each line seperated by <CR> <LF>.
>>
>> In an earlier post; that was not responded too, I mentioned:
>>
>> "Otherwise; for large EDI files a VFS iterator Mediator that streams through
>> input file and outputs smaller
>> chunks for processing, in Synapse, may be a solution ? "
>>
>> So I had mentioned a few solutions, in prior posts, solution now are:
>>
>> 1) VFS writes straight to temporary file, then a Java mediator can process
>> the file by splitting it into many smaller files. These files then trigger
>> another VFS proxy that submits these to the final web Service.
>> The problem is is that is uses the file system (not so bad).
>> 2) A Java Mediator takes the <text> package and splits it up by wrapping
>> into many XML <data> elements that can then be acted on by a Synapse
>> Iterator. So replace the text message with many smaller XML elements.
>> Problem is that this loads whole message into memory.
>> 3) Create another Iterator in Synapse that works on Regular expression (to
>> split the text data) or actually uses a for loop approach to chop the file
>> into chunks based on the loop index value. E.g. Index = 23 means a 14K chunk
>> 23 chunks into the data.
>> 4) Using the approach proposed now - just submit the file straight (stream
>> it) to another web service that chops it up. It may return an XML document
>> with many sub elelements that allows the standard Iterator to work. Similar
>> to (2) but using another service rather than Java to split document.
>> 5) Using the approach proposed now - just submit the file straight (stream
>> it) to another web service that chops it up but calls a Synapse proxy with
>> each small packet of data that then forwards it to the final WEb Service. So
>> the Web Service iterates across the data; and not Synapse.
>>
>> Then other solutions replace Synapse with a stand alone Java program at the
>> front end.
>>
>> Another issue here is throttling: Splitting the file is one issues but
>> submitting 100's of calls in parralel to the destination service would
>> result in time outs... So need to work in throttling.
>>
>>
>>
>>
>>
>>
>>
>>
>> Ruwan Linton wrote:
>>>
>>> I agree and can understand the time factor and also +1 for reusing stuff
>>> than trying to invent the wheel again :-)
>>>
>>> Thanks,
>>> Ruwan
>>>
>>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>>> <an...@gmail.com>wrote:
>>>
>>>> Ruwan,
>>>>
>>>> It's not a question of possibility, it is a question of available time
>>>> :-)
>>>>
>>>> Also note that some of the features that we might want to implement
>>>> have some similarities with what is done for attachments in Axiom
>>>> (except that an attachment is only available once, while a file over
>>>> VFS can be read several times). I think there is also some existing
>>>> code in Axis2 that might be useful. We should not reimplement these
>>>> things but try to make the existing code reusable. This however is
>>>> only realistic for the next release after 1.3.
>>>>
>>>> Andreas
>>>>
>>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>>> wrote:
>>>> > Andreas,
>>>> >
>>>> > Can we have the caching at the file system as a property to support the
>>>> > multiple layers touching the full message and is it possible make it to
>>>> > specify a threshold for streaming? For example if the message is
>>>> touched
>>>> > several time we might still need streaming but not for the 100KB or
>>>> lesser
>>>> > files.
>>>> >
>>>> > Thanks,
>>>> > Ruwan
>>>> >
>>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>>> andreas.veithen@gmail.com>
>>>> > wrote:
>>>> >>
>>>> >> I've done an initial implementation of this feature. It is available
>>>> >> in trunk and should be included in the next nightly build. In order to
>>>> >> enable this in your configuration, you need to add the following
>>>> >> property to the proxy:
>>>> >>
>>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>>> >>
>>>> >> You also need to add the following mediators just before the <send>
>>>> >> mediator:
>>>> >>
>>>> >> <property action="remove" name="transportNonBlocking" scope="axis2"/>
>>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>>> >>
>>>> >> With this configuration Synapse will stream the data directly from the
>>>> >> incoming to the outgoing transport without storing it in memory or in
>>>> >> a temporary file. Note that this has two other side effects:
>>>> >> * The incoming file (or connection in case of a remote file) will only
>>>> >> be opened on demand. In this case this happens during execution of the
>>>> >> <send> mediator.
>>>> >> * If during the mediation the content of the file is needed several
>>>> >> time (which is not the case in your example), it will be read several
>>>> >> times. The reason is of course that the content is not cached.
>>>> >>
>>>> >> I tested the solution with a 2GB file and it worked fine. The
>>>> >> performance of the implementation is not yet optimal, but at least the
>>>> >> memory consumption is constant.
>>>> >>
>>>> >> Some additional comments:
>>>> >> * The transport.vfs.Streaming property has no impact on XML and SOAP
>>>> >> processing: this type of content is processed exactly as before.
>>>> >> * With the changes described here, we have now two different policies
>>>> >> for plain text and binary content processing: in-memory caching + no
>>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>>> >> should define a wider range of policies in the future, including file
>>>> >> system caching + streaming.
>>>> >> * It is necessary to remove the transportNonBlocking property
>>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send> mediator
>>>> >> (more precisely the OperationClient) from executing the outgoing
>>>> >> transport in a separate thread. This property is set by the incoming
>>>> >> transport. I think this is a bug since I don't see any valid reason
>>>> >> why the transport that handles the incoming request should determine
>>>> >> the threading behavior of the transport that sends the outgoing
>>>> >> request to the target service. Maybe Asankha can comment on this?
>>>> >>
>>>> >> Andreas
>>>> >>
>>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net> wrote:
>>>> >> >
>>>> >> > Thats good; as this stops us using Synapse.
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > Asankha C. Perera wrote:
>>>> >> >>
>>>> >> >>
>>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>>> Java
>>>> >> >>> heap
>>>> >> >>> space
>>>> >> >>>         at
>>>> >> >>>
>>>> >> >>>
>>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>> >> >>>         at
>>>> >> >>>
>>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>> >> >>>         at
>>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>> >> >>>         at
>>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>> >> >>>         at
>>>> >> >>>
>>>> >> >>>
>>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>> >> >>>
>>>> >> >> Since the content type is text, the plain text formatter is trying
>>>> to
>>>> >> >> use a String to parse as I see.. which is a problem for large
>>>> content..
>>>> >> >>
>>>> >> >> A definite bug we need to fix ..
>>>> >> >>
>>>> >> >> cheers
>>>> >> >> asankha
>>>> >> >>
>>>> >> >> --
>>>> >> >> Asankha C. Perera
>>>> >> >> AdroitLogic, http://adroitlogic.org
>>>> >> >>
>>>> >> >> http://esbmagic.blogspot.com
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> ---------------------------------------------------------------------
>>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>> >> >>
>>>> >> >>
>>>> >> >>
>>>> >> >
>>>> >> > --
>>>> >> > View this message in context:
>>>> >> >
>>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>>> >> >
>>>> >> >
>>>> >> >
>>>> ---------------------------------------------------------------------
>>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>>> >> >
>>>> >> >
>>>> >>
>>>> >> ---------------------------------------------------------------------
>>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Ruwan Linton
>>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>>> > http://ruwansblog.blogspot.com/
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>>
>>>>
>>>
>>>
>>> --
>>> Ruwan Linton
>>> http://wso2.org - "Oxygenating the Web Services Platform"
>>> http://ruwansblog.blogspot.com/
>>>
>>>
>>
>> --
>> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


RE: VFS - Synapse Memory Leak

Posted by Kim Horn <ki...@icsglobal.net>.
Hello Andreas,
This is great and really helps, have not had time to try it out but will soon.

Contributing the java.io.Reader would be a great help but it will take me a while to get up to speed to do the Synapse iterator.

In the short term I am going to use a brute force approach that is now feasible given the memory issue is resolved. Just thought of this one today. Use VFS proxy to FTP file locally; so streaming helps here. A POJOCommand on <out> to split file into another directory, stream in and out. Another independent VFS proxy watches that directory and submits each file to Web service. Hopefully memory will be fine. Overloading the destination may still be an issue ?

Kim



-----Original Message-----
From: Andreas Veithen [mailto:andreas.veithen@gmail.com] 
Sent: Monday, 9 March 2009 10:55 PM
To: dev@synapse.apache.org
Subject: Re: VFS - Synapse Memory Leak

The changes I did in the VFS transport and the message builders for
text/plain and application/octet-stream certainly don't provide an
out-of-the-box solution for your use case, but they are the
prerequisite.

Concerning your first proposed solution (let the VFS write the content
to a temporary file), I don't like this because it would create a
tight coupling between the VFS transport and the mediator. A design
goal should be that the solution will still work if the file comes
from another source, e.g. an attachment in an MTOM or SwA message.

I thing that an all-Synapse solution (2 or 3) should be possible, but
this will require development of a custom mediator. This mediator
would read the content, split it up (and store the chunks in memory or
an disk) and executes a sub-sequence for each chunk. The execution of
the sub-sequence would happen synchronously to limit the memory/disk
space consumption (to the maximum chunk size) and to avoid flooding
the destination service.

Note that it is probably not possible to implemented the mediator
using a script because of the problematic String handling. Also,
Spring, POJO and class mediators don't support sub-sequences (I
think). Therefore it should be implemented as a full-featured Java
mediator, probably taking the existing iterate mediator as a template.
I can contribute the required code to get the text content in the form
of a java.io.Reader.

Regards,

Andreas

On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>
> Although this is a good feature it may not solve the actual problem ?
> The main first issue on my list was the memory leak.
> However, the real problem is once I get this massive files I  have to send
> it to a web Service that can only take it in small chunks (about 14MB) .
> Streaming it straight out would just kill the destination Web service. It
> would get the memory error. The text document can be split apart easily, as
> it has independant records on each line seperated by <CR> <LF>.
>
> In an earlier post; that was not responded too, I mentioned:
>
> "Otherwise; for large EDI files a VFS iterator Mediator that streams through
> input file and outputs smaller
> chunks for processing, in Synapse, may be a solution ? "
>
> So I had mentioned a few solutions, in prior posts, solution now are:
>
> 1) VFS writes straight to temporary file, then a Java mediator can process
> the file by splitting it into many smaller files. These files then trigger
> another VFS proxy that submits these to the final web Service.
> The problem is is that is uses the file system (not so bad).
> 2) A Java Mediator takes the <text> package and splits it up by wrapping
> into many XML <data> elements that can then be acted on by a Synapse
> Iterator. So replace the text message with many smaller XML elements.
> Problem is that this loads whole message into memory.
> 3) Create another Iterator in Synapse that works on Regular expression (to
> split the text data) or actually uses a for loop approach to chop the file
> into chunks based on the loop index value. E.g. Index = 23 means a 14K chunk
> 23 chunks into the data.
> 4) Using the approach proposed now - just submit the file straight (stream
> it) to another web service that chops it up. It may return an XML document
> with many sub elelements that allows the standard Iterator to work. Similar
> to (2) but using another service rather than Java to split document.
> 5) Using the approach proposed now - just submit the file straight (stream
> it) to another web service that chops it up but calls a Synapse proxy with
> each small packet of data that then forwards it to the final WEb Service. So
> the Web Service iterates across the data; and not Synapse.
>
> Then other solutions replace Synapse with a stand alone Java program at the
> front end.
>
> Another issue here is throttling: Splitting the file is one issues but
> submitting 100's of calls in parralel to the destination service would
> result in time outs... So need to work in throttling.
>
>
>
>
>
>
>
>
> Ruwan Linton wrote:
>>
>> I agree and can understand the time factor and also +1 for reusing stuff
>> than trying to invent the wheel again :-)
>>
>> Thanks,
>> Ruwan
>>
>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>> <an...@gmail.com>wrote:
>>
>>> Ruwan,
>>>
>>> It's not a question of possibility, it is a question of available time
>>> :-)
>>>
>>> Also note that some of the features that we might want to implement
>>> have some similarities with what is done for attachments in Axiom
>>> (except that an attachment is only available once, while a file over
>>> VFS can be read several times). I think there is also some existing
>>> code in Axis2 that might be useful. We should not reimplement these
>>> things but try to make the existing code reusable. This however is
>>> only realistic for the next release after 1.3.
>>>
>>> Andreas
>>>
>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>> wrote:
>>> > Andreas,
>>> >
>>> > Can we have the caching at the file system as a property to support the
>>> > multiple layers touching the full message and is it possible make it to
>>> > specify a threshold for streaming? For example if the message is
>>> touched
>>> > several time we might still need streaming but not for the 100KB or
>>> lesser
>>> > files.
>>> >
>>> > Thanks,
>>> > Ruwan
>>> >
>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>> andreas.veithen@gmail.com>
>>> > wrote:
>>> >>
>>> >> I've done an initial implementation of this feature. It is available
>>> >> in trunk and should be included in the next nightly build. In order to
>>> >> enable this in your configuration, you need to add the following
>>> >> property to the proxy:
>>> >>
>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>> >>
>>> >> You also need to add the following mediators just before the <send>
>>> >> mediator:
>>> >>
>>> >> <property action="remove" name="transportNonBlocking" scope="axis2"/>
>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>> >>
>>> >> With this configuration Synapse will stream the data directly from the
>>> >> incoming to the outgoing transport without storing it in memory or in
>>> >> a temporary file. Note that this has two other side effects:
>>> >> * The incoming file (or connection in case of a remote file) will only
>>> >> be opened on demand. In this case this happens during execution of the
>>> >> <send> mediator.
>>> >> * If during the mediation the content of the file is needed several
>>> >> time (which is not the case in your example), it will be read several
>>> >> times. The reason is of course that the content is not cached.
>>> >>
>>> >> I tested the solution with a 2GB file and it worked fine. The
>>> >> performance of the implementation is not yet optimal, but at least the
>>> >> memory consumption is constant.
>>> >>
>>> >> Some additional comments:
>>> >> * The transport.vfs.Streaming property has no impact on XML and SOAP
>>> >> processing: this type of content is processed exactly as before.
>>> >> * With the changes described here, we have now two different policies
>>> >> for plain text and binary content processing: in-memory caching + no
>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>> >> should define a wider range of policies in the future, including file
>>> >> system caching + streaming.
>>> >> * It is necessary to remove the transportNonBlocking property
>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send> mediator
>>> >> (more precisely the OperationClient) from executing the outgoing
>>> >> transport in a separate thread. This property is set by the incoming
>>> >> transport. I think this is a bug since I don't see any valid reason
>>> >> why the transport that handles the incoming request should determine
>>> >> the threading behavior of the transport that sends the outgoing
>>> >> request to the target service. Maybe Asankha can comment on this?
>>> >>
>>> >> Andreas
>>> >>
>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net> wrote:
>>> >> >
>>> >> > Thats good; as this stops us using Synapse.
>>> >> >
>>> >> >
>>> >> >
>>> >> > Asankha C. Perera wrote:
>>> >> >>
>>> >> >>
>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>> Java
>>> >> >>> heap
>>> >> >>> space
>>> >> >>>         at
>>> >> >>>
>>> >> >>>
>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>> >> >>>         at
>>> >> >>>
>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>> >> >>>         at
>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>> >> >>>         at
>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>> >> >>>         at
>>> >> >>>
>>> >> >>>
>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>> >> >>>
>>> >> >> Since the content type is text, the plain text formatter is trying
>>> to
>>> >> >> use a String to parse as I see.. which is a problem for large
>>> content..
>>> >> >>
>>> >> >> A definite bug we need to fix ..
>>> >> >>
>>> >> >> cheers
>>> >> >> asankha
>>> >> >>
>>> >> >> --
>>> >> >> Asankha C. Perera
>>> >> >> AdroitLogic, http://adroitlogic.org
>>> >> >>
>>> >> >> http://esbmagic.blogspot.com
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> ---------------------------------------------------------------------
>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >
>>> >> > --
>>> >> > View this message in context:
>>> >> >
>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>> >> >
>>> >> >
>>> >> >
>>> ---------------------------------------------------------------------
>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>> >> >
>>> >> >
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Ruwan Linton
>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>> > http://ruwansblog.blogspot.com/
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>>
>> --
>> Ruwan Linton
>> http://wso2.org - "Oxygenating the Web Services Platform"
>> http://ruwansblog.blogspot.com/
>>
>>
>
> --
> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


RE: VFS - Synapse Memory Leak

Posted by "Hubert, Eric" <Er...@foxmobile.com>.
Hi Andreas,

> No, basically the code would take an OMElement and return a Reader
> that represents the text content of that element. It would take care
> of doing this in an optimal way (constant memory usage and minimal
> usage of intermediate buffers), i.e. it would provide the same
> functionality than new StringReader(omElement.getText()), but without
> loading the entire data into memory. We already do something like this
> in the PlainTextFormatter, but here the idea is to encapsulate that
> nicely behind a Reader implementation. Note that this is not at all
> transport specific and would work with any OMElement.

+1 sounds very useful

Re: VFS - Synapse Memory Leak

Posted by Andreas Veithen <an...@gmail.com>.
No, basically the code would take an OMElement and return a Reader
that represents the text content of that element. It would take care
of doing this in an optimal way (constant memory usage and minimal
usage of intermediate buffers), i.e. it would provide the same
functionality than new StringReader(omElement.getText()), but without
loading the entire data into memory. We already do something like this
in the PlainTextFormatter, but here the idea is to encapsulate that
nicely behind a Reader implementation. Note that this is not at all
transport specific and would work with any OMElement.

Having this piece of code means that the streaming aspect is done and
that the problem is reduced to the implementation of a
split-iterate-callout mediation/mediator. (I'm trying to decompose the
original problem into smaller pieces that can be reused elsewhere)

Andreas

On Mon, Mar 9, 2009 at 16:53, Ruwan Linton <ru...@gmail.com> wrote:
> Andreas,
>
> On Mon, Mar 9, 2009 at 5:24 PM, Andreas Veithen <an...@gmail.com>
> wrote:
>>
>> The changes I did in the VFS transport and the message builders for
>> text/plain and application/octet-stream certainly don't provide an
>> out-of-the-box solution for your use case, but they are the
>> prerequisite.
>>
>> Concerning your first proposed solution (let the VFS write the content
>> to a temporary file), I don't like this because it would create a
>> tight coupling between the VFS transport and the mediator. A design
>> goal should be that the solution will still work if the file comes
>> from another source, e.g. an attachment in an MTOM or SwA message.
>>
>> I thing that an all-Synapse solution (2 or 3) should be possible, but
>> this will require development of a custom mediator. This mediator
>> would read the content, split it up (and store the chunks in memory or
>> an disk) and executes a sub-sequence for each chunk. The execution of
>> the sub-sequence would happen synchronously to limit the memory/disk
>> space consumption (to the maximum chunk size) and to avoid flooding
>> the destination service.
>>
>> Note that it is probably not possible to implemented the mediator
>> using a script because of the problematic String handling. Also,
>> Spring, POJO and class mediators don't support sub-sequences (I
>> think). Therefore it should be implemented as a full-featured Java
>> mediator, probably taking the existing iterate mediator as a template.
>> I can contribute the required code to get the text content in the form
>> of a java.io.Reader.
>
> Could you please explain this is bit? do you mean to implement the transport
> to give out text content as a java.io.Reader? If so what is the general
> usage of this except for this particular scenario?
>
> Thanks,
> Ruwan
>
>>
>> Regards,
>>
>> Andreas
>>
>> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>> >
>> > Although this is a good feature it may not solve the actual problem ?
>> > The main first issue on my list was the memory leak.
>> > However, the real problem is once I get this massive files I  have to
>> > send
>> > it to a web Service that can only take it in small chunks (about 14MB) .
>> > Streaming it straight out would just kill the destination Web service.
>> > It
>> > would get the memory error. The text document can be split apart easily,
>> > as
>> > it has independant records on each line seperated by <CR> <LF>.
>> >
>> > In an earlier post; that was not responded too, I mentioned:
>> >
>> > "Otherwise; for large EDI files a VFS iterator Mediator that streams
>> > through
>> > input file and outputs smaller
>> > chunks for processing, in Synapse, may be a solution ? "
>> >
>> > So I had mentioned a few solutions, in prior posts, solution now are:
>> >
>> > 1) VFS writes straight to temporary file, then a Java mediator can
>> > process
>> > the file by splitting it into many smaller files. These files then
>> > trigger
>> > another VFS proxy that submits these to the final web Service.
>> > The problem is is that is uses the file system (not so bad).
>> > 2) A Java Mediator takes the <text> package and splits it up by wrapping
>> > into many XML <data> elements that can then be acted on by a Synapse
>> > Iterator. So replace the text message with many smaller XML elements.
>> > Problem is that this loads whole message into memory.
>> > 3) Create another Iterator in Synapse that works on Regular expression
>> > (to
>> > split the text data) or actually uses a for loop approach to chop the
>> > file
>> > into chunks based on the loop index value. E.g. Index = 23 means a 14K
>> > chunk
>> > 23 chunks into the data.
>> > 4) Using the approach proposed now - just submit the file straight
>> > (stream
>> > it) to another web service that chops it up. It may return an XML
>> > document
>> > with many sub elelements that allows the standard Iterator to work.
>> > Similar
>> > to (2) but using another service rather than Java to split document.
>> > 5) Using the approach proposed now - just submit the file straight
>> > (stream
>> > it) to another web service that chops it up but calls a Synapse proxy
>> > with
>> > each small packet of data that then forwards it to the final WEb
>> > Service. So
>> > the Web Service iterates across the data; and not Synapse.
>> >
>> > Then other solutions replace Synapse with a stand alone Java program at
>> > the
>> > front end.
>> >
>> > Another issue here is throttling: Splitting the file is one issues but
>> > submitting 100's of calls in parralel to the destination service would
>> > result in time outs... So need to work in throttling.
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > Ruwan Linton wrote:
>> >>
>> >> I agree and can understand the time factor and also +1 for reusing
>> >> stuff
>> >> than trying to invent the wheel again :-)
>> >>
>> >> Thanks,
>> >> Ruwan
>> >>
>> >> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>> >> <an...@gmail.com>wrote:
>> >>
>> >>> Ruwan,
>> >>>
>> >>> It's not a question of possibility, it is a question of available time
>> >>> :-)
>> >>>
>> >>> Also note that some of the features that we might want to implement
>> >>> have some similarities with what is done for attachments in Axiom
>> >>> (except that an attachment is only available once, while a file over
>> >>> VFS can be read several times). I think there is also some existing
>> >>> code in Axis2 that might be useful. We should not reimplement these
>> >>> things but try to make the existing code reusable. This however is
>> >>> only realistic for the next release after 1.3.
>> >>>
>> >>> Andreas
>> >>>
>> >>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>> >>> wrote:
>> >>> > Andreas,
>> >>> >
>> >>> > Can we have the caching at the file system as a property to support
>> >>> > the
>> >>> > multiple layers touching the full message and is it possible make it
>> >>> > to
>> >>> > specify a threshold for streaming? For example if the message is
>> >>> touched
>> >>> > several time we might still need streaming but not for the 100KB or
>> >>> lesser
>> >>> > files.
>> >>> >
>> >>> > Thanks,
>> >>> > Ruwan
>> >>> >
>> >>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>> >>> andreas.veithen@gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> I've done an initial implementation of this feature. It is
>> >>> >> available
>> >>> >> in trunk and should be included in the next nightly build. In order
>> >>> >> to
>> >>> >> enable this in your configuration, you need to add the following
>> >>> >> property to the proxy:
>> >>> >>
>> >>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>> >>> >>
>> >>> >> You also need to add the following mediators just before the <send>
>> >>> >> mediator:
>> >>> >>
>> >>> >> <property action="remove" name="transportNonBlocking"
>> >>> >> scope="axis2"/>
>> >>> >> <property action="set" name="OUT_ONLY" value="true"/>
>> >>> >>
>> >>> >> With this configuration Synapse will stream the data directly from
>> >>> >> the
>> >>> >> incoming to the outgoing transport without storing it in memory or
>> >>> >> in
>> >>> >> a temporary file. Note that this has two other side effects:
>> >>> >> * The incoming file (or connection in case of a remote file) will
>> >>> >> only
>> >>> >> be opened on demand. In this case this happens during execution of
>> >>> >> the
>> >>> >> <send> mediator.
>> >>> >> * If during the mediation the content of the file is needed several
>> >>> >> time (which is not the case in your example), it will be read
>> >>> >> several
>> >>> >> times. The reason is of course that the content is not cached.
>> >>> >>
>> >>> >> I tested the solution with a 2GB file and it worked fine. The
>> >>> >> performance of the implementation is not yet optimal, but at least
>> >>> >> the
>> >>> >> memory consumption is constant.
>> >>> >>
>> >>> >> Some additional comments:
>> >>> >> * The transport.vfs.Streaming property has no impact on XML and
>> >>> >> SOAP
>> >>> >> processing: this type of content is processed exactly as before.
>> >>> >> * With the changes described here, we have now two different
>> >>> >> policies
>> >>> >> for plain text and binary content processing: in-memory caching +
>> >>> >> no
>> >>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>> >>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>> >>> >> should define a wider range of policies in the future, including
>> >>> >> file
>> >>> >> system caching + streaming.
>> >>> >> * It is necessary to remove the transportNonBlocking property
>> >>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
>> >>> >> mediator
>> >>> >> (more precisely the OperationClient) from executing the outgoing
>> >>> >> transport in a separate thread. This property is set by the
>> >>> >> incoming
>> >>> >> transport. I think this is a bug since I don't see any valid reason
>> >>> >> why the transport that handles the incoming request should
>> >>> >> determine
>> >>> >> the threading behavior of the transport that sends the outgoing
>> >>> >> request to the target service. Maybe Asankha can comment on this?
>> >>> >>
>> >>> >> Andreas
>> >>> >>
>> >>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
>> >>> >> wrote:
>> >>> >> >
>> >>> >> > Thats good; as this stops us using Synapse.
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> > Asankha C. Perera wrote:
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>> >>> Java
>> >>> >> >>> heap
>> >>> >> >>> space
>> >>> >> >>>         at
>> >>> >> >>>
>> >>> >> >>>
>> >>>
>> >>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>> >>> >> >>>         at
>> >>> >> >>>
>> >>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>> >>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>> >>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>> >>> >> >>>         at
>> >>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>> >>> >> >>>         at
>> >>> >> >>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>> >>> >> >>>         at
>> >>> >> >>> org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>> >>> >> >>>         at
>> >>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>> >>> >> >>>         at
>> >>> >> >>>
>> >>> >> >>>
>> >>>
>> >>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>> >>> >> >>>
>> >>> >> >> Since the content type is text, the plain text formatter is
>> >>> >> >> trying
>> >>> to
>> >>> >> >> use a String to parse as I see.. which is a problem for large
>> >>> content..
>> >>> >> >>
>> >>> >> >> A definite bug we need to fix ..
>> >>> >> >>
>> >>> >> >> cheers
>> >>> >> >> asankha
>> >>> >> >>
>> >>> >> >> --
>> >>> >> >> Asankha C. Perera
>> >>> >> >> AdroitLogic, http://adroitlogic.org
>> >>> >> >>
>> >>> >> >> http://esbmagic.blogspot.com
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>> ---------------------------------------------------------------------
>> >>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> >>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>> >>> >> >>
>> >>> >> >>
>> >>> >> >>
>> >>> >> >
>> >>> >> > --
>> >>> >> > View this message in context:
>> >>> >> >
>> >>>
>> >>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>> >>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> ---------------------------------------------------------------------
>> >>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> >>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>> >>> >> >
>> >>> >> >
>> >>> >>
>> >>> >>
>> >>> >> ---------------------------------------------------------------------
>> >>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> >>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>> >>> >>
>> >>> >
>> >>> >
>> >>> >
>> >>> > --
>> >>> > Ruwan Linton
>> >>> > http://wso2.org - "Oxygenating the Web Services Platform"
>> >>> > http://ruwansblog.blogspot.com/
>> >>> >
>> >>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> >>> For additional commands, e-mail: dev-help@synapse.apache.org
>> >>>
>> >>>
>> >>
>> >>
>> >> --
>> >> Ruwan Linton
>> >> http://wso2.org - "Oxygenating the Web Services Platform"
>> >> http://ruwansblog.blogspot.com/
>> >>
>> >>
>> >
>> > --
>> > View this message in context:
>> > http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
>> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> > For additional commands, e-mail: dev-help@synapse.apache.org
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>
>
>
> --
> Ruwan Linton
> http://wso2.org - "Oxygenating the Web Services Platform"
> http://ruwansblog.blogspot.com/
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by Ruwan Linton <ru...@gmail.com>.
Andreas,

On Mon, Mar 9, 2009 at 5:24 PM, Andreas Veithen
<an...@gmail.com>wrote:

> The changes I did in the VFS transport and the message builders for
> text/plain and application/octet-stream certainly don't provide an
> out-of-the-box solution for your use case, but they are the
> prerequisite.
>
> Concerning your first proposed solution (let the VFS write the content
> to a temporary file), I don't like this because it would create a
> tight coupling between the VFS transport and the mediator. A design
> goal should be that the solution will still work if the file comes
> from another source, e.g. an attachment in an MTOM or SwA message.
>
> I thing that an all-Synapse solution (2 or 3) should be possible, but
> this will require development of a custom mediator. This mediator
> would read the content, split it up (and store the chunks in memory or
> an disk) and executes a sub-sequence for each chunk. The execution of
> the sub-sequence would happen synchronously to limit the memory/disk
> space consumption (to the maximum chunk size) and to avoid flooding
> the destination service.
>
> Note that it is probably not possible to implemented the mediator
> using a script because of the problematic String handling. Also,
> Spring, POJO and class mediators don't support sub-sequences (I
> think). Therefore it should be implemented as a full-featured Java
> mediator, probably taking the existing iterate mediator as a template.
> I can contribute the required code to get the text content in the form
> of a java.io.Reader.


Could you please explain this is bit? do you mean to implement the transport
to give out text content as a java.io.Reader? If so what is the general
usage of this except for this particular scenario?

Thanks,
Ruwan


>
>
> Regards,
>
> Andreas
>
> On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
> >
> > Although this is a good feature it may not solve the actual problem ?
> > The main first issue on my list was the memory leak.
> > However, the real problem is once I get this massive files I  have to
> send
> > it to a web Service that can only take it in small chunks (about 14MB) .
> > Streaming it straight out would just kill the destination Web service. It
> > would get the memory error. The text document can be split apart easily,
> as
> > it has independant records on each line seperated by <CR> <LF>.
> >
> > In an earlier post; that was not responded too, I mentioned:
> >
> > "Otherwise; for large EDI files a VFS iterator Mediator that streams
> through
> > input file and outputs smaller
> > chunks for processing, in Synapse, may be a solution ? "
> >
> > So I had mentioned a few solutions, in prior posts, solution now are:
> >
> > 1) VFS writes straight to temporary file, then a Java mediator can
> process
> > the file by splitting it into many smaller files. These files then
> trigger
> > another VFS proxy that submits these to the final web Service.
> > The problem is is that is uses the file system (not so bad).
> > 2) A Java Mediator takes the <text> package and splits it up by wrapping
> > into many XML <data> elements that can then be acted on by a Synapse
> > Iterator. So replace the text message with many smaller XML elements.
> > Problem is that this loads whole message into memory.
> > 3) Create another Iterator in Synapse that works on Regular expression
> (to
> > split the text data) or actually uses a for loop approach to chop the
> file
> > into chunks based on the loop index value. E.g. Index = 23 means a 14K
> chunk
> > 23 chunks into the data.
> > 4) Using the approach proposed now - just submit the file straight
> (stream
> > it) to another web service that chops it up. It may return an XML
> document
> > with many sub elelements that allows the standard Iterator to work.
> Similar
> > to (2) but using another service rather than Java to split document.
> > 5) Using the approach proposed now - just submit the file straight
> (stream
> > it) to another web service that chops it up but calls a Synapse proxy
> with
> > each small packet of data that then forwards it to the final WEb Service.
> So
> > the Web Service iterates across the data; and not Synapse.
> >
> > Then other solutions replace Synapse with a stand alone Java program at
> the
> > front end.
> >
> > Another issue here is throttling: Splitting the file is one issues but
> > submitting 100's of calls in parralel to the destination service would
> > result in time outs... So need to work in throttling.
> >
> >
> >
> >
> >
> >
> >
> >
> > Ruwan Linton wrote:
> >>
> >> I agree and can understand the time factor and also +1 for reusing stuff
> >> than trying to invent the wheel again :-)
> >>
> >> Thanks,
> >> Ruwan
> >>
> >> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
> >> <an...@gmail.com>wrote:
> >>
> >>> Ruwan,
> >>>
> >>> It's not a question of possibility, it is a question of available time
> >>> :-)
> >>>
> >>> Also note that some of the features that we might want to implement
> >>> have some similarities with what is done for attachments in Axiom
> >>> (except that an attachment is only available once, while a file over
> >>> VFS can be read several times). I think there is also some existing
> >>> code in Axis2 that might be useful. We should not reimplement these
> >>> things but try to make the existing code reusable. This however is
> >>> only realistic for the next release after 1.3.
> >>>
> >>> Andreas
> >>>
> >>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
> >>> wrote:
> >>> > Andreas,
> >>> >
> >>> > Can we have the caching at the file system as a property to support
> the
> >>> > multiple layers touching the full message and is it possible make it
> to
> >>> > specify a threshold for streaming? For example if the message is
> >>> touched
> >>> > several time we might still need streaming but not for the 100KB or
> >>> lesser
> >>> > files.
> >>> >
> >>> > Thanks,
> >>> > Ruwan
> >>> >
> >>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
> >>> andreas.veithen@gmail.com>
> >>> > wrote:
> >>> >>
> >>> >> I've done an initial implementation of this feature. It is available
> >>> >> in trunk and should be included in the next nightly build. In order
> to
> >>> >> enable this in your configuration, you need to add the following
> >>> >> property to the proxy:
> >>> >>
> >>> >> <parameter name="transport.vfs.Streaming">true</parameter>
> >>> >>
> >>> >> You also need to add the following mediators just before the <send>
> >>> >> mediator:
> >>> >>
> >>> >> <property action="remove" name="transportNonBlocking"
> scope="axis2"/>
> >>> >> <property action="set" name="OUT_ONLY" value="true"/>
> >>> >>
> >>> >> With this configuration Synapse will stream the data directly from
> the
> >>> >> incoming to the outgoing transport without storing it in memory or
> in
> >>> >> a temporary file. Note that this has two other side effects:
> >>> >> * The incoming file (or connection in case of a remote file) will
> only
> >>> >> be opened on demand. In this case this happens during execution of
> the
> >>> >> <send> mediator.
> >>> >> * If during the mediation the content of the file is needed several
> >>> >> time (which is not the case in your example), it will be read
> several
> >>> >> times. The reason is of course that the content is not cached.
> >>> >>
> >>> >> I tested the solution with a 2GB file and it worked fine. The
> >>> >> performance of the implementation is not yet optimal, but at least
> the
> >>> >> memory consumption is constant.
> >>> >>
> >>> >> Some additional comments:
> >>> >> * The transport.vfs.Streaming property has no impact on XML and SOAP
> >>> >> processing: this type of content is processed exactly as before.
> >>> >> * With the changes described here, we have now two different
> policies
> >>> >> for plain text and binary content processing: in-memory caching + no
> >>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
> >>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
> >>> >> should define a wider range of policies in the future, including
> file
> >>> >> system caching + streaming.
> >>> >> * It is necessary to remove the transportNonBlocking property
> >>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send>
> mediator
> >>> >> (more precisely the OperationClient) from executing the outgoing
> >>> >> transport in a separate thread. This property is set by the incoming
> >>> >> transport. I think this is a bug since I don't see any valid reason
> >>> >> why the transport that handles the incoming request should determine
> >>> >> the threading behavior of the transport that sends the outgoing
> >>> >> request to the target service. Maybe Asankha can comment on this?
> >>> >>
> >>> >> Andreas
> >>> >>
> >>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net>
> wrote:
> >>> >> >
> >>> >> > Thats good; as this stops us using Synapse.
> >>> >> >
> >>> >> >
> >>> >> >
> >>> >> > Asankha C. Perera wrote:
> >>> >> >>
> >>> >> >>
> >>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
> >>> Java
> >>> >> >>> heap
> >>> >> >>> space
> >>> >> >>>         at
> >>> >> >>>
> >>> >> >>>
> >>>
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
> >>> >> >>>         at
> >>> >> >>>
> >>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
> >>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
> >>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
> >>> >> >>>         at
> >>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
> >>> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
> >>> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
> >>> >> >>>         at
> >>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
> >>> >> >>>         at
> >>> >> >>>
> >>> >> >>>
> >>>
> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
> >>> >> >>>
> >>> >> >> Since the content type is text, the plain text formatter is
> trying
> >>> to
> >>> >> >> use a String to parse as I see.. which is a problem for large
> >>> content..
> >>> >> >>
> >>> >> >> A definite bug we need to fix ..
> >>> >> >>
> >>> >> >> cheers
> >>> >> >> asankha
> >>> >> >>
> >>> >> >> --
> >>> >> >> Asankha C. Perera
> >>> >> >> AdroitLogic, http://adroitlogic.org
> >>> >> >>
> >>> >> >> http://esbmagic.blogspot.com
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> ---------------------------------------------------------------------
> >>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> >>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >
> >>> >> > --
> >>> >> > View this message in context:
> >>> >> >
> >>>
> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
> >>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
> >>> >> >
> >>> >> >
> >>> >> >
> >>> ---------------------------------------------------------------------
> >>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> >>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
> >>> >> >
> >>> >> >
> >>> >>
> >>> >>
> ---------------------------------------------------------------------
> >>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> >>> >> For additional commands, e-mail: dev-help@synapse.apache.org
> >>> >>
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Ruwan Linton
> >>> > http://wso2.org - "Oxygenating the Web Services Platform"
> >>> > http://ruwansblog.blogspot.com/
> >>> >
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> >>> For additional commands, e-mail: dev-help@synapse.apache.org
> >>>
> >>>
> >>
> >>
> >> --
> >> Ruwan Linton
> >> http://wso2.org - "Oxygenating the Web Services Platform"
> >> http://ruwansblog.blogspot.com/
> >>
> >>
> >
> > --
> > View this message in context:
> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> > For additional commands, e-mail: dev-help@synapse.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>


-- 
Ruwan Linton
http://wso2.org - "Oxygenating the Web Services Platform"
http://ruwansblog.blogspot.com/

Re: VFS - Synapse Memory Leak

Posted by Andreas Veithen <an...@gmail.com>.
The changes I did in the VFS transport and the message builders for
text/plain and application/octet-stream certainly don't provide an
out-of-the-box solution for your use case, but they are the
prerequisite.

Concerning your first proposed solution (let the VFS write the content
to a temporary file), I don't like this because it would create a
tight coupling between the VFS transport and the mediator. A design
goal should be that the solution will still work if the file comes
from another source, e.g. an attachment in an MTOM or SwA message.

I thing that an all-Synapse solution (2 or 3) should be possible, but
this will require development of a custom mediator. This mediator
would read the content, split it up (and store the chunks in memory or
an disk) and executes a sub-sequence for each chunk. The execution of
the sub-sequence would happen synchronously to limit the memory/disk
space consumption (to the maximum chunk size) and to avoid flooding
the destination service.

Note that it is probably not possible to implemented the mediator
using a script because of the problematic String handling. Also,
Spring, POJO and class mediators don't support sub-sequences (I
think). Therefore it should be implemented as a full-featured Java
mediator, probably taking the existing iterate mediator as a template.
I can contribute the required code to get the text content in the form
of a java.io.Reader.

Regards,

Andreas

On Mon, Mar 9, 2009 at 03:05, kimhorn <ki...@icsglobal.net> wrote:
>
> Although this is a good feature it may not solve the actual problem ?
> The main first issue on my list was the memory leak.
> However, the real problem is once I get this massive files I  have to send
> it to a web Service that can only take it in small chunks (about 14MB) .
> Streaming it straight out would just kill the destination Web service. It
> would get the memory error. The text document can be split apart easily, as
> it has independant records on each line seperated by <CR> <LF>.
>
> In an earlier post; that was not responded too, I mentioned:
>
> "Otherwise; for large EDI files a VFS iterator Mediator that streams through
> input file and outputs smaller
> chunks for processing, in Synapse, may be a solution ? "
>
> So I had mentioned a few solutions, in prior posts, solution now are:
>
> 1) VFS writes straight to temporary file, then a Java mediator can process
> the file by splitting it into many smaller files. These files then trigger
> another VFS proxy that submits these to the final web Service.
> The problem is is that is uses the file system (not so bad).
> 2) A Java Mediator takes the <text> package and splits it up by wrapping
> into many XML <data> elements that can then be acted on by a Synapse
> Iterator. So replace the text message with many smaller XML elements.
> Problem is that this loads whole message into memory.
> 3) Create another Iterator in Synapse that works on Regular expression (to
> split the text data) or actually uses a for loop approach to chop the file
> into chunks based on the loop index value. E.g. Index = 23 means a 14K chunk
> 23 chunks into the data.
> 4) Using the approach proposed now - just submit the file straight (stream
> it) to another web service that chops it up. It may return an XML document
> with many sub elelements that allows the standard Iterator to work. Similar
> to (2) but using another service rather than Java to split document.
> 5) Using the approach proposed now - just submit the file straight (stream
> it) to another web service that chops it up but calls a Synapse proxy with
> each small packet of data that then forwards it to the final WEb Service. So
> the Web Service iterates across the data; and not Synapse.
>
> Then other solutions replace Synapse with a stand alone Java program at the
> front end.
>
> Another issue here is throttling: Splitting the file is one issues but
> submitting 100's of calls in parralel to the destination service would
> result in time outs... So need to work in throttling.
>
>
>
>
>
>
>
>
> Ruwan Linton wrote:
>>
>> I agree and can understand the time factor and also +1 for reusing stuff
>> than trying to invent the wheel again :-)
>>
>> Thanks,
>> Ruwan
>>
>> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
>> <an...@gmail.com>wrote:
>>
>>> Ruwan,
>>>
>>> It's not a question of possibility, it is a question of available time
>>> :-)
>>>
>>> Also note that some of the features that we might want to implement
>>> have some similarities with what is done for attachments in Axiom
>>> (except that an attachment is only available once, while a file over
>>> VFS can be read several times). I think there is also some existing
>>> code in Axis2 that might be useful. We should not reimplement these
>>> things but try to make the existing code reusable. This however is
>>> only realistic for the next release after 1.3.
>>>
>>> Andreas
>>>
>>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>>> wrote:
>>> > Andreas,
>>> >
>>> > Can we have the caching at the file system as a property to support the
>>> > multiple layers touching the full message and is it possible make it to
>>> > specify a threshold for streaming? For example if the message is
>>> touched
>>> > several time we might still need streaming but not for the 100KB or
>>> lesser
>>> > files.
>>> >
>>> > Thanks,
>>> > Ruwan
>>> >
>>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>>> andreas.veithen@gmail.com>
>>> > wrote:
>>> >>
>>> >> I've done an initial implementation of this feature. It is available
>>> >> in trunk and should be included in the next nightly build. In order to
>>> >> enable this in your configuration, you need to add the following
>>> >> property to the proxy:
>>> >>
>>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>>> >>
>>> >> You also need to add the following mediators just before the <send>
>>> >> mediator:
>>> >>
>>> >> <property action="remove" name="transportNonBlocking" scope="axis2"/>
>>> >> <property action="set" name="OUT_ONLY" value="true"/>
>>> >>
>>> >> With this configuration Synapse will stream the data directly from the
>>> >> incoming to the outgoing transport without storing it in memory or in
>>> >> a temporary file. Note that this has two other side effects:
>>> >> * The incoming file (or connection in case of a remote file) will only
>>> >> be opened on demand. In this case this happens during execution of the
>>> >> <send> mediator.
>>> >> * If during the mediation the content of the file is needed several
>>> >> time (which is not the case in your example), it will be read several
>>> >> times. The reason is of course that the content is not cached.
>>> >>
>>> >> I tested the solution with a 2GB file and it worked fine. The
>>> >> performance of the implementation is not yet optimal, but at least the
>>> >> memory consumption is constant.
>>> >>
>>> >> Some additional comments:
>>> >> * The transport.vfs.Streaming property has no impact on XML and SOAP
>>> >> processing: this type of content is processed exactly as before.
>>> >> * With the changes described here, we have now two different policies
>>> >> for plain text and binary content processing: in-memory caching + no
>>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>>> >> should define a wider range of policies in the future, including file
>>> >> system caching + streaming.
>>> >> * It is necessary to remove the transportNonBlocking property
>>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send> mediator
>>> >> (more precisely the OperationClient) from executing the outgoing
>>> >> transport in a separate thread. This property is set by the incoming
>>> >> transport. I think this is a bug since I don't see any valid reason
>>> >> why the transport that handles the incoming request should determine
>>> >> the threading behavior of the transport that sends the outgoing
>>> >> request to the target service. Maybe Asankha can comment on this?
>>> >>
>>> >> Andreas
>>> >>
>>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net> wrote:
>>> >> >
>>> >> > Thats good; as this stops us using Synapse.
>>> >> >
>>> >> >
>>> >> >
>>> >> > Asankha C. Perera wrote:
>>> >> >>
>>> >> >>
>>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>>> Java
>>> >> >>> heap
>>> >> >>> space
>>> >> >>>         at
>>> >> >>>
>>> >> >>>
>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>> >> >>>         at
>>> >> >>>
>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>>> >> >>>         at
>>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>> >> >>>         at
>>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>> >> >>>         at
>>> >> >>>
>>> >> >>>
>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>> >> >>>
>>> >> >> Since the content type is text, the plain text formatter is trying
>>> to
>>> >> >> use a String to parse as I see.. which is a problem for large
>>> content..
>>> >> >>
>>> >> >> A definite bug we need to fix ..
>>> >> >>
>>> >> >> cheers
>>> >> >> asankha
>>> >> >>
>>> >> >> --
>>> >> >> Asankha C. Perera
>>> >> >> AdroitLogic, http://adroitlogic.org
>>> >> >>
>>> >> >> http://esbmagic.blogspot.com
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> ---------------------------------------------------------------------
>>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >
>>> >> > --
>>> >> > View this message in context:
>>> >> >
>>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>>> >> >
>>> >> >
>>> >> >
>>> ---------------------------------------------------------------------
>>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>>> >> >
>>> >> >
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Ruwan Linton
>>> > http://wso2.org - "Oxygenating the Web Services Platform"
>>> > http://ruwansblog.blogspot.com/
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>>> For additional commands, e-mail: dev-help@synapse.apache.org
>>>
>>>
>>
>>
>> --
>> Ruwan Linton
>> http://wso2.org - "Oxygenating the Web Services Platform"
>> http://ruwansblog.blogspot.com/
>>
>>
>
> --
> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by kimhorn <ki...@icsglobal.net>.
Although this is a good feature it may not solve the actual problem ?
The main first issue on my list was the memory leak.
However, the real problem is once I get this massive files I  have to send
it to a web Service that can only take it in small chunks (about 14MB) .
Streaming it straight out would just kill the destination Web service. It
would get the memory error. The text document can be split apart easily, as
it has independant records on each line seperated by <CR> <LF>.

In an earlier post; that was not responded too, I mentioned:

"Otherwise; for large EDI files a VFS iterator Mediator that streams through
input file and outputs smaller
chunks for processing, in Synapse, may be a solution ? "

So I had mentioned a few solutions, in prior posts, solution now are:

1) VFS writes straight to temporary file, then a Java mediator can process
the file by splitting it into many smaller files. These files then trigger
another VFS proxy that submits these to the final web Service.
The problem is is that is uses the file system (not so bad).
2) A Java Mediator takes the <text> package and splits it up by wrapping
into many XML <data> elements that can then be acted on by a Synapse
Iterator. So replace the text message with many smaller XML elements.
Problem is that this loads whole message into memory.
3) Create another Iterator in Synapse that works on Regular expression (to
split the text data) or actually uses a for loop approach to chop the file
into chunks based on the loop index value. E.g. Index = 23 means a 14K chunk
23 chunks into the data.
4) Using the approach proposed now - just submit the file straight (stream
it) to another web service that chops it up. It may return an XML document
with many sub elelements that allows the standard Iterator to work. Similar
to (2) but using another service rather than Java to split document.
5) Using the approach proposed now - just submit the file straight (stream
it) to another web service that chops it up but calls a Synapse proxy with
each small packet of data that then forwards it to the final WEb Service. So
the Web Service iterates across the data; and not Synapse.

Then other solutions replace Synapse with a stand alone Java program at the
front end.

Another issue here is throttling: Splitting the file is one issues but
submitting 100's of calls in parralel to the destination service would
result in time outs... So need to work in throttling.








Ruwan Linton wrote:
> 
> I agree and can understand the time factor and also +1 for reusing stuff
> than trying to invent the wheel again :-)
> 
> Thanks,
> Ruwan
> 
> On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
> <an...@gmail.com>wrote:
> 
>> Ruwan,
>>
>> It's not a question of possibility, it is a question of available time
>> :-)
>>
>> Also note that some of the features that we might want to implement
>> have some similarities with what is done for attachments in Axiom
>> (except that an attachment is only available once, while a file over
>> VFS can be read several times). I think there is also some existing
>> code in Axis2 that might be useful. We should not reimplement these
>> things but try to make the existing code reusable. This however is
>> only realistic for the next release after 1.3.
>>
>> Andreas
>>
>> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com>
>> wrote:
>> > Andreas,
>> >
>> > Can we have the caching at the file system as a property to support the
>> > multiple layers touching the full message and is it possible make it to
>> > specify a threshold for streaming? For example if the message is
>> touched
>> > several time we might still need streaming but not for the 100KB or
>> lesser
>> > files.
>> >
>> > Thanks,
>> > Ruwan
>> >
>> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
>> andreas.veithen@gmail.com>
>> > wrote:
>> >>
>> >> I've done an initial implementation of this feature. It is available
>> >> in trunk and should be included in the next nightly build. In order to
>> >> enable this in your configuration, you need to add the following
>> >> property to the proxy:
>> >>
>> >> <parameter name="transport.vfs.Streaming">true</parameter>
>> >>
>> >> You also need to add the following mediators just before the <send>
>> >> mediator:
>> >>
>> >> <property action="remove" name="transportNonBlocking" scope="axis2"/>
>> >> <property action="set" name="OUT_ONLY" value="true"/>
>> >>
>> >> With this configuration Synapse will stream the data directly from the
>> >> incoming to the outgoing transport without storing it in memory or in
>> >> a temporary file. Note that this has two other side effects:
>> >> * The incoming file (or connection in case of a remote file) will only
>> >> be opened on demand. In this case this happens during execution of the
>> >> <send> mediator.
>> >> * If during the mediation the content of the file is needed several
>> >> time (which is not the case in your example), it will be read several
>> >> times. The reason is of course that the content is not cached.
>> >>
>> >> I tested the solution with a 2GB file and it worked fine. The
>> >> performance of the implementation is not yet optimal, but at least the
>> >> memory consumption is constant.
>> >>
>> >> Some additional comments:
>> >> * The transport.vfs.Streaming property has no impact on XML and SOAP
>> >> processing: this type of content is processed exactly as before.
>> >> * With the changes described here, we have now two different policies
>> >> for plain text and binary content processing: in-memory caching + no
>> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
>> >> connection + streaming (transport.vfs.Streaming=true). Probably we
>> >> should define a wider range of policies in the future, including file
>> >> system caching + streaming.
>> >> * It is necessary to remove the transportNonBlocking property
>> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send> mediator
>> >> (more precisely the OperationClient) from executing the outgoing
>> >> transport in a separate thread. This property is set by the incoming
>> >> transport. I think this is a bug since I don't see any valid reason
>> >> why the transport that handles the incoming request should determine
>> >> the threading behavior of the transport that sends the outgoing
>> >> request to the target service. Maybe Asankha can comment on this?
>> >>
>> >> Andreas
>> >>
>> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net> wrote:
>> >> >
>> >> > Thats good; as this stops us using Synapse.
>> >> >
>> >> >
>> >> >
>> >> > Asankha C. Perera wrote:
>> >> >>
>> >> >>
>> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError:
>> Java
>> >> >>> heap
>> >> >>> space
>> >> >>>         at
>> >> >>>
>> >> >>>
>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>> >> >>>         at
>> >> >>>
>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>> >> >>>         at
>> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>> >> >>>         at
>> org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>> >> >>>         at
>> >> >>>
>> >> >>>
>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>> >> >>>
>> >> >> Since the content type is text, the plain text formatter is trying
>> to
>> >> >> use a String to parse as I see.. which is a problem for large
>> content..
>> >> >>
>> >> >> A definite bug we need to fix ..
>> >> >>
>> >> >> cheers
>> >> >> asankha
>> >> >>
>> >> >> --
>> >> >> Asankha C. Perera
>> >> >> AdroitLogic, http://adroitlogic.org
>> >> >>
>> >> >> http://esbmagic.blogspot.com
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> ---------------------------------------------------------------------
>> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
>> >> >>
>> >> >>
>> >> >>
>> >> >
>> >> > --
>> >> > View this message in context:
>> >> >
>> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>> >> >
>> >> >
>> >> >
>> ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> >> > For additional commands, e-mail: dev-help@synapse.apache.org
>> >> >
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>> >>
>> >
>> >
>> >
>> > --
>> > Ruwan Linton
>> > http://wso2.org - "Oxygenating the Web Services Platform"
>> > http://ruwansblog.blogspot.com/
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
> 
> 
> -- 
> Ruwan Linton
> http://wso2.org - "Oxygenating the Web Services Platform"
> http://ruwansblog.blogspot.com/
> 
> 

-- 
View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22405973.html
Sent from the Synapse - Dev mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by Ruwan Linton <ru...@gmail.com>.
I agree and can understand the time factor and also +1 for reusing stuff
than trying to invent the wheel again :-)

Thanks,
Ruwan

On Sun, Mar 8, 2009 at 4:08 PM, Andreas Veithen
<an...@gmail.com>wrote:

> Ruwan,
>
> It's not a question of possibility, it is a question of available time :-)
>
> Also note that some of the features that we might want to implement
> have some similarities with what is done for attachments in Axiom
> (except that an attachment is only available once, while a file over
> VFS can be read several times). I think there is also some existing
> code in Axis2 that might be useful. We should not reimplement these
> things but try to make the existing code reusable. This however is
> only realistic for the next release after 1.3.
>
> Andreas
>
> On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com> wrote:
> > Andreas,
> >
> > Can we have the caching at the file system as a property to support the
> > multiple layers touching the full message and is it possible make it to
> > specify a threshold for streaming? For example if the message is touched
> > several time we might still need streaming but not for the 100KB or
> lesser
> > files.
> >
> > Thanks,
> > Ruwan
> >
> > On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <
> andreas.veithen@gmail.com>
> > wrote:
> >>
> >> I've done an initial implementation of this feature. It is available
> >> in trunk and should be included in the next nightly build. In order to
> >> enable this in your configuration, you need to add the following
> >> property to the proxy:
> >>
> >> <parameter name="transport.vfs.Streaming">true</parameter>
> >>
> >> You also need to add the following mediators just before the <send>
> >> mediator:
> >>
> >> <property action="remove" name="transportNonBlocking" scope="axis2"/>
> >> <property action="set" name="OUT_ONLY" value="true"/>
> >>
> >> With this configuration Synapse will stream the data directly from the
> >> incoming to the outgoing transport without storing it in memory or in
> >> a temporary file. Note that this has two other side effects:
> >> * The incoming file (or connection in case of a remote file) will only
> >> be opened on demand. In this case this happens during execution of the
> >> <send> mediator.
> >> * If during the mediation the content of the file is needed several
> >> time (which is not the case in your example), it will be read several
> >> times. The reason is of course that the content is not cached.
> >>
> >> I tested the solution with a 2GB file and it worked fine. The
> >> performance of the implementation is not yet optimal, but at least the
> >> memory consumption is constant.
> >>
> >> Some additional comments:
> >> * The transport.vfs.Streaming property has no impact on XML and SOAP
> >> processing: this type of content is processed exactly as before.
> >> * With the changes described here, we have now two different policies
> >> for plain text and binary content processing: in-memory caching + no
> >> streaming (transport.vfs.Streaming=false) and no caching + deferred
> >> connection + streaming (transport.vfs.Streaming=true). Probably we
> >> should define a wider range of policies in the future, including file
> >> system caching + streaming.
> >> * It is necessary to remove the transportNonBlocking property
> >> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send> mediator
> >> (more precisely the OperationClient) from executing the outgoing
> >> transport in a separate thread. This property is set by the incoming
> >> transport. I think this is a bug since I don't see any valid reason
> >> why the transport that handles the incoming request should determine
> >> the threading behavior of the transport that sends the outgoing
> >> request to the target service. Maybe Asankha can comment on this?
> >>
> >> Andreas
> >>
> >> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net> wrote:
> >> >
> >> > Thats good; as this stops us using Synapse.
> >> >
> >> >
> >> >
> >> > Asankha C. Perera wrote:
> >> >>
> >> >>
> >> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: Java
> >> >>> heap
> >> >>> space
> >> >>>         at
> >> >>>
> >> >>>
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
> >> >>>         at
> >> >>>
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
> >> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
> >> >>>         at java.io.StringWriter.write(StringWriter.java:72)
> >> >>>         at
> org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
> >> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
> >> >>>         at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
> >> >>>         at
> >> >>>
> >> >>>
> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
> >> >>>
> >> >> Since the content type is text, the plain text formatter is trying to
> >> >> use a String to parse as I see.. which is a problem for large
> content..
> >> >>
> >> >> A definite bug we need to fix ..
> >> >>
> >> >> cheers
> >> >> asankha
> >> >>
> >> >> --
> >> >> Asankha C. Perera
> >> >> AdroitLogic, http://adroitlogic.org
> >> >>
> >> >> http://esbmagic.blogspot.com
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> >> >> For additional commands, e-mail: dev-help@synapse.apache.org
> >> >>
> >> >>
> >> >>
> >> >
> >> > --
> >> > View this message in context:
> >> >
> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
> >> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
> >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> >> > For additional commands, e-mail: dev-help@synapse.apache.org
> >> >
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> >> For additional commands, e-mail: dev-help@synapse.apache.org
> >>
> >
> >
> >
> > --
> > Ruwan Linton
> > http://wso2.org - "Oxygenating the Web Services Platform"
> > http://ruwansblog.blogspot.com/
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>


-- 
Ruwan Linton
http://wso2.org - "Oxygenating the Web Services Platform"
http://ruwansblog.blogspot.com/

Re: VFS - Synapse Memory Leak

Posted by Andreas Veithen <an...@gmail.com>.
Ruwan,

It's not a question of possibility, it is a question of available time :-)

Also note that some of the features that we might want to implement
have some similarities with what is done for attachments in Axiom
(except that an attachment is only available once, while a file over
VFS can be read several times). I think there is also some existing
code in Axis2 that might be useful. We should not reimplement these
things but try to make the existing code reusable. This however is
only realistic for the next release after 1.3.

Andreas

On Sun, Mar 8, 2009 at 03:47, Ruwan Linton <ru...@gmail.com> wrote:
> Andreas,
>
> Can we have the caching at the file system as a property to support the
> multiple layers touching the full message and is it possible make it to
> specify a threshold for streaming? For example if the message is touched
> several time we might still need streaming but not for the 100KB or lesser
> files.
>
> Thanks,
> Ruwan
>
> On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen <an...@gmail.com>
> wrote:
>>
>> I've done an initial implementation of this feature. It is available
>> in trunk and should be included in the next nightly build. In order to
>> enable this in your configuration, you need to add the following
>> property to the proxy:
>>
>> <parameter name="transport.vfs.Streaming">true</parameter>
>>
>> You also need to add the following mediators just before the <send>
>> mediator:
>>
>> <property action="remove" name="transportNonBlocking" scope="axis2"/>
>> <property action="set" name="OUT_ONLY" value="true"/>
>>
>> With this configuration Synapse will stream the data directly from the
>> incoming to the outgoing transport without storing it in memory or in
>> a temporary file. Note that this has two other side effects:
>> * The incoming file (or connection in case of a remote file) will only
>> be opened on demand. In this case this happens during execution of the
>> <send> mediator.
>> * If during the mediation the content of the file is needed several
>> time (which is not the case in your example), it will be read several
>> times. The reason is of course that the content is not cached.
>>
>> I tested the solution with a 2GB file and it worked fine. The
>> performance of the implementation is not yet optimal, but at least the
>> memory consumption is constant.
>>
>> Some additional comments:
>> * The transport.vfs.Streaming property has no impact on XML and SOAP
>> processing: this type of content is processed exactly as before.
>> * With the changes described here, we have now two different policies
>> for plain text and binary content processing: in-memory caching + no
>> streaming (transport.vfs.Streaming=false) and no caching + deferred
>> connection + streaming (transport.vfs.Streaming=true). Probably we
>> should define a wider range of policies in the future, including file
>> system caching + streaming.
>> * It is necessary to remove the transportNonBlocking property
>> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send> mediator
>> (more precisely the OperationClient) from executing the outgoing
>> transport in a separate thread. This property is set by the incoming
>> transport. I think this is a bug since I don't see any valid reason
>> why the transport that handles the incoming request should determine
>> the threading behavior of the transport that sends the outgoing
>> request to the target service. Maybe Asankha can comment on this?
>>
>> Andreas
>>
>> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net> wrote:
>> >
>> > Thats good; as this stops us using Synapse.
>> >
>> >
>> >
>> > Asankha C. Perera wrote:
>> >>
>> >>
>> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: Java
>> >>> heap
>> >>> space
>> >>>         at
>> >>>
>> >>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>> >>>         at
>> >>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>> >>>         at java.io.StringWriter.write(StringWriter.java:72)
>> >>>         at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>> >>>         at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>> >>>         at
>> >>>
>> >>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>> >>>
>> >> Since the content type is text, the plain text formatter is trying to
>> >> use a String to parse as I see.. which is a problem for large content..
>> >>
>> >> A definite bug we need to fix ..
>> >>
>> >> cheers
>> >> asankha
>> >>
>> >> --
>> >> Asankha C. Perera
>> >> AdroitLogic, http://adroitlogic.org
>> >>
>> >> http://esbmagic.blogspot.com
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> >> For additional commands, e-mail: dev-help@synapse.apache.org
>> >>
>> >>
>> >>
>> >
>> > --
>> > View this message in context:
>> > http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
>> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> > For additional commands, e-mail: dev-help@synapse.apache.org
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>
>
>
> --
> Ruwan Linton
> http://wso2.org - "Oxygenating the Web Services Platform"
> http://ruwansblog.blogspot.com/
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by Ruwan Linton <ru...@gmail.com>.
Andreas,

Can we have the caching at the file system as a property to support the
multiple layers touching the full message and is it possible make it to
specify a threshold for streaming? For example if the message is touched
several time we might still need streaming but not for the 100KB or lesser
files.

Thanks,
Ruwan

On Sun, Mar 8, 2009 at 1:12 AM, Andreas Veithen
<an...@gmail.com>wrote:

> I've done an initial implementation of this feature. It is available
> in trunk and should be included in the next nightly build. In order to
> enable this in your configuration, you need to add the following
> property to the proxy:
>
> <parameter name="transport.vfs.Streaming">true</parameter>
>
> You also need to add the following mediators just before the <send>
> mediator:
>
> <property action="remove" name="transportNonBlocking" scope="axis2"/>
> <property action="set" name="OUT_ONLY" value="true"/>
>
> With this configuration Synapse will stream the data directly from the
> incoming to the outgoing transport without storing it in memory or in
> a temporary file. Note that this has two other side effects:
> * The incoming file (or connection in case of a remote file) will only
> be opened on demand. In this case this happens during execution of the
> <send> mediator.
> * If during the mediation the content of the file is needed several
> time (which is not the case in your example), it will be read several
> times. The reason is of course that the content is not cached.
>
> I tested the solution with a 2GB file and it worked fine. The
> performance of the implementation is not yet optimal, but at least the
> memory consumption is constant.
>
> Some additional comments:
> * The transport.vfs.Streaming property has no impact on XML and SOAP
> processing: this type of content is processed exactly as before.
> * With the changes described here, we have now two different policies
> for plain text and binary content processing: in-memory caching + no
> streaming (transport.vfs.Streaming=false) and no caching + deferred
> connection + streaming (transport.vfs.Streaming=true). Probably we
> should define a wider range of policies in the future, including file
> system caching + streaming.
> * It is necessary to remove the transportNonBlocking property
> (MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send> mediator
> (more precisely the OperationClient) from executing the outgoing
> transport in a separate thread. This property is set by the incoming
> transport. I think this is a bug since I don't see any valid reason
> why the transport that handles the incoming request should determine
> the threading behavior of the transport that sends the outgoing
> request to the target service. Maybe Asankha can comment on this?
>
> Andreas
>
> On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net> wrote:
> >
> > Thats good; as this stops us using Synapse.
> >
> >
> >
> > Asankha C. Perera wrote:
> >>
> >>
> >>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: Java
> heap
> >>> space
> >>>         at
> >>>
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
> >>>         at
> >>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
> >>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
> >>>         at java.io.StringWriter.write(StringWriter.java:72)
> >>>         at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
> >>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
> >>>         at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
> >>>         at
> >>>
> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
> >>>
> >> Since the content type is text, the plain text formatter is trying to
> >> use a String to parse as I see.. which is a problem for large content..
> >>
> >> A definite bug we need to fix ..
> >>
> >> cheers
> >> asankha
> >>
> >> --
> >> Asankha C. Perera
> >> AdroitLogic, http://adroitlogic.org
> >>
> >> http://esbmagic.blogspot.com
> >>
> >>
> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> >> For additional commands, e-mail: dev-help@synapse.apache.org
> >>
> >>
> >>
> >
> > --
> > View this message in context:
> http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
> > Sent from the Synapse - Dev mailing list archive at Nabble.com.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> > For additional commands, e-mail: dev-help@synapse.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>


-- 
Ruwan Linton
http://wso2.org - "Oxygenating the Web Services Platform"
http://ruwansblog.blogspot.com/

Re: VFS - Synapse Memory Leak

Posted by Andreas Veithen <an...@gmail.com>.
I've done an initial implementation of this feature. It is available
in trunk and should be included in the next nightly build. In order to
enable this in your configuration, you need to add the following
property to the proxy:

<parameter name="transport.vfs.Streaming">true</parameter>

You also need to add the following mediators just before the <send> mediator:

<property action="remove" name="transportNonBlocking" scope="axis2"/>
<property action="set" name="OUT_ONLY" value="true"/>

With this configuration Synapse will stream the data directly from the
incoming to the outgoing transport without storing it in memory or in
a temporary file. Note that this has two other side effects:
* The incoming file (or connection in case of a remote file) will only
be opened on demand. In this case this happens during execution of the
<send> mediator.
* If during the mediation the content of the file is needed several
time (which is not the case in your example), it will be read several
times. The reason is of course that the content is not cached.

I tested the solution with a 2GB file and it worked fine. The
performance of the implementation is not yet optimal, but at least the
memory consumption is constant.

Some additional comments:
* The transport.vfs.Streaming property has no impact on XML and SOAP
processing: this type of content is processed exactly as before.
* With the changes described here, we have now two different policies
for plain text and binary content processing: in-memory caching + no
streaming (transport.vfs.Streaming=false) and no caching + deferred
connection + streaming (transport.vfs.Streaming=true). Probably we
should define a wider range of policies in the future, including file
system caching + streaming.
* It is necessary to remove the transportNonBlocking property
(MessageContext.TRANSPORT_NON_BLOCKING) to prevent the <send> mediator
(more precisely the OperationClient) from executing the outgoing
transport in a separate thread. This property is set by the incoming
transport. I think this is a bug since I don't see any valid reason
why the transport that handles the incoming request should determine
the threading behavior of the transport that sends the outgoing
request to the target service. Maybe Asankha can comment on this?

Andreas

On Thu, Mar 5, 2009 at 07:21, kimhorn <ki...@icsglobal.net> wrote:
>
> Thats good; as this stops us using Synapse.
>
>
>
> Asankha C. Perera wrote:
>>
>>
>>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: Java heap
>>> space
>>>         at
>>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>>         at
>>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>>         at java.io.StringWriter.write(StringWriter.java:72)
>>>         at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>>         at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>>         at
>>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>>
>> Since the content type is text, the plain text formatter is trying to
>> use a String to parse as I see.. which is a problem for large content..
>>
>> A definite bug we need to fix ..
>>
>> cheers
>> asankha
>>
>> --
>> Asankha C. Perera
>> AdroitLogic, http://adroitlogic.org
>>
>> http://esbmagic.blogspot.com
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
>> For additional commands, e-mail: dev-help@synapse.apache.org
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
> Sent from the Synapse - Dev mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by kimhorn <ki...@icsglobal.net>.
Thats good; as this stops us using Synapse. 



Asankha C. Perera wrote:
> 
> 
>> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: Java heap
>> space
>>         at
>> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>>         at
>> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>>         at java.io.StringWriter.write(StringWriter.java:72)
>>         at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>>         at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>>         at
>> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>>   
> Since the content type is text, the plain text formatter is trying to 
> use a String to parse as I see.. which is a problem for large content..
> 
> A definite bug we need to fix ..
> 
> cheers
> asankha
> 
> -- 
> Asankha C. Perera
> AdroitLogic, http://adroitlogic.org
> 
> http://esbmagic.blogspot.com
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
> For additional commands, e-mail: dev-help@synapse.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/VFS---Synapse-Memory-Leak-tp22344176p22345904.html
Sent from the Synapse - Dev mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org


Re: VFS - Synapse Memory Leak

Posted by "Asankha C. Perera" <as...@apache.org>.
> Exception in thread "vfs-Worker-4" java.lang.OutOfMemoryError: Java heap
> space
>         at
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:99)
>         at
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:518)
>         at java.lang.StringBuffer.append(StringBuffer.java:307)
>         at java.io.StringWriter.write(StringWriter.java:72)
>         at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1129)
>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1104)
>         at org.apache.commons.io.IOUtils.copy(IOUtils.java:1078)
>         at org.apache.commons.io.IOUtils.toString(IOUtils.java:382)
>         at
> org.apache.synapse.format.PlainTextBuilder.processDocument(PlainTextBuilder.java:68)
>   
Since the content type is text, the plain text formatter is trying to 
use a String to parse as I see.. which is a problem for large content..

A definite bug we need to fix ..

cheers
asankha

-- 
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com





---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@synapse.apache.org
For additional commands, e-mail: dev-help@synapse.apache.org