You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@camel.apache.org by Aaron Sutula <aa...@noaa.gov> on 2011/07/14 23:08:43 UTC

Growing HashMap for File Endpoint

Hi,

I'm using Camel 2.7.2 with Spring 3.0 and running inside the Virgo (Equinox) OSGi container on OS X and on Linux. I'm using a file endpoint to watch a directory for new files and send the content of those files to a bean method defined in my Spring context. My application will run for some time, but then runs out of heap space. I've profiled the application until it runs out of memory and examined the heap dump. The most frequent class in the heap is HashMap$Entry and most of those entries seem to hold String objects representing the path to files that Camel has previously picked up. It seems like this growing HashMap is what is causing me to run out of memory.

Here is my Spring configuration for Camel:

<bean name="publisher" class="gov.noaa.nws.iris.textingest.Publisher">
<property name="rabbitTemplate" ref="rabbitTemplate" />
</bean>

<camel:camelContext id="textIngestContext" autoStartup="true">
<camel:endpoint id="fileinput" uri="file://${watchDirectory}?readLock=fileLock&amp;delete=true" />
<camel:route autoStartup="true" startupOrder="2">
<camel:from ref="fileinput" />
<camel:setHeader headerName="filename">
<camel:simple>${file:onlyname.noext}</camel:simple>
</camel:setHeader>
<camel:to uri="bean:publisher?method=publishFile" />
</camel:route>
</camel:camelContext>

You can download the tar'd up heap dump at http://venus1.wrh.noaa.gov/scratch/sutula/heap_dump.hprof.tgz.

I've run a similar Camel endpoint in the past, but with much lower rates of files being picked up by the endpoint. I don't think I had memory issues then. This makes me think it's something related to the rate at which the endpoint is picking up data... Does the HashMap recording previous files get purged of entries that are n seconds old? In that case, high data rates would cause the HashMap to be larger.

Any help on this would be greatly appreciated.

Thanks,
Aaron





Re: Growing HashMap for File Endpoint

Posted by Claus Ibsen <cl...@gmail.com>.
On Thu, Jul 14, 2011 at 11:08 PM, Aaron Sutula <aa...@noaa.gov> wrote:
> Hi,
>
> I'm using Camel 2.7.2 with Spring 3.0 and running inside the Virgo (Equinox) OSGi container on OS X and on Linux. I'm using a file endpoint to watch a directory for new files and send the content of those files to a bean method defined in my Spring context. My application will run for some time, but then runs out of heap space. I've profiled the application until it runs out of memory and examined the heap dump. The most frequent class in the heap is HashMap$Entry and most of those entries seem to hold String objects representing the path to files that Camel has previously picked up. It seems like this growing HashMap is what is causing me to run out of memory.

How many entries are you seeing? And in which list/collection or
whatever are they stored?

The file endpoint only has an internal "in progress" repository
enabled by default, which remembers the names of the current in
progress files.
protected IdempotentRepository<String> inProgressRepository = new
MemoryIdempotentRepository();
And his repo has a limitation to keep up till 1000 entries.

I assume it must be something else that keeps those references alive.


>
> Here is my Spring configuration for Camel:
>
> <bean name="publisher" class="gov.noaa.nws.iris.textingest.Publisher">
> <property name="rabbitTemplate" ref="rabbitTemplate" />
> </bean>
>
> <camel:camelContext id="textIngestContext" autoStartup="true">
> <camel:endpoint id="fileinput" uri="file://${watchDirectory}?readLock=fileLock&amp;delete=true" />
> <camel:route autoStartup="true" startupOrder="2">
> <camel:from ref="fileinput" />
> <camel:setHeader headerName="filename">
> <camel:simple>${file:onlyname.noext}</camel:simple>
> </camel:setHeader>
> <camel:to uri="bean:publisher?method=publishFile" />
> </camel:route>
> </camel:camelContext>
>
> You can download the tar'd up heap dump at http://venus1.wrh.noaa.gov/scratch/sutula/heap_dump.hprof.tgz.
>
> I've run a similar Camel endpoint in the past, but with much lower rates of files being picked up by the endpoint. I don't think I had memory issues then. This makes me think it's something related to the rate at which the endpoint is picking up data... Does the HashMap recording previous files get purged of entries that are n seconds old? In that case, high data rates would cause the HashMap to be larger.
>
> Any help on this would be greatly appreciated.
>
> Thanks,
> Aaron
>
>
>
>
>



-- 
Claus Ibsen
-----------------
FuseSource
Email: cibsen@fusesource.com
Web: http://fusesource.com
Twitter: davsclaus, fusenews
Blog: http://davsclaus.blogspot.com/
Author of Camel in Action: http://www.manning.com/ibsen/