You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@camel.apache.org by Rajith Muditha Attapattu <ra...@gmail.com> on 2017/05/17 15:29:02 UTC

Processing very large files with Camel

I have a very large flat file, say 50-100 GBs (ex daily transactions).

I'm looking at the possibility of using camel to process the flat file and
update a database
Camel file and stream components come into place.

My gut reaction is to have a route that simply read 500MB to 1GB worth of
data and write to a file in a folder. (Break down the into manageable
chunks)

Another route looks at the folder and picks a file and hands it over to a
thread pool backed route to process them.

Has anybody attempted this kind of scenario?
If so have you run into challenges ?

Are there any limitations with streaming or file component? and which
component is better placed for this task?

Regards,

Rajith Muditha Attapattu <http://rajith.2rlabs.com/>

Re: Processing very large files with Camel

Posted by Rajith Muditha Attapattu <ra...@gmail.com>.
Thank you Stephan for the quick answer!

On Wed, May 17, 2017 at 11:39 AM, Burkard Stephan <Stephan.Burkard@visana.ch
> wrote:

> Hi
>
> I read XML files in sizes up to about 400 MB with the following route - it
> uses file AND streaming component.
>
> Since it is XML the splitting is done based on a specific XML element. The
> chunks are then sent to a JMS queue. Before sending it to the queue you can
> of course do transformations etc to create the needed format.
>
> from(fileEndpointUri)
>         .routeId(localRouteId)
>         .split().tokenizeXML(<XML-Elementname used for
> splitting>).streaming()
>         .to(queueEndpointUri);
>
> Regards
> Stephan
>
>
> -----Ursprüngliche Nachricht-----
> Von: Rajith Muditha Attapattu [mailto:rajith77@gmail.com]
> Gesendet: Mittwoch, 17. Mai 2017 17:29
> An: users@camel.apache.org
> Betreff: Processing very large files with Camel
>
> I have a very large flat file, say 50-100 GBs (ex daily transactions).
>
> I'm looking at the possibility of using camel to process the flat file and
> update a database Camel file and stream components come into place.
>
> My gut reaction is to have a route that simply read 500MB to 1GB worth of
> data and write to a file in a folder. (Break down the into manageable
> chunks)
>
> Another route looks at the folder and picks a file and hands it over to a
> thread pool backed route to process them.
>
> Has anybody attempted this kind of scenario?
> If so have you run into challenges ?
>
> Are there any limitations with streaming or file component? and which
> component is better placed for this task?
>
> Regards,
>
> Rajith Muditha Attapattu <http://rajith.2rlabs.com/>
>



-- 
Regards,

Rajith Muditha Attapattu <http://rajith.2rlabs.com/>

AW: Processing very large files with Camel

Posted by Burkard Stephan <St...@visana.ch>.
Hi 

I read XML files in sizes up to about 400 MB with the following route - it uses file AND streaming component. 

Since it is XML the splitting is done based on a specific XML element. The chunks are then sent to a JMS queue. Before sending it to the queue you can of course do transformations etc to create the needed format.

from(fileEndpointUri)
	.routeId(localRouteId)
	.split().tokenizeXML(<XML-Elementname used for splitting>).streaming()
	.to(queueEndpointUri);

Regards
Stephan


-----Ursprüngliche Nachricht-----
Von: Rajith Muditha Attapattu [mailto:rajith77@gmail.com] 
Gesendet: Mittwoch, 17. Mai 2017 17:29
An: users@camel.apache.org
Betreff: Processing very large files with Camel

I have a very large flat file, say 50-100 GBs (ex daily transactions).

I'm looking at the possibility of using camel to process the flat file and update a database Camel file and stream components come into place.

My gut reaction is to have a route that simply read 500MB to 1GB worth of data and write to a file in a folder. (Break down the into manageable
chunks)

Another route looks at the folder and picks a file and hands it over to a thread pool backed route to process them.

Has anybody attempted this kind of scenario?
If so have you run into challenges ?

Are there any limitations with streaming or file component? and which component is better placed for this task?

Regards,

Rajith Muditha Attapattu <http://rajith.2rlabs.com/>