You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@camel.apache.org by gramanero <gr...@gmail.com> on 2012/09/14 17:03:32 UTC

Processing large files in Camel

I am attempting to understand how the Camel File/File2 component works as I
have a need to use this type of capability in an upcoming project.

Goal: Write a Camel spring-based route that will detect files in a folder
structure, wait until the entire file has been written, and place an entry
onto an ActiveMQ queue that indicates the file name, path, and length, and
leave the file in place (i.e. do not let Camel move it to a .camel folder).

Camel Route:
    <route id="newFileReceived" startupOrder="1">
      <from
uri="file:E:\NewDataReceived?recursive=true&amp;maxMessagesPerPoll=500&amp;noop=true&amp;delay=60000"/>
      <to uri="activemq:queue:data.received"/>
    </route>

Problem: For small files this route works as expected. For large files, I am
seeing a .camelLock file appear every N seconds (which corresponds to the
configured 'delay'setting) and then disappear. I am assuming that this is
Camel trying to get a read lock on the file, but I cannot find any docs on
this type of behavior. I am also guessing that it keeps appearing and
disappearing due to the file size?? The size of the files that I am dealing
with are anywhere from 7Gb and up. This may not be a realistic use case for
what we will be dealing with, but I want to verify that writing files will
only ever put an entry onto the queue once the file have been completely
written and not before.

Camel version: Whatever is baked into apache-servicemix-4.4.1-fuse-03-06

Any suggestions/help is greatly appreciated.

Thank you.





--
View this message in context: http://camel.465427.n5.nabble.com/Processing-large-files-in-Camel-tp5719376.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Re: Processing large files in Camel

Posted by gramanero <gr...@gmail.com>.
I found this in the log as it relates to the issue with the .camelLock file.
Looks like there is a java.lang.NullPointerException being thrown. No clue
as to why this is happening though.


Fri Sep 14 2012 12:43:21 GMT-0400 (Eastern Daylight Time)	ERROR	Failed
delivery for exchangeId: ID-WMORMANS-7972-1346359758983-12-77. Exhausted
after delivery attempt: 1 caught: java.lang.NullPointerException	
java.lang.NullPointerException: null

Fri Sep 14 2012 12:43:21 GMT-0400 (Eastern Daylight Time)	WARN	Rollback file
strategy:
org.apache.camel.component.file.strategy.GenericFileRenameProcessStrategy@612a3291
for file:
GenericFile[E:\Publishing\NewDataReceived\ReferenceData\Ford\OEC.Publishing.Service.FileWatcher-7-19-2012.DMP]		

Fri Sep 14 2012 12:43:20 GMT-0400 (Eastern Daylight Time)	INFO
ID-WMORMANS-7972-1346359758983-12-77 >>>
(publishing.dataWatchDog.FileReceived)
from(file://E:\Publishing\NewDataReceived?delay=60000&maxMessagesPerPoll=500&recursive=true)
--> activemq://queue:data.received <<< Pattern:InOnly,
Headers:{CamelFileNameOnly=OEC.Publishing.Service.FileWatcher-7-19-2012.DMP,
breadcrumbId=ID-WMORMANS-7972-1346359758983-12-78,
CamelFileLength=7623493727,
CamelFileAbsolutePath=E:\Publishing\NewDataReceived\ReferenceData\Ford\OEC.Publishing.Service.FileWatcher-7-19-2012.DMP,
CamelFileRelativePath=ReferenceData\Ford\OEC.Publishing.Service.FileWatcher-7-19-2012.DMP,
CamelFilePath=E:\Publishing\NewDataReceived\ReferenceData\Ford\OEC.Publishing.Service.FileWatcher-7-19-2012.DMP,
CamelFileAbsolute=true,
CamelFileName=ReferenceData\Ford\OEC.Publishing.Service.FileWatcher-7-19-2012.DMP,
CamelFileLastModified=Thu Jul 19 16:02:52 EDT 2012,
CamelFileParent=E:\Publishing\NewDataReceived\ReferenceData\Ford},
BodyType:org.apache.camel.component.file.GenericFile, Body:[Body is file
based:
GenericFile[E:\Publishing\NewDataReceived\ReferenceData\Ford\OEC.Publishing.Service.FileWatcher-7-19-2012.DMP]]



--
View this message in context: http://camel.465427.n5.nabble.com/Processing-large-files-in-Camel-tp5719376p5719386.html
Sent from the Camel - Users mailing list archive at Nabble.com.

RE: Data Transform to Bean

Posted by santoshjoshi <sa...@gmail.com>.
Hi chandra 

I have created an example at 

https://github.com/santoshjoshi/camel-bindy-example
https://github.com/santoshjoshi/camel-bindy-example 

Please have a look at that .

regards
Santosh Joshi
Chandra Kalirasa wrote
> 
> Hi Santosh,
> Thanks for the prompt response.
> 
> Do I need to convert the text file (*.txt) into String before I map into
> bean? Because these are not CSV files.
> 
>   context.addRoutes(new RouteBuilder() {
>             public void configure() {
>             
> from("file://inputdir?noop=true&fileName=employee_feed.txt").convertBodyTo(String.class)
>                 )
> 
>             }
>         });
> 
> Thanks
> Chandra
> 
> 
> -----Original Message-----
> From: santoshjoshi [mailto:santoshjoshi2003@]
> Sent: Friday, September 14, 2012 12:37 PM
> To: users@.apache
> Subject: Re: Data Transform to Bean
> 
> Hi Chandra,
> 
> You can use Camel Bindy Component.  http://camel.apache.org/bindy.html
> http://camel.apache.org/bindy.html
> 
> Regards
> Santosh Joshi
> 
> 
> Chandra Kalirasa wrote
>>
>> Hi:
>> I am new to camel.
>>
>> I have text files (.txt)  which contains Employee info.  I need to map
>> employee info fields  into Beans. Is there any mapping/data transform to
>> bean in Camel? Please advise.
>>
>> Thank you,
>> Chandra
>>
>> Sample text file:
>>
>> EMPLOYEE_NUM|FIRST_NAM|LAST_NAM|EMAIL|WORK_PHONE_NUM|DEPT_CD|ACTIVE|ROLE|SSN|BUS_UNIT_CD
>> 112011|shiv|Pewee|shiv.pewee@||610000 Employee
>> Relations|N|R|123-62-4874|IT
>> 123212|red|Nore|abby.norman@||610000 Employee
>> Relations|N|R|123-98-7953|HR
>> 357533|dade|kkekee|ada.keke@|7077-359-6452|610 Employee
>> Relations|N|R|308-25-7297|SHIPING
>> 134944|Penn|Russe|penn.russe@||610 Employee Relations|Y|R|123-37-9350|IT
>> 111155|Alb|Lasillase|ad@|180-693-2428|530
>> Administration|N|H|150-97-1265|IT
>> 1111661|ed|Xasillase|exandria.casils@|180-693-2428|543
>> Administration|N|R|230-97-1265|IT
>>
> 
> 
> 
> --
> View this message in context:
> http://camel.465427.n5.nabble.com/Processing-large-files-in-Camel-tp5719376p5719388.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
> 




--
View this message in context: http://camel.465427.n5.nabble.com/Processing-large-files-in-Camel-tp5719376p5719394.html
Sent from the Camel - Users mailing list archive at Nabble.com.

RE: Data Transform to Bean

Posted by Chandra Kalirasa <ck...@seatoncorp.com>.
Hi Santosh, 
Thanks for the prompt response.

Do I need to convert the text file (*.txt) into String before I map into bean? Because these are not CSV files. 

  context.addRoutes(new RouteBuilder() {
            public void configure() {
            	from("file://inputdir?noop=true&fileName=employee_feed.txt").convertBodyTo(String.class)
                )         
             
            }
        });

Thanks
Chandra


-----Original Message-----
From: santoshjoshi [mailto:santoshjoshi2003@gmail.com] 
Sent: Friday, September 14, 2012 12:37 PM
To: users@camel.apache.org
Subject: Re: Data Transform to Bean

Hi Chandra,

You can use Camel Bindy Component.  http://camel.apache.org/bindy.html
http://camel.apache.org/bindy.html 

Regards
Santosh Joshi


Chandra Kalirasa wrote
> 
> Hi:
> I am new to camel.
> 
> I have text files (.txt)  which contains Employee info.  I need to map
> employee info fields  into Beans. Is there any mapping/data transform to
> bean in Camel? Please advise.
> 
> Thank you,
> Chandra
> 
> Sample text file:
> 
> EMPLOYEE_NUM|FIRST_NAM|LAST_NAM|EMAIL|WORK_PHONE_NUM|DEPT_CD|ACTIVE|ROLE|SSN|BUS_UNIT_CD
> 112011|shiv|Pewee|shiv.pewee@||610000 Employee
> Relations|N|R|123-62-4874|IT
> 123212|red|Nore|abby.norman@||610000 Employee Relations|N|R|123-98-7953|HR
> 357533|dade|kkekee|ada.keke@|7077-359-6452|610 Employee
> Relations|N|R|308-25-7297|SHIPING
> 134944|Penn|Russe|penn.russe@||610 Employee Relations|Y|R|123-37-9350|IT
> 111155|Alb|Lasillase|ad@|180-693-2428|530 
> Administration|N|H|150-97-1265|IT
> 1111661|ed|Xasillase|exandria.casils@|180-693-2428|543
> Administration|N|R|230-97-1265|IT
> 



--
View this message in context: http://camel.465427.n5.nabble.com/Processing-large-files-in-Camel-tp5719376p5719388.html
Sent from the Camel - Users mailing list archive at Nabble.com.




Re: Data Transform to Bean

Posted by santoshjoshi <sa...@gmail.com>.
Hi Chandra,

You can use Camel Bindy Component.  http://camel.apache.org/bindy.html
http://camel.apache.org/bindy.html 

Regards
Santosh Joshi


Chandra Kalirasa wrote
> 
> Hi:
> I am new to camel.
> 
> I have text files (.txt)  which contains Employee info.  I need to map
> employee info fields  into Beans. Is there any mapping/data transform to
> bean in Camel? Please advise.
> 
> Thank you,
> Chandra
> 
> Sample text file:
> 
> EMPLOYEE_NUM|FIRST_NAM|LAST_NAM|EMAIL|WORK_PHONE_NUM|DEPT_CD|ACTIVE|ROLE|SSN|BUS_UNIT_CD
> 112011|shiv|Pewee|shiv.pewee@||610000 Employee
> Relations|N|R|123-62-4874|IT
> 123212|red|Nore|abby.norman@||610000 Employee Relations|N|R|123-98-7953|HR
> 357533|dade|kkekee|ada.keke@|7077-359-6452|610 Employee
> Relations|N|R|308-25-7297|SHIPING
> 134944|Penn|Russe|penn.russe@||610 Employee Relations|Y|R|123-37-9350|IT
> 111155|Alb|Lasillase|ad@|180-693-2428|530 
> Administration|N|H|150-97-1265|IT
> 1111661|ed|Xasillase|exandria.casils@|180-693-2428|543
> Administration|N|R|230-97-1265|IT
> 



--
View this message in context: http://camel.465427.n5.nabble.com/Processing-large-files-in-Camel-tp5719376p5719388.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Data Transform to Bean

Posted by Chandra Kalirasa <ck...@seatoncorp.com>.
Hi:
I am new to camel.   

I have text files (.txt)  which contains Employee info.  I need to map employee info fields  into Beans. Is there any mapping/data transform to bean in Camel? Please advise.

Thank you,
Chandra

Sample text file:

EMPLOYEE_NUM|FIRST_NAM|LAST_NAM|EMAIL|WORK_PHONE_NUM|DEPT_CD|ACTIVE|ROLE|SSN|BUS_UNIT_CD                
112011|shiv|Pewee|shiv.pewee@yahoo.com||610000 Employee Relations|N|R|123-62-4874|IT           
123212|red|Nore|abby.norman@google.com||610000 Employee Relations|N|R|123-98-7953|HR         
357533|dade|kkekee|ada.keke@google.com|7077-359-6452|610 Employee Relations|N|R|308-25-7297|SHIPING        
134944|Penn|Russe|penn.russe@google.com||610 Employee Relations|Y|R|123-37-9350|IT  
111155|Alb|Lasillase|ad@yahoo.com|180-693-2428|530  Administration|N|H|150-97-1265|IT
1111661|ed|Xasillase|exandria.casils@google.com|180-693-2428|543 Administration|N|R|230-97-1265|IT




Re: Processing large files in Camel

Posted by gramanero <gr...@gmail.com>.
Sorry, once more piece of information...the appearing and disappearing of the
.camelLock file is continuous. It never stops unless I stop the route. Also,
I am running on Windows 7 O.S.



--
View this message in context: http://camel.465427.n5.nabble.com/Processing-large-files-in-Camel-tp5719376p5719378.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Re: Processing large files in Camel

Posted by gramanero <gr...@gmail.com>.
Hi Claus,

Thank you for the response. After someone hijacked my thread I deleted my
postings as I figured no one would respond. Plus, I figured out the
problem. You actually touched on the issue I discovered.

When I send the message to AMQ you are correct in that the entire file
content was being delivered to the queue. This is definitely not what I
wanted. Since the files are so large, at least for my test cases,
"something" was blowing up under the covers to cause the
java.lang.NullPointerException. My assumption is that due to the volume of
bytes being written to the queue either a timeout or some sort of buffer
overflow was happening, but the error was just manifesting itself in a
weird way (i.e. the java.lang.NullPointException). I would love to know the
actual problem, but in the end all I wanted to write to the queue was some
basic information about the file and NOT the entire file itself. Performing
a transform did the trick, so at this point I have things working as
suspected. Just had to hit my head against the desk a few times to figure
out what was going on.

Again, Thank you for your response. It is very much appreciated.

--Scott

On Sat, Sep 15, 2012 at 2:25 AM, Claus Ibsen-2 [via Camel] <
ml-node+s465427n5719403h69@n5.nabble.com> wrote:

> Hi
>
> I suggest to take a moment to look more at the docs
> http://camel.apache.org/file2
>
> For example on that link there is a red box on top that talks about
> the issue of reading files which are currently being written by
> another process. There is a number of options you can configure to
> chose among. And if that don't suit you, you can implement your own
> custom read lock strategy.
>
> And for sending the "message" to AMQ you would need to do a message
> transformation so you don't send the entire file content with it, but
> only the meta-data you want. And in the format you want.
>
> See the EIP for that
> http://camel.apache.org/message-translator.html
>
> There is more EIPs here
> http://camel.apache.org/enterprise-integration-patterns.html
>
> And since you have to leave the files in place, you would have to
> consider at "what to do" in case the server is restarted.
> Then the previous "scanned" files may be picked up again. Is this
> okay, or do you need to keep "book" of  that.
> If so check out the idempotent consumer option on the file component,
> as well as the EIP
> http://camel.apache.org/idempotent-consumer.html
>
>
> On Fri, Sep 14, 2012 at 5:03 PM, gramanero <[hidden email]<http://user/SendEmail.jtp?type=node&node=5719403&i=0>>
> wrote:
>
> > I am attempting to understand how the Camel File/File2 component works
> as I
> > have a need to use this type of capability in an upcoming project.
> >
> > Goal: Write a Camel spring-based route that will detect files in a
> folder
> > structure, wait until the entire file has been written, and place an
> entry
> > onto an ActiveMQ queue that indicates the file name, path, and length,
> and
> > leave the file in place (i.e. do not let Camel move it to a .camel
> folder).
> >
> > Camel Route:
> >     <route id="newFileReceived" startupOrder="1">
> >       <from
> >
> uri="file:E:\NewDataReceived?recursive=true&amp;maxMessagesPerPoll=500&amp;noop=true&amp;delay=60000"/>
>
> >       <to uri="activemq:queue:data.received"/>
> >     </route>
> >
> > Problem: For small files this route works as expected. For large files,
> I am
> > seeing a .camelLock file appear every N seconds (which corresponds to
> the
> > configured 'delay'setting) and then disappear. I am assuming that this
> is
> > Camel trying to get a read lock on the file, but I cannot find any docs
> on
> > this type of behavior. I am also guessing that it keeps appearing and
> > disappearing due to the file size?? The size of the files that I am
> dealing
> > with are anywhere from 7Gb and up. This may not be a realistic use case
> for
> > what we will be dealing with, but I want to verify that writing files
> will
> > only ever put an entry onto the queue once the file have been completely
> > written and not before.
> >
> > Camel version: Whatever is baked into apache-servicemix-4.4.1-fuse-03-06
> >
> > Any suggestions/help is greatly appreciated.
> >
> > Thank you.
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> http://camel.465427.n5.nabble.com/Processing-large-files-in-Camel-tp5719376.html
> > Sent from the Camel - Users mailing list archive at Nabble.com.
>
>
>
> --
> Claus Ibsen
> -----------------
> FuseSource
> Email: [hidden email]<http://user/SendEmail.jtp?type=node&node=5719403&i=1>
> Web: http://fusesource.com
> Twitter: davsclaus, fusenews
> Blog: http://davsclaus.com
> Author of Camel in Action: http://www.manning.com/ibsen
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://camel.465427.n5.nabble.com/Processing-large-files-in-Camel-tp5719376p5719403.html
>  To unsubscribe from Processing large files in Camel, click here<http://camel.465427.n5.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=5719376&code=Z3JhbWFuZXJvQGdtYWlsLmNvbXw1NzE5Mzc2fC0xNjAyMDYxMDQz>
> .
> NAML<http://camel.465427.n5.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://camel.465427.n5.nabble.com/Processing-large-files-in-Camel-tp5719376p5719468.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Re: Processing large files in Camel

Posted by Claus Ibsen <cl...@gmail.com>.
Hi

I suggest to take a moment to look more at the docs
http://camel.apache.org/file2

For example on that link there is a red box on top that talks about
the issue of reading files which are currently being written by
another process. There is a number of options you can configure to
chose among. And if that don't suit you, you can implement your own
custom read lock strategy.

And for sending the "message" to AMQ you would need to do a message
transformation so you don't send the entire file content with it, but
only the meta-data you want. And in the format you want.

See the EIP for that
http://camel.apache.org/message-translator.html

There is more EIPs here
http://camel.apache.org/enterprise-integration-patterns.html

And since you have to leave the files in place, you would have to
consider at "what to do" in case the server is restarted.
Then the previous "scanned" files may be picked up again. Is this
okay, or do you need to keep "book" of  that.
If so check out the idempotent consumer option on the file component,
as well as the EIP
http://camel.apache.org/idempotent-consumer.html


On Fri, Sep 14, 2012 at 5:03 PM, gramanero <gr...@gmail.com> wrote:
> I am attempting to understand how the Camel File/File2 component works as I
> have a need to use this type of capability in an upcoming project.
>
> Goal: Write a Camel spring-based route that will detect files in a folder
> structure, wait until the entire file has been written, and place an entry
> onto an ActiveMQ queue that indicates the file name, path, and length, and
> leave the file in place (i.e. do not let Camel move it to a .camel folder).
>
> Camel Route:
>     <route id="newFileReceived" startupOrder="1">
>       <from
> uri="file:E:\NewDataReceived?recursive=true&amp;maxMessagesPerPoll=500&amp;noop=true&amp;delay=60000"/>
>       <to uri="activemq:queue:data.received"/>
>     </route>
>
> Problem: For small files this route works as expected. For large files, I am
> seeing a .camelLock file appear every N seconds (which corresponds to the
> configured 'delay'setting) and then disappear. I am assuming that this is
> Camel trying to get a read lock on the file, but I cannot find any docs on
> this type of behavior. I am also guessing that it keeps appearing and
> disappearing due to the file size?? The size of the files that I am dealing
> with are anywhere from 7Gb and up. This may not be a realistic use case for
> what we will be dealing with, but I want to verify that writing files will
> only ever put an entry onto the queue once the file have been completely
> written and not before.
>
> Camel version: Whatever is baked into apache-servicemix-4.4.1-fuse-03-06
>
> Any suggestions/help is greatly appreciated.
>
> Thank you.
>
>
>
>
>
> --
> View this message in context: http://camel.465427.n5.nabble.com/Processing-large-files-in-Camel-tp5719376.html
> Sent from the Camel - Users mailing list archive at Nabble.com.



-- 
Claus Ibsen
-----------------
FuseSource
Email: cibsen@fusesource.com
Web: http://fusesource.com
Twitter: davsclaus, fusenews
Blog: http://davsclaus.com
Author of Camel in Action: http://www.manning.com/ibsen