You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@camel.apache.org by Christian Mueller <ch...@gmail.com> on 2010/12/10 18:56:47 UTC

Splitting big XML files using xpath() and streaming()

Hello all!

I'm using Camel 2.2.0 with Java 1.6.20, but this problem also exists in
Camel 2.5.0:

It looks like the streaming() mode did not work (for me) in conjunction with
xpath. I created a simple unit test:

{code}
public class StreamingTest extends CamelTestSupport {
    
    @Test
    public void testStreamingBigXmlfiles() throws Exception {
        Thread.sleep(200000); // let Camel split the big file
    }

    @Override
    protected RouteBuilder createRouteBuilder() throws Exception {
        return new RouteBuilder() {
            @Override
            public void configure() throws Exception {
                XPathBuilder xPath = xpath("/a/b");
                
                from("file://src/test/resources?noop=true")
                    .log("Starting to process big file:
${header.CamelFileName}")
                    .split(xPath).streaming()
                        .process(new Processor() {
                            private int counter = 0;
                            
                            public void process(Exchange arg0) throws
Exception {
                                System.out.println("MSG: " + ++counter);
                            }
                        })
                    .end()
                    .log("Done processing big file:
${header.CamelFileName}");
            }
        };
    }   
}
{code}

which consumes a 100MB file with the following format:

{code}
<?xml version="1.0" encoding="UTF-8"?>

	
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
	
	
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
	
	
...
	

{code}

and each time, I get the following exception:

{code}
SCHWERWIEGEND: Caused by: [org.apache.camel.CamelExecutionException -
Exception occurred during execution on the exchange: Exchange[null]]
org.apache.camel.CamelExecutionException: Exception occurred during
execution on the exchange: Exchange[null]
	at
org.apache.camel.util.ObjectHelper.wrapCamelExecutionException(ObjectHelper.java:1156)
	at
org.apache.camel.impl.DefaultExchange.setException(DefaultExchange.java:262)
	at
org.apache.camel.processor.MulticastProcessor.process(MulticastProcessor.java:196)
	at org.apache.camel.processor.Splitter.process(Splitter.java:94)
	at
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:70)
	at
org.apache.camel.processor.DelegateAsyncProcessor.processNext(DelegateAsyncProcessor.java:98)
	at
org.apache.camel.processor.DelegateAsyncProcessor.process(DelegateAsyncProcessor.java:89)
	at
org.apache.camel.processor.interceptor.TraceInterceptor.process(TraceInterceptor.java:99)
	at
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:70)
	at
org.apache.camel.processor.RedeliveryErrorHandler.processErrorHandler(RedeliveryErrorHandler.java:299)
	at
org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:208)
	at
org.apache.camel.processor.DefaultChannel.process(DefaultChannel.java:256)
	at
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:70)
	at org.apache.camel.processor.Pipeline.process(Pipeline.java:143)
	at org.apache.camel.processor.Pipeline.process(Pipeline.java:78)
	at
org.apache.camel.processor.UnitOfWorkProcessor.process(UnitOfWorkProcessor.java:99)
	at
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:70)
	at
org.apache.camel.processor.DelegateAsyncProcessor.processNext(DelegateAsyncProcessor.java:98)
	at
org.apache.camel.processor.DelegateAsyncProcessor.process(DelegateAsyncProcessor.java:89)
	at
org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:68)
	at
org.apache.camel.component.file.GenericFileConsumer.processExchange(GenericFileConsumer.java:322)
	at
org.apache.camel.component.file.GenericFileConsumer.processBatch(GenericFileConsumer.java:155)
	at
org.apache.camel.component.file.GenericFileConsumer.poll(GenericFileConsumer.java:121)
	at
org.apache.camel.impl.ScheduledPollConsumer.run(ScheduledPollConsumer.java:97)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
	at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
	at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at
com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.getNodeObject(DeferredDocumentImpl.java:998)
	at
com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.synchronizeChildren(DeferredDocumentImpl.java:1741)
	at
com.sun.org.apache.xerces.internal.dom.DeferredElementNSImpl.synchronizeChildren(DeferredElementNSImpl.java:141)
	at
com.sun.org.apache.xerces.internal.dom.ParentNode.hasChildNodes(ParentNode.java:194)
	at
com.sun.org.apache.xml.internal.dtm.ref.dom2dtm.DOM2DTM.nextNode(DOM2DTM.java:344)
	at
com.sun.org.apache.xml.internal.dtm.ref.DTMDefaultBase._firstch(DTMDefaultBase.java:531)
	at
com.sun.org.apache.xml.internal.dtm.ref.DTMDefaultBase.getFirstChild(DTMDefaultBase.java:971)
	at
com.sun.org.apache.xml.internal.dtm.ref.DTMDefaultBaseTraversers$ChildTraverser.first(DTMDefaultBaseTraversers.java:409)
	at
com.sun.org.apache.xpath.internal.axes.AxesWalker.getNextNode(AxesWalker.java:325)
	at
com.sun.org.apache.xpath.internal.axes.AxesWalker.nextNode(AxesWalker.java:361)
	at
com.sun.org.apache.xpath.internal.axes.WalkingIterator.nextNode(WalkingIterator.java:192)
	at
com.sun.org.apache.xpath.internal.axes.NodeSequence.nextNode(NodeSequence.java:288)
	at
com.sun.org.apache.xpath.internal.axes.NodeSequence.runTo(NodeSequence.java:442)
	at
com.sun.org.apache.xml.internal.dtm.ref.DTMNodeList.<init>(DTMNodeList.java:79)
	at
com.sun.org.apache.xpath.internal.objects.XNodeSet.nodelist(XNodeSet.java:339)
	at
com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.getResultAsType(XPathExpressionImpl.java:353)
	at
com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.eval(XPathExpressionImpl.java:99)
	at
com.sun.org.apache.xpath.internal.jaxp.XPathExpressionImpl.evaluate(XPathExpressionImpl.java:180)
	at
org.apache.camel.builder.xml.XPathBuilder.doInEvaluateAs(XPathBuilder.java:657)
	at
org.apache.camel.builder.xml.XPathBuilder.evaluateAs(XPathBuilder.java:629)
	at
org.apache.camel.builder.xml.XPathBuilder.evaluate(XPathBuilder.java:602)
	at
org.apache.camel.builder.xml.XPathBuilder.evaluate(XPathBuilder.java:131)
	at
org.apache.camel.processor.Splitter.createProcessorExchangePairs(Splitter.java:99)
	at
org.apache.camel.processor.MulticastProcessor.process(MulticastProcessor.java:181)
	at org.apache.camel.processor.Splitter.process(Splitter.java:94)
	at
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:70)
	at
org.apache.camel.processor.DelegateAsyncProcessor.processNext(DelegateAsyncProcessor.java:98)
	at
org.apache.camel.processor.DelegateAsyncProcessor.process(DelegateAsyncProcessor.java:89)
	at
org.apache.camel.processor.interceptor.TraceInterceptor.process(TraceInterceptor.java:99)
	at
org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:70)
	at
org.apache.camel.processor.RedeliveryErrorHandler.processErrorHandler(RedeliveryErrorHandler.java:299)
	at
org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:208)
{code}

Any idea?

Thanks in advance,
Christian
-- 
View this message in context: http://camel.465427.n5.nabble.com/Splitting-big-XML-files-using-xpath-and-streaming-tp3300695p3300695.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Re: Splitting big XML files using xpath() and streaming()

Posted by Christian Mueller <ch...@gmail.com>.
Hello jburkhardt!
I was not aware of VDT. I will have a look on it today.
Thanks,
Christian
-- 
View this message in context: http://camel.465427.n5.nabble.com/Splitting-big-XML-files-using-xpath-and-streaming-tp3300695p3302841.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Re: Splitting big XML files using xpath() and streaming()

Posted by jburkhardt <jb...@gmail.com>.
I'd really recommend using VTD or something similar to split large XML.  Much
faster and more memory efficient.
Define a bean that does the split and returns an iterator and supply it as
the method for split.
Simple example using VTD to split XML:
http://stackoverflow.com/questions/1640472/java-how-to-split-xml-stream-into-small-xml-documents-xpath-on-streaming-xml-pa/1641152#1641152

-- 
View this message in context: http://camel.465427.n5.nabble.com/Splitting-big-XML-files-using-xpath-and-streaming-tp3300695p3300842.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Re: Splitting big XML files using xpath() and streaming()

Posted by Charles Moulliard <cm...@gmail.com>.
Hi Christian,

Unfortunately, I don't find yet the time to investigate this issue but 
should done soon for a customer in Germany which would like to parse big 
XML file (around 250 Mb) using Xpath. After a rapid quick search last 
week, I found this interesting note on Saxon web site --> 
http://www.saxonica.com/documentation/sourcedocs/streaming.xml regarding 
to strategies to adopt.

Regards,

Charles

On 13/12/10 10:12, Christian Mueller wrote:
> Hello Richard!
> Thanks for your suggestion. I'm aware of the Smook integration, because I
> developed it together with the Smooks guys. :o)
> I thought if it's possible with the splitter EIP via XPath and the
> streaming() mode, it is the easiest solution (also for my junior
> co-workers). But it looks like it's not possible and I will make this more
> clear in the wiki (because there is example code which use xpath and the
> streaming mode).
> I also think about whether or not it's meaningful to add this improvement to
> the splitter/xpath EIP...
> Cheers,
> Christian

Re: Questions of how to simplify using Camel as mediator for WebService

Posted by William Tam <em...@gmail.com>.
Another suggestion is to leverage sample language's OGNL support
(http://camel.apache.org/simple.html) which is pretty cool. I haven't
tried it myself tho.

For example in PAYLOAD mode, you can get ...

the second SOAP message part by simple("${body.body[1]}")
the third SOAP header by simple("${body.headers[2]")

In all cases, an Element is returned but you can easily convert it to a
string if you like.



On 12/14/2010 10:09 AM, William Tam wrote:
>
> On 12/14/2010 08:51 AM, ext2 wrote:
>> Hi:
>>
>> 	The cxf component (payload model)will wrapped the soap message in a
>> CXFPayLoad or a CXFMessage. Both are not advantage for camel's script
>> language to access headers and body xml。 
>> 	I cannot use a simple expression etc: simple language or xpath
>> language to access the soap header. That means I must always write a java
>> program to query the soap xml even for a very simple usage;
> You can try MESSAGE mode. You will get the entire SOAP message in one
> XML. CxfPayload gives you access to individual header or body part as a
> Element and you don't have to parse the SOAP message. In MESSAGE mode,
> you get to parse the SOAP message which sounds like what you want.
>
>> 	I am not sure why the camel's cxf component doesn't map soap header
>> to  message header,soap body to message body. 
> Camel message header is used by transport header and other Camel
> "internal" header. If you jam the SOAP header in there, not only it can
> create a mess. How can other components tell it is not a "transport"
> header but it is a header of your SOAP message? If you think about it,
> the SOAP header is really a part of the a SOAP message which is the
> payload of Camel message. Also, the Camel header would be propagated as
> protocol" header such as HTTP unless it is filtered out.
>
>> 	If so , the end user can  use camel as mediator  for Web Service
>> more simply;
>>
> That may be true if Camel has only one component - camel-cxf.
>> Thanks for suggestion
>>
>>
>>

Re: Questions of how to simplify using Camel as mediator for WebService

Posted by ext2 <xu...@tongtech.com>.
Hi, William Tam:
It's cool. 
Thanks a lot


----original -----
Sender: William Tam [mailto:email.wtam@gmail.com] 
Date: 2010/12/15 0:04
Receiver: users@camel.apache.org
Subject: Re: Questions of how to simplify using Camel as mediator for
WebService

I think you can give Simple Language's OGNL a try (on CxfPayload object)
to avoid writing Java code.

On 12/14/2010 10:40 AM, ext2 wrote:
>> Camel message header is used by transport header and other Camel
>> "internal" header.
> Is it the design principles of camel? Only transport header and Internal
> header?
>
>> If you jam the SOAP header in there, not only it can
>> create a mess. How can other components tell it is not a "transport"
>> header but it is a header of your SOAP message? If you think about it,
>> the SOAP header is really a part of the a SOAP message which is the
>> payload of Camel message.
>> Also, the Camel header would be propagated as
>> protocol" header such as HTTP unless it is filtered out.
> Maybe you are right, but how the resolve the question "facilitate to use
> camel as web service mediator"? 
> Or camel cannot act as a better mediator for webservice?
>
> ----original-----
> Sender: William Tam [mailto:email.wtam@gmail.com] 
> Date: 2010/12/14 23:10
> Receiver: users@camel.apache.org
> Subject: Re: Questions of how to simplify using Camel as mediator for
> WebService
>
>
>
> On 12/14/2010 08:51 AM, ext2 wrote:
>> Hi:
>>
>> 	The cxf component (payload model)will wrapped the soap message in a
>> CXFPayLoad or a CXFMessage. Both are not advantage for camel's script
>> language to access headers and body xml。 
>> 	I cannot use a simple expression etc: simple language or xpath
>> language to access the soap header. That means I must always write a java
>> program to query the soap xml even for a very simple usage;
> You can try MESSAGE mode. You will get the entire SOAP message in one
> XML. CxfPayload gives you access to individual header or body part as a
> Element and you don't have to parse the SOAP message. In MESSAGE mode,
> you get to parse the SOAP message which sounds like what you want.
>
>> 	I am not sure why the camel's cxf component doesn't map soap header
>> to  message header,soap body to message body. 
>> 	If so , the end user can  use camel as mediator  for Web Service
>> more simply;
>>
> That may be true if Camel has only one component - camel-cxf.
>> Thanks for suggestion
>>
>>
>>
>
>



Re: Questions of how to simplify using Camel as mediator for WebService

Posted by William Tam <em...@gmail.com>.
I think you can give Simple Language's OGNL a try (on CxfPayload object)
to avoid writing Java code.

On 12/14/2010 10:40 AM, ext2 wrote:
>> Camel message header is used by transport header and other Camel
>> "internal" header.
> Is it the design principles of camel? Only transport header and Internal
> header?
>
>> If you jam the SOAP header in there, not only it can
>> create a mess. How can other components tell it is not a "transport"
>> header but it is a header of your SOAP message? If you think about it,
>> the SOAP header is really a part of the a SOAP message which is the
>> payload of Camel message.
>> Also, the Camel header would be propagated as
>> protocol" header such as HTTP unless it is filtered out.
> Maybe you are right, but how the resolve the question "facilitate to use
> camel as web service mediator"? 
> Or camel cannot act as a better mediator for webservice?
>
> ----original-----
> Sender: William Tam [mailto:email.wtam@gmail.com] 
> Date: 2010/12/14 23:10
> Receiver: users@camel.apache.org
> Subject: Re: Questions of how to simplify using Camel as mediator for
> WebService
>
>
>
> On 12/14/2010 08:51 AM, ext2 wrote:
>> Hi:
>>
>> 	The cxf component (payload model)will wrapped the soap message in a
>> CXFPayLoad or a CXFMessage. Both are not advantage for camel's script
>> language to access headers and body xml。 
>> 	I cannot use a simple expression etc: simple language or xpath
>> language to access the soap header. That means I must always write a java
>> program to query the soap xml even for a very simple usage;
> You can try MESSAGE mode. You will get the entire SOAP message in one
> XML. CxfPayload gives you access to individual header or body part as a
> Element and you don't have to parse the SOAP message. In MESSAGE mode,
> you get to parse the SOAP message which sounds like what you want.
>
>> 	I am not sure why the camel's cxf component doesn't map soap header
>> to  message header,soap body to message body. 
>> 	If so , the end user can  use camel as mediator  for Web Service
>> more simply;
>>
> That may be true if Camel has only one component - camel-cxf.
>> Thanks for suggestion
>>
>>
>>
>
>

Re: Questions of how to simplify using Camel as mediator for WebService

Posted by ext2 <xu...@tongtech.com>.
>Camel message header is used by transport header and other Camel
>"internal" header.
Is it the design principles of camel? Only transport header and Internal
header?

> If you jam the SOAP header in there, not only it can
>create a mess. How can other components tell it is not a "transport"
>header but it is a header of your SOAP message? If you think about it,
>the SOAP header is really a part of the a SOAP message which is the
>payload of Camel message.
>Also, the Camel header would be propagated as
>protocol" header such as HTTP unless it is filtered out.

Maybe you are right, but how the resolve the question "facilitate to use
camel as web service mediator"? 
Or camel cannot act as a better mediator for webservice?

----original-----
Sender: William Tam [mailto:email.wtam@gmail.com] 
Date: 2010/12/14 23:10
Receiver: users@camel.apache.org
Subject: Re: Questions of how to simplify using Camel as mediator for
WebService



On 12/14/2010 08:51 AM, ext2 wrote:
> Hi:
>
> 	The cxf component (payload model)will wrapped the soap message in a
> CXFPayLoad or a CXFMessage. Both are not advantage for camel's script
> language to access headers and body xml。 
> 	I cannot use a simple expression etc: simple language or xpath
> language to access the soap header. That means I must always write a java
> program to query the soap xml even for a very simple usage;
You can try MESSAGE mode. You will get the entire SOAP message in one
XML. CxfPayload gives you access to individual header or body part as a
Element and you don't have to parse the SOAP message. In MESSAGE mode,
you get to parse the SOAP message which sounds like what you want.

> 	I am not sure why the camel's cxf component doesn't map soap header
> to  message header,soap body to message body. 

> 	If so , the end user can  use camel as mediator  for Web Service
> more simply;
>
That may be true if Camel has only one component - camel-cxf.
> Thanks for suggestion
>
>
>



Re: Questions of how to simplify using Camel as mediator for WebService

Posted by William Tam <em...@gmail.com>.

On 12/14/2010 08:51 AM, ext2 wrote:
> Hi:
>
> 	The cxf component (payload model)will wrapped the soap message in a
> CXFPayLoad or a CXFMessage. Both are not advantage for camel's script
> language to access headers and body xml。 
> 	I cannot use a simple expression etc: simple language or xpath
> language to access the soap header. That means I must always write a java
> program to query the soap xml even for a very simple usage;
You can try MESSAGE mode. You will get the entire SOAP message in one
XML. CxfPayload gives you access to individual header or body part as a
Element and you don't have to parse the SOAP message. In MESSAGE mode,
you get to parse the SOAP message which sounds like what you want.

> 	I am not sure why the camel's cxf component doesn't map soap header
> to  message header,soap body to message body. 
Camel message header is used by transport header and other Camel
"internal" header. If you jam the SOAP header in there, not only it can
create a mess. How can other components tell it is not a "transport"
header but it is a header of your SOAP message? If you think about it,
the SOAP header is really a part of the a SOAP message which is the
payload of Camel message. Also, the Camel header would be propagated as
protocol" header such as HTTP unless it is filtered out.

> 	If so , the end user can  use camel as mediator  for Web Service
> more simply;
>
That may be true if Camel has only one component - camel-cxf.
> Thanks for suggestion
>
>
>

Questions of how to simplify using Camel as mediator for WebService

Posted by ext2 <xu...@tongtech.com>.
Hi:

	The cxf component (payload model)will wrapped the soap message in a
CXFPayLoad or a CXFMessage. Both are not advantage for camel's script
language to access headers and body xml。 
	I cannot use a simple expression etc: simple language or xpath
language to access the soap header. That means I must always write a java
program to query the soap xml even for a very simple usage;
	I am not sure why the camel's cxf component doesn't map soap header
to  message header,soap body to message body. 
	If so , the end user can  use camel as mediator  for Web Service
more simply;

Thanks for suggestion



Re: Splitting big XML files using xpath() and streaming()

Posted by ext2 <xu...@tongtech.com>.
Hi, Christian Müller:
	Thanks you. I know how to split csv now.
First, please forgive my long e-mail , but this question is  confusing  me
for a long time. I think I should express it clearly at this chance.

I still feel confused about whether we should do "batch" operation in the
route or do "batch" operation in adaptor. (camel use the words "component or
endpoint" instead of adaptor)
	Do "batch" operations in route will be more flexible. But do "batch"
operations in adaptors will achieve more performance: reduce memory usage ,
and increasing the processing speed;
	The "csv" is not a good example, maybe we could using "SQL"
component to illustrate my confusing question:
	First let's assume the inbound and outbound "SQL" component support
"batch" operations. 
	For inbound adpator,  "batch" means: when "SQL" inbound endpoint
select billions of records, it doesn't output all the records at once, but
it will output batch of records, each time only a batch of records is
output, and it will finish output until all record are outputted ;The record
number of a batch can be limited by "batch size". For inbound adaptor's
"batch operation", it looks like  "split" operation is hidden in the inbound
endpoint. 
	Because the records never output at once, so the memory usage will
be reduced;
	For outbound adaptor, "batch" means: it can receive batch of
records, and insert a batch of records into database in a single insert
operation. This is to say: it can construct a long SQL to insert multiple
records into database, so the processing speed will be very fast than insert
record one by one;

	Actually, "do batch in adaptor " is not as flexible as "do batching
in route". But I haven't think of a good way which can achieve both flexible
, and performance also.
	How does you understand this question?
	Does anyone have a good idea for this question?

	Thanks a lot
-----orginal-----
Sender: Christian Müller [mailto:christian.mueller@gmail.com] 
Date: 2010/12/15 5:26
Receiver: users@camel.apache.org
Subject: Re: Splitting big XML files using xpath() and streaming()

Only for clarification, the streaming mode doesn't work with xpath but works
very well with the Scanner, which is used if you split like this:
split(body(String.class).tokenize("\n")).
If you have to split CSV files, you will probably use the Scanner.
You should also have a look at the camel smooks integration, provided by
Smooks out of the box...

Cheers,
Christian



Re: Splitting big XML files using xpath() and streaming()

Posted by Christian Müller <ch...@gmail.com>.
Only for clarification, the streaming mode doesn't work with xpath but works
very well with the Scanner, which is used if you split like this:
split(body(String.class).tokenize("\n")).
If you have to split CSV files, you will probably use the Scanner.
You should also have a look at the camel smooks integration, provided by
Smooks out of the box...

Cheers,
Christian

Re: Splitting big XML files using xpath() and streaming()

Posted by ext2 <xu...@tongtech.com>.
The real problem is: if split expression cannot do streaming operation, the
memory cost cannot be avoid; 
Here the xpath is example which cannot do streaming. Csv will be another
example. It means we cannot process a very large csv file (is it?I am not
sure, but it seems so)
So how about supporting "batch" operations  on the camel's component? I am
not sure if this is suitable for camel. But some other commercial adapter's
does support such a function;

Also the batch options is not only apply for file component, but also apply
to other components, etc: a SQL component which should output the result
batched;

----original -----
Sender: Charles Moulliard [mailto:cmoulliard@gmail.com] 
Date: 2010/12/14 19:22
Receiver: users@camel.apache.org
Subject: Re: Splitting big XML files using xpath() and streaming()

Hi Christian,

Interesting discussion that you have started where we reach the border / 
limit about what camel should do in a messaging approach instead of a 
batch process.

Regarding to your question about using a stylesheet to split the XML 
files into multiple small files, I'm not quite sure that this is the 
right solution as we have to check what happens between all the files 
individually before to answer to the question 'Is my file valid' ? We 
will probably speed up the process and reduce memory consumption but 
increase complexicity of the solution. If nevertheless, there is no 
other alternatives, we could use a SEDA processor to split the big file 
in small files, process them individually and use an aggregator to check 
if each file has been validated/processed correctly

Regards,

Charles


On 14/12/10 10:54, Christian Müller wrote:
> @Claus: As what I found, using XPath in a streaming mode is not possible
out
> of the box in Java 5 or 6 (XPath 1.0 needs the DOM XML Document). JAXP 1.4
> (part of Java 6) includes the StAX API and can be used in Java 5. But than
> we have to parse the XPath expression by our self and use the Iterator API
> (XMLEventReader)... I think this is not what we want... I will update the
> wiki page with the information, that the streaming() mode can not be used
in
> conjunction with xpath().
>
> @Charles: Thanks for the link. Do you plan to use a style sheet and split
a
> big xml file into multilpe small files? Or is it possible with the pure
> Saxon Java API (without a style sheet) which returns an iterator? This
could
> be also the solution for our requirements (pure Java API which we can use
in
> a custom processor).
>
> Cheers,
> Christian
>



Re: Splitting big XML files using xpath() and streaming()

Posted by Charles Moulliard <cm...@gmail.com>.
Hi Christian,

Interesting discussion that you have started where we reach the border / 
limit about what camel should do in a messaging approach instead of a 
batch process.

Regarding to your question about using a stylesheet to split the XML 
files into multiple small files, I'm not quite sure that this is the 
right solution as we have to check what happens between all the files 
individually before to answer to the question 'Is my file valid' ? We 
will probably speed up the process and reduce memory consumption but 
increase complexicity of the solution. If nevertheless, there is no 
other alternatives, we could use a SEDA processor to split the big file 
in small files, process them individually and use an aggregator to check 
if each file has been validated/processed correctly

Regards,

Charles


On 14/12/10 10:54, Christian Müller wrote:
> @Claus: As what I found, using XPath in a streaming mode is not possible out
> of the box in Java 5 or 6 (XPath 1.0 needs the DOM XML Document). JAXP 1.4
> (part of Java 6) includes the StAX API and can be used in Java 5. But than
> we have to parse the XPath expression by our self and use the Iterator API
> (XMLEventReader)... I think this is not what we want... I will update the
> wiki page with the information, that the streaming() mode can not be used in
> conjunction with xpath().
>
> @Charles: Thanks for the link. Do you plan to use a style sheet and split a
> big xml file into multilpe small files? Or is it possible with the pure
> Saxon Java API (without a style sheet) which returns an iterator? This could
> be also the solution for our requirements (pure Java API which we can use in
> a custom processor).
>
> Cheers,
> Christian
>

Re: Splitting big XML files using xpath() and streaming()

Posted by Christian Müller <ch...@gmail.com>.
@Claus: As what I found, using XPath in a streaming mode is not possible out
of the box in Java 5 or 6 (XPath 1.0 needs the DOM XML Document). JAXP 1.4
(part of Java 6) includes the StAX API and can be used in Java 5. But than
we have to parse the XPath expression by our self and use the Iterator API
(XMLEventReader)... I think this is not what we want... I will update the
wiki page with the information, that the streaming() mode can not be used in
conjunction with xpath().

@Charles: Thanks for the link. Do you plan to use a style sheet and split a
big xml file into multilpe small files? Or is it possible with the pure
Saxon Java API (without a style sheet) which returns an iterator? This could
be also the solution for our requirements (pure Java API which we can use in
a custom processor).

Cheers,
Christian

Re: Splitting big XML files using xpath() and streaming()

Posted by Claus Ibsen <cl...@gmail.com>.
On Mon, Dec 13, 2010 at 10:12 AM, Christian Mueller
<ch...@gmail.com> wrote:
>
> Hello Richard!
> Thanks for your suggestion. I'm aware of the Smook integration, because I
> developed it together with the Smooks guys. :o)
> I thought if it's possible with the splitter EIP via XPath and the
> streaming() mode, it is the easiest solution (also for my junior
> co-workers). But it looks like it's not possible and I will make this more
> clear in the wiki (because there is example code which use xpath and the
> streaming mode).

We are under the mercy whatever the XPath implementation in the JDK
can do out of the box.
If it cannot do xpath evaluation in a streaming fashion, then it cant.
Do a bit on google and see what it can do in the JDK.


> I also think about whether or not it's meaningful to add this improvement to
> the splitter/xpath EIP...
> Cheers,
> Christian
> --
> View this message in context: http://camel.465427.n5.nabble.com/Splitting-big-XML-files-using-xpath-and-streaming-tp3300695p3302838.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>



-- 
Claus Ibsen
-----------------
FuseSource
Email: cibsen@fusesource.com
Web: http://fusesource.com
Twitter: davsclaus
Blog: http://davsclaus.blogspot.com/
Author of Camel in Action: http://www.manning.com/ibsen/

Re: Splitting big XML files using xpath() and streaming()

Posted by Charles Moulliard <cm...@gmail.com>.
Hi Christian,

Unfortunately, I don't find yet the time to investigate this issue but 
should done soon for a customer in Germany which would like to parse big 
XML file (around 250 Mb) using Xpath. After a rapid quick search last 
week, I found this interesting note on Saxon web site --> 
http://www.saxonica.com/documentation/sourcedocs/streaming.xml regrading 
to strategies to adopt.

Regards,

Charles

On 13/12/10 10:12, Christian Mueller wrote:
> Hello Richard!
> Thanks for your suggestion. I'm aware of the Smook integration, because I
> developed it together with the Smooks guys. :o)
> I thought if it's possible with the splitter EIP via XPath and the
> streaming() mode, it is the easiest solution (also for my junior
> co-workers). But it looks like it's not possible and I will make this more
> clear in the wiki (because there is example code which use xpath and the
> streaming mode).
> I also think about whether or not it's meaningful to add this improvement to
> the splitter/xpath EIP...
> Cheers,
> Christian

Re: Splitting big XML files using xpath() and streaming()

Posted by Christian Mueller <ch...@gmail.com>.
Hello Richard!
Thanks for your suggestion. I'm aware of the Smook integration, because I
developed it together with the Smooks guys. :o)
I thought if it's possible with the splitter EIP via XPath and the
streaming() mode, it is the easiest solution (also for my junior
co-workers). But it looks like it's not possible and I will make this more
clear in the wiki (because there is example code which use xpath and the
streaming mode).
I also think about whether or not it's meaningful to add this improvement to
the splitter/xpath EIP...
Cheers,
Christian
-- 
View this message in context: http://camel.465427.n5.nabble.com/Splitting-big-XML-files-using-xpath-and-streaming-tp3300695p3302838.html
Sent from the Camel - Users mailing list archive at Nabble.com.

Re: Splitting big XML files using xpath() and streaming()

Posted by Richard Kettelerij <ri...@gmail.com>.
@Christian. Perhaps Smooks can help in this scenario? The upcoming 1.4
release offers Camel support out of the box. Processing huge (gigabyte)
files is where Smooks is suppose to excel.

http://blog.smooks.org/2010/10/26/smooks-v1-4-beta1-available-for-download/
-- 
View this message in context: http://camel.465427.n5.nabble.com/Splitting-big-XML-files-using-xpath-and-streaming-tp3300695p3301059.html
Sent from the Camel - Users mailing list archive at Nabble.com.