You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@camel.apache.org by gheidorn <gr...@gmail.com> on 2013/01/23 04:41:06 UTC
Tokenize Producing XML That is Not Well-Formed
I have the following XML structure:
*<c />*
*<c />*
Element c is optional. I'm using split tokenize on element b, which works
great when element c is present. I get tokens in the form of:
*<c />*
*<c />*
The issue is when c is not present, tokenize returns tokens in the form of:
* *
For some odd reason the closing tag for the root node is appearing in my
first token and this triggers an IOException caused by:
Caused by: javax.xml.bind.UnmarshalException
- with linked exception:
[org.xml.sax.SAXParseException: The markup in the document following the
root element must be well-formed.]
For what it's worth, the original XML is passing an earlier XSD check with
no problems, so I know the XML that is getting split into tokens is
well-formed and validated.
Has anyone run into this issue before?
--
View this message in context: http://camel.465427.n5.nabble.com/Tokenize-Producing-XML-That-is-Not-Well-Formed-tp5726035.html
Sent from the Camel - Users mailing list archive at Nabble.com.
Re: Tokenize Producing XML That is Not Well-Formed
Posted by Henryk Konsek <he...@gmail.com>.
> If you agree, I'll submit a JIRA issue and can
> work on a patch.
Good catch Greg :) . I created the appropriate Jira issue [1]. We
would appreciate if you contribute the patch for the bug you detected.
[1] https://issues.apache.org/jira/browse/CAMEL-6012
--
Henryk Konsek
http://henryk-konsek.blogspot.com
Re: Tokenize Producing XML That is Not Well-Formed
Posted by Christian Müller <ch...@gmail.com>.
+1
We love contributions ;-)
Have a look at http://camel.apache.org/contributing.html
Best,
Christian
Sent from a mobile device
Am 23.01.2013 17:37 schrieb "gheidorn" <gr...@gmail.com>:
> Christian, I created a JUnit that illustrates the issue (see attached). I
> believe we should enhance the TokenXMLPairExpressionIterator to account for
> self-closing XML tokens. If you agree, I'll submit a JIRA issue and can
> work on a patch.
>
>
>
> --
> View this message in context:
> http://camel.465427.n5.nabble.com/Tokenize-Producing-XML-That-is-Not-Well-Formed-tp5726035p5726081.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>
Re: Tokenize Producing XML That is Not Well-Formed
Posted by gheidorn <gr...@gmail.com>.
Christian, I created a JUnit that illustrates the issue (see attached). I
believe we should enhance the TokenXMLPairExpressionIterator to account for
self-closing XML tokens. If you agree, I'll submit a JIRA issue and can
work on a patch.
--
View this message in context: http://camel.465427.n5.nabble.com/Tokenize-Producing-XML-That-is-Not-Well-Formed-tp5726035p5726081.html
Sent from the Camel - Users mailing list archive at Nabble.com.
Re: Tokenize Producing XML That is Not Well-Formed
Posted by Christian Müller <ch...@gmail.com>.
You should may consider to implement your own splitter bean for this case.
Best,
Christian
Sent from a mobile device
Am 23.01.2013 08:05 schrieb "gheidorn" <gr...@gmail.com>:
> Alright that wasn't quite it. I continue to have problems with tokenizing
> elements missing optional children as originally stated.
>
>
>
> --
> View this message in context:
> http://camel.465427.n5.nabble.com/Tokenize-Producing-XML-That-is-Not-Well-Formed-tp5726035p5726039.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>
Re: Tokenize Producing XML That is Not Well-Formed
Posted by gheidorn <gr...@gmail.com>.
Alright that wasn't quite it. I continue to have problems with tokenizing
elements missing optional children as originally stated.
--
View this message in context: http://camel.465427.n5.nabble.com/Tokenize-Producing-XML-That-is-Not-Well-Formed-tp5726035p5726039.html
Sent from the Camel - Users mailing list archive at Nabble.com.
Re: Tokenize Producing XML That is Not Well-Formed
Posted by gheidorn <gr...@gmail.com>.
In classic fashion, I've answered my own question after a night of debugging.
For posterity, I will share that I was converting the String representation
of my XML into a w3c Document object, and then splitting that Document
object using tokenize xml. Probably not intended to work like that! When I
left the original XML as a string, the tokenize xml works just fine in my
scenario.
That being said, it seems like tokenizing on a Document object is "almost
there" in terms of functionality ...only my edge case isn't working (when
you are missing optional child elements).
--
View this message in context: http://camel.465427.n5.nabble.com/Tokenize-Producing-XML-That-is-Not-Well-Formed-tp5726035p5726038.html
Sent from the Camel - Users mailing list archive at Nabble.com.
Re: Tokenize Producing XML That is Not Well-Formed
Posted by gheidorn <gr...@gmail.com>.
I tracked the code back to the TokenXMLPairExpressionIterator, which as the
name indicates, doesn't check to see if the token is self-closing. I'm
going to open a JIRA Issue to see if we can build in support for
self-closing XML tokens. I'll see if I can submit a patch for review today.
--
View this message in context: http://camel.465427.n5.nabble.com/Tokenize-Producing-XML-That-is-Not-Well-Formed-tp5726035p5726079.html
Sent from the Camel - Users mailing list archive at Nabble.com.
Re: Tokenize Producing XML That is Not Well-Formed
Posted by gheidorn <gr...@gmail.com>.
I wrote a short JUnit and found that if the tag has a closing tag, then the
tokenizeXML works correctly. If the tag is self-closing, then tokenizeXML
fails.
I have attached the JUnit and am currently walking the code to see if I can
pinpoint the class that does the tokenizeXML to see if I can patch it to
accept self-closing tags.
GenericTokenizeTest.java
<http://camel.465427.n5.nabble.com/file/n5726078/GenericTokenizeTest.java>
--
View this message in context: http://camel.465427.n5.nabble.com/Tokenize-Producing-XML-That-is-Not-Well-Formed-tp5726035p5726078.html
Sent from the Camel - Users mailing list archive at Nabble.com.
Re: Tokenize Producing XML That is Not Well-Formed
Posted by gheidorn <gr...@gmail.com>.
I attached my camel.xml configuration, but here is the route pseudocode where
the issue lies:
route
sftp
doTry
to validator:my.xsd
split strategyRef=myAggregationStrategy
tokenize token=ad xml=true
log message=in.body
camel.xml <http://camel.465427.n5.nabble.com/file/n5726070/camel.xml>
--
View this message in context: http://camel.465427.n5.nabble.com/Tokenize-Producing-XML-That-is-Not-Well-Formed-tp5726035p5726070.html
Sent from the Camel - Users mailing list archive at Nabble.com.
Re: Tokenize Producing XML That is Not Well-Formed
Posted by Henryk Konsek <he...@gmail.com>.
> I updated my XML to escape properly for viewing. Thanks in advance!
I'm not getting the issue :) . Could you send routes you're using?
--
Henryk Konsek
http://henryk-konsek.blogspot.com
Re: Tokenize Producing XML That is Not Well-Formed
Posted by gheidorn <gr...@gmail.com>.
I updated my XML to escape properly for viewing. Thanks in advance!
--
View this message in context: http://camel.465427.n5.nabble.com/Tokenize-Producing-XML-That-is-Not-Well-Formed-tp5726035p5726036.html
Sent from the Camel - Users mailing list archive at Nabble.com.