You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Rick Liao <ri...@capsenta.com> on 2017/03/01 03:46:38 UTC

ResultSetFormatter outputs illegal XML

Hello,

I am using ResultSetFormatter to write out the contents of a ResultSet into
a XML file. My data in ResultSet contain an illegal character for XML,
\u0010. The formatter still outputted the character into the XML file
resulting in an incorrect XML file.

Can I configure the formatter to ignore the illegal character? Or do I have
to clean the data before giving it to the formatter?

Code:
ResultSetFormatter.outputAsXML(outputStream, sparqlResultSet)

Thanks for your time!
Rick

Re: ResultSetFormatter outputs illegal XML

Posted by Andy Seaborne <an...@apache.org>.
Rick,

The data needs to be cleaned or treated as XML 1.1.

XML 1.0 does not have way to encode or escape such a character.  There 
is no way to write the results legally in XML 1.0.

XML 1.1 does allow it.

See XML rule [2], the only characters allowed in an XML document.
Rule [66] requires &# entities to be in the chars of rule [2].

Other formats like SPARQL Results in JSON allow the character.

     Andy

On 01/03/17 03:46, Rick Liao wrote:
> Hello,
>
> I am using ResultSetFormatter to write out the contents of a ResultSet into
> a XML file. My data in ResultSet contain an illegal character for XML,
> \u0010. The formatter still outputted the character into the XML file
> resulting in an incorrect XML file.
>
> Can I configure the formatter to ignore the illegal character? Or do I have
> to clean the data before giving it to the formatter?
>
> Code:
> ResultSetFormatter.outputAsXML(outputStream, sparqlResultSet)
>
> Thanks for your time!
> Rick
>