You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by zqzuk <zi...@hotmail.com> on 2008/01/21 22:24:04 UTC
illegal characters in xml file to be posted?
Hi, I am using the SimplePostTool to post files to solr. I have encoutered
some problem with the content of xml files. I noticed that if my xml file
has fields whose values contain the character "&" or "<" or ">", the post
fails and I get the exception :
"javax.xml.stream.XMLStreamException: ParseError at [row, col]:[x,y]
Message: The entity name must immediately follow the '&' in the entity
reference"
Looks like these characters are illegal in xml as embedded contents - but I
did extract them from xml in the first place. Is there a list of such
characters I need to deal with before I pass that to SimplePostTool?
Thanks!
--
View this message in context: http://www.nabble.com/illegal-characters-in-xml-file-to-be-posted--tp15006748p15006748.html
Sent from the Solr - User mailing list archive at Nabble.com.
RE: illegal characters in xml file to be posted?
Posted by zqzuk <zi...@hotmail.com>.
Thanks for the quick advice!
pbinkley wrote:
>
> You should encode those three characters, and it doesn't hurt to encode
> the ampersand and double-quote characters too:
> http://en.wikipedia.org/wiki/XML#Entity_references
>
> Peter
>
> -----Original Message-----
> From: zqzuk [mailto:ziqi.zhang@hotmail.com]
> Sent: Monday, January 21, 2008 2:24 PM
> To: solr-user@lucene.apache.org
> Subject: illegal characters in xml file to be posted?
>
>
> Hi, I am using the SimplePostTool to post files to solr. I have
> encoutered some problem with the content of xml files. I noticed that if
> my xml file has fields whose values contain the character "&" or "<" or
> ">", the post fails and I get the exception :
>
> "javax.xml.stream.XMLStreamException: ParseError at [row, col]:[x,y]
> Message: The entity name must immediately follow the '&' in the entity
> reference"
>
> Looks like these characters are illegal in xml as embedded contents -
> but I did extract them from xml in the first place. Is there a list of
> such characters I need to deal with before I pass that to
> SimplePostTool?
>
> Thanks!
> --
> View this message in context:
> http://www.nabble.com/illegal-characters-in-xml-file-to-be-posted--tp150
> 06748p15006748.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>
--
View this message in context: http://www.nabble.com/illegal-characters-in-xml-file-to-be-posted--tp15006748p15007840.html
Sent from the Solr - User mailing list archive at Nabble.com.
RE: illegal characters in xml file to be posted?
Posted by "Binkley, Peter" <Pe...@ualberta.ca>.
You should encode those three characters, and it doesn't hurt to encode
the ampersand and double-quote characters too:
http://en.wikipedia.org/wiki/XML#Entity_references
Peter
-----Original Message-----
From: zqzuk [mailto:ziqi.zhang@hotmail.com]
Sent: Monday, January 21, 2008 2:24 PM
To: solr-user@lucene.apache.org
Subject: illegal characters in xml file to be posted?
Hi, I am using the SimplePostTool to post files to solr. I have
encoutered some problem with the content of xml files. I noticed that if
my xml file has fields whose values contain the character "&" or "<" or
">", the post fails and I get the exception :
"javax.xml.stream.XMLStreamException: ParseError at [row, col]:[x,y]
Message: The entity name must immediately follow the '&' in the entity
reference"
Looks like these characters are illegal in xml as embedded contents -
but I did extract them from xml in the first place. Is there a list of
such characters I need to deal with before I pass that to
SimplePostTool?
Thanks!
--
View this message in context:
http://www.nabble.com/illegal-characters-in-xml-file-to-be-posted--tp150
06748p15006748.html
Sent from the Solr - User mailing list archive at Nabble.com.