You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@groovy.apache.org by Andrew Myers <am...@gmail.com> on 2015/11/19 02:47:33 UTC

org.xml.sax.SAXParseException with XmlSlurper

Hi,

For a while I've been using groovy to parse some badly formed HTML via 
XmlSlurper in conjunction with TagSoup, something like this:

def slurper = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser())
def html = slurper.parseText(htmlText)

It works fine when I unit test it with Gradle, but I've tried to deploy 
this inside another webapp which runs on Lucee (http://lucee.org/) but I 
think I'm running into some kind of "Jar hell".  When I try to parse the 
htmlText, I get an error like this which makes me think it's not using 
the tagsoup Parser

The exception is: org.xml.sax.SAXParseException, with a stracktrace 
starting like this:

The element type "meta" must be terminated by the matching end-tag 
"</meta>". at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown 
Source):-1 at 
org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown 
Source):-1 at groovy.util.XmlSlurper.parse(XmlSlurper.java:205):205 at 
groovy.util.XmlSlurper.parse(XmlSlurper.java:258):258 at 
groovy.util.XmlSlurper.parseText(XmlSlurper.java:284):284 at 
groovy.util.XmlSlurper$parseText.call(Unknown Source):-1 at 
org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:45):45 
at 
org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:108):108 
at 
org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116):116 
at

I'm a bit lost as to what to look for to debug this.  Has anyone come 
across anything similar?

Thanks!
Andrew.

Re: org.xml.sax.SAXParseException with XmlSlurper

Posted by Andrew Myers <am...@gmail.com>.
Thanks for the tip!

On Fri, 20 Nov 2015 2:43 am Owen Rubel <or...@gmail.com> wrote:

I have ran into that multiple times. Have to disable html errors in
Tomcat/Jetty to avoid this.

Owen Rubel
415-971-0976
orubel@gmail.com

Re: org.xml.sax.SAXParseException with XmlSlurper

Posted by Owen Rubel <or...@gmail.com>.
I have ran into that multiple times. Have to disable html errors in
Tomcat/Jetty to avoid this.

Owen Rubel
415-971-0976
orubel@gmail.com

On Wed, Nov 18, 2015 at 6:33 PM, Andrew Myers <am...@gmail.com> wrote:

> Oops, I have found the problem.  I was looking at the wrong bit of code,
> and at the part where it's failing I was parsing XML *without* TagSoup, and
> it appears that the service I was calling was returning a HTML error page.
>
> Sorry to waste your time :)
>
>

Re: org.xml.sax.SAXParseException with XmlSlurper

Posted by Andrew Myers <am...@gmail.com>.
Oops, I have found the problem.  I was looking at the wrong bit of code, 
and at the part where it's failing I was parsing XML *without* TagSoup, 
and it appears that the service I was calling was returning a HTML error 
page.

Sorry to waste your time :)