You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Alexandre Rafalovitch (JIRA)" <ji...@apache.org> on 2016/10/11 20:51:20 UTC

[jira] [Closed] (SOLR-772) malformed XML updates w/Resin's Stax parser doesn't trigger errors

     [ https://issues.apache.org/jira/browse/SOLR-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexandre Rafalovitch closed SOLR-772.
--------------------------------------
    Resolution: Cannot Reproduce

Ancient question about no-longer supported deployment method (to Resin). 

> malformed XML updates w/Resin's Stax parser doesn't trigger errors
> ------------------------------------------------------------------
>
>                 Key: SOLR-772
>                 URL: https://issues.apache.org/jira/browse/SOLR-772
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>
> Originally noted by yonik on the mailing list...
> {quote}
> Then I tried Resin 3.1.1 and 3.1.6....
> Things *seem* to mostly work... until you get to updating:
>    ...
> Now here is another really weird thing... post any garbage to the
> update URL, and you still get a success!  It successfully fails on
> jetty.  Mangled query requests correctly fail.  This perhaps initially
> points to something specific to the XML config in jetty?
> {quote}
> Followup from Hoss...
> {quote}
> Skimming the code in XmlUpdateRequestHandler, and testing out various inputs, this seems like a bug in com.caucho.xml.stream.XMLStreamReaderImpl.
> Using curl as yonik described...
> curl -i http://localhost:8080/solr/update --data-binary 'crap' -H 'Content-type:text/xml; charset=utf-8'
> ...resin-3.1.6 (on Linux) returns a success (incorrectly) but the request 
> handler doesn't log any action taken. if we alter they payload ('crap') 
> above we can see some different behaviors...
> 1) 'crap<add><doc><field name="id">hoss</field></doc></add>'
> Solr adds the doc, ignorant of the crap before the add command
> 2) 'crap<add><doc></doc></add>'
> Solr correctly complains about the missing id field (example configs require it)
> 3) 'crap<add>'
> Solr returns success even though it's not legal XML
> 4) 'crap<add'
> Get the following exception...
> {noformat}
> javax.xml.stream.XMLStreamException: :1:7 Expected > at 0xffffffff
>         at com.caucho.xml.stream.XMLStreamReaderImpl.error(XMLStreamReaderImpl.java:1268)
>         at com.caucho.xml.stream.XMLStreamReaderImpl.readElementBegin(XMLStreamReaderImpl.java:689)
>         at com.caucho.xml.stream.XMLStreamReaderImpl.readNext(XMLStreamReaderImpl.java:653)
>         at com.caucho.xml.stream.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:594)
>         at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:148)
> {noformat}
> 5) '<add><doc>'
> This appears to hang ... the connection seems to be left open as if it's waiting for more data.
> ...
> None of these 5 things happen when testing with Jetty.
> I'm not really very familiar with this StaX stuff -- but I suspect what's happening here is that on "wacky" input Caucho's XMLStreamReaderImpl.next() is returning values we're not expecting instead of throwing exceptions ... and depending on the input, this is either causing the XmlUpdateRequestHandler.processUpdate loop/switch to ignore the garbage data, or get stuck in an infinite loop (when there is no END_DOCUMENT)
> The question is: Are we doing the right thing, and com.caucho.xml.stream.XMLStreamReaderImpl is broken; or is XMLStreamReaderImpl producing a legal sequence of parse events for those bad inputs and we're not dealing with it properly?
> FWIW: adding the following line to our web.xml seems to make everything "work" (by which i mean: "fail") as expected...
> <system-property javax.xml.stream.XMLInputFactory="com.ctc.wstx.stax.WstxInputFactory" />
> ...do we want commit this?  
> (It wouldn't be the first time we've had to put in settings to force Resin to use the XML Library we want because something doesn't work with theirs.)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org