You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Lance Norskog <go...@gmail.com> on 2008/11/01 20:31:56 UTC

DIH http input xpath syntax

The wiki page for the DIH handler mentions that the XML is parsed with a
streaming parser and that the xpath parser only handles a subset of the
xpath syntax. Which streaming parser is it and where would I find this
subset documented?  I tried a few things like the "the first entry" and
"length of data > 3" and nothing worked.  "/a/b/c" was all that worked for
me.
 
Tx,
 
Lance

Re: DIH http input xpath syntax

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
The parser is Stax. But the XPath implementation is custom. Certain
XPath features are hard to implement in streaming way
There is not documentation yet.
You can access attributes like /root/a/b/@a
 attribute values can be checked like
/root/a/b[@k]/x or
/root/a/b[@k][@m='n']/x

A lot of what you achieve with advanced XPath can be achieved with a
custom Transformer. If I can get a usecase if would be helpful


All the supported things have testcases
http://svn.apache.org/viewvc/lucene/solr/trunk/contrib/dataimporthandler/src/test/java/org/apache/solr/handler/dataimport/TestXPathRecordReader.java?revision=681182&view=markup



On Sun, Nov 2, 2008 at 1:01 AM, Lance Norskog <go...@gmail.com> wrote:
> The wiki page for the DIH handler mentions that the XML is parsed with a
> streaming parser and that the xpath parser only handles a subset of the
> xpath syntax. Which streaming parser is it and where would I find this
> subset documented?  I tried a few things like the "the first entry" and
> "length of data > 3" and nothing worked.  "/a/b/c" was all that worked for
> me.
>
> Tx,
>
> Lance
>



-- 
--Noble Paul