You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Hostetter <ho...@fucit.org> on 2009/06/20 02:01:32 UTC

Re: XPath query support in Solr Cell

: Date: Wed, 20 May 2009 16:45:25 -0400
: From: Eric Pugh
: Subject: XPath query support in Solr Cell

Not sure if you figured this out, but your error is coming from curl, not 
from Solr.  curl has a "feature" where it can hit multiple URLs that 
differe only by a sequential number in a range.  check the "URL" section 
of "man curl" for all the details.

Full URI escaping of the square brackets (to %5B and %5D) should work 
however ... it works for me anyway.

: So I am trying to filter down what I am indexing, and the basic XPath queries
: don't work.  For example, working with tutorial.pdf this indexes all the
: <div/>:
: 
: curl
: http://localhost:8983/solr/update/extract?ext.idx.attr=true\&ext.def.fl=text\&ext.map.div=foo_t\&ext.capture=div\&ext.literal.id=126\&ext.xpath=\/xhtml:html\/xhtml:body\/descendant:node\(\)
: -F "tutorial=@tutorial.pdf"
: 
: However, if I want to only index the first div, I expect to do this:
: 
: budapest:site epugh$ curl
: http://localhost:8983/solr/update/extract?ext.idx.attr=true\&ext.def.fl=text\&ext.map.div=foo_t\&ext.capture=div\&ext.literal.id=126\&ext.xpath=\/xhtml:html\/xhtml:body\/xhtml:div[1]
: -F "tutorial=@tutorial.pdf"
: 
: But I keep getting back an issue from curl.  My attempts to escape the [1]
: have failed.  Any suggestions?
: 
: curl: (3) [globbing] error: bad range specification after pos 174
: 
: Eric
: 
: PS,
: Also, this site seems to be okay as a place to upload your html and practice
: xpath:
: 
: http://www.whitebeam.org/library/guide/TechNotes/xpathtestbed.rhtm
: 
: I did have to trip out the namespace stuff though.
: 
: 
: 
: 
: -----------------------------------------------------
: Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 |
: http://www.opensourceconnections.com
: Free/Busy: http://tinyurl.com/eric-cal
: 
: 
: 



-Hoss