You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Alexandre Rafalovitch <ar...@gmail.com> on 2012/07/19 06:48:17 UTC

Can I get DIH skip fields that match empty text nodes

Hello,

I have DIH reading an XML file and getting fields with empty values.
My definition is:
<field column="title" xpath="/database/document/item[@name='Title']/text"/>

/text here is actual node name, not text() (e.g. <item
name='Title'><text/></item>)

Right now, I get the field (of type string) with empty value
indexed/stored/returned. Plus, all the copy fields get the empties as
well.

Can I get DIH to skip that field if I don't have any actual text in
it? I can see how to do it with custom transformer, but it seems that
this would be a common problem and I might just be missing a setting
or some XPath secret.

I actually tried [node()],  [text()] and .../text/text() at the end,
but that seems to make the XPathEntityProcessor skip the field all
together.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)