You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2012/06/22 14:24:44 UTC

[Solr Wiki] Update of "ExtractingRequestHandler" by JanHoydahl

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "ExtractingRequestHandler" page has been changed by JanHoydahl:
http://wiki.apache.org/solr/ExtractingRequestHandler?action=diff&rev1=74&rev2=75

Comment:
literalsOverride

   * captureAttr=true|false - Index attributes of the Tika XHTML elements into separate fields, named after the element.  For example, when extracting from HTML, Tika can return the href attributes in <a> tags as fields named "a". See the examples below.
   * xpath=<XPath expression> - When extracting, only return Tika XHTML content that satisfies the XPath expression.  See http://lucene.apache.org/tika/documentation.html for details on the format of Tika XHTML.  See also TikaExtractOnlyExampleOutput.
   * lowernames=true|false - Map all field names to lowercase with underscores.  For example, Content-Type would be mapped to content_type.
+  * literalsOverride=true|false - <!> [[Solr4.0]] When true, literal field values will override other values with same field name, such as metadata and content. Default: true
  
  If extractOnly is true, additional input parameters: