You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2012/06/22 14:24:44 UTC
[Solr Wiki] Update of "ExtractingRequestHandler" by JanHoydahl
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The "ExtractingRequestHandler" page has been changed by JanHoydahl:
http://wiki.apache.org/solr/ExtractingRequestHandler?action=diff&rev1=74&rev2=75
Comment:
literalsOverride
* captureAttr=true|false - Index attributes of the Tika XHTML elements into separate fields, named after the element. For example, when extracting from HTML, Tika can return the href attributes in <a> tags as fields named "a". See the examples below.
* xpath=<XPath expression> - When extracting, only return Tika XHTML content that satisfies the XPath expression. See http://lucene.apache.org/tika/documentation.html for details on the format of Tika XHTML. See also TikaExtractOnlyExampleOutput.
* lowernames=true|false - Map all field names to lowercase with underscores. For example, Content-Type would be mapped to content_type.
+ * literalsOverride=true|false - <!> [[Solr4.0]] When true, literal field values will override other values with same field name, such as metadata and content. Default: true
If extractOnly is true, additional input parameters: