You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by Eric Pugh <ep...@opensourceconnections.com> on 2009/08/10 11:28:55 UTC

Doc Question for Solr Cell

I was refreshing my mind on the newly updated parameters on Solr Cell,  
and noticed that the Configuration section on http://wiki.apache.org/solr/ExtractingRequestHandler 
  is out of date.  Before I fixed it, I wanted to confirm that

<requestHandler name="/update/extract"  
class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
	 <lst name="defaults">
	 <str name="ext.map.Last-Modified">last_modified</str>
	 <bool name="ext.ignore.und.fl">true</bool> </lst>

Should be changed to map.Last-Modified only, and that the  
ignore.und.fl capability is now implemented via uprefix:

uprefix=<prefix> - Prefix all fields that are not defined in the  
schema with the given prefix. This is very useful when combined with  
dynamic field definitions. Example: uprefix=ignored_ would effectively  
ignore all unknown fields generated by Tika given the example schema  
contains<dynamicField name="ignored_*" type="ignored"/>

Eric



-----------------------------------------------------
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com
Free/Busy: http://tinyurl.com/eric-cal





Re: Doc Question for Solr Cell

Posted by Grant Ingersoll <gs...@apache.org>.
On Aug 10, 2009, at 5:28 AM, Eric Pugh wrote:

> I was refreshing my mind on the newly updated parameters on Solr  
> Cell, and noticed that the Configuration section on http://wiki.apache.org/solr/ExtractingRequestHandler 
>  is out of date.  Before I fixed it, I wanted to confirm that
>
> <requestHandler name="/update/extract"  
> class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
> 	 <lst name="defaults">
> 	 <str name="ext.map.Last-Modified">last_modified</str>
> 	 <bool name="ext.ignore.und.fl">true</bool> </lst>
>
> Should be changed to map.Last-Modified only, and that the  
> ignore.und.fl capability is now implemented via uprefix:
>
> uprefix=<prefix> - Prefix all fields that are not defined in the  
> schema with the given prefix. This is very useful when combined with  
> dynamic field definitions. Example: uprefix=ignored_ would  
> effectively ignore all unknown fields generated by Tika given the  
> example schema contains<dynamicField name="ignored_*" type="ignored"/>

That is my understanding, yes.

Re: Doc Question for Solr Cell

Posted by Yonik Seeley <ys...@gmail.com>.
On Mon, Aug 10, 2009 at 5:28 AM, Eric
Pugh<ep...@opensourceconnections.com> wrote:
> I was refreshing my mind on the newly updated parameters on Solr Cell, and
> noticed that the Configuration section on
> http://wiki.apache.org/solr/ExtractingRequestHandler is out of date.  Before
> I fixed it, I wanted to confirm that
>
> <requestHandler name="/update/extract"
> class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
>         <lst name="defaults">
>         <str name="ext.map.Last-Modified">last_modified</str>
>         <bool name="ext.ignore.und.fl">true</bool> </lst>
>
> Should be changed to map.Last-Modified only, and that the ignore.und.fl
> capability is now implemented via uprefix:

Yep.
Before 1.4 is released I had wanted to add good default mappings for
common document types along with the fields in the example schema.
And then just cut-n-paste the config from the exampe schema.  It would
be great if you had any recommendations for such default mappings.

-Yonik
http://www.lucidimagination.com