You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by David Haines <dl...@umich.edu> on 2009/02/25 21:16:30 UTC

preventing text extraction

When using Jackrabbit 1.4.8 webdav and Mac OS X 1.5.6 the Mac is  
sending over "invisible" resource fork files in addition to the  
expected visible file.  The resource forks are given the original file  
name prepended with a '._'. They are sent with no Content-Type header  
and Jackrabbit, reasonably, infers the mime type from the file  
extension.  That's not the correct type for resource forks and is  
leading to many stack dumps in the text extractor code.  While Apple  
webdav is causing this problem I need to find someway to avoid it on  
the server side.

Is there some way to configure Jackrabbit to not do text extraction on  
files that match particular patterns?  I'd like to be able to tell it  
to ignore indexing on files that start '._' regardless of the file  
extension.

Thanks for any comments, I'm just getting started with Jackrabbit.

- Dave


David Haines
CTools Developer
Digital Media Commons
University of Michigan
dlhaines@umich.edu