You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Michael Neale <mi...@gmail.com> on 2009/02/03 06:29:30 UTC

Indexing...

Quick Q: textual content stored on a node is full text indexed - but
what about item that is stored via InputStream (assuming it happens to
be text?) it wouldn't be indexed would it?


Michael.

Re: Indexing...

Posted by Marcel Reutegger <ma...@gmx.net>.
Michael Neale wrote:
> right so that looks like by default it will index plain text ??

it will, but please note that text extraction only works for the jcr:data binary
property of an nt:resource node. this restriction is due to additional
information that is required to correctly extract text from a stream (i.e. mime
type and encoding). for plain text you have to set the jcr:mimeType property on
the nt:resource node to text/plain and set jcr:encoding to an appropriate value.
see also: http://wiki.apache.org/jackrabbit/nt:resource

regards
 marcel

Re: Indexing...

Posted by Michael Neale <mi...@gmail.com>.
right so that looks like by default it will index plain text ??

On Tue, Feb 3, 2009 at 7:40 PM, Ard Schrijvers
<a....@onehippo.com> wrote:
>
>>
>> Quick Q: textual content stored on a node is full text
>> indexed - but what about item that is stored via InputStream
>> (assuming it happens to be text?) it wouldn't be indexed would it?
>
> Depends on whether you configure it to be indexed or not, see
> textFilterClasses at [1]. Think in the comment of the SearchIndex of the
> repository.xml in trunk you can find which textFilterClasses are
> available (pdf/html/text/xml etc )
>
> [1] http://wiki.apache.org/jackrabbit/Search
>
> Regards Ard
>
>>
>>
>> Michael.
>>
>



-- 
Michael D Neale
home: www.michaelneale.net
blog: michaelneale.blogspot.com

RE: Indexing...

Posted by Ard Schrijvers <a....@onehippo.com>.
> 
> Quick Q: textual content stored on a node is full text 
> indexed - but what about item that is stored via InputStream 
> (assuming it happens to be text?) it wouldn't be indexed would it?

Depends on whether you configure it to be indexed or not, see
textFilterClasses at [1]. Think in the comment of the SearchIndex of the
repository.xml in trunk you can find which textFilterClasses are
available (pdf/html/text/xml etc )

[1] http://wiki.apache.org/jackrabbit/Search

Regards Ard

> 
> 
> Michael.
>