You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by Adamo Bozzetti <ad...@abfidee.it> on 2007/01/18 14:15:09 UTC

Question about full text search together with normal search

Hi list,
I'm using jackrabbit for a custom document management. I have defined a 
node type that extends nt:file and that have a new propertys such as author.
When i search for attribute or for text there are no problems, but i 
don't understand how to put together the two search.
I don't find a way to join a node with it's child.

I also tried to create my node type that extends directly nt:resource 
but binary content seems not to be indexed.

Someone has faced out this question?

Thanks in advance
Adamo



Re: Question about full text search together with normal search

Posted by Marcel Reutegger <ma...@day.com>.
Adamo Bozzetti wrote:
> Hi list,
> I'm using jackrabbit for a custom document management. I have defined a 
> node type that extends nt:file and that have a new propertys such as 
> author.
> When i search for attribute or for text there are no problems, but i 
> don't understand how to put together the two search.
> I don't find a way to join a node with it's child.

As of the next jackrabbit release you will be able to execute the following query:

//element(*, my:file)[jcr:contains(jcr:content, 'some text') and @author eq 'Adamo']

Previous releases do not support the child axis in the predicate e.g. the 
expression jcr:contains(jcr:content, 'some text').

regards
  marcel

RE: Question about full text search together with normal search

Posted by Jaka Jaksic <ja...@telemach.net>.
There are several steps you need to take to ensure proper indexing of file
contents (you may have already done some of them):

1. Ensure that all nt:resource nodes have properly set jcr:mimeType
property, so that the indexing mechanism can recognize indexable content
types and use the appropriate text filter.

2. Configure the Workspace/SearchIndex/textFilterClasses property in
repository.xml, so that it includes text filters for all types you wish to
index, e.g.:
<param name="textFilterClasses"
value="org.apache.jackrabbit.core.query.MsExcelTextFilter,org.apache.jackrab
bit.core.query.MsPowerPointTextFilter,org.apache.jackrabbit.core.query.MsWor
dTextFilter,org.apache.jackrabbit.core.query.PdfTextFilter,org.apache.jackra
bbit.core.query.HTMLTextFilter,org.apache.jackrabbit.core.query.XMLTextFilte
r,org.apache.jackrabbit.core.query.RTFTextFilter,org.apache.jackrabbit.core.
query.OpenOfficeTextFilter"/>

(Note: This only sets the default configuration for new workspaces - it does
not enable indexing in existing workspaces!)

3. Configure text filters same as above in workspace.xml in each of your
existing workspace folders to enable indexing in existing workspaces.

4. Make sure you have all the necessary jars in classpath. Beside
jackrabbit-index-filters-*.jar, most text filters have their own
dependencies. For all of the above filters, you need to have the following
jars: nekohtml-0.9.5.jar, poi-2.5.1-final-20040804.jar, PDFBox-0.7.2.jar,
tm-extractors-0.4.jar. (I'm not sure about this, but if one dependency is
missing, the indexing process seems to fail for other file types too.)

5. Delete the index subfolder in each of your existing workspace folders, so
that the content will be reindexed.

This should do it. Now start the repository application and each workspace
should be reindexed the first time it is opened.


Regards,
Jaka


-----Original Message-----
From: Adamo Bozzetti [mailto:adamo.bozzetti@abfidee.it] 
Sent: Thursday, January 18, 2007 2:15 PM
To: dev@jackrabbit.apache.org
Subject: Question about full text search together with normal search

Hi list,
I'm using jackrabbit for a custom document management. I have defined a node
type that extends nt:file and that have a new propertys such as author.
When i search for attribute or for text there are no problems, but i don't
understand how to put together the two search.
I don't find a way to join a node with it's child.

I also tried to create my node type that extends directly nt:resource but
binary content seems not to be indexed.

Someone has faced out this question?

Thanks in advance
Adamo