You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Oleg <ov...@googlemail.com> on 2009/11/30 11:56:02 UTC

Full text search failed

Hello,

I have added a quite normally MS word document to the content repository and
tried to find some text in its content (full text search). Before that I
configured SearchIndex as following: 

        <SearchIndex
class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
            
            
            
            
        </SearchIndex>

And now I get a warning:
A property claimed to start before zero, at -512! Resetting it to zero, and
hoping for the best

Full text search doesn't work - nothing found, although the query is
correct. It's Apache POI or Jackrabbit problem?

Thanks.
Oleg.
-- 
View this message in context: http://n4.nabble.com/Full-text-search-failed-tp931175p931175.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.

Re: Full text search failed

Posted by Oleg <ov...@googlemail.com>.
Done.

https://issues.apache.org/jira/browse/JCR-2416
https://issues.apache.org/jira/browse/JCR-2416 


Alexander Klimetschek wrote:
> 
> On Mon, Nov 30, 2009 at 17:02, Oleg <ov...@googlemail.com> wrote:
>> I have just made the experience that Text.escapeIllegalXpathSearchChars
>> doesn't work correct if the whole phrase is surrounded by \" (double
>> quotes
>> within String) and the sign \" is the last sign. The mentioned below
>> exception org.apache.lucene.queryParser.ParseException will be thrown.
> 
> Could you provide a short code snippet or test case that proves this
> problem? And report an issue in our JIRA?
> http://wiki.apache.org/jackrabbit/QuestionsAndAnswers#Reporting_Problems
> 
> Thanks!
> Alex
> 
> -- 
> Alexander Klimetschek
> alexander.klimetschek@day.com
> 
> 

-- 
View this message in context: http://n4.nabble.com/Full-text-search-failed-tp931175p932148.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.

Re: Full text search failed

Posted by Alexander Klimetschek <ak...@day.com>.
On Mon, Nov 30, 2009 at 17:02, Oleg <ov...@googlemail.com> wrote:
> I have just made the experience that Text.escapeIllegalXpathSearchChars
> doesn't work correct if the whole phrase is surrounded by \" (double quotes
> within String) and the sign \" is the last sign. The mentioned below
> exception org.apache.lucene.queryParser.ParseException will be thrown.

Could you provide a short code snippet or test case that proves this
problem? And report an issue in our JIRA?
http://wiki.apache.org/jackrabbit/QuestionsAndAnswers#Reporting_Problems

Thanks!
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

Re: Full text search failed

Posted by Oleg <ov...@googlemail.com>.
Hello,

I have just made the experience that Text.escapeIllegalXpathSearchChars
doesn't work correct if the whole phrase is surrounded by \" (double quotes
within String) and the sign \" is the last sign. The mentioned below
exception org.apache.lucene.queryParser.ParseException will be thrown.

Instead of call Text.escapeIllegalXpathSearchChars(q).replaceAll("'", "''")
(see commendation  http://wiki.apache.org/jackrabbit/EncodingAndEscaping
http://wiki.apache.org/jackrabbit/EncodingAndEscaping ) the simple
replacement replaceAll("'", "''") should be enough. Or the method
Text.escapeIllegalXpathSearchChars should be fixed.

Best regards.
Oleg.


Oleg wrote:
> 
> Hello,
> 
> I have one more question related to the full text search. I use
> instructions from  http://wiki.apache.org/jackrabbit/EncodingAndEscaping
> http://wiki.apache.org/jackrabbit/EncodingAndEscaping  to escape values in
> queries. Well. But the method Text.escapeIllegalXpathSearchChars does
> something strange replacements. If I want to search a whole phrase I
> surround the text with double quotes. This is described in JCR 170
> (section 6.6.5.2 jcr:contains Function) A term may be either a single word
> or a phrase delimited by double quotes ("). Well. I write now a String
> "\"Have much fun\"" and pass it through
> Text.escapeIllegalXpathSearchChars. An exception is thrown:
> 
> javax.jcr.RepositoryException: Exception building query:
> org.apache.lucene.queryParser.ParseException: Cannot parse '"Have much
> fun\"': Lexical error at line 1, column 17.  Encountered: <EOF> after :
> "\"Have much fun\\\""
> 
> How can I force to search phrases as a whole text?
> 
> Best regards.
> Oleg.
> 

-- 
View this message in context: http://n4.nabble.com/Full-text-search-failed-tp931175p931389.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.

Re: Full text search failed

Posted by Oleg <ov...@googlemail.com>.
Hello,

I have one more question related to the full text search. I use instructions
from  http://wiki.apache.org/jackrabbit/EncodingAndEscaping
http://wiki.apache.org/jackrabbit/EncodingAndEscaping  to escape values in
queries. Well. But the method Text.escapeIllegalXpathSearchChars does
something strange replacements. If I want to search a whole phrase I
surround the text with double quotes. This is described in JCR 170 (section
6.6.5.2 jcr:contains Function) A term may be either a single word or a
phrase delimited by double quotes ("). Well. I write now a String "\"Have
much fun\"" and pass it through Text.escapeIllegalXpathSearchChars. An
exception is thrown:

javax.jcr.RepositoryException: Exception building query:
org.apache.lucene.queryParser.ParseException: Cannot parse '"Have much
fun\"': Lexical error at line 1, column 17.  Encountered: <EOF> after :
"\"Have much fun\\\""

How can I force to search phrases as a whole text?

Best regards.
Oleg.
-- 
View this message in context: http://n4.nabble.com/Full-text-search-failed-tp931175p931238.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.

Re: Full text search failed

Posted by Oleg <ov...@googlemail.com>.
Hi Alexander,

The full text search is working now. I did search immediately after node
adding. The background thread was not ready with indexing. And now I call
thread sleep ca. 5 sek. after node adding to allow the content of document
being indexed. But the message above is still present. I found, that it's
caused by Apache POI. I try with MS Word 2003 and 2007 and with quite simple
documents. The message was always present. At the moment it doesn't annoy me
and I don't care about it, but it would be interesting to know its nature.
The log is here (DEBUGs are displayed by my classes):


13:01:38,515 DEBUG [RepositoryAccessor] ==> Repository started.
13:01:38,687 INFO  [RepositoryImpl] Starting repository...
13:01:38,703 INFO  [LocalFileSystem] LocalFileSystem initialized at path
target\repository
13:01:38,890 INFO  [NodeTypeRegistry] no custom node type definitions found
13:01:38,906 INFO  [LocalFileSystem] LocalFileSystem initialized at path
target\version
13:01:43,937 INFO  [ConnectionRecoveryManager] Database: Apache Derby /
10.2.1.6 - (452058)
13:01:43,937 INFO  [ConnectionRecoveryManager] Driver: Apache Derby Embedded
JDBC Driver / 10.2.1.6 - (452058)
13:01:45,281 INFO  [RepositoryImpl] initializing workspace 'default'...
13:01:45,281 INFO  [LocalFileSystem] LocalFileSystem initialized at path
target\workspaces\default
13:01:49,578 INFO  [ConnectionRecoveryManager] Database: Apache Derby /
10.2.1.6 - (452058)
13:01:49,578 INFO  [ConnectionRecoveryManager] Driver: Apache Derby Embedded
JDBC Driver / 10.2.1.6 - (452058)
13:01:51,187 INFO  [MultiIndex] indexing...
/jcr:system/jcr:nodeTypes/nt:propertyDefinition/jcr:propertyDefinition[8]
(100)
13:01:51,250 INFO  [MultiIndex] Created initial index for 143 nodes
13:01:51,250 INFO  [SearchIndex] Index initialized: target/repository/index
Version: 3
13:01:51,250 INFO  [MultiIndex] Created initial index for 1 nodes
13:01:51,250 INFO  [SearchIndex] Index initialized:
target\workspaces\default/index Version: 3
13:01:51,250 INFO  [RepositoryImpl] workspace 'default' initialized
13:01:51,281 INFO  [RepositoryImpl] created system workspace: security
13:01:51,281 INFO  [RepositoryImpl] Repository started
13:01:51,281 INFO  [TransientRepository] Transient repository initialized
13:01:51,281 INFO  [RepositoryImpl] initializing workspace 'security'...
13:01:51,281 INFO  [LocalFileSystem] LocalFileSystem initialized at path
target\workspaces\security
13:01:55,656 INFO  [ConnectionRecoveryManager] Database: Apache Derby /
10.2.1.6 - (452058)
13:01:55,656 INFO  [ConnectionRecoveryManager] Driver: Apache Derby Embedded
JDBC Driver / 10.2.1.6 - (452058)
13:01:56,484 INFO  [MultiIndex] Created initial index for 1 nodes
13:01:56,484 INFO  [SearchIndex] Index initialized:
target\workspaces\security/index Version: 3
13:01:56,484 INFO  [RepositoryImpl] workspace 'security' initialized
13:01:56,500 INFO  [SimpleSecurityManager] init: using Repository
LoginModule configuration for Jackrabbit
13:01:56,500 INFO  [RepositoryImpl] SecurityManager = class
org.apache.jackrabbit.core.security.simple.SimpleSecurityManager
13:01:56,515 WARN  [AbstractLoginModule] No credentials available -> try
default (anonymous) authentication.
13:01:56,546 INFO  [TransientRepository] Session opened
13:01:56,546 DEBUG [RepositoryAccessor] ==> Default workspace 'default'
acquired.
13:01:56,906 DEBUG [RepositoryAccessor] ==> Opening new JCR Session for the
current thread.
13:01:56,906 DEBUG [RepositoryAccessor] ==> Try to create workspace
'ws_self_ova'.
13:01:56,937 DEBUG [RepositoryAccessor] ==> Workspace 'ws_self_ova' has been
created.
13:01:56,937 INFO  [RepositoryImpl] initializing workspace 'ws_self_ova'...
13:01:56,937 INFO  [LocalFileSystem] LocalFileSystem initialized at path
target\workspaces\ws_self_ova
13:02:00,812 INFO  [ConnectionRecoveryManager] Database: Apache Derby /
10.2.1.6 - (452058)
13:02:00,812 INFO  [ConnectionRecoveryManager] Driver: Apache Derby Embedded
JDBC Driver / 10.2.1.6 - (452058)
13:02:01,515 INFO  [MultiIndex] Created initial index for 1 nodes
13:02:01,515 INFO  [SearchIndex] Index initialized:
target\workspaces\ws_self_ova/index Version: 3
13:02:01,515 INFO  [RepositoryImpl] workspace 'ws_self_ova' initialized
13:02:01,531 INFO  [TransientRepository] Session opened
13:02:01,531 DEBUG [RepositoryAccessor] Namespace prefix 'cssns' has been
registered to the uri 'http://www.aluluei.net/cssns'
13:02:01,703 DEBUG [RepositoryAccessor] Core node types have been registered
13:02:01,703 DEBUG [RepositoryAccessor] ==> Closing JCR Session for the
current thread.
13:02:01,703 INFO  [TransientRepository] Session closed
13:02:01,718 DEBUG [RepositoryAccessor] ==> Opening new JCR Session for the
current thread.
13:02:01,718 INFO  [TransientRepository] Session opened
13:02:01,718 DEBUG [RepositoryAccessor] ==> Closing JCR Session for the
current thread.
13:02:01,718 INFO  [TransientRepository] Session closed
13:02:01,718 DEBUG [RepositoryAccessor] ==> Opening new JCR Session for the
current thread.
13:02:01,718 INFO  [TransientRepository] Session opened
13:02:01,718 DEBUG [RepositoryAccessor] ==> Closing JCR Session for the
current thread.
13:02:01,718 INFO  [TransientRepository] Session closed
13:02:01,734 DEBUG [RepositoryAccessor] ==> Opening new JCR Session for the
current thread.
13:02:01,734 INFO  [TransientRepository] Session opened
13:02:01,750 DEBUG [RepositoryAccessor] ==> Closing JCR Session for the
current thread.
13:02:01,750 INFO  [TransientRepository] Session closed
13:02:01,750 DEBUG [RepositoryAccessor] ==> Opening new JCR Session for the
current thread.
13:02:01,750 INFO  [TransientRepository] Session opened
13:02:02,312 DEBUG [CollaborativeContent] Query //element(*,
cssns:file)[jcr:like(@cssns:documentTitle,
'%Picture%')]/(@cssns:documentTitle | @jcr:created | @jcr:mimeType |
@cssns:size | @jcr:lastModified) order by @cssns:documentTitle ascending is
executing now ...
A property claimed to start before zero, at -512! Resetting it to zero, and
hoping for the best
A property claimed to start before zero, at -512! Resetting it to zero, and
hoping for the best
13:02:06,515 INFO  [MultiIndex] updating index with 1 nodes from indexing
queue.
13:02:09,484 DEBUG [CollaborativeContent] Query //element(*,
cssns:file)[jcr:contains(jcr:content, 'Have much fun')] is executing now ...
13:02:09,531 DEBUG [CollaborativeContent] Query //element(*,
cssns:file)[jcr:contains(jcr:content, 'Have much funn')] is executing now
...
13:02:09,531 DEBUG [CollaborativeContent] Query //element(*,
cssns:file)[jcr:contains(jcr:content, '29.11.2009 OR 01.01.2010')] is
executing now ...
13:02:09,781 DEBUG [RepositoryAccessor] ==> Closing JCR Session for the
current thread.
13:02:09,781 INFO  [TransientRepository] Session closed
13:02:09,781 INFO  [TransientRepository] Session closed
13:02:09,781 INFO  [RepositoryImpl] Shutting down repository...
13:02:09,781 INFO  [IndexMerger] IndexMerger terminated
13:02:09,781 INFO  [SearchIndex] Index closed: target/repository/index
13:02:09,781 INFO  [RepositoryImpl] shutting down workspace 'ws_self_ova'...
13:02:09,781 INFO  [ObservationDispatcher] Notification of EventListeners
stopped.
13:02:09,781 INFO  [IndexMerger] IndexMerger terminated
13:02:09,906 INFO  [SearchIndex] Index closed:
target\workspaces\ws_self_ova/index
13:02:11,203 INFO  [DerbyPersistenceManager] Database
'target\workspaces\ws_self_ova/db' shutdown.
13:02:11,203 INFO  [RepositoryImpl] workspace 'ws_self_ova' has been
shutdown
13:02:11,203 INFO  [RepositoryImpl] shutting down workspace 'security'...
13:02:11,203 INFO  [ObservationDispatcher] Notification of EventListeners
stopped.
13:02:11,203 INFO  [IndexMerger] IndexMerger terminated
13:02:11,218 INFO  [SearchIndex] Index closed:
target\workspaces\security/index
13:02:12,421 INFO  [DerbyPersistenceManager] Database
'target\workspaces\security/db' shutdown.
13:02:12,421 INFO  [RepositoryImpl] workspace 'security' has been shutdown
13:02:12,421 INFO  [RepositoryImpl] shutting down workspace 'default'...
13:02:12,421 INFO  [ObservationDispatcher] Notification of EventListeners
stopped.
13:02:12,421 INFO  [IndexMerger] IndexMerger terminated
13:02:12,437 INFO  [SearchIndex] Index closed:
target\workspaces\default/index
13:02:13,734 INFO  [DerbyPersistenceManager] Database
'target\workspaces\default/db' shutdown.
13:02:13,734 INFO  [RepositoryImpl] workspace 'default' has been shutdown
13:02:14,968 INFO  [DerbyPersistenceManager] Database 'target/version/db'
shutdown.
13:02:14,968 INFO  [RepositoryImpl] Repository has been shutdown
13:02:14,968 INFO  [TransientRepository] Transient repository shut down


Best regards.
Oleg.
-- 
View this message in context: http://n4.nabble.com/Full-text-search-failed-tp931175p931217.html
Sent from the Jackrabbit - Users mailing list archive at Nabble.com.

Re: Full text search failed

Posted by Alexander Klimetschek <ak...@day.com>.
On Mon, Nov 30, 2009 at 11:56, Oleg <ov...@googlemail.com> wrote:
> And now I get a warning:
> A property claimed to start before zero, at -512! Resetting it to zero, and
> hoping for the best

Do you have a full stack trace? Or more info from the logs?

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com