You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jackrabbit.apache.org by "Bertrand Lega (JIRA)" <ji...@apache.org> on 2005/08/25 19:38:08 UTC

[jira] Created: (JCR-198) jcr:contains doesn't return incomplete match

jcr:contains doesn't return incomplete match
--------------------------------------------

         Key: JCR-198
         URL: http://issues.apache.org/jira/browse/JCR-198
     Project: Jackrabbit
        Type: Bug
  Components: query  
    Reporter: Bertrand Lega


This behaviour is very strange. 

I have the following repository : 

   ... 
   + [node] mynode
        [prop] title = "my big title"

the following query doesn't return any node : //*[jcr:contains(@title, "bi")] wherea
revision : 234496

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Re: [jira] Resolved: (JCR-198) jcr:contains doesn't return incomplete match

Posted by Marcel Reutegger <ma...@gmx.net>.
Hi Bertrand,

Bertrand LEGA wrote:
> It goes without saying that I think this is a very usefull feature :)
> 
> Marcel, do you think this is a modification requiring a good expertise 
> of lucene and/or jackrabbit ?

when you know where to look for, it's not that complicated ;)

> If it's a little modification, I may assign somebody from my team to 
> have a look at, or even myself.

those are the changes I think are needed:

- in LuceneQueryBuilder.visit(TextsearchQueryNode node, Object data) we 
need to switch to a different QueryParser that allows wildcards at the 
beginning of a term. The current QueryParser is the one shipped with lucene.
- Adapt the default lucene parser definition and integrate generating 
the new custom parser into the maven build process. You will notice that 
there is a already a comment in the QueryParser.jj file shipped with the 
lucene source distribution that shows how to enable prefix queries.

hmm, I guess that's it already.

> If you feel that this is a sensitive area and that the change should be 
> done by a jackrabbit expert, then, we would wait the implementation 
> (provided the change is accepted).
> 
> What do you think ?

go ahead. any help is appreciated!

> By the way, what is the point to have jcr:like and jcr:contains ? Why 
> not having one operation (and being able to spcify if the query is case 
> sensitive or not) ?

jcr:like was introduced to map the LIKE operator from SQL to XPath. And 
because LIKE is case sensitive in SQL it must behave the same in XPath.

whereas with jcr:contains the aim is to provide a fulltext facility 
which is among other things _not_ case sensitive.

regards
  marcel

Re: [jira] Resolved: (JCR-198) jcr:contains doesn't return incomplete match

Posted by Bertrand LEGA <le...@yahoo.fr>.
It goes without saying that I think this is a very usefull feature :)

Marcel, do you think this is a modification requiring a good expertise 
of lucene and/or jackrabbit ?
If it's a little modification, I may assign somebody from my team to 
have a look at, or even myself.
If you feel that this is a sensitive area and that the change should be 
done by a jackrabbit expert, then, we would wait the implementation 
(provided the change is accepted).

What do you think ?

By the way, what is the point to have jcr:like and jcr:contains ? Why 
not having one operation (and being able to spcify if the query is case 
sensitive or not) ?

Regards,
Bertrand.


Philipp Bracher wrote:

>>> So jcr:like(%title, "%N%") do the job but is case sensitive. And 
>>> jcr:contains(%title, "*N*") is
>>
>
> I have the same problem and I think this is a feature used in nearly 
> every application and so far  important. Case insensitive search with 
> patterns is supported by the most used storage systems. Searching for 
> words is maybe OK for google, but not for smaller applications where 
> you search for cities or what ever entities.
>
> Would be nice to get this feature ;-)
>
> Philipp Bracher
>
>

	

	
		
___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
Téléchargez cette version sur http://fr.messenger.yahoo.com

Re: [jira] Resolved: (JCR-198) jcr:contains doesn't return incomplete match

Posted by Philipp Bracher <ph...@obinary.com>.
>> So jcr:like(%title, "%N%") do the job but is case sensitive. And 
>> jcr:contains(%title, "*N*") is

I have the same problem and I think this is a feature used in nearly 
every application and so far  important. Case insensitive search with 
patterns is supported by the most used storage systems. Searching for 
words is maybe OK for google, but not for smaller applications where 
you search for cities or what ever entities.

Would be nice to get this feature ;-)

Philipp Bracher


Re: [jira] Resolved: (JCR-198) jcr:contains doesn't return incomplete match

Posted by Marcel Reutegger <ma...@gmx.net>.
Bertrand LEGA wrote:
> The following query doesn't work :
> 
> //*[jcr:contains(., '*bi*')]
> 
> Why is it so ?

This is currently not supported because it requires a complete index 
scan, which can be quite expensive.

If there is a wider need for this kind of query we can implement this in 
jackrabbit.

thoughts?

> Basically, my need is to do the following search :
>   - search for a string anywhere in a property. For example, with the 
> contents above, searching for 'N' should return both nodes.
>   - case insensitive
> 
> So jcr:like(%title, "%N%") do the job but is case sensitive. And 
> jcr:contains(%title, "*N*") is not possible.
> I'm stuck here.
> 
> Do you have an idea ?

well, I'm afraid, this is currently not possible...

regards
  marcel

Re: [jira] Resolved: (JCR-198) jcr:contains doesn't return incomplete match

Posted by Bertrand LEGA <le...@yahoo.fr>.
Indeed, it works.
Thanks for your answer.

The following query doesn't work :

//*[jcr:contains(., '*bi*')]

Why is it so ?

+ [node] node1
        [prop] title = "a new question"
+ [node] node2
        [prop] title = "answer"


Basically, my need is to do the following search :
   - search for a string anywhere in a property. For example, with the 
contents above, searching for 'N' should return both nodes.
   - case insensitive

So jcr:like(%title, "%N%") do the job but is case sensitive. And 
jcr:contains(%title, "*N*") is not possible.
I'm stuck here.

Do you have an idea ?


Marcel Reutegger (JIRA) wrote:

>     [ http://issues.apache.org/jira/browse/JCR-198?page=all ]
>     
>Marcel Reutegger resolved JCR-198:
>----------------------------------
>
>    Resolution: Invalid
>
>jcr:contains in XPath and CONTAINS in SQL respectively are fulltext extensions to the two languages. The specification is not very strict on what exactly is supported. It merely defines the fulltext syntax that can be used in the contains function. Whether a term matches also a substring is not specified.
>
>It is certainly possible to change the jackrabbit implementation that jcr:contains will also match substrings. However, I'm not sure if that is a good idea. E.g. if you search for 'bi' on google you don't get whatever starts with 'bi'.
>
>What Jackrabbit already supports are wildcards in the jcr:contains function.
>
>To make your query work you can use:
>//*[jcr:contains(., 'bi*')]
>
>which will match everything that starts with 'bi' or 'bi' itself.
>
>or use:
>//*[jcr:contains(., 'bi?')]
>
>which will match any 3 letter word that starts with 'bi'
>
>Keep in mind that this is not standardized and will probably not work on other implementations.
>
>  
>
>>jcr:contains doesn't return incomplete match
>>--------------------------------------------
>>
>>         Key: JCR-198
>>         URL: http://issues.apache.org/jira/browse/JCR-198
>>     Project: Jackrabbit
>>        Type: Bug
>>  Components: query
>>    Reporter: Bertrand Lega
>>    
>>
>
>  
>
>>This behaviour is very strange. 
>>I have the following repository : 
>>   ... 
>>   + [node] mynode
>>        [prop] title = "my big title"
>>the following query doesn't return any node : //*[jcr:contains(@title, "bi")] wherea
>>revision : 234496
>>    
>>
>
>  
>

	

	
		
___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
T�l�chargez cette version sur http://fr.messenger.yahoo.com

[jira] Resolved: (JCR-198) jcr:contains doesn't return incomplete match

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-198?page=all ]
     
Marcel Reutegger resolved JCR-198:
----------------------------------

    Resolution: Invalid

jcr:contains in XPath and CONTAINS in SQL respectively are fulltext extensions to the two languages. The specification is not very strict on what exactly is supported. It merely defines the fulltext syntax that can be used in the contains function. Whether a term matches also a substring is not specified.

It is certainly possible to change the jackrabbit implementation that jcr:contains will also match substrings. However, I'm not sure if that is a good idea. E.g. if you search for 'bi' on google you don't get whatever starts with 'bi'.

What Jackrabbit already supports are wildcards in the jcr:contains function.

To make your query work you can use:
//*[jcr:contains(., 'bi*')]

which will match everything that starts with 'bi' or 'bi' itself.

or use:
//*[jcr:contains(., 'bi?')]

which will match any 3 letter word that starts with 'bi'

Keep in mind that this is not standardized and will probably not work on other implementations.

> jcr:contains doesn't return incomplete match
> --------------------------------------------
>
>          Key: JCR-198
>          URL: http://issues.apache.org/jira/browse/JCR-198
>      Project: Jackrabbit
>         Type: Bug
>   Components: query
>     Reporter: Bertrand Lega

>
> This behaviour is very strange. 
> I have the following repository : 
>    ... 
>    + [node] mynode
>         [prop] title = "my big title"
> the following query doesn't return any node : //*[jcr:contains(@title, "bi")] wherea
> revision : 234496

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (JCR-198) jcr:contains doesn't return incomplete match

Posted by "Bertrand Lega (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/JCR-198?page=comments#action_12320019 ] 

Bertrand Lega commented on JCR-198:
-----------------------------------

Arg, I posted the bug before I finished typing it !!

... continued

whereas the following returns my node as exptected : //*[jcr:contains(@title, "big")]

So for now, I use jcr:like but jcr:like is case insensitive, so this is not a satifying workaround for me.


> jcr:contains doesn't return incomplete match
> --------------------------------------------
>
>          Key: JCR-198
>          URL: http://issues.apache.org/jira/browse/JCR-198
>      Project: Jackrabbit
>         Type: Bug
>   Components: query
>     Reporter: Bertrand Lega

>
> This behaviour is very strange. 
> I have the following repository : 
>    ... 
>    + [node] mynode
>         [prop] title = "my big title"
> the following query doesn't return any node : //*[jcr:contains(@title, "bi")] wherea
> revision : 234496

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Closed: (JCR-198) jcr:contains doesn't return incomplete match

Posted by "Stefan Guggisberg (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/JCR-198?page=all ]
     
Stefan Guggisberg closed JCR-198:
---------------------------------


closing resolved issue

> jcr:contains doesn't return incomplete match
> --------------------------------------------
>
>          Key: JCR-198
>          URL: http://issues.apache.org/jira/browse/JCR-198
>      Project: Jackrabbit
>         Type: Bug
>   Components: query
>     Reporter: Bertrand Lega

>
> This behaviour is very strange. 
> I have the following repository : 
>    ... 
>    + [node] mynode
>         [prop] title = "my big title"
> the following query doesn't return any node : //*[jcr:contains(@title, "bi")] wherea
> revision : 234496

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira