You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jackrabbit.apache.org by Sridhar Raman <sr...@gmail.com> on 2007/12/17 14:24:00 UTC

Querying on multi-value properties

This is the node structure I have:
bookType:
- authors (multiple)

I may have around 1000 nodes of bookType.  Now, I want to find all nodes of
type bookType that have "stephen king" in the property author.

A simple jcr:contains query with like this - jcr:contains(@authors, 'stephen
king') - doesn't work as I would want it to.  The reason being, if by
chance, I have a node that has "stephen" in one of the author values, and
"king" in another value of property author, this node is returned as a
match.  But I want _only_ those nodes that have "stephen" and "king" in the
same value of the author property.

Example:
nodeA:
- authors {"stephen king", "robert jordan"}

nodeB:
- authors {"good stephen", "royal king"}

On running the query, I get both nodeA and nodeB.  But I want only nodeA.
Is there any way of doing
this?

Thanks,
Sridhar

Re: Querying on multi-value properties

Posted by Marcel Reutegger <ma...@gmx.net>.
Ard Schrijvers wrote:
>> Is this how it should work?
> 
> I am not sure wether this is according specs. Anybody else?

the spec does not clearly specify how multi-valued properties are handled in a 
jcr:contains() function. it even says that jcr:contains() on a property is optional.

I don't think we can or should change the current behaviour. as a workaround I 
suggest to use individual nodes for authors. e.g. using infamous same-name siblings:

/books/addison-wesley/1995/design-patterns/author[1]/name = 'Erich Gamma'
/books/addison-wesley/1995/design-patterns/author[2]/name = 'Richard Helm'
/books/addison-wesley/1995/design-patterns/author[3]/name = 'Ralph E. Johnson'

the following query will not return results:

//element(my:book, )[jcr:contains(author/@name, 'erich helm')]

or without using SNS:

/books/addison-wesley/1995/design-patterns/authors/gamma/name = 'Erich Gamma'
/books/addison-wesley/1995/design-patterns/authors/helm/name = 'Richard Helm'
/books/addison-wesley/1995/design-patterns/authors/johnson/name = 'Ralph E. Johnson'

you would then have the following query:

//element(my:book, )[jcr:contains(authors/*/@name, 'erich helm')]

regards
  marcel

RE: Querying on multi-value properties

Posted by Ard Schrijvers <a....@hippo.nl>.
> > You can use contains and 'and' the different terms, ie,
> 
> > "//*[jcr:contains(.,'stephen') and jcr:contains(.,'king')]"
> Herein lies the main problem.  Say I had a node with 2 values 
> for the property author, namely "stephen jordan" and "terry king".
> The above query would still flag this node as a matching 
> node, though the same property _actually_ doesn't contain 
> stephen AND king.  Probably the reason for this is that the 
> property is searched as a whole, and a special case for 
> multi-value is not made.

I understand your problem

> 
> Regarding analysers, I actually have my own analyser, etc.  

Yes, but I do not think a specific property based analyzer which could
still resolve your issue (the way i explained before)

> So that is not a problem.  Just that a search that I wish 
> would be enforced only on each single value of a multi-value 
> property doesn't happen.  The search is "spread" across the 
> property values.
> 
> Is this how it should work?

I am not sure wether this is according specs. Anybody else?

Regards Ard


Re: Querying on multi-value properties

Posted by Sridhar Raman <sr...@gmail.com>.
> You can use contains and 'and' the different
> terms, ie,

> "//*[jcr:contains(.,'stephen') and jcr:contains(.,'king')]"
Herein lies the main problem.  Say I had a node with 2 values for the
property author, namely "stephen jordan" and "terry king".
The above query would still flag this node as a matching node, though the
same property _actually_ doesn't contain stephen AND king.  Probably the
reason for this is that the property is searched as a whole, and a special
case for multi-value is not made.

Regarding analysers, I actually have my own analyser, etc.  So that is not a
problem.  Just that a search that I wish would be enforced only on each
single value of a multi-value property doesn't happen.  The search is
"spread" across the property values.

Is this how it should work?

On Dec 18, 2007 2:35 PM, Ard Schrijvers <a....@hippo.nl> wrote:

>
> > Isn't //*[@author='stephen king'] a case-sensitive search?
>
> You have fn:lower-case
>
> > Besides if I had a node with an author value as "stephen
> > edwin king", this query wouldn't work.  Right?
>
> True, but I did not know you where having all kinds of combinations of
> one and the same person. You can use contains and 'and' the different
> terms, ie,
>
> "//*[jcr:contains(.,'stephen') and jcr:contains(.,'king')]"
>
> Otherwise, you could look at if the synonymprovider is useable for
> multiple terms. If not, and you really want to make it the way you want,
> you can always add your own AuthorAnalyzer (combination of
> KeywordAnalyzer and SynonymAnalyzer) and then you can use
>
> //*[@author='stephen edward king']
> //*[@author='Stephen Edward King']
> //*[@author='stephen king']
> //*[@author='Stephen King']
>
> depending on how you organize your analyzer. Not really trivial the
> first time, but makes you understand lucene analyzing a but, and how to
> optimize searches you want to do.
>
> Regards Ard
>
> >
> > On Dec 17, 2007 7:03 PM, Ard Schrijvers <a....@hippo.nl> wrote:
> >
> > > Hello,
> > >
> > > //*[@author='stephen king'] should do it
> > >
> > > Regards Ard
> > >
> > > >
> > > > This is the node structure I have:
> > > > bookType:
> > > > - authors (multiple)
> > > >
> > > > I may have around 1000 nodes of bookType.  Now, I want to
> > find all
> > > > nodes of type bookType that have "stephen king" in the property
> > > > author.
> > > >
> > > > A simple jcr:contains query with like this -
> > jcr:contains(@authors,
> > > > 'stephen
> > > > king') - doesn't work as I would want it to.  The reason
> > being, if
> > > > by chance, I have a node that has "stephen" in one of the author
> > > > values, and "king" in another value of property author,
> > this node is
> > > > returned as a match.  But I want _only_ those nodes that have
> > > > "stephen" and "king" in the same value of the author property.
> > > >
> > > > Example:
> > > > nodeA:
> > > > - authors {"stephen king", "robert jordan"}
> > > >
> > > > nodeB:
> > > > - authors {"good stephen", "royal king"}
> > > >
> > > > On running the query, I get both nodeA and nodeB.  But I
> > want only
> > > > nodeA.
> > > > Is there any way of doing
> > > > this?
> > > >
> > > > Thanks,
> > > > Sridhar
> > > >
> > >
> >
>

RE: Querying on multi-value properties

Posted by Ard Schrijvers <a....@hippo.nl>.
> Isn't //*[@author='stephen king'] a case-sensitive search?  

You have fn:lower-case 

> Besides if I had a node with an author value as "stephen 
> edwin king", this query wouldn't work.  Right?

True, but I did not know you where having all kinds of combinations of
one and the same person. You can use contains and 'and' the different
terms, ie, 

"//*[jcr:contains(.,'stephen') and jcr:contains(.,'king')]"

Otherwise, you could look at if the synonymprovider is useable for
multiple terms. If not, and you really want to make it the way you want,
you can always add your own AuthorAnalyzer (combination of
KeywordAnalyzer and SynonymAnalyzer) and then you can use 

//*[@author='stephen edward king']
//*[@author='Stephen Edward King']
//*[@author='stephen king']
//*[@author='Stephen King']

depending on how you organize your analyzer. Not really trivial the
first time, but makes you understand lucene analyzing a but, and how to
optimize searches you want to do. 

Regards Ard

> 
> On Dec 17, 2007 7:03 PM, Ard Schrijvers <a....@hippo.nl> wrote:
> 
> > Hello,
> >
> > //*[@author='stephen king'] should do it
> >
> > Regards Ard
> >
> > >
> > > This is the node structure I have:
> > > bookType:
> > > - authors (multiple)
> > >
> > > I may have around 1000 nodes of bookType.  Now, I want to 
> find all 
> > > nodes of type bookType that have "stephen king" in the property 
> > > author.
> > >
> > > A simple jcr:contains query with like this - 
> jcr:contains(@authors, 
> > > 'stephen
> > > king') - doesn't work as I would want it to.  The reason 
> being, if 
> > > by chance, I have a node that has "stephen" in one of the author 
> > > values, and "king" in another value of property author, 
> this node is 
> > > returned as a match.  But I want _only_ those nodes that have 
> > > "stephen" and "king" in the same value of the author property.
> > >
> > > Example:
> > > nodeA:
> > > - authors {"stephen king", "robert jordan"}
> > >
> > > nodeB:
> > > - authors {"good stephen", "royal king"}
> > >
> > > On running the query, I get both nodeA and nodeB.  But I 
> want only 
> > > nodeA.
> > > Is there any way of doing
> > > this?
> > >
> > > Thanks,
> > > Sridhar
> > >
> >
> 

Re: Querying on multi-value properties

Posted by Sridhar Raman <sr...@gmail.com>.
Isn't //*[@author='stephen king'] a case-sensitive search?  Besides if I had
a node with an author value as "stephen edwin king", this query wouldn't
work.  Right?

On Dec 17, 2007 7:03 PM, Ard Schrijvers <a....@hippo.nl> wrote:

> Hello,
>
> //*[@author='stephen king'] should do it
>
> Regards Ard
>
> >
> > This is the node structure I have:
> > bookType:
> > - authors (multiple)
> >
> > I may have around 1000 nodes of bookType.  Now, I want to
> > find all nodes of type bookType that have "stephen king" in
> > the property author.
> >
> > A simple jcr:contains query with like this -
> > jcr:contains(@authors, 'stephen
> > king') - doesn't work as I would want it to.  The reason
> > being, if by chance, I have a node that has "stephen" in one
> > of the author values, and "king" in another value of property
> > author, this node is returned as a match.  But I want _only_
> > those nodes that have "stephen" and "king" in the same value
> > of the author property.
> >
> > Example:
> > nodeA:
> > - authors {"stephen king", "robert jordan"}
> >
> > nodeB:
> > - authors {"good stephen", "royal king"}
> >
> > On running the query, I get both nodeA and nodeB.  But I want
> > only nodeA.
> > Is there any way of doing
> > this?
> >
> > Thanks,
> > Sridhar
> >
>

RE: Querying on multi-value properties

Posted by Ard Schrijvers <a....@hippo.nl>.
Hello,

//*[@author='stephen king'] should do it

Regards Ard

> 
> This is the node structure I have:
> bookType:
> - authors (multiple)
> 
> I may have around 1000 nodes of bookType.  Now, I want to 
> find all nodes of type bookType that have "stephen king" in 
> the property author.
> 
> A simple jcr:contains query with like this - 
> jcr:contains(@authors, 'stephen
> king') - doesn't work as I would want it to.  The reason 
> being, if by chance, I have a node that has "stephen" in one 
> of the author values, and "king" in another value of property 
> author, this node is returned as a match.  But I want _only_ 
> those nodes that have "stephen" and "king" in the same value 
> of the author property.
> 
> Example:
> nodeA:
> - authors {"stephen king", "robert jordan"}
> 
> nodeB:
> - authors {"good stephen", "royal king"}
> 
> On running the query, I get both nodeA and nodeB.  But I want 
> only nodeA.
> Is there any way of doing
> this?
> 
> Thanks,
> Sridhar
>