You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by au...@francelabs.com on 2014/10/21 08:43:21 UTC

Nested documents in Solr

Hi,

I have some question regarding nested document queries.

For example, let’s say that I have many books, one of which is the 
following one:
Book _title: Nested documents for dummies
Chapter1_Title: Introduction
Chapter1_Content: Nested documents are fun.
Chapter2_Title: Which technology should I use?
Chapter2_Content: Lucene of course!

First I want to find books that contain an introduction and that are 
about Lucene. So I decide to flatten my data and use 3 multivalued 
fields (Book_Title,Chapter_Title and Chapter_Content), I index my 
document and then I get what I want when I run the following query : “ 
chapter_title:Introduction AND chapter_title:Lucene “
But now I want to find books that contain “fun” in a chapter which name 
is “introduction”.  My model is no more valid (Chapter2_content is no 
more linked with Chapter2_title). That is why I change my datamodel and 
use nested documents:
I now have a parent with a single valued field Book_title and different 
childs with single valued fields Chapter_title and Chapter_Content. Now, 
when I run the query “chapter_title: Introduction AND 
chapter_content:fun” I also get what I want… But what do I have to do if 
I want to use these two kinds of query with a unique data model?
Maybe the only way to do this is to use nested documents and to index 
data both in child documents and in a flattened form in the parent 
document. Then we will be able to run the two different queries.

Do you have any other (better) idea?

Thank you,

Regards,

Aurélien

Re: Nested documents in Solr

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Hello Aurélien,

There are a lot of materials about this problem. Start from this one:
https://www.youtube.com/watch?v=YCkkOyZ-zkM

On Wed, Oct 22, 2014 at 6:08 PM, <au...@francelabs.com> wrote:

> Hi Ramzi,
>
> Thank you but I am not sure to understand well your answer. In your
> example, I suppose that the indexed docs are flattened. If I want an AND
> query instead of an OR query (let say, for example 'chapter_title:Lucene
> AND chapter_content:fun'), how can I be sure that the terms "Lucene" and
> "fun" will be matched in the same chapter of the book? (since in this case
> chapter_content and chapter_title are multivalued fields)?
>
> Regards,
>
> Aurélien
>
>
> On 21.10.2014 19:59, Ramzi Alqrainy wrote:
>
>> I think if I have your question right, You can use multiple custom query
>> syntax. You explicitly specify an alternative query parser such as DisMax
>> or
>> eDisMax, you're using the standard Lucene query parser by default.
>>
>> In your case, I think I can solve it by using this query
>> chapter_title:Introduction ( chapter_title:Lucene OR chapter_content:fun )
>>
>> Here are some query examples demonstrating the query syntax.
>>
>> *Keyword matching*
>>
>> Search for word "foo" in the title field.
>>
>> title:foo
>> Search for phrase "foo bar" in the title field.
>>
>> title:"foo bar"
>> Search for phrase "foo bar" in the title field AND the phrase "quick fox"
>> in
>> the body field.
>>
>> title:"foo bar" AND body:"quick fox"
>> Search for either the phrase "foo bar" in the title field AND the phrase
>> "quick fox" in the body field, or the word "fox" in the title field.
>>
>> (title:"foo bar" AND body:"quick fox") OR title:fox
>> Search for word "foo" and not "bar" in the title field.
>>
>> title:foo -title:bar
>>
>> *Wildcard matching*
>>
>> Search for any word that starts with "foo" in the title field.
>>
>> title:foo*
>> Search for any word that starts with "foo" and ends with bar in the title
>> field.
>>
>> title:foo*bar
>> Note that Lucene doesn't support using a * symbol as the first character
>> of
>> a search.
>>
>>
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Nested-documents-
>> in-Solr-tp4165099p4165232.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>
> Hi,
>
> I have question regarding nested document queries:
> For example, let’s say that I have the following book:
> Book _title: Nested document for dummies
> Chapter1_Title: Introduction
> Chapter1_Content: Nested documents are fun.
> Chapter2_Title: Which technology should I use?
> Chapter2_Content: Lucene of course!
>
> First I want to find books that contain an introduction and that are
> about Lucene. So I decide to flatten my data and use 3 multivalued fields
> (Book_Title,Chapter_Title and Chapter_Content), I index my document and
> then I get what I want when I use the following query : “
> chapter_title:Introduction AND chapter_title:Lucene “
> Now I want to find books that contain “fun” in a chapter called
> “introduction”.  My model is no more valid (Chapter2_content is no more
> linked with Chapter2_title). That is why I change my datamodel and use
> nested documents:
> I have now a parent with a single valued field Book_title and different
> childs with single valued fields Chapter_title and Chapter_Content. Now,
> when I run the query “chapter_title: Introduction AND chapter_content:fun”
> I also get what I want… But what do I have to do if I want to use these two
> kinds of query with a unique data model?
>
> Thank you,
>
>
> Regards,
>
> Aurélien MAZOYER
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>

Re: Nested documents in Solr

Posted by au...@francelabs.com.
Hi Ramzi,

Thank you but I am not sure to understand well your answer. In your 
example, I suppose that the indexed docs are flattened. If I want an AND 
query instead of an OR query (let say, for example 'chapter_title:Lucene 
AND chapter_content:fun'), how can I be sure that the terms "Lucene" and 
"fun" will be matched in the same chapter of the book? (since in this 
case chapter_content and chapter_title are multivalued fields)?

Regards,

Aurélien

On 21.10.2014 19:59, Ramzi Alqrainy wrote:
> I think if I have your question right, You can use multiple custom 
> query
> syntax. You explicitly specify an alternative query parser such as 
> DisMax or
> eDisMax, you're using the standard Lucene query parser by default.
> 
> In your case, I think I can solve it by using this query
> chapter_title:Introduction ( chapter_title:Lucene OR 
> chapter_content:fun )
> 
> Here are some query examples demonstrating the query syntax.
> 
> *Keyword matching*
> 
> Search for word "foo" in the title field.
> 
> title:foo
> Search for phrase "foo bar" in the title field.
> 
> title:"foo bar"
> Search for phrase "foo bar" in the title field AND the phrase "quick 
> fox" in
> the body field.
> 
> title:"foo bar" AND body:"quick fox"
> Search for either the phrase "foo bar" in the title field AND the 
> phrase
> "quick fox" in the body field, or the word "fox" in the title field.
> 
> (title:"foo bar" AND body:"quick fox") OR title:fox
> Search for word "foo" and not "bar" in the title field.
> 
> title:foo -title:bar
> 
> *Wildcard matching*
> 
> Search for any word that starts with "foo" in the title field.
> 
> title:foo*
> Search for any word that starts with "foo" and ends with bar in the 
> title
> field.
> 
> title:foo*bar
> Note that Lucene doesn't support using a * symbol as the first 
> character of
> a search.
> 
> 
> 
> 
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Nested-documents-in-Solr-tp4165099p4165232.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Hi,

I have question regarding nested document queries:
For example, let’s say that I have the following book:
Book _title: Nested document for dummies
Chapter1_Title: Introduction
Chapter1_Content: Nested documents are fun.
Chapter2_Title: Which technology should I use?
Chapter2_Content: Lucene of course!

First I want to find books that contain an introduction and that are
about Lucene. So I decide to flatten my data and use 3 multivalued 
fields
(Book_Title,Chapter_Title and Chapter_Content), I index my document and
then I get what I want when I use the following query : “
chapter_title:Introduction AND chapter_title:Lucene “
Now I want to find books that contain “fun” in a chapter called
“introduction”.  My model is no more valid (Chapter2_content is no more
linked with Chapter2_title). That is why I change my datamodel and use
nested documents:
I have now a parent with a single valued field Book_title and different
childs with single valued fields Chapter_title and Chapter_Content. Now,
when I run the query “chapter_title: Introduction AND 
chapter_content:fun”
I also get what I want… But what do I have to do if I want to use these 
two
kinds of query with a unique data model?

Thank you,


Regards,

Aurélien MAZOYER

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



Re: Nested documents in Solr

Posted by Ramzi Alqrainy <ra...@gmail.com>.
I think if I have your question right, You can use multiple custom query
syntax. You explicitly specify an alternative query parser such as DisMax or
eDisMax, you're using the standard Lucene query parser by default.

In your case, I think I can solve it by using this query
chapter_title:Introduction ( chapter_title:Lucene OR chapter_content:fun )

Here are some query examples demonstrating the query syntax.

*Keyword matching*

Search for word "foo" in the title field.

title:foo
Search for phrase "foo bar" in the title field.

title:"foo bar"
Search for phrase "foo bar" in the title field AND the phrase "quick fox" in
the body field.

title:"foo bar" AND body:"quick fox"
Search for either the phrase "foo bar" in the title field AND the phrase
"quick fox" in the body field, or the word "fox" in the title field.

(title:"foo bar" AND body:"quick fox") OR title:fox
Search for word "foo" and not "bar" in the title field.

title:foo -title:bar

*Wildcard matching*

Search for any word that starts with "foo" in the title field.

title:foo*
Search for any word that starts with "foo" and ends with bar in the title
field.

title:foo*bar
Note that Lucene doesn't support using a * symbol as the first character of
a search.







--
View this message in context: http://lucene.472066.n3.nabble.com/Nested-documents-in-Solr-tp4165099p4165232.html
Sent from the Solr - User mailing list archive at Nabble.com.