You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sathyakumar Seshachalam <Sa...@Trimble.com> on 2016/01/29 11:13:02 UTC

Nested documents and many-many relation

Hi,

Am trying to investigate the possibility of using Block Join query parser in a many-to-many relation scenario.
Observation is that when a document is added as a child to more than one parent document (I use Solrj to do this), I seem to get two copies of the child document. Can this be avoided ? Is this per design ?
Are there are articles talking about ways to model a many-to-many relationship (even if its a hacky solution).


Re: Nested documents and many-many relation

Posted by Jan Høydahl <ja...@cominvent.com>.
The new Parallell SQL feature of 6.0? Also query-time on top of streaming, don’t know performance...

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 1. feb. 2016 kl. 07.37 skrev Sathyakumar Seshachalam <Sa...@Trimble.com>:
> 
> Thanks, query time joins are not an option for me, because of the size of
> the index and hence the join performance,
> I will look at Siren.
> 
> 
> On 29/01/16, 10:16 PM, "Alessandro Benedetti" <ab...@apache.org>
> wrote:
> 
>> Probably if you are interested in a many-to-many relation, you could be
>> interested in the query time join.
>> it has been the first type of join integrated in Solr.
>> It allow you to avoid redundancies.
>> It's slower than block join, but it doesn't force you to any specific
>> indexing approach.
>> It became less and less popular but there are scenario could be useful !
>> 
>> Take a look to Siren as well, it could be interesting, not sure it will
>> help you as Siren will duplicate nested documents.
>> Cheers
>> 
>> On 29 January 2016 at 15:32, Jack Krupansky <ja...@gmail.com>
>> wrote:
>> 
>>> If you wish to change, add, or delete a child or change the parent you
>>> must
>>> do an add of the entire block again with both the parent and all
>>> children.
>>> This is because the efficiency of Block Join comes from the documents
>>> being
>>> adjacent in Lucene and segments are immutable in Lucene, so the entire
>>> block must be written to a new segment.
>>> 
>>> 
>>> -- Jack Krupansky
>>> 
>>> On Fri, Jan 29, 2016 at 5:13 AM, Sathyakumar Seshachalam <
>>> Sathyakumar_Seshachalam@trimble.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Am trying to investigate the possibility of using Block Join query
>>> parser
>>>> in a many-to-many relation scenario.
>>>> Observation is that when a document is added as a child to more than
>>> one
>>>> parent document (I use Solrj to do this), I seem to get two copies of
>>> the
>>>> child document. Can this be avoided ? Is this per design ?
>>>> Are there are articles talking about ways to model a many-to-many
>>>> relationship (even if its a hacky solution).
>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> -- 
>> --------------------------
>> 
>> Benedetti Alessandro
>> Visiting card : http://about.me/alessandro_benedetti
>> 
>> "Tyger, tyger burning bright
>> In the forests of the night,
>> What immortal hand or eye
>> Could frame thy fearful symmetry?"
>> 
>> William Blake - Songs of Experience -1794 England
> 


Re: Nested documents and many-many relation

Posted by Sathyakumar Seshachalam <Sa...@Trimble.com>.
Thanks, query time joins are not an option for me, because of the size of
the index and hence the join performance,
I will look at Siren.
 

On 29/01/16, 10:16 PM, "Alessandro Benedetti" <ab...@apache.org>
wrote:

>Probably if you are interested in a many-to-many relation, you could be
>interested in the query time join.
>it has been the first type of join integrated in Solr.
>It allow you to avoid redundancies.
>It's slower than block join, but it doesn't force you to any specific
>indexing approach.
>It became less and less popular but there are scenario could be useful !
>
>Take a look to Siren as well, it could be interesting, not sure it will
>help you as Siren will duplicate nested documents.
>Cheers
>
>On 29 January 2016 at 15:32, Jack Krupansky <ja...@gmail.com>
>wrote:
>
>> If you wish to change, add, or delete a child or change the parent you
>>must
>> do an add of the entire block again with both the parent and all
>>children.
>> This is because the efficiency of Block Join comes from the documents
>>being
>> adjacent in Lucene and segments are immutable in Lucene, so the entire
>> block must be written to a new segment.
>>
>>
>> -- Jack Krupansky
>>
>> On Fri, Jan 29, 2016 at 5:13 AM, Sathyakumar Seshachalam <
>> Sathyakumar_Seshachalam@trimble.com> wrote:
>>
>> > Hi,
>> >
>> > Am trying to investigate the possibility of using Block Join query
>>parser
>> > in a many-to-many relation scenario.
>> > Observation is that when a document is added as a child to more than
>>one
>> > parent document (I use Solrj to do this), I seem to get two copies of
>>the
>> > child document. Can this be avoided ? Is this per design ?
>> > Are there are articles talking about ways to model a many-to-many
>> > relationship (even if its a hacky solution).
>> >
>> >
>>
>
>
>
>-- 
>--------------------------
>
>Benedetti Alessandro
>Visiting card : http://about.me/alessandro_benedetti
>
>"Tyger, tyger burning bright
>In the forests of the night,
>What immortal hand or eye
>Could frame thy fearful symmetry?"
>
>William Blake - Songs of Experience -1794 England


Re: Nested documents and many-many relation

Posted by Alessandro Benedetti <ab...@apache.org>.
Probably if you are interested in a many-to-many relation, you could be
interested in the query time join.
it has been the first type of join integrated in Solr.
It allow you to avoid redundancies.
It's slower than block join, but it doesn't force you to any specific
indexing approach.
It became less and less popular but there are scenario could be useful !

Take a look to Siren as well, it could be interesting, not sure it will
help you as Siren will duplicate nested documents.
Cheers

On 29 January 2016 at 15:32, Jack Krupansky <ja...@gmail.com>
wrote:

> If you wish to change, add, or delete a child or change the parent you must
> do an add of the entire block again with both the parent and all children.
> This is because the efficiency of Block Join comes from the documents being
> adjacent in Lucene and segments are immutable in Lucene, so the entire
> block must be written to a new segment.
>
>
> -- Jack Krupansky
>
> On Fri, Jan 29, 2016 at 5:13 AM, Sathyakumar Seshachalam <
> Sathyakumar_Seshachalam@trimble.com> wrote:
>
> > Hi,
> >
> > Am trying to investigate the possibility of using Block Join query parser
> > in a many-to-many relation scenario.
> > Observation is that when a document is added as a child to more than one
> > parent document (I use Solrj to do this), I seem to get two copies of the
> > child document. Can this be avoided ? Is this per design ?
> > Are there are articles talking about ways to model a many-to-many
> > relationship (even if its a hacky solution).
> >
> >
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: Nested documents and many-many relation

Posted by Jack Krupansky <ja...@gmail.com>.
If you wish to change, add, or delete a child or change the parent you must
do an add of the entire block again with both the parent and all children.
This is because the efficiency of Block Join comes from the documents being
adjacent in Lucene and segments are immutable in Lucene, so the entire
block must be written to a new segment.


-- Jack Krupansky

On Fri, Jan 29, 2016 at 5:13 AM, Sathyakumar Seshachalam <
Sathyakumar_Seshachalam@trimble.com> wrote:

> Hi,
>
> Am trying to investigate the possibility of using Block Join query parser
> in a many-to-many relation scenario.
> Observation is that when a document is added as a child to more than one
> parent document (I use Solrj to do this), I seem to get two copies of the
> child document. Can this be avoided ? Is this per design ?
> Are there are articles talking about ways to model a many-to-many
> relationship (even if its a hacky solution).
>
>

Re: Nested documents and many-many relation

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Hello,
This implies that an indexing extracts cliques of bipartite graph. Then,
every clique goes as a single block with a sentinel parent document. And
this parent document can carry incidence matrix as, let's say, binary
docvalues. Then, a bunch of custom components can to handle this model.

On Fri, Jan 29, 2016 at 1:13 PM, Sathyakumar Seshachalam <
Sathyakumar_Seshachalam@trimble.com> wrote:

> Hi,
>
> Am trying to investigate the possibility of using Block Join query parser
> in a many-to-many relation scenario.
> Observation is that when a document is added as a child to more than one
> parent document (I use Solrj to do this), I seem to get two copies of the
> child document. Can this be avoided ? Is this per design ?
> Are there are articles talking about ways to model a many-to-many
> relationship (even if its a hacky solution).
>
>


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<mk...@griddynamics.com>