You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jon Stewart <jo...@lightboxtechnologies.com> on 2014/02/10 15:05:21 UTC

Faceting and Query-time Joins

Hello,

tl;dr: I'd like to know how to do faceting over the result set of a
query-time join (JoinUtils). If it's not currently supported by
Lucene, I'd appreciate some pointers about what needs to be done.

I'm working on a greenfields project with Lucene 4.6. The application
treats its primary objects as a collection of child records. The child
records are of different types and, unfortunately, are not available
all at once (ruling out BlockJoinQuery). As the child records roll
into the system for indexing, they're represented as Lucene Document
objects that have the primary key of the parent object as a field. The
child records themselves never change, so there's no need for
re-indexing. I can use query-time joins on the parent ID field. So
far, so good.

The problem is that I also very much want to have faceting pertaining
to the parent objects. Googling around the past couple days hasn't
revealed much discussion of how to combine facets with query-time
joins (except "nope": http://search-lucene.com/m/QTPadBcnv1). Is it
possible to combine these two features with the above constraints? If
so, how? If not in Lucene 4.6, is there related work in trunk? One
thing I was thinking about last night is that it wouldn't seem to be
too hard to do the faceting for this case by using update-able
NumericDocValue on a dummy parent object, since that shouldn't require
re-indexing.

TIA,

Jon
-- 
Jon Stewart, Principal
(646) 719-0317 | jon@lightboxtechnologies.com | Arlington, VA

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Faceting and Query-time Joins

Posted by Jon Stewart <jo...@lightboxtechnologies.com>.
Child values. There really isn't a true parent, other than to refer to
the collection of children. However, the children are all of different
types and I'd expect that a given facet would only pertain to a given
child, i.e., you're not going to get a second child which involves the
same facet.

A bit unusual is that I care more about indexing performance and less
about query latency. I don't want queries that take too long, of
course, but a second or two or three is fine, and I don't expect much,
if any, concurrency (nor is adding RAM :-). So having to do a little
more work at query time isn't that big of a concern if I can avoid
re-indexing.


Jon


On Mon, Feb 10, 2014 at 12:04 PM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> Are you faceting on parent values or child values?
>
> Parent values should be easy; child values is not.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Mon, Feb 10, 2014 at 9:05 AM, Jon Stewart
> <jo...@lightboxtechnologies.com> wrote:
>> Hello,
>>
>> tl;dr: I'd like to know how to do faceting over the result set of a
>> query-time join (JoinUtils). If it's not currently supported by
>> Lucene, I'd appreciate some pointers about what needs to be done.
>>
>> I'm working on a greenfields project with Lucene 4.6. The application
>> treats its primary objects as a collection of child records. The child
>> records are of different types and, unfortunately, are not available
>> all at once (ruling out BlockJoinQuery). As the child records roll
>> into the system for indexing, they're represented as Lucene Document
>> objects that have the primary key of the parent object as a field. The
>> child records themselves never change, so there's no need for
>> re-indexing. I can use query-time joins on the parent ID field. So
>> far, so good.
>>
>> The problem is that I also very much want to have faceting pertaining
>> to the parent objects. Googling around the past couple days hasn't
>> revealed much discussion of how to combine facets with query-time
>> joins (except "nope": http://search-lucene.com/m/QTPadBcnv1). Is it
>> possible to combine these two features with the above constraints? If
>> so, how? If not in Lucene 4.6, is there related work in trunk? One
>> thing I was thinking about last night is that it wouldn't seem to be
>> too hard to do the faceting for this case by using update-able
>> NumericDocValue on a dummy parent object, since that shouldn't require
>> re-indexing.
>>
>> TIA,
>>
>> Jon
>> --
>> Jon Stewart, Principal
>> (646) 719-0317 | jon@lightboxtechnologies.com | Arlington, VA
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>



-- 
Jon Stewart, Principal
(646) 719-0317 | jon@lightboxtechnologies.com | Arlington, VA

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Faceting and Query-time Joins

Posted by Michael McCandless <lu...@mikemccandless.com>.
Are you faceting on parent values or child values?

Parent values should be easy; child values is not.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Feb 10, 2014 at 9:05 AM, Jon Stewart
<jo...@lightboxtechnologies.com> wrote:
> Hello,
>
> tl;dr: I'd like to know how to do faceting over the result set of a
> query-time join (JoinUtils). If it's not currently supported by
> Lucene, I'd appreciate some pointers about what needs to be done.
>
> I'm working on a greenfields project with Lucene 4.6. The application
> treats its primary objects as a collection of child records. The child
> records are of different types and, unfortunately, are not available
> all at once (ruling out BlockJoinQuery). As the child records roll
> into the system for indexing, they're represented as Lucene Document
> objects that have the primary key of the parent object as a field. The
> child records themselves never change, so there's no need for
> re-indexing. I can use query-time joins on the parent ID field. So
> far, so good.
>
> The problem is that I also very much want to have faceting pertaining
> to the parent objects. Googling around the past couple days hasn't
> revealed much discussion of how to combine facets with query-time
> joins (except "nope": http://search-lucene.com/m/QTPadBcnv1). Is it
> possible to combine these two features with the above constraints? If
> so, how? If not in Lucene 4.6, is there related work in trunk? One
> thing I was thinking about last night is that it wouldn't seem to be
> too hard to do the faceting for this case by using update-able
> NumericDocValue on a dummy parent object, since that shouldn't require
> re-indexing.
>
> TIA,
>
> Jon
> --
> Jon Stewart, Principal
> (646) 719-0317 | jon@lightboxtechnologies.com | Arlington, VA
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org