You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Burkamp, Christian" <C....@Ceyoniq.com> on 2007/04/20 13:11:01 UTC

Avoiding caching of special filter queries

Hi,

I'm using filter queries to implement document level security with solr.
The caching mechanism for filters separate from queries comes in handy
and the system performs well once all the filters for the users of the
system are stored in the cache.
However, I'm storing full document content in the index for the purpose
of highlighting. In addition to the standard snippet highlighting I
would like to offer a feature that displays the highlighted full
document content. I can add a filter query to select just the needed
Document by ID but this filter would go into the filter cache as well,
possibly throwing out some of the other usefull filters.
Is there a way to get the single document with highlighting info but
without polluting the filter cache?

-- Christian


AW: Avoiding caching of special filter queries

Posted by "Burkamp, Christian" <C....@Ceyoniq.com>.
Hi Erik and Mike,

Thank you for clarifying. 
Erik's pretty obvious solution works for now. 
I will try the more complex approach if it turns out to be a performance issue to do the search again.

Thanks alot,

--Christian

-----Ursprüngliche Nachricht-----
Von: Mike Klaas [mailto:mike.klaas@gmail.com] 
Gesendet: Freitag, 20. April 2007 22:33
An: solr-user@lucene.apache.org
Betreff: Re: Avoiding caching of special filter queries

On 4/20/07, Burkamp, Christian <C....@ceyoniq.com> wrote:
> Hi Erik,
>
> No, what I need to do is
>
>     &q="my funny query"&fq=user:erik&fq=id:"doc Id"&hl=on ...
>
> This is because the StandardRequestHandler needs the original query to do proper highlighting.
> The user gets his paginated result page with his next 10 hits. He can then select one document for highlighting. Then I just repeat the last request with an additional filter query to select this one document and add the highlighting parameters.

Erik posted the way to do this that works with OOB Solr.  If you want to do it with no additional querying (not even for the docid filter), you can use an approach like this (from a previous email):

 - turn on lazy field loading.  For best effect, compress the main text field.
 - create a new request handler that is similar to dismax, but uses the query for highlighting only.  A separate parameter allows the specification of document keys to highlight
 - highlighting requires the internal lucene document id, not the document key, and it can be slow to execute queries to get the ids.  I created a custom cache that maps doc keys -> doc ids, populate it during the main query, and grab ids from the cache during the highlighting step.

-Mike

Re: Avoiding caching of special filter queries

Posted by Mike Klaas <mi...@gmail.com>.
On 4/20/07, Burkamp, Christian <C....@ceyoniq.com> wrote:
> Hi Erik,
>
> No, what I need to do is
>
>     &q="my funny query"&fq=user:erik&fq=id:"doc Id"&hl=on ...
>
> This is because the StandardRequestHandler needs the original query to do proper highlighting.
> The user gets his paginated result page with his next 10 hits. He can then select one document for highlighting. Then I just repeat the last request with an additional filter query to select this one document and add the highlighting parameters.

Erik posted the way to do this that works with OOB Solr.  If you want
to do it with no additional querying (not even for the docid filter),
you can use an approach like this (from a previous email):

 - turn on lazy field loading.  For best effect, compress the main text field.
 - create a new request handler that is similar to dismax, but uses
the query for highlighting only.  A separate parameter allows the
specification of document keys to highlight
 - highlighting requires the internal lucene document id, not the
document key, and it can be slow to execute queries to get the ids.  I
created a custom cache that maps doc keys -> doc ids, populate it
during the main query, and grab ids from the cache during the
highlighting step.

-Mike

Re: AW: Avoiding caching of special filter queries

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Apr 20, 2007, at 10:02 AM, Burkamp, Christian wrote:
> No, what I need to do is
>
>     &q="my funny query"&fq=user:erik&fq=id:"doc Id"&hl=on ...

No you don't.... what you need is:

	&q="my funny query" AND id:"doc Id"&fq=user:erik&hl=on

> This is because the StandardRequestHandler needs the original query  
> to do proper highlighting.
> The user gets his paginated result page with his next 10 hits. He  
> can then select one document for highlighting. Then I just repeat  
> the last request with an additional filter query to select this one  
> document and add the highlighting parameters.

I think the above will suit this use case just fine.  No?

	Erik


>
> -- Christian
>
> -----Ursprüngliche Nachricht-----
> Von: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Gesendet: Freitag, 20. April 2007 15:43
> An: solr-user@lucene.apache.org
> Betreff: Re: Avoiding caching of special filter queries
>
>
>
> On Apr 20, 2007, at 7:11 AM, Burkamp, Christian wrote:
>> I'm using filter queries to implement document level security with
>> solr.
>> The caching mechanism for filters separate from queries comes in  
>> handy
>> and the system performs well once all the filters for the users of  
>> the
>> system are stored in the cache.
>> However, I'm storing full document content in the index for the
>> purpose
>> of highlighting. In addition to the standard snippet highlighting I
>> would like to offer a feature that displays the highlighted full
>> document content. I can add a filter query to select just the needed
>> Document by ID but this filter would go into the filter cache as  
>> well,
>> possibly throwing out some of the other usefull filters.
>> Is there a way to get the single document with highlighting info but
>> without polluting the filter cache?
>
> Correct me if I'm wrong, but here's my understanding...
>
>     &q=id:"doc id"&fq=user:erik
>
> is what you'd want to do.  q=id:"doc" won't go into the filter cache,
> but rather the query cache and the document itself into the document
> cache.  So you won't risk bumping things out of the filter cache by
> using queries.
>
> 	Erik


AW: Avoiding caching of special filter queries

Posted by "Burkamp, Christian" <C....@Ceyoniq.com>.
Hi Erik,

No, what I need to do is 

    &q="my funny query"&fq=user:erik&fq=id:"doc Id"&hl=on ...

This is because the StandardRequestHandler needs the original query to do proper highlighting.
The user gets his paginated result page with his next 10 hits. He can then select one document for highlighting. Then I just repeat the last request with an additional filter query to select this one document and add the highlighting parameters.

-- Christian

-----Ursprüngliche Nachricht-----
Von: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
Gesendet: Freitag, 20. April 2007 15:43
An: solr-user@lucene.apache.org
Betreff: Re: Avoiding caching of special filter queries



On Apr 20, 2007, at 7:11 AM, Burkamp, Christian wrote:
> I'm using filter queries to implement document level security with
> solr.
> The caching mechanism for filters separate from queries comes in handy
> and the system performs well once all the filters for the users of the
> system are stored in the cache.
> However, I'm storing full document content in the index for the  
> purpose
> of highlighting. In addition to the standard snippet highlighting I
> would like to offer a feature that displays the highlighted full
> document content. I can add a filter query to select just the needed
> Document by ID but this filter would go into the filter cache as well,
> possibly throwing out some of the other usefull filters.
> Is there a way to get the single document with highlighting info but
> without polluting the filter cache?

Correct me if I'm wrong, but here's my understanding...

    &q=id:"doc id"&fq=user:erik

is what you'd want to do.  q=id:"doc" won't go into the filter cache,  
but rather the query cache and the document itself into the document  
cache.  So you won't risk bumping things out of the filter cache by  
using queries.

	Erik


Re: Avoiding caching of special filter queries

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Apr 20, 2007, at 7:11 AM, Burkamp, Christian wrote:
> I'm using filter queries to implement document level security with  
> solr.
> The caching mechanism for filters separate from queries comes in handy
> and the system performs well once all the filters for the users of the
> system are stored in the cache.
> However, I'm storing full document content in the index for the  
> purpose
> of highlighting. In addition to the standard snippet highlighting I
> would like to offer a feature that displays the highlighted full
> document content. I can add a filter query to select just the needed
> Document by ID but this filter would go into the filter cache as well,
> possibly throwing out some of the other usefull filters.
> Is there a way to get the single document with highlighting info but
> without polluting the filter cache?

Correct me if I'm wrong, but here's my understanding...

    &q=id:"doc id"&fq=user:erik

is what you'd want to do.  q=id:"doc" won't go into the filter cache,  
but rather the query cache and the document itself into the document  
cache.  So you won't risk bumping things out of the filter cache by  
using queries.

	Erik