You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Grant Ingersoll <gs...@apache.org> on 2007/04/12 23:59:29 UTC

Results per user

Hi,

For a given user, I would like to submit a search and return back  
results that are filtered on a per user basis, for instance to remove  
results that have already been viewed.  I know I could post process  
the results from Solr to do this, but am wondering if a better  
solution is to implement my own request handler that takes in user id  
info and manages a cache of Filters that maintains the bit set info  
on the search side.  Is this a good approach?

Has anyone come across this kind of thing before?  A similar idea  
would be how to implement something like Google's personal search  
results.

Thanks,
Grant

Re: Results per user

Posted by "J.J. Larrea" <jj...@panix.com>.
I wrote the following after hurriedly reading Grant Ingersoll's 
question, and I completely missed the "to remove results that have 
already been viewed" bit.  Which leads me to think what I wrote may 
have no bearing on this issue...  but perhaps it may have bearing on 
someone else's issue?

- J.J.

-----
Under the assumption that there is an untokenized field, say 
UserAccess, with user names or IDs that for each document indicate 
which users can access them...

If you could trust the requesting client to modify the request based 
on the user name or ID, it could either

  - Add an fq=UserAccess:userName argument to every request

  - Create a RequestHandler configuration for each user, putting such 
a fq (with a hardwired username) in an 'appends' section, along with 
any other needed customization:

   <requestHandler name="/users/tony" class="solr.StandardRequestHandler">
     <lst name="appends">
       <str name="fq">UserAccess:tony</str>
     ...

But if you cannot trust the requesting client and need to do the 
filtering on the SOLR side of the divide, then I think you can simply 
subclass and deploy org.apache.solr.servlet.SolrDispatchFilter, such 
that in the execute() method you take the user (e.g. from 
request.getRemoteUser() or some other means), format a fq argument as 
above, and explicitly add it to the params in the SolrQueryRequest. 
While users can add filters to their queries, they would not be able 
to remove the applet-supplied filter query.

Regardless of how fq is specified, it would create a cached filter 
for each user. Obviously the filter cache size should be greater than 
the number of simultaneously active users plus the filters they use 
in their queries; inactive users' filters will be scrubbed until the 
next time.
-----


Re: Results per user

Posted by Chris Hostetter <ho...@fucit.org>.
: I don't use Filters very much so this might be a dumb question, but I
: could overcome the main drawback by hooking into the filter and
: updating it's bits without affecting the caching, right?

Not really ... Solr doesn't use Filter's the same way as
CachingWrapperFilter does ... it builds DocSet's out of them and cachines
those for the life of the IndexSearcher (or until the cache gets full and
it needs to expunge somehting) when a new IndexSearcher is opened, it
auto-warms the new filterCache by executing hte exsiting Filter's against
the new IndexSearcher.

: I kind of think I have scaling issues no matter what.  If you do the
: post processing way, then you may have to make repeated fetches to
: Solr in order to get enough results to display.

anything you can do on the client side you can do in a custom request
handler (assuming you cna do it in java) so that will at least save you
the overhead of HTTP back and forth with the Solr server ... i was jsut
trying to think of ways that existing features available to
SolrRequestHandlers could help you more.




-Hoss


Re: Results per user

Posted by Grant Ingersoll <gs...@apache.org>.
I don't use Filters very much so this might be a dumb question, but I  
could overcome the main drawback by hooking into the filter and  
updating it's bits without affecting the caching, right?

I kind of think I have scaling issues no matter what.  If you do the  
post processing way, then you may have to make repeated fetches to  
Solr in order to get enough results to display.

I think I may have to dig a bit deeper into both approaches

On Apr 12, 2007, at 7:41 PM, Chris Hostetter wrote:

>
> : > results that are filtered on a per user basis, for instance to  
> remove
> : > results that have already been viewed.  I know I could post  
> process
> : > the results from Solr to do this, but am wondering if a better
> : > solution is to implement my own request handler that takes in  
> user id
> : > info and manages a cache of Filters that maintains the bit set  
> info
> : > on the search side.  Is this a good approach?
>
> : One issue with your approach would be scaling... if you have  
> multiple
> : searchers, how do you communicate this user data between them?
>
> If the filtering logic can be implemented in a Filter class, you might
> just want to rely on the built in filterCache (you'd still need a  
> custom
> request handler that kows about your custom Filter)
>
> the plus side is you'd get all the benefits of Solr's filter cache  
> (cached
> as long as the same searcher is used, autowarmed when a new  
> searcher is
> opened)
>
> the down side is you'd get all the benefits of Solr's filter cache  
> (cached
> as long as the same searcher is used -- so it wouldn't notice if you'd
> updated your datastore to remove a bunch of files from their filter)
>
>
>
>
>
> -Hoss
>



Re: Results per user

Posted by Chris Hostetter <ho...@fucit.org>.
: > results that are filtered on a per user basis, for instance to remove
: > results that have already been viewed.  I know I could post process
: > the results from Solr to do this, but am wondering if a better
: > solution is to implement my own request handler that takes in user id
: > info and manages a cache of Filters that maintains the bit set info
: > on the search side.  Is this a good approach?

: One issue with your approach would be scaling... if you have multiple
: searchers, how do you communicate this user data between them?

If the filtering logic can be implemented in a Filter class, you might
just want to rely on the built in filterCache (you'd still need a custom
request handler that kows about your custom Filter)

the plus side is you'd get all the benefits of Solr's filter cache (cached
as long as the same searcher is used, autowarmed when a new searcher is
opened)

the down side is you'd get all the benefits of Solr's filter cache (cached
as long as the same searcher is used -- so it wouldn't notice if you'd
updated your datastore to remove a bunch of files from their filter)





-Hoss


Re: Results per user

Posted by Yonik Seeley <yo...@apache.org>.
On 4/12/07, Grant Ingersoll <gs...@apache.org> wrote:
> For a given user, I would like to submit a search and return back
> results that are filtered on a per user basis, for instance to remove
> results that have already been viewed.  I know I could post process
> the results from Solr to do this, but am wondering if a better
> solution is to implement my own request handler that takes in user id
> info and manages a cache of Filters that maintains the bit set info
> on the search side.  Is this a good approach?

I haven't done anything like that.
One issue with your approach would be scaling... if you have multiple
searchers, how do you communicate this user data between them?

Storing the user info in a database and post-processing might be the easiest.

-Yonik