You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Floyd Wu <fl...@gmail.com> on 2011/11/23 04:48:29 UTC

Separate ACL and document index

Hi there,

Is it possible to separate ACL index and document index and achieve to
search by user role in SOLR?

Currently my implementation is to index ACL with document, but the
document itself change frequently. I have to perform rebuild index
every time when ACL change. It's heavy for whole system due to
document are so many and content are huge.

Do you guys have any solution to solve this problem. I've been read
mailing list for a while. Seem there is not suitable solution for me.

I want user searches result only for him according to his role but I
don't want to re-index document every time when document's ACL change.

To my knowledge, is this possible to perform a join like database to
achieve this? How and possible?

Thanks

Floyd

Re: Separate ACL and document index

Posted by Floyd Wu <fl...@gmail.com>.
I've been read much about Document Level Security
https://issues.apache.org/jira/browse/SOLR-1895
https://issues.apache.org/jira/browse/SOLR-1872
https://issues.apache.org/jira/browse/SOLR-1834

But I not fully sure that these patch solved my problem?
It seems to that change the original document ACL will need to
re-build index "with document content".

It make no sense to rebuild when I only change ACL.

Have any idea? Or I just misunderstanding these patch?

Floyd



2011/11/23 Floyd Wu <fl...@gmail.com>:
> Hi there,
>
> Is it possible to separate ACL index and document index and achieve to
> search by user role in SOLR?
>
> Currently my implementation is to index ACL with document, but the
> document itself change frequently. I have to perform rebuild index
> every time when ACL change. It's heavy for whole system due to
> document are so many and content are huge.
>
> Do you guys have any solution to solve this problem. I've been read
> mailing list for a while. Seem there is not suitable solution for me.
>
> I want user searches result only for him according to his role but I
> don't want to re-index document every time when document's ACL change.
>
> To my knowledge, is this possible to perform a join like database to
> achieve this? How and possible?
>
> Thanks
>
> Floyd
>

Re: Separate ACL and document index

Posted by Erick Erickson <er...@gmail.com>.
There's another approach that *may* help, see:
https://issues.apache.org/jira/browse/SOLR-2429

This is probably suitable if you don't have a zillion results
to sort through. The idea here is that you can specify a
filter query that only executes after all the other parts
of a query are done, i.e. is only calculated for documents
that have been selected and passed through the lower-
cost filter queries etc. You could create a custom component
that calculated whether a user had access to the docs
on the fly.

Yet another approach is to define the problem away. If you can
define a reasonably small number of *groups* (where small
might be 100s) and assign users to groups and then
grant/deny access based on group membership, you can
then assign users to groups and have their access controlled
without re-indexing the doc. You do have to get the group list
that the user belongs to from some external source and
use *that* as your filter.

But the ACL problem is yucky when it gets very complex.

Best
Erick


On Wed, Nov 23, 2011 at 9:49 PM, Floyd Wu <fl...@gmail.com> wrote:
> Thank you for your sharing, My current solution is similar to 2).
> But my problem is ACL is early-binding (which means I build index and
> embedded ACL with document index) I don't want to rebuild full index(a
> lucene/solr Document with PDF content and ACL) when front end change
> only permission settings.
>
> Seems solution 2)  have same problem.
>
> Floyd
>
>
> 2011/11/24 Robert Stewart <bs...@gmail.com>:
>> I have used two different ways:
>>
>> 1) Store mapping from users to documents in some external database
>> such as MySQL.  At search time, lookup mapping for user to some unique
>> doc ID or some group ID, and then build query or doc set which you can
>> cache in SOLR process for some period.  Then use that as a filter in
>> your search.  This is more involved approach but better if you have
>> lots of ACLs per user, but it is non-trivial to implement it well.  I
>> used this in a system with over 100 million docs, and approx. 20,000
>> ACLs per user.  The ACL mapped user to a set of group IDs, and each
>> group could have 10,000+ documents.
>>
>> 2) Generate a query filter that you pass to SOLR as part of the
>> search.  Potentially it could be a pretty large query if user has
>> granular ACL over may documents or groups.  I've seen it work ok with
>> up to 1000 or so ACLs per user query.  So you build that filter query
>> from the client using some external database to lookup user ACLs
>> before sending request to SOLR.
>>
>> Bob
>>
>>
>> On Tue, Nov 22, 2011 at 10:48 PM, Floyd Wu <fl...@gmail.com> wrote:
>>> Hi there,
>>>
>>> Is it possible to separate ACL index and document index and achieve to
>>> search by user role in SOLR?
>>>
>>> Currently my implementation is to index ACL with document, but the
>>> document itself change frequently. I have to perform rebuild index
>>> every time when ACL change. It's heavy for whole system due to
>>> document are so many and content are huge.
>>>
>>> Do you guys have any solution to solve this problem. I've been read
>>> mailing list for a while. Seem there is not suitable solution for me.
>>>
>>> I want user searches result only for him according to his role but I
>>> don't want to re-index document every time when document's ACL change.
>>>
>>> To my knowledge, is this possible to perform a join like database to
>>> achieve this? How and possible?
>>>
>>> Thanks
>>>
>>> Floyd
>>>
>>
>

Re: Separate ACL and document index

Posted by Floyd Wu <fl...@gmail.com>.
Thank you for your sharing, My current solution is similar to 2).
But my problem is ACL is early-binding (which means I build index and
embedded ACL with document index) I don't want to rebuild full index(a
lucene/solr Document with PDF content and ACL) when front end change
only permission settings.

Seems solution 2)  have same problem.

Floyd


2011/11/24 Robert Stewart <bs...@gmail.com>:
> I have used two different ways:
>
> 1) Store mapping from users to documents in some external database
> such as MySQL.  At search time, lookup mapping for user to some unique
> doc ID or some group ID, and then build query or doc set which you can
> cache in SOLR process for some period.  Then use that as a filter in
> your search.  This is more involved approach but better if you have
> lots of ACLs per user, but it is non-trivial to implement it well.  I
> used this in a system with over 100 million docs, and approx. 20,000
> ACLs per user.  The ACL mapped user to a set of group IDs, and each
> group could have 10,000+ documents.
>
> 2) Generate a query filter that you pass to SOLR as part of the
> search.  Potentially it could be a pretty large query if user has
> granular ACL over may documents or groups.  I've seen it work ok with
> up to 1000 or so ACLs per user query.  So you build that filter query
> from the client using some external database to lookup user ACLs
> before sending request to SOLR.
>
> Bob
>
>
> On Tue, Nov 22, 2011 at 10:48 PM, Floyd Wu <fl...@gmail.com> wrote:
>> Hi there,
>>
>> Is it possible to separate ACL index and document index and achieve to
>> search by user role in SOLR?
>>
>> Currently my implementation is to index ACL with document, but the
>> document itself change frequently. I have to perform rebuild index
>> every time when ACL change. It's heavy for whole system due to
>> document are so many and content are huge.
>>
>> Do you guys have any solution to solve this problem. I've been read
>> mailing list for a while. Seem there is not suitable solution for me.
>>
>> I want user searches result only for him according to his role but I
>> don't want to re-index document every time when document's ACL change.
>>
>> To my knowledge, is this possible to perform a join like database to
>> achieve this? How and possible?
>>
>> Thanks
>>
>> Floyd
>>
>

Re: Separate ACL and document index

Posted by Robert Stewart <bs...@gmail.com>.
I have used two different ways:

1) Store mapping from users to documents in some external database
such as MySQL.  At search time, lookup mapping for user to some unique
doc ID or some group ID, and then build query or doc set which you can
cache in SOLR process for some period.  Then use that as a filter in
your search.  This is more involved approach but better if you have
lots of ACLs per user, but it is non-trivial to implement it well.  I
used this in a system with over 100 million docs, and approx. 20,000
ACLs per user.  The ACL mapped user to a set of group IDs, and each
group could have 10,000+ documents.

2) Generate a query filter that you pass to SOLR as part of the
search.  Potentially it could be a pretty large query if user has
granular ACL over may documents or groups.  I've seen it work ok with
up to 1000 or so ACLs per user query.  So you build that filter query
from the client using some external database to lookup user ACLs
before sending request to SOLR.

Bob


On Tue, Nov 22, 2011 at 10:48 PM, Floyd Wu <fl...@gmail.com> wrote:
> Hi there,
>
> Is it possible to separate ACL index and document index and achieve to
> search by user role in SOLR?
>
> Currently my implementation is to index ACL with document, but the
> document itself change frequently. I have to perform rebuild index
> every time when ACL change. It's heavy for whole system due to
> document are so many and content are huge.
>
> Do you guys have any solution to solve this problem. I've been read
> mailing list for a while. Seem there is not suitable solution for me.
>
> I want user searches result only for him according to his role but I
> don't want to re-index document every time when document's ACL change.
>
> To my knowledge, is this possible to perform a join like database to
> achieve this? How and possible?
>
> Thanks
>
> Floyd
>