You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sujatha Arun <su...@gmail.com> on 2011/06/14 09:18:56 UTC
Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Hello,
Our Use Case is as follows
Several solr webapps (one JVM) ,Each webapp catering to one client .Each
client has their users who can purchase products from the site .Once they
purchase ,they have full access to the products ,other wise they can only
view details .
The products are not tied to the user at the document level, simply because
, once the purchase duration of product expires ,the user will no longer
have access to that product.
So a search for a product once the user logs in and searches for only the
products that he has access to Will translate to something like this . ,the
product ids are obtained form the db for a particular user and can run
into n number.
<search term> &fq=product_id(100 10001 ......n number)
but we are currently running into too many Boolean expansion error .We are
not able to tie the user also into roles as each user is mainly any one who
comes to site and purchases a product .
Given the 2 solutions above as SOLR -1872 where we have to specify the user
in an ACL file and
query for allow and deny also translates to what we are trying to do above
In Case of SOLR 1834 ,we are required to use a crawler (APACHE manifoldCF)
for indexing the Permissions(also the data) into the document and then
querying on it ,this will also not work in our scenario as we have n web
apps having the same requirement ,it would be tedious to set this up for
each webapp and also the requirement that once the user permission for a
product is revoked ,then he should not be able to search on the same within
his subscribed products.
Any pointers would be helpful and sorry about the lengthy description.
Regards
Sujatha
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Sujatha Arun <su...@gmail.com>.
Peter ,
Thanks for the clarification.
Why I specifically asked was because, we have many search instances
(200+) on a single JVM.
Each of these instaces could have <n> users and each user can subscribe to
<n> products .Now accordng to your suggestion , I need to maintain an
in-memory list of all users and their subscribed products for each of the
instances and use this list to fllter for a given query.We are maintaining
the user and subscrption details in a DB.
I was wondering ,instead if it would make more sense(with respect to
memory) to dynamically get the subscribed product ids when ever a user
logs in (as access is only for the user session) and use this data to
flter the query ?
And we really do not have budget and hence wont be able to contract LI for
this ,though I will certanly need to get some JAVA experts help wthin my
org.
Thanks for your time
Regards
Sujatha
On Wed, Jun 15, 2011 at 11:29 PM, Peter Sturge <pe...@gmail.com>wrote:
> Hi,
>
> By in-memory, I mean you hold a list of users (+ some other parameters
> like order number, expiry, what ever else you need) in one of those
> Greek HashMaps, and use this list to determine what query
> parameters/results will be processed for a given search request
> (SOLR-1872 reads an acl file to populate such a list). So if you had
> 500 users who had purchased stuff at a given moment, you'd have 500
> entries in the table that hold the relevant data to filter/not filter
> searches/results.
> This won't cause a memory problem unless you have a million users and
> stored their autobiography in each entry.
> I wouldn't call this sort of thing a novice or even journeyman's task,
> you would definitely need to know about using and maintaining tables
> etc.
> Would you be able to contract someone to do the work on your behalf?
> There are some excellent resources around, and Lucid would certainly
> do a great job, but of course you'd need budget for this approach.
> Alternatively, maybe you can tap some java expertise within your
> organization to help out?
>
> HTH,
> Peter
>
>
> On Wed, Jun 15, 2011 at 6:17 PM, Sujatha Arun <su...@gmail.com> wrote:
> > Thanks ,Peter.
> >
> > I am not a Java Programmer and hence the code seems all Greek and Latin
> to
> > me .I do have a basic knowledge ,but all this Map,hashMap
> > ,Hashlist,NamedList , I dont understand.
> >
> > However I would like to implement the solution that you have mentoned
> ,so
> > if you have any pointers for me ,would be great .I would also try to dig
> > deep into JAVA.
> >
> > What s meant by in-memory?Is it the Ram memory ,So If i have <n>
> > concurrent users ,each having <n> products subscrbed,what would be the
> > Impact on memory ?
> >
> >
> >
> > Regards
> > Sujatha
> >
> >
> > On Tue, Jun 14, 2011 at 5:43 PM, Peter Sturge <peter.sturge@gmail.com
> >wrote:
> >
> >> SOLR-1872 doesn't add discrete booleans to the query, it does it
> >> programmatically, so you shouldn't see this problem. (if you have a
> >> look at the code, you'll see how it filters queries)
> >> I suppose you could modify SOLR-1872 to use an in-memory,
> >> dynamically-updated user list (+ associated filters) instead of using
> >> the acl file.
> >> This would give you the 'changing users' and 'expiry' functionailty you
> >> need.
> >>
> >>
> >>
> >> On Tue, Jun 14, 2011 at 10:08 AM, Sujatha Arun <su...@gmail.com>
> >> wrote:
> >> > Thanks Peter , for your input .
> >> >
> >> > I really would like a document and schema agnostic solution as in
> >> solr
> >> > 1872.
> >> >
> >> > Am I right in my assumption that SOLR1872 is same as the solution
> that
> >> > we currently have where we add a flter query of the products to
> orignal
> >> > query and hence (SOLR 1872) will also run into TOO many boolean
> clause
> >> > expanson error?
> >> >
> >> > Regards
> >> > Sujatha
> >> >
> >> >
> >> > On Tue, Jun 14, 2011 at 1:53 PM, Peter Sturge <peter.sturge@gmail.com
> >> >wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> SOLR-1834 is good when the original documents' ACL is accessible.
> >> >> SOLR-1872 is good where the usernames are persistent - neither of
> >> >> these really fit your use case.
> >> >> It sounds like you need more of an 'in-memory', transient access
> >> >> control mechanism. Does the access have to exist beyond the user's
> >> >> session (or the Solr vm session)?
> >> >> Your best bet is probably something like a custom SearchComponent or
> >> >> similar, that keeps track of user purchases, and either
> adjusts/limits
> >> >> the query or the results to suit.
> >> >> With your own module in the query chain, you can then decide when the
> >> >> 'expiry' is, and limit results accordingly.
> >> >>
> >> >> SearchComponent's are pretty easy to write and integrate. Have a look
> >> at:
> >> >> http://wiki.apache.org/solr/SearchComponent
> >> >> for info on SearchComponent and its usage.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Jun 14, 2011 at 8:18 AM, Sujatha Arun <su...@gmail.com>
> >> wrote:
> >> >> > Hello,
> >> >> >
> >> >> >
> >> >> > Our Use Case is as follows
> >> >> >
> >> >> > Several solr webapps (one JVM) ,Each webapp catering to one client
> >> .Each
> >> >> > client has their users who can purchase products from the site
> .Once
> >> >> they
> >> >> > purchase ,they have full access to the products ,other wise they
> can
> >> only
> >> >> > view details .
> >> >> >
> >> >> > The products are not tied to the user at the document level,
> simply
> >> >> because
> >> >> > , once the purchase duration of product expires ,the user will no
> >> longer
> >> >> > have access to that product.
> >> >> >
> >> >> > So a search for a product once the user logs in and searches for
> only
> >> the
> >> >> > products that he has access to Will translate to something like
> this .
> >> >> ,the
> >> >> > product ids are obtained form the db for a particular user and can
> >> run
> >> >> > into n number.
> >> >> >
> >> >> > <search term> &fq=product_id(100 10001 ......n number)
> >> >> >
> >> >> > but we are currently running into too many Boolean expansion error
> .We
> >> >> are
> >> >> > not able to tie the user also into roles as each user is mainly any
> >> one
> >> >> who
> >> >> > comes to site and purchases a product .
> >> >> >
> >> >> > Given the 2 solutions above as SOLR -1872 where we have to specify
> the
> >> >> user
> >> >> > in an ACL file and
> >> >> > query for allow and deny also translates to what we are trying to
> do
> >> >> above
> >> >> >
> >> >> > In Case of SOLR 1834 ,we are required to use a crawler (APACHE
> >> >> manifoldCF)
> >> >> > for indexing the Permissions(also the data) into the document and
> then
> >> >> > querying on it ,this will also not work in our scenario as we have
> n
> >> web
> >> >> > apps having the same requirement ,it would be tedious to set this
> up
> >> for
> >> >> > each webapp and also the requirement that once the user permission
> >> for a
> >> >> > product is revoked ,then he should not be able to search on the
> same
> >> >> within
> >> >> > his subscribed products.
> >> >> >
> >> >> > Any pointers would be helpful and sorry about the lengthy
> description.
> >> >> >
> >> >> > Regards
> >> >> > Sujatha
> >> >> >
> >> >>
> >> >
> >>
> >
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Peter Sturge <pe...@gmail.com>.
Hi,
By in-memory, I mean you hold a list of users (+ some other parameters
like order number, expiry, what ever else you need) in one of those
Greek HashMaps, and use this list to determine what query
parameters/results will be processed for a given search request
(SOLR-1872 reads an acl file to populate such a list). So if you had
500 users who had purchased stuff at a given moment, you'd have 500
entries in the table that hold the relevant data to filter/not filter
searches/results.
This won't cause a memory problem unless you have a million users and
stored their autobiography in each entry.
I wouldn't call this sort of thing a novice or even journeyman's task,
you would definitely need to know about using and maintaining tables
etc.
Would you be able to contract someone to do the work on your behalf?
There are some excellent resources around, and Lucid would certainly
do a great job, but of course you'd need budget for this approach.
Alternatively, maybe you can tap some java expertise within your
organization to help out?
HTH,
Peter
On Wed, Jun 15, 2011 at 6:17 PM, Sujatha Arun <su...@gmail.com> wrote:
> Thanks ,Peter.
>
> I am not a Java Programmer and hence the code seems all Greek and Latin to
> me .I do have a basic knowledge ,but all this Map,hashMap
> ,Hashlist,NamedList , I dont understand.
>
> However I would like to implement the solution that you have mentoned ,so
> if you have any pointers for me ,would be great .I would also try to dig
> deep into JAVA.
>
> What s meant by in-memory?Is it the Ram memory ,So If i have <n>
> concurrent users ,each having <n> products subscrbed,what would be the
> Impact on memory ?
>
>
>
> Regards
> Sujatha
>
>
> On Tue, Jun 14, 2011 at 5:43 PM, Peter Sturge <pe...@gmail.com>wrote:
>
>> SOLR-1872 doesn't add discrete booleans to the query, it does it
>> programmatically, so you shouldn't see this problem. (if you have a
>> look at the code, you'll see how it filters queries)
>> I suppose you could modify SOLR-1872 to use an in-memory,
>> dynamically-updated user list (+ associated filters) instead of using
>> the acl file.
>> This would give you the 'changing users' and 'expiry' functionailty you
>> need.
>>
>>
>>
>> On Tue, Jun 14, 2011 at 10:08 AM, Sujatha Arun <su...@gmail.com>
>> wrote:
>> > Thanks Peter , for your input .
>> >
>> > I really would like a document and schema agnostic solution as in
>> solr
>> > 1872.
>> >
>> > Am I right in my assumption that SOLR1872 is same as the solution that
>> > we currently have where we add a flter query of the products to orignal
>> > query and hence (SOLR 1872) will also run into TOO many boolean clause
>> > expanson error?
>> >
>> > Regards
>> > Sujatha
>> >
>> >
>> > On Tue, Jun 14, 2011 at 1:53 PM, Peter Sturge <peter.sturge@gmail.com
>> >wrote:
>> >
>> >> Hi,
>> >>
>> >> SOLR-1834 is good when the original documents' ACL is accessible.
>> >> SOLR-1872 is good where the usernames are persistent - neither of
>> >> these really fit your use case.
>> >> It sounds like you need more of an 'in-memory', transient access
>> >> control mechanism. Does the access have to exist beyond the user's
>> >> session (or the Solr vm session)?
>> >> Your best bet is probably something like a custom SearchComponent or
>> >> similar, that keeps track of user purchases, and either adjusts/limits
>> >> the query or the results to suit.
>> >> With your own module in the query chain, you can then decide when the
>> >> 'expiry' is, and limit results accordingly.
>> >>
>> >> SearchComponent's are pretty easy to write and integrate. Have a look
>> at:
>> >> http://wiki.apache.org/solr/SearchComponent
>> >> for info on SearchComponent and its usage.
>> >>
>> >>
>> >>
>> >>
>> >> On Tue, Jun 14, 2011 at 8:18 AM, Sujatha Arun <su...@gmail.com>
>> wrote:
>> >> > Hello,
>> >> >
>> >> >
>> >> > Our Use Case is as follows
>> >> >
>> >> > Several solr webapps (one JVM) ,Each webapp catering to one client
>> .Each
>> >> > client has their users who can purchase products from the site .Once
>> >> they
>> >> > purchase ,they have full access to the products ,other wise they can
>> only
>> >> > view details .
>> >> >
>> >> > The products are not tied to the user at the document level, simply
>> >> because
>> >> > , once the purchase duration of product expires ,the user will no
>> longer
>> >> > have access to that product.
>> >> >
>> >> > So a search for a product once the user logs in and searches for only
>> the
>> >> > products that he has access to Will translate to something like this .
>> >> ,the
>> >> > product ids are obtained form the db for a particular user and can
>> run
>> >> > into n number.
>> >> >
>> >> > <search term> &fq=product_id(100 10001 ......n number)
>> >> >
>> >> > but we are currently running into too many Boolean expansion error .We
>> >> are
>> >> > not able to tie the user also into roles as each user is mainly any
>> one
>> >> who
>> >> > comes to site and purchases a product .
>> >> >
>> >> > Given the 2 solutions above as SOLR -1872 where we have to specify the
>> >> user
>> >> > in an ACL file and
>> >> > query for allow and deny also translates to what we are trying to do
>> >> above
>> >> >
>> >> > In Case of SOLR 1834 ,we are required to use a crawler (APACHE
>> >> manifoldCF)
>> >> > for indexing the Permissions(also the data) into the document and then
>> >> > querying on it ,this will also not work in our scenario as we have n
>> web
>> >> > apps having the same requirement ,it would be tedious to set this up
>> for
>> >> > each webapp and also the requirement that once the user permission
>> for a
>> >> > product is revoked ,then he should not be able to search on the same
>> >> within
>> >> > his subscribed products.
>> >> >
>> >> > Any pointers would be helpful and sorry about the lengthy description.
>> >> >
>> >> > Regards
>> >> > Sujatha
>> >> >
>> >>
>> >
>>
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Sujatha Arun <su...@gmail.com>.
Thanks ,Peter.
I am not a Java Programmer and hence the code seems all Greek and Latin to
me .I do have a basic knowledge ,but all this Map,hashMap
,Hashlist,NamedList , I dont understand.
However I would like to implement the solution that you have mentoned ,so
if you have any pointers for me ,would be great .I would also try to dig
deep into JAVA.
What s meant by in-memory?Is it the Ram memory ,So If i have <n>
concurrent users ,each having <n> products subscrbed,what would be the
Impact on memory ?
Regards
Sujatha
On Tue, Jun 14, 2011 at 5:43 PM, Peter Sturge <pe...@gmail.com>wrote:
> SOLR-1872 doesn't add discrete booleans to the query, it does it
> programmatically, so you shouldn't see this problem. (if you have a
> look at the code, you'll see how it filters queries)
> I suppose you could modify SOLR-1872 to use an in-memory,
> dynamically-updated user list (+ associated filters) instead of using
> the acl file.
> This would give you the 'changing users' and 'expiry' functionailty you
> need.
>
>
>
> On Tue, Jun 14, 2011 at 10:08 AM, Sujatha Arun <su...@gmail.com>
> wrote:
> > Thanks Peter , for your input .
> >
> > I really would like a document and schema agnostic solution as in
> solr
> > 1872.
> >
> > Am I right in my assumption that SOLR1872 is same as the solution that
> > we currently have where we add a flter query of the products to orignal
> > query and hence (SOLR 1872) will also run into TOO many boolean clause
> > expanson error?
> >
> > Regards
> > Sujatha
> >
> >
> > On Tue, Jun 14, 2011 at 1:53 PM, Peter Sturge <peter.sturge@gmail.com
> >wrote:
> >
> >> Hi,
> >>
> >> SOLR-1834 is good when the original documents' ACL is accessible.
> >> SOLR-1872 is good where the usernames are persistent - neither of
> >> these really fit your use case.
> >> It sounds like you need more of an 'in-memory', transient access
> >> control mechanism. Does the access have to exist beyond the user's
> >> session (or the Solr vm session)?
> >> Your best bet is probably something like a custom SearchComponent or
> >> similar, that keeps track of user purchases, and either adjusts/limits
> >> the query or the results to suit.
> >> With your own module in the query chain, you can then decide when the
> >> 'expiry' is, and limit results accordingly.
> >>
> >> SearchComponent's are pretty easy to write and integrate. Have a look
> at:
> >> http://wiki.apache.org/solr/SearchComponent
> >> for info on SearchComponent and its usage.
> >>
> >>
> >>
> >>
> >> On Tue, Jun 14, 2011 at 8:18 AM, Sujatha Arun <su...@gmail.com>
> wrote:
> >> > Hello,
> >> >
> >> >
> >> > Our Use Case is as follows
> >> >
> >> > Several solr webapps (one JVM) ,Each webapp catering to one client
> .Each
> >> > client has their users who can purchase products from the site .Once
> >> they
> >> > purchase ,they have full access to the products ,other wise they can
> only
> >> > view details .
> >> >
> >> > The products are not tied to the user at the document level, simply
> >> because
> >> > , once the purchase duration of product expires ,the user will no
> longer
> >> > have access to that product.
> >> >
> >> > So a search for a product once the user logs in and searches for only
> the
> >> > products that he has access to Will translate to something like this .
> >> ,the
> >> > product ids are obtained form the db for a particular user and can
> run
> >> > into n number.
> >> >
> >> > <search term> &fq=product_id(100 10001 ......n number)
> >> >
> >> > but we are currently running into too many Boolean expansion error .We
> >> are
> >> > not able to tie the user also into roles as each user is mainly any
> one
> >> who
> >> > comes to site and purchases a product .
> >> >
> >> > Given the 2 solutions above as SOLR -1872 where we have to specify the
> >> user
> >> > in an ACL file and
> >> > query for allow and deny also translates to what we are trying to do
> >> above
> >> >
> >> > In Case of SOLR 1834 ,we are required to use a crawler (APACHE
> >> manifoldCF)
> >> > for indexing the Permissions(also the data) into the document and then
> >> > querying on it ,this will also not work in our scenario as we have n
> web
> >> > apps having the same requirement ,it would be tedious to set this up
> for
> >> > each webapp and also the requirement that once the user permission
> for a
> >> > product is revoked ,then he should not be able to search on the same
> >> within
> >> > his subscribed products.
> >> >
> >> > Any pointers would be helpful and sorry about the lengthy description.
> >> >
> >> > Regards
> >> > Sujatha
> >> >
> >>
> >
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Peter Sturge <pe...@gmail.com>.
SOLR-1872 doesn't add discrete booleans to the query, it does it
programmatically, so you shouldn't see this problem. (if you have a
look at the code, you'll see how it filters queries)
I suppose you could modify SOLR-1872 to use an in-memory,
dynamically-updated user list (+ associated filters) instead of using
the acl file.
This would give you the 'changing users' and 'expiry' functionailty you need.
On Tue, Jun 14, 2011 at 10:08 AM, Sujatha Arun <su...@gmail.com> wrote:
> Thanks Peter , for your input .
>
> I really would like a document and schema agnostic solution as in solr
> 1872.
>
> Am I right in my assumption that SOLR1872 is same as the solution that
> we currently have where we add a flter query of the products to orignal
> query and hence (SOLR 1872) will also run into TOO many boolean clause
> expanson error?
>
> Regards
> Sujatha
>
>
> On Tue, Jun 14, 2011 at 1:53 PM, Peter Sturge <pe...@gmail.com>wrote:
>
>> Hi,
>>
>> SOLR-1834 is good when the original documents' ACL is accessible.
>> SOLR-1872 is good where the usernames are persistent - neither of
>> these really fit your use case.
>> It sounds like you need more of an 'in-memory', transient access
>> control mechanism. Does the access have to exist beyond the user's
>> session (or the Solr vm session)?
>> Your best bet is probably something like a custom SearchComponent or
>> similar, that keeps track of user purchases, and either adjusts/limits
>> the query or the results to suit.
>> With your own module in the query chain, you can then decide when the
>> 'expiry' is, and limit results accordingly.
>>
>> SearchComponent's are pretty easy to write and integrate. Have a look at:
>> http://wiki.apache.org/solr/SearchComponent
>> for info on SearchComponent and its usage.
>>
>>
>>
>>
>> On Tue, Jun 14, 2011 at 8:18 AM, Sujatha Arun <su...@gmail.com> wrote:
>> > Hello,
>> >
>> >
>> > Our Use Case is as follows
>> >
>> > Several solr webapps (one JVM) ,Each webapp catering to one client .Each
>> > client has their users who can purchase products from the site .Once
>> they
>> > purchase ,they have full access to the products ,other wise they can only
>> > view details .
>> >
>> > The products are not tied to the user at the document level, simply
>> because
>> > , once the purchase duration of product expires ,the user will no longer
>> > have access to that product.
>> >
>> > So a search for a product once the user logs in and searches for only the
>> > products that he has access to Will translate to something like this .
>> ,the
>> > product ids are obtained form the db for a particular user and can run
>> > into n number.
>> >
>> > <search term> &fq=product_id(100 10001 ......n number)
>> >
>> > but we are currently running into too many Boolean expansion error .We
>> are
>> > not able to tie the user also into roles as each user is mainly any one
>> who
>> > comes to site and purchases a product .
>> >
>> > Given the 2 solutions above as SOLR -1872 where we have to specify the
>> user
>> > in an ACL file and
>> > query for allow and deny also translates to what we are trying to do
>> above
>> >
>> > In Case of SOLR 1834 ,we are required to use a crawler (APACHE
>> manifoldCF)
>> > for indexing the Permissions(also the data) into the document and then
>> > querying on it ,this will also not work in our scenario as we have n web
>> > apps having the same requirement ,it would be tedious to set this up for
>> > each webapp and also the requirement that once the user permission for a
>> > product is revoked ,then he should not be able to search on the same
>> within
>> > his subscribed products.
>> >
>> > Any pointers would be helpful and sorry about the lengthy description.
>> >
>> > Regards
>> > Sujatha
>> >
>>
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Sujatha Arun <su...@gmail.com>.
Thanks Peter , for your input .
I really would like a document and schema agnostic solution as in solr
1872.
Am I right in my assumption that SOLR1872 is same as the solution that
we currently have where we add a flter query of the products to orignal
query and hence (SOLR 1872) will also run into TOO many boolean clause
expanson error?
Regards
Sujatha
On Tue, Jun 14, 2011 at 1:53 PM, Peter Sturge <pe...@gmail.com>wrote:
> Hi,
>
> SOLR-1834 is good when the original documents' ACL is accessible.
> SOLR-1872 is good where the usernames are persistent - neither of
> these really fit your use case.
> It sounds like you need more of an 'in-memory', transient access
> control mechanism. Does the access have to exist beyond the user's
> session (or the Solr vm session)?
> Your best bet is probably something like a custom SearchComponent or
> similar, that keeps track of user purchases, and either adjusts/limits
> the query or the results to suit.
> With your own module in the query chain, you can then decide when the
> 'expiry' is, and limit results accordingly.
>
> SearchComponent's are pretty easy to write and integrate. Have a look at:
> http://wiki.apache.org/solr/SearchComponent
> for info on SearchComponent and its usage.
>
>
>
>
> On Tue, Jun 14, 2011 at 8:18 AM, Sujatha Arun <su...@gmail.com> wrote:
> > Hello,
> >
> >
> > Our Use Case is as follows
> >
> > Several solr webapps (one JVM) ,Each webapp catering to one client .Each
> > client has their users who can purchase products from the site .Once
> they
> > purchase ,they have full access to the products ,other wise they can only
> > view details .
> >
> > The products are not tied to the user at the document level, simply
> because
> > , once the purchase duration of product expires ,the user will no longer
> > have access to that product.
> >
> > So a search for a product once the user logs in and searches for only the
> > products that he has access to Will translate to something like this .
> ,the
> > product ids are obtained form the db for a particular user and can run
> > into n number.
> >
> > <search term> &fq=product_id(100 10001 ......n number)
> >
> > but we are currently running into too many Boolean expansion error .We
> are
> > not able to tie the user also into roles as each user is mainly any one
> who
> > comes to site and purchases a product .
> >
> > Given the 2 solutions above as SOLR -1872 where we have to specify the
> user
> > in an ACL file and
> > query for allow and deny also translates to what we are trying to do
> above
> >
> > In Case of SOLR 1834 ,we are required to use a crawler (APACHE
> manifoldCF)
> > for indexing the Permissions(also the data) into the document and then
> > querying on it ,this will also not work in our scenario as we have n web
> > apps having the same requirement ,it would be tedious to set this up for
> > each webapp and also the requirement that once the user permission for a
> > product is revoked ,then he should not be able to search on the same
> within
> > his subscribed products.
> >
> > Any pointers would be helpful and sorry about the lengthy description.
> >
> > Regards
> > Sujatha
> >
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Peter Sturge <pe...@gmail.com>.
Hi,
SOLR-1834 is good when the original documents' ACL is accessible.
SOLR-1872 is good where the usernames are persistent - neither of
these really fit your use case.
It sounds like you need more of an 'in-memory', transient access
control mechanism. Does the access have to exist beyond the user's
session (or the Solr vm session)?
Your best bet is probably something like a custom SearchComponent or
similar, that keeps track of user purchases, and either adjusts/limits
the query or the results to suit.
With your own module in the query chain, you can then decide when the
'expiry' is, and limit results accordingly.
SearchComponent's are pretty easy to write and integrate. Have a look at:
http://wiki.apache.org/solr/SearchComponent
for info on SearchComponent and its usage.
On Tue, Jun 14, 2011 at 8:18 AM, Sujatha Arun <su...@gmail.com> wrote:
> Hello,
>
>
> Our Use Case is as follows
>
> Several solr webapps (one JVM) ,Each webapp catering to one client .Each
> client has their users who can purchase products from the site .Once they
> purchase ,they have full access to the products ,other wise they can only
> view details .
>
> The products are not tied to the user at the document level, simply because
> , once the purchase duration of product expires ,the user will no longer
> have access to that product.
>
> So a search for a product once the user logs in and searches for only the
> products that he has access to Will translate to something like this . ,the
> product ids are obtained form the db for a particular user and can run
> into n number.
>
> <search term> &fq=product_id(100 10001 ......n number)
>
> but we are currently running into too many Boolean expansion error .We are
> not able to tie the user also into roles as each user is mainly any one who
> comes to site and purchases a product .
>
> Given the 2 solutions above as SOLR -1872 where we have to specify the user
> in an ACL file and
> query for allow and deny also translates to what we are trying to do above
>
> In Case of SOLR 1834 ,we are required to use a crawler (APACHE manifoldCF)
> for indexing the Permissions(also the data) into the document and then
> querying on it ,this will also not work in our scenario as we have n web
> apps having the same requirement ,it would be tedious to set this up for
> each webapp and also the requirement that once the user permission for a
> product is revoked ,then he should not be able to search on the same within
> his subscribed products.
>
> Any pointers would be helpful and sorry about the lengthy description.
>
> Regards
> Sujatha
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Sujatha Arun <su...@gmail.com>.
Constantijn,
I am aware of this and we have already increased max boolean clauses to
<3500> from the default <1200> for all our 200+ instances .
But the requirement is that we could have <n> number of products running
to several thousands for each of the instances and since <n> is not defined
, this will not scale considering <n> could be different for each of our
instances and also the performance impact of so many Boolean clauses.
Regards
Sujatha
On Fri, Jun 17, 2011 at 2:58 PM, Constantijn Visinescu
<ba...@gmail.com>wrote:
> Just to chip in my 2 cents:
>
> You know you can increase the max number of boolean clauses in the
> configuration files?
> Depending on your situation it might not be a permanent fix, but it
> could provide some instant relief.
>
> Constantijn
>
>
> On Fri, Jun 17, 2011 at 11:19 AM, Peter Sturge <pe...@gmail.com>
> wrote:
> > You'll need to be a bit careful using joins, as the performance hit
> > can be significant if you have lots of cross-referencing to do, which
> > I believe you would given your scenario.
> >
> > Your table could be setup to use the username as the key (for fast
> > lookup), then map these to your own data class or collection or
> > similar to hold your other information: products, expiry etc.
> > By using your own data class, it's then easy to extend it later if you
> > want to add additional parameters. (for example: HashMap<String,
> > MyDataClass>)
> >
> > When a search comes in, the user is looked up to retrieve the data
> > class, then its contents (as defined by you) is examined and the query
> > is processed/filtered appropriately.
> >
> > You'll need a bootstrap mechanism for populating the list in the first
> > place. One thing worth looking at is lazy loading - i.e. the first
> > time a user does a search (you lookup the user in the table, and it
> > isn't there), you load the data class (maybe from your DB, a file, or
> > index), then ad it to the table. This is good if you have 10's of
> > thousands or millions of users, but only a handful are actually
> > searching, some perhaps very rarely.
> >
> > If you do have millions of users, and your data class has heavy
> > requirements (e.g. many thousands of products + info etc.), you might
> > want to 'time-out' in-memory table entries, if the table gets really
> > huge - it depends on the usage of your system. (you can run a
> > synchronized cleanup thread to do this if you deemed it necessary).
> >
> >
> > On Fri, Jun 17, 2011 at 6:06 AM, Sujatha Arun <su...@gmail.com>
> wrote:
> >> Alexey,
> >>
> >> Do you mean that we have current Index as it is and have a separate
> core
> >> which has only the user-id ,product-id relation and at while querying
> ,do a
> >> join between the two cores based on the user-id.
> >>
> >>
> >> This would involve us to Index/delete the product as and when the user
> >> subscription for a product changes ,This would involve some amount of
> >> latency if the Indexing (we have a queue system for Indexing across the
> >> various instances) or deletion is delayed
> >>
> >> IF we want to go ahead with this solution ,We currently are using solr
> 1.3
> >> , so is this functionality available as a patch for solr 1.3?Would it
> be
> >> possible to do with a separate Index instead of a core ,then I can
> create
> >> only one Index common for all our instances and then use this instance
> to
> >> do the join.
> >>
> >> Thanks
> >> Sujatha
> >>
> >> On Thu, Jun 16, 2011 at 9:27 PM, Alexey Serba <as...@gmail.com> wrote:
> >>
> >>> > So a search for a product once the user logs in and searches for only
> the
> >>> > products that he has access to Will translate to something like this
> .
> >>> ,the
> >>> > product ids are obtained form the db for a particular user and can
> run
> >>> > into n number.
> >>> >
> >>> > <search term> &fq=product_id(100 10001 ......n number)
> >>> >
> >>> > but we are currently running into too many Boolean expansion error
> .We
> >>> are
> >>> > not able to tie the user also into roles as each user is mainly any
> one
> >>> who
> >>> > comes to site and purchases a product .
> >>>
> >>> I'm wondering if new trunk Solr join functionality can help here.
> >>>
> >>> * http://wiki.apache.org/solr/Join
> >>>
> >>> In theory you can index your products (product_id, ...) and
> >>> user_id-product many-to-many relation (user_product_id, user_id) into
> >>> signle/different cores and then do join, like
> >>> f=search terms&fq={!join from=product_id
> to=user_product_id}user_id:10101
> >>>
> >>> But I haven't tried that, so I'm just speculating.
> >>>
> >>
> >
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Constantijn Visinescu <ba...@gmail.com>.
Just to chip in my 2 cents:
You know you can increase the max number of boolean clauses in the
configuration files?
Depending on your situation it might not be a permanent fix, but it
could provide some instant relief.
Constantijn
On Fri, Jun 17, 2011 at 11:19 AM, Peter Sturge <pe...@gmail.com> wrote:
> You'll need to be a bit careful using joins, as the performance hit
> can be significant if you have lots of cross-referencing to do, which
> I believe you would given your scenario.
>
> Your table could be setup to use the username as the key (for fast
> lookup), then map these to your own data class or collection or
> similar to hold your other information: products, expiry etc.
> By using your own data class, it's then easy to extend it later if you
> want to add additional parameters. (for example: HashMap<String,
> MyDataClass>)
>
> When a search comes in, the user is looked up to retrieve the data
> class, then its contents (as defined by you) is examined and the query
> is processed/filtered appropriately.
>
> You'll need a bootstrap mechanism for populating the list in the first
> place. One thing worth looking at is lazy loading - i.e. the first
> time a user does a search (you lookup the user in the table, and it
> isn't there), you load the data class (maybe from your DB, a file, or
> index), then ad it to the table. This is good if you have 10's of
> thousands or millions of users, but only a handful are actually
> searching, some perhaps very rarely.
>
> If you do have millions of users, and your data class has heavy
> requirements (e.g. many thousands of products + info etc.), you might
> want to 'time-out' in-memory table entries, if the table gets really
> huge - it depends on the usage of your system. (you can run a
> synchronized cleanup thread to do this if you deemed it necessary).
>
>
> On Fri, Jun 17, 2011 at 6:06 AM, Sujatha Arun <su...@gmail.com> wrote:
>> Alexey,
>>
>> Do you mean that we have current Index as it is and have a separate core
>> which has only the user-id ,product-id relation and at while querying ,do a
>> join between the two cores based on the user-id.
>>
>>
>> This would involve us to Index/delete the product as and when the user
>> subscription for a product changes ,This would involve some amount of
>> latency if the Indexing (we have a queue system for Indexing across the
>> various instances) or deletion is delayed
>>
>> IF we want to go ahead with this solution ,We currently are using solr 1.3
>> , so is this functionality available as a patch for solr 1.3?Would it be
>> possible to do with a separate Index instead of a core ,then I can create
>> only one Index common for all our instances and then use this instance to
>> do the join.
>>
>> Thanks
>> Sujatha
>>
>> On Thu, Jun 16, 2011 at 9:27 PM, Alexey Serba <as...@gmail.com> wrote:
>>
>>> > So a search for a product once the user logs in and searches for only the
>>> > products that he has access to Will translate to something like this .
>>> ,the
>>> > product ids are obtained form the db for a particular user and can run
>>> > into n number.
>>> >
>>> > <search term> &fq=product_id(100 10001 ......n number)
>>> >
>>> > but we are currently running into too many Boolean expansion error .We
>>> are
>>> > not able to tie the user also into roles as each user is mainly any one
>>> who
>>> > comes to site and purchases a product .
>>>
>>> I'm wondering if new trunk Solr join functionality can help here.
>>>
>>> * http://wiki.apache.org/solr/Join
>>>
>>> In theory you can index your products (product_id, ...) and
>>> user_id-product many-to-many relation (user_product_id, user_id) into
>>> signle/different cores and then do join, like
>>> f=search terms&fq={!join from=product_id to=user_product_id}user_id:10101
>>>
>>> But I haven't tried that, so I'm just speculating.
>>>
>>
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Sujatha Arun <su...@gmail.com>.
Thanks ,Peter .
This very much seems to be the solution that I should be going forward with
.Thanks for your time and clear explanation.
Regards
Sujatha
On Fri, Jun 17, 2011 at 2:49 PM, Peter Sturge <pe...@gmail.com>wrote:
> You'll need to be a bit careful using joins, as the performance hit
> can be significant if you have lots of cross-referencing to do, which
> I believe you would given your scenario.
>
> Your table could be setup to use the username as the key (for fast
> lookup), then map these to your own data class or collection or
> similar to hold your other information: products, expiry etc.
> By using your own data class, it's then easy to extend it later if you
> want to add additional parameters. (for example: HashMap<String,
> MyDataClass>)
>
> When a search comes in, the user is looked up to retrieve the data
> class, then its contents (as defined by you) is examined and the query
> is processed/filtered appropriately.
>
> You'll need a bootstrap mechanism for populating the list in the first
> place. One thing worth looking at is lazy loading - i.e. the first
> time a user does a search (you lookup the user in the table, and it
> isn't there), you load the data class (maybe from your DB, a file, or
> index), then ad it to the table. This is good if you have 10's of
> thousands or millions of users, but only a handful are actually
> searching, some perhaps very rarely.
>
> If you do have millions of users, and your data class has heavy
> requirements (e.g. many thousands of products + info etc.), you might
> want to 'time-out' in-memory table entries, if the table gets really
> huge - it depends on the usage of your system. (you can run a
> synchronized cleanup thread to do this if you deemed it necessary).
>
>
> On Fri, Jun 17, 2011 at 6:06 AM, Sujatha Arun <su...@gmail.com> wrote:
> > Alexey,
> >
> > Do you mean that we have current Index as it is and have a separate core
> > which has only the user-id ,product-id relation and at while querying
> ,do a
> > join between the two cores based on the user-id.
> >
> >
> > This would involve us to Index/delete the product as and when the user
> > subscription for a product changes ,This would involve some amount of
> > latency if the Indexing (we have a queue system for Indexing across the
> > various instances) or deletion is delayed
> >
> > IF we want to go ahead with this solution ,We currently are using solr
> 1.3
> > , so is this functionality available as a patch for solr 1.3?Would it be
> > possible to do with a separate Index instead of a core ,then I can
> create
> > only one Index common for all our instances and then use this instance
> to
> > do the join.
> >
> > Thanks
> > Sujatha
> >
> > On Thu, Jun 16, 2011 at 9:27 PM, Alexey Serba <as...@gmail.com> wrote:
> >
> >> > So a search for a product once the user logs in and searches for only
> the
> >> > products that he has access to Will translate to something like this .
> >> ,the
> >> > product ids are obtained form the db for a particular user and can
> run
> >> > into n number.
> >> >
> >> > <search term> &fq=product_id(100 10001 ......n number)
> >> >
> >> > but we are currently running into too many Boolean expansion error .We
> >> are
> >> > not able to tie the user also into roles as each user is mainly any
> one
> >> who
> >> > comes to site and purchases a product .
> >>
> >> I'm wondering if new trunk Solr join functionality can help here.
> >>
> >> * http://wiki.apache.org/solr/Join
> >>
> >> In theory you can index your products (product_id, ...) and
> >> user_id-product many-to-many relation (user_product_id, user_id) into
> >> signle/different cores and then do join, like
> >> f=search terms&fq={!join from=product_id
> to=user_product_id}user_id:10101
> >>
> >> But I haven't tried that, so I'm just speculating.
> >>
> >
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Peter Sturge <pe...@gmail.com>.
You'll need to be a bit careful using joins, as the performance hit
can be significant if you have lots of cross-referencing to do, which
I believe you would given your scenario.
Your table could be setup to use the username as the key (for fast
lookup), then map these to your own data class or collection or
similar to hold your other information: products, expiry etc.
By using your own data class, it's then easy to extend it later if you
want to add additional parameters. (for example: HashMap<String,
MyDataClass>)
When a search comes in, the user is looked up to retrieve the data
class, then its contents (as defined by you) is examined and the query
is processed/filtered appropriately.
You'll need a bootstrap mechanism for populating the list in the first
place. One thing worth looking at is lazy loading - i.e. the first
time a user does a search (you lookup the user in the table, and it
isn't there), you load the data class (maybe from your DB, a file, or
index), then ad it to the table. This is good if you have 10's of
thousands or millions of users, but only a handful are actually
searching, some perhaps very rarely.
If you do have millions of users, and your data class has heavy
requirements (e.g. many thousands of products + info etc.), you might
want to 'time-out' in-memory table entries, if the table gets really
huge - it depends on the usage of your system. (you can run a
synchronized cleanup thread to do this if you deemed it necessary).
On Fri, Jun 17, 2011 at 6:06 AM, Sujatha Arun <su...@gmail.com> wrote:
> Alexey,
>
> Do you mean that we have current Index as it is and have a separate core
> which has only the user-id ,product-id relation and at while querying ,do a
> join between the two cores based on the user-id.
>
>
> This would involve us to Index/delete the product as and when the user
> subscription for a product changes ,This would involve some amount of
> latency if the Indexing (we have a queue system for Indexing across the
> various instances) or deletion is delayed
>
> IF we want to go ahead with this solution ,We currently are using solr 1.3
> , so is this functionality available as a patch for solr 1.3?Would it be
> possible to do with a separate Index instead of a core ,then I can create
> only one Index common for all our instances and then use this instance to
> do the join.
>
> Thanks
> Sujatha
>
> On Thu, Jun 16, 2011 at 9:27 PM, Alexey Serba <as...@gmail.com> wrote:
>
>> > So a search for a product once the user logs in and searches for only the
>> > products that he has access to Will translate to something like this .
>> ,the
>> > product ids are obtained form the db for a particular user and can run
>> > into n number.
>> >
>> > <search term> &fq=product_id(100 10001 ......n number)
>> >
>> > but we are currently running into too many Boolean expansion error .We
>> are
>> > not able to tie the user also into roles as each user is mainly any one
>> who
>> > comes to site and purchases a product .
>>
>> I'm wondering if new trunk Solr join functionality can help here.
>>
>> * http://wiki.apache.org/solr/Join
>>
>> In theory you can index your products (product_id, ...) and
>> user_id-product many-to-many relation (user_product_id, user_id) into
>> signle/different cores and then do join, like
>> f=search terms&fq={!join from=product_id to=user_product_id}user_id:10101
>>
>> But I haven't tried that, so I'm just speculating.
>>
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Sujatha Arun <su...@gmail.com>.
Alexey ,
We are not planning to upgrade our solr version at the moment as all is fine
with the current version so far and hence would not be able to try this
solution .
Regards
Sujatha
On Fri, Jun 17, 2011 at 3:47 PM, Alexey Serba <as...@gmail.com> wrote:
> > Do you mean that we have current Index as it is and have a separate core
> > which has only the user-id ,product-id relation and at while querying
> ,do a
> > join between the two cores based on the user-id.
> Exactly. You can index user-id, product-id relation either to the same
> core or to different core on the same Solr instance.
>
> > This would involve us to Index/delete the product as and when the user
> > subscription for a product changes ,This would involve some amount of
> > latency if the Indexing (we have a queue system for Indexing across the
> > various instances) or deletion is delayed
> Right, but I'm not sure if it's possible to achieve good performance
> requiring zero latency.
>
> > IF we want to go ahead with this solution ,We currently are using solr
> 1.3
> > , so is this functionality available as a patch for solr 1.3?
> No. AFAIK it's in trunk only.
>
> > Would it be
> > possible to do with a separate Index instead of a core ,then I can
> create
> > only one Index common for all our instances and then use this instance
> to
> > do the join.
> No, I don't think that's possible with join feature. I guess that
> would require network request per search req and number of mapped ids
> could be huge, so it could affect performance significantly.
>
> > You'll need to be a bit careful using joins, as the performance hit
> > can be significant if you have lots of cross-referencing to do, which
> > I believe you would given your scenario.
> As far as I understand join query would build bitset filter which can
> be cached in filterCache, etc. The only performance impact I can think
> of is that user-product relations table could be too big to fit into
> single instance.
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Alexey Serba <as...@gmail.com>.
> Do you mean that we have current Index as it is and have a separate core
> which has only the user-id ,product-id relation and at while querying ,do a
> join between the two cores based on the user-id.
Exactly. You can index user-id, product-id relation either to the same
core or to different core on the same Solr instance.
> This would involve us to Index/delete the product as and when the user
> subscription for a product changes ,This would involve some amount of
> latency if the Indexing (we have a queue system for Indexing across the
> various instances) or deletion is delayed
Right, but I'm not sure if it's possible to achieve good performance
requiring zero latency.
> IF we want to go ahead with this solution ,We currently are using solr 1.3
> , so is this functionality available as a patch for solr 1.3?
No. AFAIK it's in trunk only.
> Would it be
> possible to do with a separate Index instead of a core ,then I can create
> only one Index common for all our instances and then use this instance to
> do the join.
No, I don't think that's possible with join feature. I guess that
would require network request per search req and number of mapped ids
could be huge, so it could affect performance significantly.
> You'll need to be a bit careful using joins, as the performance hit
> can be significant if you have lots of cross-referencing to do, which
> I believe you would given your scenario.
As far as I understand join query would build bitset filter which can
be cached in filterCache, etc. The only performance impact I can think
of is that user-product relations table could be too big to fit into
single instance.
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Sujatha Arun <su...@gmail.com>.
Alexey,
Do you mean that we have current Index as it is and have a separate core
which has only the user-id ,product-id relation and at while querying ,do a
join between the two cores based on the user-id.
This would involve us to Index/delete the product as and when the user
subscription for a product changes ,This would involve some amount of
latency if the Indexing (we have a queue system for Indexing across the
various instances) or deletion is delayed
IF we want to go ahead with this solution ,We currently are using solr 1.3
, so is this functionality available as a patch for solr 1.3?Would it be
possible to do with a separate Index instead of a core ,then I can create
only one Index common for all our instances and then use this instance to
do the join.
Thanks
Sujatha
On Thu, Jun 16, 2011 at 9:27 PM, Alexey Serba <as...@gmail.com> wrote:
> > So a search for a product once the user logs in and searches for only the
> > products that he has access to Will translate to something like this .
> ,the
> > product ids are obtained form the db for a particular user and can run
> > into n number.
> >
> > <search term> &fq=product_id(100 10001 ......n number)
> >
> > but we are currently running into too many Boolean expansion error .We
> are
> > not able to tie the user also into roles as each user is mainly any one
> who
> > comes to site and purchases a product .
>
> I'm wondering if new trunk Solr join functionality can help here.
>
> * http://wiki.apache.org/solr/Join
>
> In theory you can index your products (product_id, ...) and
> user_id-product many-to-many relation (user_product_id, user_id) into
> signle/different cores and then do join, like
> f=search terms&fq={!join from=product_id to=user_product_id}user_id:10101
>
> But I haven't tried that, so I'm just speculating.
>
Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)
Posted by Alexey Serba <as...@gmail.com>.
> So a search for a product once the user logs in and searches for only the
> products that he has access to Will translate to something like this . ,the
> product ids are obtained form the db for a particular user and can run
> into n number.
>
> <search term> &fq=product_id(100 10001 ......n number)
>
> but we are currently running into too many Boolean expansion error .We are
> not able to tie the user also into roles as each user is mainly any one who
> comes to site and purchases a product .
I'm wondering if new trunk Solr join functionality can help here.
* http://wiki.apache.org/solr/Join
In theory you can index your products (product_id, ...) and
user_id-product many-to-many relation (user_product_id, user_id) into
signle/different cores and then do join, like
f=search terms&fq={!join from=product_id to=user_product_id}user_id:10101
But I haven't tried that, so I'm just speculating.