You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jos Janssen <jo...@websdesign.nl> on 2010/11/22 12:05:59 UTC

SOLR and secure content

Hi,

We are currently investigating how to setup a correct solr server for our
goals.
The problem i'm running into is how to design the solr setup so that we can
check if a user is authenticated for viewing the document.  Let me explain
the situation.

We have a website with some pages and documents which are accesible by
everyone (Public).
We also have some sort of extranet, thse pages and documents are not
accesible for everyone. 
In this extranet we have different user groups. Acces is defined by the user
group. 

What i'm looking for is some sort of best practices to design/configure solr
setup for this situation.
I searched the internet but could find any examples or documentation for
this situation.

Maybe i'm not looking for the right documentation, that why i post this
message. 
Can someone give me some information for this.

Regards,

Jos 


-- 
View this message in context: http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1945028.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR and secure content

Posted by Dennis Gearon <ge...@sbcglobal.net>.
Solr basically does ONE thing (and related things) very well. Doing all the 
error messaging that yuou want would be fighting all the specialization builit 
into Solr/Lucene code.

 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



----- Original Message ----
From: Jos Janssen <jo...@websdesign.nl>
To: solr-user@lucene.apache.org
Sent: Tue, November 23, 2010 6:45:12 AM
Subject: Re: SOLR and secure content



The setup of multiple cores is a good option, thanks for the advice.

I agree the "required" field should be in the application layer, but i also
think some "error" handling should come from the Solr server to prevent
incorrect usage. If i only knew how to do this for each request.

Regards,

Jos
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1953726.html

Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR and secure content

Posted by Jos Janssen <jo...@websdesign.nl>.
Dennis,

We will we serving the content to de indexed websites. As i wrote we will be
looking into setting up different cores, 1 core for each website. This wil
make sure the content is sperated for each individual indexed website.

The so called "error" handling is only needed, in case of bad programming on
the client side, to make sure the response/result won't contain content that
should not be returned with those parameters.

i hope this clarifies my goal.

regards,

jos
-- 
View this message in context: http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1956807.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR and secure content

Posted by Dennis Gearon <ge...@sbcglobal.net>.
Solr basically does ONE thing (and related things) very well. Doing all the 
error messaging that yuou want would be fighting all the specialization builit 
into Solr/Lucene code.

 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



----- Original Message ----
From: Jos Janssen <jo...@websdesign.nl>
To: solr-user@lucene.apache.org
Sent: Tue, November 23, 2010 6:45:12 AM
Subject: Re: SOLR and secure content



The setup of multiple cores is a good option, thanks for the advice.

I agree the "required" field should be in the application layer, but i also
think some "error" handling should come from the Solr server to prevent
incorrect usage. If i only knew how to do this for each request.

Regards,

Jos
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1953726.html

Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR and secure content

Posted by Jos Janssen <jo...@websdesign.nl>.

The setup of multiple cores is a good option, thanks for the advice.

I agree the "required" field should be in the application layer, but i also
think some "error" handling should come from the Solr server to prevent
incorrect usage. If i only knew how to do this for each request.

Regards,

Jos
-- 
View this message in context: http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1953726.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR and secure content

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
Sounds like a good plan. I'd probably also set multiple cores for each
website. This could give you more accurate results scoring.

Good question about the "required" configuration option.. any input?
Although on the other hand, this is a rule which seems to better fit in your
application's Validation layer rather than Solr.

On 23 November 2010 12:35, Jos Janssen <jo...@websdesign.nl> wrote:

>
> Hi everyone,
>
> This is how we think we should set it up.
>
> Situation:
> - Multiple websites indexed on 1 solr server
> - Results should be seperated for each website
> - Search results should be filtered on group access
>
> Solution i think is possible with solr:
> - Solr server should only be accesed through API which we will write in
> PHP.
> - Solr server authentication wil be defined through IP adres on server side
> and username and password will be send through API for each different
> website.
> - Extra document fields in Solr server will contain:
> 1. Website Hash to identify and filter results fo each different website
> (Website authentication)
> 2. list of groups who can access the document  (Group authentication)
>
> When making a query these fields should be required. Is it possible to
> configure handlers on the solr server so that these field are required
> whith
> each type of query? So for adding documents, deleting and querying?
>
> Am i correct? Any further advice is welcome.
>
> regard,
>
> Jos
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1953071.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: SOLR and secure content

Posted by Dennis Gearon <ge...@sbcglobal.net>.
I can see no reason to keep separate web sites information in the same index. If 
it's not being served to a website at all, why have data from another website in 
'accidental' proximity to it? Someday, a coder WILL make a mistake, or a library 
upgrade will allow access.

Best at least sort data and access to it on easy to define borders like web 
sites.

 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



----- Original Message ----
From: Jos Janssen <jo...@websdesign.nl>
To: solr-user@lucene.apache.org
Sent: Tue, November 23, 2010 4:35:09 AM
Subject: Re: SOLR and secure content


Hi everyone,

This is how we think we should set it up.

Situation:
- Multiple websites indexed on 1 solr server
- Results should be seperated for each website
- Search results should be filtered on group access

Solution i think is possible with solr:
- Solr server should only be accesed through API which we will write in PHP.
- Solr server authentication wil be defined through IP adres on server side
and username and password will be send through API for each different
website.
- Extra document fields in Solr server will contain:
1. Website Hash to identify and filter results fo each different website
(Website authentication)
2. list of groups who can access the document  (Group authentication)

When making a query these fields should be required. Is it possible to
configure handlers on the solr server so that these field are required whith
each type of query? So for adding documents, deleting and querying?

Am i correct? Any further advice is welcome.

regard,

Jos



-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1953071.html

Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR and secure content

Posted by Geert-Jan Brits <gb...@gmail.com>.
> When making a query these fields should be required. Is it possible to
configure handlers on the solr server so that these field are required whith
each type of query? So for adding documents, deleting and querying?

have a look at 'invariants' (and 'appends') in the example solrconfig.
They can be defined per requesthandler and do exactly what you describe (at
least for the search-side of things)

Cheers,
Geert-Jan

2010/11/23 Jos Janssen <jo...@websdesign.nl>

>
> Hi everyone,
>
> This is how we think we should set it up.
>
> Situation:
> - Multiple websites indexed on 1 solr server
> - Results should be seperated for each website
> - Search results should be filtered on group access
>
> Solution i think is possible with solr:
> - Solr server should only be accesed through API which we will write in
> PHP.
> - Solr server authentication wil be defined through IP adres on server side
> and username and password will be send through API for each different
> website.
> - Extra document fields in Solr server will contain:
> 1. Website Hash to identify and filter results fo each different website
> (Website authentication)
> 2. list of groups who can access the document  (Group authentication)
>
> When making a query these fields should be required. Is it possible to
> configure handlers on the solr server so that these field are required
> whith
> each type of query? So for adding documents, deleting and querying?
>
> Am i correct? Any further advice is welcome.
>
> regard,
>
> Jos
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1953071.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: SOLR and secure content

Posted by Jos Janssen <jo...@websdesign.nl>.
Hi everyone,

This is how we think we should set it up.

Situation:
- Multiple websites indexed on 1 solr server
- Results should be seperated for each website
- Search results should be filtered on group access

Solution i think is possible with solr:
- Solr server should only be accesed through API which we will write in PHP.
- Solr server authentication wil be defined through IP adres on server side
and username and password will be send through API for each different
website.
- Extra document fields in Solr server will contain:
1. Website Hash to identify and filter results fo each different website
(Website authentication)
2. list of groups who can access the document  (Group authentication)

When making a query these fields should be required. Is it possible to
configure handlers on the solr server so that these field are required whith
each type of query? So for adding documents, deleting and querying?

Am i correct? Any further advice is welcome.

regard,

Jos



-- 
View this message in context: http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1953071.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR and secure content

Posted by Robert Muir <rc...@gmail.com>.
On Tue, Nov 23, 2010 at 5:26 AM, Peter Sturge <pe...@gmail.com> wrote:
> Document-level access control can be a real 'can of worms', and it can
> be worthwhile spending a bit of time defining exactly what you need.

I agree, "document-level access control" is an anti-feature.

You can't just give someone access to a subset of documents, yet still
give them access to a huge amount of statistics about documents they
don't have access to (e.g. ranking based on IDF, spellchecker,
autosuggest, ...), especially since text tends to follow certain nice
statistical properties.

You must give someone access to an entire inverted index, or no access
to an inverted index at all. Otherwise they can use information from
documents they do have access to, to start reconstructing documents
they don't.

Re: SOLR and secure content

Posted by Peter Sturge <pe...@gmail.com>.
Yes, as mentioned in the above link, there's SOLR-1872 for maintaing
your own document-level access control. Also, if you have access to
the file system documents and want to use their existing ACL, have a
look at SOLR-1834.
Document-level access control can be a real 'can of worms', and it can
be worthwhile spending a bit of time defining exactly what you need.

Thanks,
Peter



On Mon, Nov 22, 2010 at 11:58 PM, Savvas-Andreas Moysidis
<sa...@googlemail.com> wrote:
> maybe this older thread on Modeling Access Control might help:
>
> http://lucene.472066.n3.nabble.com/Modelling-Access-Control-td1756817.html#a1761482
>
> Regards,
> -- Savvas
>
> On 22 November 2010 18:53, Jos Janssen <jo...@websdesign.nl> wrote:
>
>>
>> Hi,
>>
>> We plan to make an application layer in PHP which will communicate to the
>> solr server.
>>
>> Direct calls will only be made for administration purposes only.
>>
>> regards,
>>
>> jos
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1947970.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>

Re: SOLR and secure content

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
maybe this older thread on Modeling Access Control might help:

http://lucene.472066.n3.nabble.com/Modelling-Access-Control-td1756817.html#a1761482

Regards,
-- Savvas

On 22 November 2010 18:53, Jos Janssen <jo...@websdesign.nl> wrote:

>
> Hi,
>
> We plan to make an application layer in PHP which will communicate to the
> solr server.
>
> Direct calls will only be made for administration purposes only.
>
> regards,
>
> jos
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1947970.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: SOLR and secure content

Posted by Jos Janssen <jo...@websdesign.nl>.
Hi,

We plan to make an application layer in PHP which will communicate to the
solr server.

Direct calls will only be made for administration purposes only.

regards,

jos
-- 
View this message in context: http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1947970.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR and secure content

Posted by Savvas-Andreas Moysidis <sa...@googlemail.com>.
Hi,

Could you elaborate a bit more on how you access Solr? are you making direct
Solr calls or is the communication directed through an application layer?

On 22 November 2010 11:05, Jos Janssen <jo...@websdesign.nl> wrote:

>
> Hi,
>
> We are currently investigating how to setup a correct solr server for our
> goals.
> The problem i'm running into is how to design the solr setup so that we can
> check if a user is authenticated for viewing the document.  Let me explain
> the situation.
>
> We have a website with some pages and documents which are accesible by
> everyone (Public).
> We also have some sort of extranet, thse pages and documents are not
> accesible for everyone.
> In this extranet we have different user groups. Acces is defined by the
> user
> group.
>
> What i'm looking for is some sort of best practices to design/configure
> solr
> setup for this situation.
> I searched the internet but could find any examples or documentation for
> this situation.
>
> Maybe i'm not looking for the right documentation, that why i post this
> message.
> Can someone give me some information for this.
>
> Regards,
>
> Jos
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-and-secure-content-tp1945028p1945028.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>